Chapter 5 Summary

Section 28.1 Chapter 5 Summary

In this chapter, you explored procedures for analyzing two or more groups and for analyzing relationships. As in earlier chapters, you saw that different study designs could lead to the same mathematical computations, but also remember to consider the study design when drawing your final conclusions. In particular, you should think about which variables were controlled by the researchers and which were not.

🔗

A very common procedure to use with two-way tables is the chi-squared procedure. You saw that this reduces to the two-sample $z$-test in the case of two groups. The chi-squared procedure is a large sample procedure and Fisher’s Exact Test can be used with smaller sample sizes. A parallel procedure for comparing several means is Analysis of Variance, which models the ratio of the variability between groups to the variability within groups using the $F$ distribution.

🔗

Regression is a very important field of study and your next course in statistics will probably be entirely about regression models. We scratched the surface here by looking at the appropriate numerical and graphical summaries for analyzing the relationship between two quantitative variables (scatterplots and correlation coefficient) and least squares regression models for linear relationships. In deciding whether you have a statistically significant relationship, the $t$-statistic uses the common standardized statistic form that you saw in earlier chapters $(\text{observed} - \text{hypothesized})/\text{standard error}\text{,}$ where the standard error of the sample slopes depends on the sample size, the variability in the $x$ values and the vertical spread about the regression line.

🔗

These calculations are complex to carry out by hand, but our focus is on knowing when to use which procedure, how to check the validity of the procedure, and how to interpret the resulting output. Learning to effectively interpret and communicate statistical results is at least as important a skill as learning which computer menus to use!

🔗

Subsection 28.1.1 Technology Summary

In this chapter, you learned how to use software to perform chi-squared tests, ANOVA, and inference for regression. You also learned how to create scatterplots, including coded scatterplots, and calculate correlation coefficients. You used applets to explore properties of $F$-statistics, regression lines, and regression coefficients. Though we hope these have given you some memorable visual images, you will generally use software to carry out specific analyses.

🔗

Subsection 28.1.2 Choice of Procedures for Comparing Several Populations, Exploring Relationships

Setting	$EV\text{:}$ Categorical$RV\text{:}$ Categorical	$EV\text{:}$ Categorical$RV\text{:}$ Quantitative	$EV\text{:}$ Quantitative$RV\text{:}$ Quantitative
Graphical summary	Segmented bar graph	Stacked boxplots	Scatterplot
Numerical summary	Conditional proportions	Group means and standard deviations	Correlation coefficient
Procedure name	Chi-squared	ANOVA	Regression
Valid to use theory-based procedure if	All expected counts $> 1\text{,}$ at least 80% of expected counts $> 5$ 🔗 Data are simple random samples or independently chosen random samples or random assignment 🔗	Independent SRSs or random assignment 🔗 Populations normal (probability plots of samples) 🔗 Variances are equal ($s_{\max}/s_{\min} \lt 2$) 🔗	Linear relationship (residuals vs. $x$) 🔗 Independent observations (e.g., random samples or randomization) 🔗 Normality of response at each $x$ value (probability plot/histogram of residuals) 🔗 Equal variance of $y$ at each $x$ value (residuals vs. $x$) 🔗
Null hypothesis	$H_0\text{:}$ Response variable distributions are the same or $H_0\text{:}$ no association between response variable and explanatory variable	$H_0: \mu_1 = \cdots = \mu_I$ or $H_a\text{:}$ treatment means are equal	$H_0: \beta_1 = 0$ (no relationship between response variable and explanatory variable)
Test statistic	$\chi^2 = \sum \frac{(O-E)^2}{E}$	$F = MST/MSE$	$t_0 = \frac{\beta_1 - \text{hypothesized value}}{SE_{\beta_1}}$
Distribution for $p$-value	Chi-squared with $(c-1)(r-1)$ df	$F$ with $I-1, n-I$ df	$t$ with $n-2$ df


R	`chisq.test`, `summary(aov(response~x))`, `summary(lm(response~x))`
Minitab	Stat > Tables > Chi-square Test for Association; Stat > ANOVA > One-way; Stat > Regression > Regression > Fit Regression Model
JMP	Analyze > Fit Y by X (for all three procedures)
Applet	Analyzing Two-way Tables; Comparing Groups (Quantitative); Analyzing Two Quantitative Variables

🔗

Subsection 28.1.3 Quick Reference to ISCAM R Workspace Functions and Other R Commands

Procedure Desired	Function Name (Options)
Normal Probability Plot	`qqnorm(data)`
Probability Plot	`qqplot(data from distribution, your data)`
Probabilities from Chi-squared distribution	`iscamchisqprob(xval, df)` returns upper tail probability
Chi-squared Test	`chisq.test(table(data))`; `chisq.test(table(data))$expected`; `$residuals`
One-way ANOVA	`summary(aov(response~explanatory))`
Probabilities from F distribution	`pf(x, df1, df2, lower.tail=FALSE)`
Scatterplot	`plot(explanatory, response)`
Correlation coefficient	`cor(x, y)`
Least Squares Regression Line	`lm(response~explanatory)`; `lm(response~explanatory)$residuals`
Coded Scatterplot	`plot(response~explanatory, col=groups)`
Inference for regression	`summary(lm(response~explanatory))`
Prediction intervals	`predict(lm(response~explanatory), newdata=data.frame(explanatory-value), interval="prediction"` or `"confidence")`
Superimpose y = x line	`lines(response, response)`
Superimpose regression line	`abline(lm(response~explanatory))`

🔗

Subsection 28.1.4 Quick Reference to Minitab Commands

Procedure Desired	Menu
Probability Plot	Graph > Probability Plot
Probabilities from Chi-Square distribution	Graph > Probability Distribution Plot
Chi-square Test	Stat > Tables > Chi-Square Test for Association; Stat > Tables > Cross Tabulation and Chi-Square
One-way ANOVA	Stat > ANOVA > One-way
Probabilities from F distribution	Graph > Probability Distribution Plot
Scatterplot	Graph > Scatterplot
Correlation coefficient	Stat > Basic Statistics > Correlation
Least Squares Regression Line	Stat > Regression > Fitted Line Plot; Storage: Residuals
Coded Scatterplot	Graph > Scatterplot, With Groups
Inference for Regression	Stat > Regression > Regression > Fit Regression Model
Prediction Intervals	After running the model: Stat > Regression > Regression > Predict
Superimpose y = x line	right click, Add > Calculated Line
Superimpose regression line	Stat > Regression > Fitted Line Plot; right click on scatterplot, Add > Regression Fit

🔗

Subsection 28.1.5 Quick Reference to JMP Commands

Procedure Desired	Menu; Hot spot
Normal Probability Plot	Analyze > Distribution; Normal quantile plot
Probability Plot	Analyze > Distribution; Continuous Fit
Probabilities from Chi-squared distribution	Distribution Calculator
Chi-squared Test	Analyze > Fit Y by X
One-way ANOVA	Analyze > Fit Y by X; Means/ANOVA/Pooled t
Probabilities from F distribution	Distribution Calculator
Scatterplot	Analyze > Fit Y by X
Correlation coefficient	Analyze > Multivariate Methods > Multivariate
Least Squares Regression Line	Analyze > Fit Y by X; Fit Line
Coded Scatterplot	Analyze > Fit Y by X; By
Inference for regression	Analyze > Fit Y by X; Fit Line
Prediction intervals	Analyze > Fit Y by X; Fit Line; Linear Fit > Mean Confidence Interval Formula or Individual Confidence Interval Formula
Superimpose y = x line	Analyze > Fit Y by X; Fit Special
Superimpose regression line	Analyze > Fit Y by X; Fit Line

🔗

You have attempted of activities on this page.

🔗

Prev Top Next