Chapter 3 Summary

Section 16.3 Chapter 3 Summary

In this chapter, you focused on comparing two groups on a categorical variable. You began by extending the simulations of sampling from a finite population to consider two independent random samples from the same population (same probability of success). You used a normal approximation to again find approximate p-values and confidence intervals. You were soon cautioned against jumping to cause-and-effect conclusions when significant differences are found between the two samples. In particular, confounding variables are always a possibility with observational studies. In describing potential confounding variables, make sure you explain the connections to both the explanatory variable and the response variable, and how the confounding variable provides an alternative explanation to the observed difference.

🔗

This led us to consider comparative experiments which utilize active imposition of the explanatory variable and random assignment to guard against confounding variables. In a properly designed experiment, which may include placebos, double-blindness, and other controls, there should not be any other substantial differences between the explanatory variable groups. Thus, if you do find a statistically significant difference in the response variable, you should feel comfortable attributing that difference to the imposed explanatory variable.

🔗

With a randomized experiment, our analysis focuses on the randomness induced by random assignment to groups. You again approximated p-values through simulation, but this time modeling the random assignment process under the null hypothesis. We modelled "shuffling" the observed responses to the explanatory variable groups, and then calculated a statistic with each shuffle to assess the null distribution of that statistic under the null hypothesis. The hypergeometric probability distribution was seen again, this time as a mathematical model for this random assignment process (leading to Fisher’s Exact Test for calculating exact p-values). Again, a normal probability model can also often be applied to approximate the p-value and to obtain confidence intervals. Keep in mind that the normal model should be applied only with suitably large sample sizes.

🔗

Lastly, you considered limitations to the difference in conditional probabilities as the parameter of interest and explored properties of relative risk and odds ratios as alternative ways to summarize the association between two categorical variables. In particular, with case-control studies, only the odds ratio provides a meaningful comparison. You applied your knowledge of simulation methods to obtain approximate p-values for these statistics, learning the usefulness of transformations in normalizing distributions. This allowed us to apply some normal-based methods for obtaining confidence intervals for the population relative risk and population odds ratio. Don’t forget to back-transform these intervals to return to the original scale.

🔗

You have attempted of activities on this page.

🔗

Prev Top Next