Section 7.3 Chapter Summary
To summarize, with a data set that includes 51 data points with the numbers of people in states, and a bit of work using R to construct a distribution of sampling means, we have learned the following:
-
Run a statistical process a large number of times and you get a consistent pattern of results.
-
Taking the means of a large number of samples and plotting them on a histogram shows that the sample means are fairly well normally distributed and that the center of the distribution is very, very close to the mean of the original raw data.
-
This resulting distribution of sample means can be used as a basis for comparisons. By making cut points at the extreme low and high ends of the distribution, for example 2.5% and 97.5%, we have a way of comparing any new information we get.
-
If we get a new sample mean, and we find that it is in the extreme zone defined by our cut points, we can tentatively conclude that the sample that made that mean is a different kind of thing than the samples that made the sampling distribution.
-
A shortcut and more accurate way of figuring the cut points involves calculating the "standard error" based on the standard deviation of the original raw data.
Weβre not statisticians at this point, but the process of reasoning based on sampling distributions is at the heart of inferential statistics, so if you have followed the logic presented in this chapter, you have made excellent progress towards being a competent user of applied statistics.
You have attempted of activities on this page.