Skip to main content

Advanced High School Statistics: Third Edition

Section 6.2 Difference of two proportions

We often wish two compare to groups to each other. In this section, we will answer the following questions:
  • How much more effective is a blood thinner than a placebo for those who undergo CPR for a heart attack?
  • How different is the approval of the 2010 healthcare law under two different question phrasings?
  • Does the use of fish oils reduce heart attacks better than a placebo?

Subsection 6.2.1

Subsection 6.2.2 Sampling distribution for the difference of two proportions (review)

In this section we want to compare proportions from two independent groups. When comparing two proportions, the quantity that we generally want to estimate is the difference \(p_1-p_2\text{,}\) which tells us how far apart the two population proportions are.
Before we perform inference for the two proportion case, we must review the sampling distribution for \(\hat{p}_1-\hat{p}_2\text{,}\) which will represent our point estimate. We know from LINK that when the independence condition is satisfied, the sampling distribution for \(\hat{p}_1-\hat{p}_2\) is centered on \(p_{1}-p_{2}\) and the standard deviation is given by:
\begin{gather*} \sigma_{\hat{p}_1-\hat{p}_2} = \sqrt{\frac{p_{1}(1-p_{1})}{n_{1}} + \frac{p_{2}(1-p_{2})}{n_{2}} } \end{gather*}
When the individual population proportions are unknown, we estimate the standard deviation of \(\hat{p}_1-\hat{p}_2\) using the Standard Error, abbreviated SE. The SE of \(\hat{p}_1-\hat{p}_2\) is found by substituting in our best estimates of p1 and p2 using the sample values:
\begin{gather*} SE_{\hat{p}_1-\hat{p}_2} = \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_{1}} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_{2}} } \end{gather*}
The difference of two sample proportions \(\hat{p}_1-\hat{p}_2\) follows a nearly normal distribution when two conditions are met. First, the sampling distribution for each sample proportion must be nearly normal. Second, the observations must be independent, both within and between groups. We cover these conditions in greater detail next.

Subsection 6.2.3 Checking conditions for inference using a normal distribution

When comparing two proportions, we carry out inference on \(p_1-p_2\text{.}\) The assumptions are that the observations are independent, both between groups and within groups and that the sampling distribution of \(\hat{p}_1-\hat{p}_2\) is nearly normal, and this assumption is reasonable when two conditions are met:
Independence. The observations across the two samples are independent. This condition is generally satisfied by checking whether the data are collected from two independent random samples or from an experiment with two randomly assigned treatments. Randomly assigning subjects to treatments is equivalent to randomly assigning treatments to subjects. When sampling without replacement, the observations can be considered independent when the sample sizes is less than 10% of the population size for both samples.
Success-failure condition. In the two-sample case, the number of successes and failures should be at least 10 for both groups, so there are four inequalities to check.

Subsection 6.2.4 Confidence interval for the difference of two proportions

We consider an experiment for patients who underwent CPR for a heart attack and were subsequently admitted to a hospital. These patients were randomly divided into a treatment group where they received a blood thinner or the control group where they did not receive a blood thinner. The outcome variable of interest was whether the patients survived for at least 24 hours. The results are shown in Table 6.2.1.
Table 6.2.1. Results for the CPR study. Patients in the treatment group were given a blood thinner, and patients in the control group were not.
Survived Died Total
Treatment 14 26 40
Control 11 39 50
Total 25 65 90
Here, the parameter of interest is a difference of population proportions, specifically, the difference in the proportion of similar patients that would survive for at least 24 hours if in the treatment group versus if in the control group. Let:
\begin{align*} p_1:\amp \text{ proportion that would survive in treatment group, and }\\ p_2:\amp \text{ proportion that would survive in control group } \end{align*}
Then the parameter of interest is \(p_1 - p_2\text{.}\) In order to use a Z-interval to estimate this difference, we must see if the point estimate, \(\hat{p}_{1} - \hat{p}_{2}\text{,}\) follows a normal distribution. Because the patients were randomly assigned to one of the two groups and one heart attack patient is unlikely to influence the next that was in the study, the observations are considered independent, both within the samples and between the samples. Next, the success-failure condition should be verified for each group. We use the sample proportions along with the sample sizes to check the condition.
\begin{align*} n_1\hat{p}_1\amp \ge 10 \amp n_1(1-\hat{p}_1)\amp \ge 10 \amp n_2\hat{p}_2\amp \ge 10 \amp n_2(1-\hat{p}_2)\amp \ge 10\\ 40 \times \frac{14}{40} \amp \ge 10 \amp 40 \times (1-\frac{14}{40}) \amp \ge 10 \amp 50 \times \frac{11}{50} \amp \ge 10 \amp 50 \times (1-\frac{11}{50}) \amp \ge 10 \end{align*}
Because all conditions are met, the normal model can be used for the point estimate of the difference in survival rate.
The point estimate is:
\begin{gather*} \hat{p}_{1} - \hat{p}_{2} = \frac{14}{40} - \frac{11}{50} = 0.35 - 0.22 = 0.13 \end{gather*}
We compute the standard error for the difference of sample proportions in the same way that we compute the standard deviation for the difference of sample proportions — the only difference is that we use the sample proportions in place of the population proportions:
\begin{align*} SE \amp = \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}= \sqrt{\frac{0.35 (1 - 0.35)}{40} + \frac{0.22 (1 - 0.22)}{50}} = 0.095 \end{align*}
Let us estimate the true difference in survival rate with 90% confidence. For a 90% confidence level, we use \(z^{\star} = 1.645\text{.}\) The 90% confidence interval is calculated as:
\begin{align*} \text{ point estimate } \pm\amp z^{\star} \times SE\ \text{ of estimate }\\ 0.13 \pm\amp 1.65\times 0.095\\ \amp (-0.027, 0.287) \end{align*}
We are 90% confident that the true difference in the survival rate (treatment \(-\) control) lies between -0.027 and 0.095. That is, we are 90% confident that the treatment of blood thinners changes survival rate for patients like those in the study by \(-2.7\%\) to \(+28.7\%\) percentage points. Because this interval contains both negative and positive values, we do not have enough information to say with confidence whether blood thinners harm or help heart attack patients who have been admitted after they have undergone CPR.

Constructing a confidence interval for the difference of two proportions.

To carry out a complete confidence interval procedure to estimate the difference of two proportions \(p_1-p_2\text{,}\)
Identify: Identify the parameter and the confidence level, C%.
  • The parameter will be a difference of proportions, e.g. the true difference in the proportion of 17 and 18 year olds with a summer job (proportion of 18 year olds \(-\) proportion of 17 year olds).
Choose: Identify the correct interval procedure and identify it by name.
  • Here we choose the 2-proportion Z-interval.
Check: Check conditions for the sampling distribution of \(\hat{p}_1-\hat{p}_2\) to be nearly normal.
  1. Independence: Data come from 2 independent random samples or from a randomized experiment with two treatments. When sampling without replacement, check that the sample size is less than 10% of the population size for both samples.
  2. Success-failure: \(n_1\hat{p}_1\geq10\text{,}\) \(n_1(1-\hat{p}_1)\geq10\text{,}\) \(n_2\hat{p}_2\geq10\text{,}\) and \(n_2(1-\hat{p}_2)\geq10\)
Calculate: Calculate the confidence interval and record it in interval form.
  • \(\text{ point estimate } \pm\ z^{\star} \times SE \text{ of estimate }\)
    • point estimate: the difference of sample proportions \(\hat{p}_1 - \hat{p}_2\)
    • \(SE\) of estimate: \(\sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}\)
    • \(z^{\star}\text{:}\) use a \(t\)-table at row \(\infty\) and confidence level C
  • (, )
Conclude: Interpret the interval and, if applicable, draw a conclusion in context.
  • We are C% confident that the true difference in the proportion of [...] is between and . If applicable, draw a conclusion based on whether the interval is entirely above, is entirely below, or contains the value 0.

Example 6.2.2.

A remote control car company is considering a new manufacturer for wheel gears. The new manufacturer would be more expensive but their higher quality gears are more reliable, resulting in happier customers and fewer warranty claims. However, management must be convinced that the more expensive gears are worth the conversion before they approve the switch. The quality control engineer collects a sample of gears, examining 1000 gears from each company and finds that 879 gears pass inspection from the current supplier and 958 pass inspection from the prospective supplier. Using these data, construct a 95% confidence interval for the difference in the proportion from each supplier that would pass inspection. Use the five step framework described above to organize your work.
Solution.
Identify: First we identify the parameter of interest. Here the parameter we wish to estimate is the true difference in the proportion of gears from each supplier that would pass inspection, \(p_1-p_2\text{.}\) We will take the difference as: current \(-\) prospective, so \(p_1\) is the true proportion that would pass from the current supplier and \(p_2\) is the true proportion that would pass from the prospective supplier. We will estimate the difference using a 95% confidence level.
Choose: Because the parameter to be estimated is a difference of proportions, we will use a 2-proportion Z-interval.
Check: The samples are independent, but not necessarily random, so to proceed we must assume the gears are all independent. For this sample we will suppose this assumption is reasonable, but the engineer would be more knowledgeable as to whether this assumption is appropriate. We also must verify the minimum sample size conditions:
\begin{align*} 1000 \times \frac{879}{1000} \amp \ge 10 \amp 1000 \times \frac{121}{1000} \amp \ge 10 \amp 1000 \times \frac{958}{1000} \amp \ge 10 \amp 1000 \times \frac{42}{1000} \amp \ge 10 \end{align*}
The success-failure condition is met for both samples.
Calculate: We will calculate the interval:
\begin{gather*} \text{ point estimate } \pm\ z^{\star} \times SE \text{ of estimate } \end{gather*}
The point estimate is the difference of sample proportions: \(\hat{p}_1-\hat{p}_2 = 0.879 - 0.958 = -0.079\text{.}\)
The \(SE\) of the difference of sample proportions is:
\(\sqrt{\frac{\ \hat{p}_1(1-\hat{p}_1)\ }{n_1}+ \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}\ } = \sqrt{\frac{0.879(1-0.879)}{1000} +\frac{0.958(1-0.958)}{1000}}= 0.0121\)
So the 95% confidence interval is given by:
\begin{align*} 0.879 - 0.958 \ \pm \ \amp 1.96 \times \sqrt{\frac{0.879(1-0.879)}{1000} +\frac{0.958(1-0.958)}{1000}}\\ -0.079\ \pm\ \amp 1.96 \times 0.0121\\ \amp (-0.103, -0.055) \end{align*}
Conclude: We are 95% confident that the true difference (current \(-\) prospective) in the proportion that would pass inspection is between -0.103 and -0.055, meaning that we are 95% confident that the prospective supplier would have between a 5.5% and 10.3% greater rate of passing inspection. Because the entire interval is below zero, the data provide sufficient evidence that the prospective gears pass inspection more often than the current gears. The remote control car company should go with the new manufacturer.

Subsection 6.2.5 Calculator: the 2-proportion Z-interval

As with the 1-proportion Z-interval, a calculator can be helpful for evaluating the final interval.

TI-83/84: 2-proportion Z-interval.

Use STAT, TESTS, 2-PropZInt.
  1. Choose STAT.
  2. Right arrow to TESTS.
  3. Down arrow and choose B:2-PropZInt.
  4. Let x1 be the number of yeses (must be an integer) in sample 1 and let n1 be the size of sample 1.
  5. Let x2 be the number of yeses (must be an integer) in sample 2 and let n2 be the size of sample 2.
  6. Let C-Level be the desired confidence level.
  7. Choose Calculate and hit ENTER, which returns:
    (,) the confidence interval
    \(\hat{p}_1\) sample 1 proportion \(n_1\) size of sample 1
    \(\hat{p}_2\) sample 2 proportion \(n_2\) size of sample 2

Casio fx-9750GII: 2-proportion Z-interval.

  1. Navigate to STAT (MENU button, then hit the 2 button or select STAT).
  2. Choose the INTR option (F4 button).
  3. Choose the Z option (F1 button).
  4. Choose the 2-P option (F4 button).
  5. Specify the interval details:
    • Confidence level of interest for C-Level.
    • Enter the number of successes for each group, x1 and x2.
    • Enter the sample size for each group, n1 and n2.
  6. Hit the EXE button, which returns
    Left, Right the ends of the confidence interval
    \(\hat{p}1\text{,}\) \(\hat{p}2\) the sample proportions
    n1, n2 sample sizes

Guided Practice 6.2.3.

From Example 6.2.2, we have that a quality control engineer collects a sample of gears, examining 1000 gears from each company and finds that 879 gears pass inspection from the current supplier and 958 pass inspection from the prospective supplier. Use a calculator to find a 95% confidence interval for the difference (current - prospective) in the proportion that would pass inspection.
 1 
Navigate to the 2-proportion Z-interval on the calculator. Let x1 \(= 879\text{,}\) n1 \(= 1000\text{,}\) x2 \(= 958\text{,}\) and n2\(= 1000\text{.}\) C-Level is .95. This should lead to an interval of \((-0.1027, -0.0553)\text{,}\) which matches what we found previously.

Subsection 6.2.6 Hypothesis testing when \(H_0\text{:}\) \(p_1 = p_2\)

Here we use a new example to examine a special estimate of the standard error when the null hypothesis is that two population proportions equal each other, i.e. \(H_0\text{:}\) \(p_1 = p_2\text{.}\) We investigate whether the way a question is phrased can influence a person’s response. Pew Research Center conducted a survey with the following question:
 2 
As you may know, by 2014 nearly all Americans will be required to have health insurance. [People who do not buy insurance will pay a penalty] while [People who cannot afford it will receive financial help from the government]. Do you approve or disapprove of this policy?
For each randomly sampled respondent, the statements in brackets were randomized: either they were kept in the original order given above, or they were reversed. Results are presented in Table 6.2.4
Table 6.2.4. Results for a Pew Research Center poll where the ordering of two statements in a question regarding healthcare were randomized.
sample size Approve law (%) Disapprove law (%) Other
“People who do not buy insurance will pay a penalty”
is given first (original order)
771 47 49 4
“People who cannot afford it will receive financial help”
from the government is given first (reversed order)
732 34 63 3

Guided Practice 6.2.5.

Is this study an experiment or an observational study?
 3 
There is a random sample involved, but there are also two treatments. Half of the the respondents are given the original statement order and the other half, randomly, are given the reversed statement order. This is an experiment because there are randomly assigned treatments.
The approval percents of 47% and 34% seem far apart. However, could this difference be due to random chance? We will answer this question using a hypothesis test. To simplify things, let
\begin{align*} p_1\amp \text{ : the proportion of respondents that would approve of policy with the original statement ordering, and }\\ p_2\amp \text{ : the proportion of respondents that would approve of policy with the reversed statement ordering. } \end{align*}

Example 6.2.6.

Set up hypotheses to test whether the two statement orders produce the same response.
Solution.
The null claim is that the question order does not matter, that is, that the two proportions should be equal. The alternate claim, the one that bears the burden of proof, is that the question ordering does matter.
\(H_0\text{:}\) \(p_1 = p_2\)
\(H_A\text{:}\) \(p_1 \ne p_2\)
Now, we can note that:
\begin{align*} p_1=p_2 \amp \text{ is equivalent to } p_1-p_2=0\text{ , and }\\ p_1\ne p_2 \amp \text{ is equivalent to } p_1-p_2\ne 0\text{.} \end{align*}
We can now see that the hypotheses are really about a difference of proportions: \(p_1-p_2\text{.}\) In the last section, we used a 2-proportion Z-interval to estimate the parameter \(p_1-p_2\text{;}\) here, we will use a 2-proportion Z-test to test the null hypothesis that \(p_1-p_2=0\text{,}\) i.e. that \(p_1=p_2\text{.}\)
Recall that the test statistic Z has the form:
\begin{gather*} Z = \frac{\text{ point estimate } - \text{ null value } }{SE\ \text{ of estimate } } \end{gather*}
The parameter of interest is \(p_1-p_2\text{,}\) so the point estimate will be the observed difference of sample proportions: \(\hat{p}_{1} - \hat{p}_{2} = 0.47 - 0.34 = 0.13\text{.}\)
The null value depends on the null hypothesis. The null hypothesis is that the approval rate would be the same for both statement orderings, i.e. that the difference is 0, therefore, the null value is 0. In this section we consider only the case where \(H_0\text{:}\) \(p_1=p_2\text{,}\) so the null value for the difference will always be 0.
The \(SD\) of a difference of sample proportions has the form:
\begin{gather*} SD = \sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}} \end{gather*}
However, in a hypothesis test, the distribution of the point estimate is always examined assuming the null hypothesis is true, i.e. in this case, \(p_1 = p_2\text{.}\) Both the success-failure check and the standard error formula should reflect this equality in the null hypothesis. We will use \(p_c\) to represent the common proportion that support healthcare law regardless of statement order:
\begin{align*} SD \amp = \sqrt{\frac{p_c(1-p_c)}{n_1} + \frac{p_c(1-p_c)}{n_2}}\\ \amp = \sqrt{p_c(1-p_c)}\sqrt{\frac{1}{n_1} + \frac{1}{n_2}} \end{align*}
We don’t know the true proportion \(p_c\text{,}\) but we can obtain a good estimate of it, \(\hat{p}_c\text{,}\) by pooling the results of both samples. We find the total number of “yeses” or “successes” and divide that by the total number of cases. This is equivalent to taking a weighted average of \(\hat{p}_1\) and \(\hat{p}_2\text{.}\) We call \(\hat{p}_c\) the pooled sample proportion, and we use it to check the success-failure condition and to compute the standard error when the null hypothesis is that \(p_1 = p_2\text{.}\) Here:
\begin{equation*} \hat{p}_c = \frac{771(0.47) + 732(0.34)}{771+732}= 0.407 \end{equation*}

Pooled sample proportion.

When the null hypothesis is \(p_1 = p_2\text{,}\) it is useful to find the pooled sample proportion:
\begin{gather*} \hat{p}_c = \frac{\text{ number of “successes” } }{\text{ number of cases } } = \frac{\text{x} _1+\text{x} _2}{n_1+n_2}=\frac{n_1\hat{p}_1 + n_2\hat{p}_2}{n_1 + n_2} \end{gather*}
Here \(\text{x} _1\) represents the number of successes in sample 1. If \(\text{x} _1\) is not given, it can be computed as \(n_1\times \hat{p}_1\text{.}\) Similarly, \(\text{x} _2\) represents the number of successes in sample 2 and can be computed as \(n_2\times \hat{p}_2\text{.}\)

Use the pooled sample proportion when \(H_0\text{:}\) \({p}_1 = {p}_2\).

When the null hypothesis states that the proportions are equal, we use the pooled sample proportion (\(\hat{p}_c\)) to check the success-failure condition and to estimate the standard error:
\begin{gather} SE =\sqrt{\hat{p}_c(1-\hat{p}_c)}\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}\tag{6.2.1} \end{gather}

Example 6.2.7.

Verify that conditions for using the normal are met and find the \(SE\) of estimate for this hypothesis test. Recall that the pooled proportion \(\hat{p}_c=0.407\text{,}\) \(n_1 = 771\text{,}\) and \(n_2=732\text{.}\)
Solution.
The data do come from two randomly assigned treatments, where the treatments are the two different orderings of the question regarding healthcare. Also, the success-failure condition (minimums of 10) easily holds for each group.
\begin{align*} 771 \times 0.407 \amp \ge 10 \amp 771 \times (1-0.407) \amp \ge 10 \amp 732 \times 0.407 \amp \ge 10 \amp 732 \times (1-0.407) \amp \ge 10 \end{align*}
Here, we compute the \(SE\) for the difference of sample proportions as:
\begin{equation*} SE =\sqrt{\hat{p}_c(1-\hat{p}_c)}\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}=\sqrt{0.407(1-0.407)}\sqrt{\frac{1}{771} + \frac{1}{732}}=0.025 \end{equation*}

Example 6.2.8.

Complete the hypothesis test using a significance level of 0.01.
Solution.
We have already set up the hypotheses and verified that the difference of proportions can be modeled using a normal distribution. We can now calculate the test statistic and p-value.
\begin{gather*} Z = \frac{\text{ point estimate } - \text{ null value } }{SE\ \text{ of estimate } }= \frac{(0.47-0.34) - 0}{0.025} = 5.2 \end{gather*}
This is a two-tailed test as \(H_A\) is that \(p_1\ne p_2\text{.}\) We can find the area in one tail and double it. Here, the p-value \(\approx\) 0. Because the p-value is smaller than \(\alpha = 0.01\text{,}\) we reject the null hypothesis and conclude that the order of the statements affects how likely a respondent is to support the 2010 healthcare law.

Hypothesis testing for the difference of two proportions.

To carry out a complete hypothesis test to test the claim that two proportions \(p_1\) and \(p_2\) are equal to each other,
Identify: Identify the hypotheses and the significance level, \(\alpha\text{.}\)
  • \(H_0\text{:}\) \(p_1=p_2\)
  • \(H_A\text{:}\) \(p_1\ne p_2\text{;}\) \(H_A\text{:}\) \(p_1>p_2\text{;}\) or \(H_A\text{:}\) \(p_1\lt p_2\)
Choose: Choose the correct test procedure and identify it by name.
  • Here we choose the 2-proportion Z-test.
Check: Check conditions for the sampling distribution for \(\hat{p}_1-\hat{p}_2\) to be nearly normal, assuming \(H_{0}:p_{1}-p_{2}\) is true.
  1. Independence: Data come from 2 independent random samples or from a randomized experiment with two treatments. When sampling without replacement, check that the sample size is less than 10% of the population size for both samples.
  2. \(n_1\hat{p}_c\geq 10\text{,}\) \(n_1(1-\hat{p}_c)\geq 10\text{,}\) \(n_2\hat{p}_c\geq 10\text{,}\) and \(n_2(1-\hat{p}_c)\geq 10\)
Calculate: Calculate the Z-statistic and p-value.
  • \(Z = \frac{\text{ point estimate } - \text{ null value } }{SE \text{ of estimate } }\)
    • point estimate: the difference of sample proportions \(\hat{p}_1 - \hat{p}_2\)
    • \(SE\) of estimate: \(\sqrt{\hat{p}_c(1-\hat{p}_c)}\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}\text{,}\) where \(\hat{p}\) is the pooled proportion
    • null value: 0
  • p-value = (based on the Z-statistic and the direction of \(H_A\))
Conclude: Compare the p-value to \(\alpha\text{,}\) and draw a conclusion in context.
  • If the p-value is \(\lt \alpha\text{,}\) reject \(H_0\text{;}\) there is sufficient evidence that [\(H_A\) in context].
  • If the p-value is \(> \alpha\text{,}\) do not reject \(H_0\text{;}\) there is not sufficient evidence that [\(H_A\) in context].

Example 6.2.9.

A 5-year experiment was conducted to evaluate the effectiveness of fish oils on reducing heart attacks, where each subject was randomized into one of two treatment groups. We’ll consider heart attack outcomes in these patients:
heart_attack no_event Total
fish_oil 145 12788 12933
placebo 200 12738 12938
Carry out a complete hypothesis test at the 10% significance level to test whether the use of fish oils is effective in reducing heart attacks.
Solution.
Identify: Define \(p_1\) and \(p_2\) as follows:
\(p_1\text{:}\) the true proportion that would suffer a heart attack if given fish oil
\(p_2\text{:}\) the true proportion that would suffer a heart attack if given placebo
We will test the following hypotheses at the \(\alpha=0.10\) significance level.
\(H_0\text{:}\) \(p_1=p_2\) Fish oil and placebo are equally effective.
\(H_A\text{:}\) \(p_1 \lt p_2\) Fish oil is effective in reducing heart attacks.
Choose: Because we are testing whether two proportions equal each other, we choose the 2-proportion Z-test.
Check: We must verify that the difference of sample proportions can be modeled using a normal distribution. First we note that there are two randomly assigned treatments. Second, we calculate the pooled proportion as follows:
\begin{equation*} \hat{p}_c = \frac{x_1+x _2}{n_1+n_2}=\frac{145 + 200}{12933 + 12938}=0.0133 \end{equation*}
We can now verify: \(12933(0.0133)\geq10\text{,}\) \(12933(1-0.0133)\geq10\text{,}\) \(12938(0.0133)\geq10\text{,}\) and \(12938(1-0.0133)\geq10\text{,}\) so both conditions are met.
Calculate: We will calculate the Z-statistic and the p-value.
\begin{gather*} Z = \frac{\text{ point estimate } - \text{ null value } }{SE \text{ of estimate } } \end{gather*}
The point estimate is the difference of sample proportions: \(\hat{p}_1-\hat{p}_2 = 0.0112 - 0.0155 = -0.0043\text{.}\)
The value hypothesized for the parameter in \(H_0\) is the null value: null value = 0.
The pooled proportion, calculated above, is: \(\hat{p}_c = 0.0133\text{.}\)
The \(SE\) of the difference of sample proportions, assuming \(H_0\) is true, is:
\(\sqrt{\hat{p}_c(1-\hat{p}_c)}\sqrt{\frac{1}{n_1} + \frac{1}{n_2}} = \sqrt{0.0133(1-0.0133)}\sqrt{\frac{1}{12933} + \frac{1}{12938}}=0.00142\text{.}\)
\begin{gather*} Z = \frac{-0.0043 - 0}{0.00142} = -3.0 \end{gather*}
Because \(H_A\) uses a less than, meaning that it is a lower-tail test, the p-value is the area to the left of \(Z=-3.0\) under the standard normal curve. This area can be found using a normal table or a calculator. The area or p-value = \(0.0013\text{.}\)
Conclude: The p-value of 0.0013 is \(\lt 0.10\text{,}\) so we reject \(H_0\text{;}\) there is sufficient evidence that fish oil is effective in reducing heart attacks.

Subsection 6.2.7 Calculator: the 2-proportion Z-test

TI-83/84: 2-proportion Z-test.

Use STAT, TESTS, 2-PropZTest.
  1. Choose STAT.
  2. Right arrow to TESTS.
  3. Down arrow and choose 6:2-PropZTest.
  4. Let x1 be the number of yeses (must be an integer) in sample 1 and let n1 be the size of sample 1.
  5. Let x2 be the number of yeses (must be an integer) in sample 2 and let n2 be the size of sample 2.
  6. Choose \(\ne\text{,}\) \(\lt\text{,}\) or \(\gt\) to correspond to \(H_A\text{.}\)
  7. Choose Calculate and hit ENTER, which returns:
    z Z-statistic p p-value
    \(\hat{p}_1\) sample 1 proportion \(\hat{p}\) pooled sample proportion
    \(\hat{p}_2\) sample 2 proportion

Casio fx-9750GII: 2-proportion Z-test.

  1. Navigate to STAT (MENU button, then hit the 2 button or select STAT).
  2. Choose the TEST option (F3 button).
  3. Choose the Z option (F1 button).
  4. Choose the 2-P option (F4 button).
  5. Specify the test details:
    • Specify the sidedness of the test using the F1, F2, and F3 keys.
    • Enter the number of successes for each group, x1 and x2.
    • Enter the sample size for each group, n1 and n2.
  6. Hit the EXE button, which returns
    z Z-statistic \(\hat{p}1\text{,}\) \(\hat{p}2\) sample proportions
    p p-value \(\hat{p}\) pooled proportion
    n1, n2 sample sizes

Guided Practice 6.2.10.

Use a calculator to find the test statistic, p-value, and pooled proportion for a test with: \(H_A\text{:}\) \(p\) for fish oil \(\lt p\) for placebo.
 4 
Correctly going through the calculator steps should lead to a solution with the test statistic z \(= -2.977\) and the p-value p \(= 0.00145\text{.}\) These two values match our calculated values from the previous example to within rounding error. The pooled proportion is given as \(\hat{p} = 0.0133\text{.}\) Note: values for x1 and x2 were given in the table. If, instead, proportions are given, find x1 and x2 by multiplying the proportions by the sample sizes and rounding the result to an integer.
heart_attack no_event Total
fish_oil 145 12788 12933
placebo 200 12738 12938

Subsection 6.2.8 Section summary

In the previous section, we looked at inference for a single proportion. In this section, we compared two groups to each other with respect to a proportion or a percent.
  • We are interested in whether the true proportion of yeses is the same or different between two distinct groups. Call these proportions \(p_1\) and \(p_2\text{.}\) The difference, \(p_1-p_2\) tells us whether \(p_1\) is greater than, less than, or equal to \(p_2\text{.}\)
  • When comparing two proportions to each other, the parameter of interest is the difference of proportions, \(p_1-p_2\text{,}\) and we use the difference of sample proportions, \(\hat{p}_1-\hat{p}_2\text{,}\) as the point estimate.
  • The sampling distribution for \(\hat{p}_1-\hat{p}_2\) is nearly normal when the success-failure condition is met for both groups and when the when the observations are independent between and within groups. When the sampling distribution of \(\hat{p}_1-\hat{p}_2\) is nearly normal, the standardized test statistic also follows a normal distribution.
  • When the null hypothesis is that the two populations proportions are equal to each other, use the pooled sample proportion \(\hat{p}_c=\frac{x_1+x_2}{n_1+n_2}\text{,}\) i.e. the combined number of yeses over the combined sample sizes, when verifying the success-failure condition and when finding the \(SE\text{.}\) For the confidence interval, do not use the pooled sample proportion; use the separate values of \(\hat{p}_1\) and \(\hat{p}_2\text{.}\)
  • When there are two samples or treatments and the parameter of interest is a difference of proportions, e.g. the true difference in proportion of 17 and 18 year olds with a summer job (proportion of 18 year olds \(-\) proportion of 17 year olds):
    • Estimate \(p_1-p_2\) at the C% confidence level using a 2-proportion Z-interval.
    • Test \(H_{0}: p_1-p_2=0\) at the \(\alpha\) significance level using a 2-proportion Z-test.
  • The two proportion Z-interval and Z-test require the sampling distribution for \(\hat{p}_1-\hat{p}_2\) to be nearly normal. For this reason we must check that the following conditions are met.
    1. Independence: Data come from 2 independent random samples or from a randomized experiment with 2 treatments. When sampling without replacement, check that the sample size is less than 10% of the population size for both samples.
    2. Success-failure for CI: \(n_1\hat{p}_1\ge 10\text{,}\) \(n_1(1-\hat{p}_1)\ge 10\text{,}\) \(n_2\hat{p}_2\ge 10\text{,}\) and \(n_2(1-\hat{p}_2)\ge 10\)
      Success-failure for Test: \(n_1\hat{p}_c\ge 10\text{,}\) \(n_1(1-\hat{p}_c)\ge 10\text{,}\) \(n_2\hat{p}_c\ge 10\text{,}\) and \(n_2(1-\hat{p}_c)\ge 10\)
  • When the conditions are met, we calculate the confidence interval and the test statistic using the same structure as in the previous section.
    • Confidence interval: \(\text{ point estimate } \pm\ z^{\star} \times SE \text{ of estimate }\)
    • Test statistic: \(Z = \frac{\text{ point estimate}- \text{null value }}{SE \text{ of estimate } }\)
    Here the point estimate is the difference of sample proportions \(\hat{p}_1 - \hat{p}_2\text{.}\)
    The \(SE\) of estimate is the \(SE\) of a difference of sample proportions.
    • For a CI, use: \(SE = \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}\text{.}\)
    • For a Test, use: \(SE = \sqrt{\hat{p}_c(1-\hat{p}_c)}\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}\text{.}\)

Exercises 6.2.9 Exercises

1. Social experiment, Part I.

A “social experiment” conducted by a TV program questioned what people do when they see a very obviously bruised woman getting picked on by her boyfriend. On two different occasions at the same restaurant, the same couple was depicted. In one scenario the woman was dressed “provocatively” and in the other scenario the woman was dressed “conservatively”. The table below shows how many restaurant diners were present under each scenario, and whether or not they intervened.
Scenario
Provocative Conservative Total
Intervene Yes 5 15 20
No 15 10 25
Total 20 25 45
Explain why the sampling distribution of the difference between the proportions of interventions under provocative and conservative scenarios does not follow an approximately normal distribution.
Solution.
This is not a randomized experiment, and it is unclear whether people would be affected by the behavior of their peers. That is, independence may not hold. Additionally, there are only 5 interventions under the provocative scenario, so the success-failure condition does not hold. Even if we consider a hypothesis test where we pool the proportions, the success-failure condition will not be satisfied. Since one condition is questionable and the other is not satisfied, the difference in sample proportions will not follow a nearly normal distribution.

2. Heart transplant success.

The Stanford University Heart Transplant Study was conducted to determine whether an experimental heart transplant program increased lifespan. Each patient entering the program was officially designated a heart transplant candidate, meaning that he was gravely ill and might benefit from a new heart. Patients were randomly assigned into treatment and control groups. Patients in the treatment group received a transplant, and those in the control group did not. The table below displays how many patients survived and died in each group.
 5 
B. Turnbull et al. “Survivorship of Heart Transplant Data”. In: Journal of the American Statistical Association 69 (1974), pp. 74-80.
control treatment
alive 4 24
dead 30 45
Suppose we are interested in estimating the difference in survival rate between the control and treatment groups using a confidence interval. Explain why we cannot construct such an interval using the normal approximation. What might go wrong if we constructed the confidence interval despite this problem?

3. Gender and color preference.

A study asked 1,924 male and 3,666 female undergraduate college students their favorite color. A 95% confidence interval for the difference between the proportions of males and females whose favorite color is black \((p_{male} - p_{female})\) was calculated to be (0.02, 0.06). Based on this information, determine if the following statements are true or false, and explain your reasoning for each statement you identify as false.
 6 
L Ellis and C Ficek. “Color preferences according to gender and sexual orientation”. In: Personality and Individual Differences 31.8 (2001), pp. 1375-1379.
  1. We are 95% confident that the true proportion of males whose favorite color is black is 2% lower to 6% higher than the true proportion of females whose favorite color is black.
  2. We are 95% confident that the true proportion of males whose favorite color is black is 2% to 6% higher than the true proportion of females whose favorite color is black.
  3. 95% of random samples will produce 95% confidence intervals that include the true difference between the population proportions of males and females whose favorite color is black.
  4. We can conclude that there is a significant difference between the proportions of males and females whose favorite color is black and that the difference between the two sample proportions is too large to plausibly be due to chance.
  5. The 95% confidence interval for \((p_{female} - p_{male})\) cannot be calculated with only the information given in this exercise.
Solution.
  1. False. The entire confidence interval is above 0.
  2. True.
  3. True.
  4. True.
  5. False. It is simply the negated and reordered values: \((-0.06,-0.02)\text{.}\)

4. The Daily Show.

A Pew Research foundation poll indicates that among 1,099 college graduates, 33% watch The Daily Show. Meanwhile, 22% of the 1,110 people with a high school degree but no college degree in the poll watch The Daily Show. A 95% confidence interval for \((p_\text{ college grad } - p_\text{ HS or less } )\text{,}\) where \(p\) is the proportion of those who watch The Daily Show, is (0.07, 0.15). Based on this information, determine if the following statements are true or false, and explain your reasoning if you identify the statement as false.
 7 
The Pew Research Center, Americans Spending More Time Following the News, data collected June 8-28, 2010.
  1. At the 5% significance level, the data provide convincing evidence of a difference between the proportions of college graduates and those with a high school degree or less who watch The Daily Show.
  2. We are 95% confident that 7% less to 15% more college graduates watch The Daily Show than those with a high school degree or less.
  3. 95% of random samples of 1,099 college graduates and 1,110 people with a high school degree or less will yield differences in sample proportions between 7% and 15%.
  4. A 90% confidence interval for \((p_\text{ college grad } - p_\text{ HS or less } )\) would be wider.
  5. A 95% confidence interval for \((p_\text{ HS or less } - p_\text{ college grad } )\) is (-0.15,-0.07).

5. National Health Plan, Part III.

Exercise 6.1.9.9 presents the results of a poll evaluating support for a generically branded “National Health Plan” in the United States. 79% of 347 Democrats and 55% of 617 Independents support a National Health Plan.
  1. Calculate a 95% confidence interval for the difference between the proportion of Democrats and Independents who support a National Health Plan \((p_{D} - p_{I})\text{,}\) and interpret it in this context. We have already checked conditions for you.
  2. True or false: If we had picked a random Democrat and a random Independent at the time of this poll, it is more likely that the Democrat would support the National Health Plan than the Independent.
Solution.
  1. Standard error:
    \begin{gather*} SE=\sqrt{ \frac{0.79(1-0.79)}{347} + \frac{0.55(1-0.55)}{617} }=0.33 \end{gather*}
    Using \(z^{*}=1.96\text{,}\) we get:
    \begin{gather*} 0.79-0.55 \pm 1.96 \times 0.03 \rightarrow (0.181, 0.299) \end{gather*}
    We are 95% confident that the proportion of Democrats who support the plan is 18.1% to 29.9% higher than the proportion of Independents who support the plan.
  2. True.

6. Sleep deprivation, CA vs. OR, Part I.

According to a report on sleep deprivation by the Centers for Disease Control and Prevention, the proportion of California residents who reported insufficient rest or sleep during each of the preceding 30 days is 8.0%, while this proportion is 8.8% for Oregon residents. These data are based on simple random samples of 11,545 California and 4,691 Oregon residents. Calculate a 95% confidence interval for the difference between the proportions of Californians and Oregonians who are sleep deprived and interpret it in context of the data.
 8 
Solution.
  1. Subscript C means control group. Subscript T means truck drivers. \(H_{0}: p_{C} = p_{T}\text{.}\) \(H_{A}: p_{C} \neq p_T\text{.}\) \(\alpha = 0.05\text{.}\) Choose: 2-proportion Z-test. Check: Independence is satisfied (random samples, that are independent), as is the success-failure condition, which we would check using the pooled proportion \((\hat{p}_{pool}=70/495=0.141)\text{.}\) Calculate \(Z=-1.65 \rightarrow \text{p-value} = 0.0989\text{.}\) Since the p-value is \(\gt \alpha\text{,}\) we fail to reject \(H_{0}\text{.}\) The data do not provide strong evidence that the rates of sleep deprivation are different for non-transportation workers and truck drivers.
  2. The title of this newspaper article makes it sound like using prenatal vitamins can prevent autism, which is a causal statement. Since this is an observational study, we cannot make causal statements based on the findings of the study. A more accurate title would be “Mothers who use prenatal vitamins before pregnancy are found to have children with a lower rate of autism”.

7. Sleep deprived transportation workers.

The National Sleep Foundation conducted a survey on the sleep habits of randomly sampled transportation workers and a control sample of non-transportation workers. The results of the survey are shown below.
 9 
Transportation Professionals
Truck Train Bux/Taxi/Limo
Control Pilots Drivers Operators Drivers
Less than 6 hours of sleep 35 19 35 29 21
6 to 8 hours of sleep 193 132 117 119 131
More than 8 hours 64 51 51 32 58
Total 292 202 203 180 210
Conduct a hypothesis test to evaluate if these data provide evidence of a difference between the proportions of truck drivers and non-transportation workers (the control group) who get less than 6 hours of sleep per day, i.e. are considered sleep deprived.
Solution.
Subscript \(_C\) means control group. Subscript \(_T\) means truck drivers. \(H_{0} : p_{C} = p_{T}\text{.}\) \(H_{A} : p_{C} \ne p_{T}\text{.}\) Independence is satisfied (random samples), as is the success-failure condition, which we would check using the pooled proportion \((\hat{p}_{pool} = 70/495 = 0.141)\text{.}\) \(Z = -1.65 \rightarrow \text{p-value } = 0.0989\text{.}\) Since the p-value is high (default to alpha = 0.05), we fail to reject \(H_{0}\text{.}\) The data do not provide strong evidence that the rates of sleep deprivation are different for non-transportation workers and truck drivers.

8. Sleep deprivation, CA vs. OR, Part II.

Exercise 6.2.9.6 provides data on sleep deprivation rates of Californians and Oregonians. The proportion of California residents who reported insufficient rest or sleep during each of the preceding 30 days is 8.0%, while this proportion is 8.8% for Oregon residents. These data are based on simple random samples of 11,545 California and 4,691 Oregon residents.
  1. Conduct a hypothesis test to determine if these data provide strong evidence the rate of sleep deprivation is different for the two states. (Reminder: Check conditions)
  2. It is possible the conclusion of the test in part (a) is incorrect. If this is the case, what type of error was made?

9. Prenatal vitamins and Autism.

Researchers studying the link between prenatal vitamin use and autism surveyed the mothers of a random sample of children aged 24 - 60 months with autism and conducted another separate random sample for children with typical development. The table below shows the number of mothers in each group who did and did not use prenatal vitamins during the three months before pregnancy (periconceptional period).
 10 
R.J. Schmidt et al. “Prenatal vitamins, one-carbon metabolism gene variants, and risk for autism”. In: Epidemiology 22.4 (2011), p. 476.
Autism
Autism Typical development Total
Periconceptional
prenatal vitamin
No vitamin 111 70 181
Vitamin 143 159 302
Total 254 229 483
  1. State appropriate hypotheses to test for independence of use of prenatal vitamins during the three months before pregnancy and autism.
  2. Complete the hypothesis test and state an appropriate conclusion. (Reminder: Verify any necessary conditions for the test.)
  3. A New York Times article reporting on this study was titled “Prenatal Vitamins May Ward Off Autism”. Do you find the title of this article to be appropriate? Explain your answer. Additionally, propose an alternative title.
     11 
    R.C. Rabin. “Patterns: Prenatal Vitamins May Ward Off Autism”. In: New York Times (2011).
Solution.
  1. Subscript V means vitamin group. Subscript NV means vitamin group. \(H_{0}: p_{V} = p_{NV}\text{.}\) \(H_{A}: p_{V} \neq p_{NV}\text{.}\) Independence is satisfied (random samples, \(\lt 10\%\) of the population), as is the success-failure condition, which we would check using the pooled proportion \((\hat{p}_{pool}=254/483=0.53)\text{.}\) \(Z=2.99 \rightarrow \text{p-value} = 0.0028\text{.}\) Since the p-value is low, we reject \(H_{0}\text{.}\) There is strong evidence of a difference in the rates of autism of children of mothers who did and did not use prenatal vitamins during the first three months before pregnancy.
  2. The title of this newspaper article makes it sound like using prenatal vitamins can prevent autism, which is a causal statement. Since this is an observational study, we cannot make causal statements based on the findings of the study. A more accurate title would be “Mothers who use prenatal vitamins before pregnancy are found to have children with a lower rate of autism”.

10. An apple a day keeps the doctor away.

A physical education teacher at a high school wanting to increase awareness on issues of nutrition and health asked her students at the beginning of the semester whether they believed the expression “an apple a day keeps the doctor away”, and 40% of the students responded yes. Throughout the semester she started each class with a brief discussion of a study highlighting positive effects of eating more fruits and vegetables. She conducted the same apple-a-day survey at the end of the semester, and this time 60% of the students responded yes. Can she used a two-proportion method from this section for this analysis? Explain your reasoning.
You have attempted of activities on this page.