Recall Investigation 1.4, where you learned that 8 of the 10 most recent heart transplantation operations at St. Georgeβs Hospital resulted in a death.
Use technology to find the "exact binomial" (Clopper-Pearson) 95% confidence interval for the probability of death in the heart transplantation process at St. Georgeβs Hospital.
The midpoint of this interval is (0.4439 + 0.9748)/2 = 0.709 which is lower than \(\hat{p}\) = 0.80. This makes sense as if we observed \(\hat{p}\) = 0.80, the largest \(\pi\) could be is 1.0 (0.20 away) but \(\pi\) could conceivably be as small as 0.0 (0.80 away), so we donβt want an interval that is symmetric around 0.80. The width is 0.9748 - 0.4439 = 0.531.
Use your calculator to calculate the 95% one proportion \(z\) confidence interval \(\hat{p} \pm z^* \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\text{.}\) Also report the midpoint and width.
Use technology to verify your calculation of the one proportion \(z\)-interval (also called the Wald interval). Does the technology output match your calculation from the previous question?
The technology output should match your calculation: (0.552, 1.05), though some software packages may round the upper value to 1.
Not Verified
Check your calculation again. The technology should produce the same interval: (0.552, 1.05), though some software pages may round the upper limit to 1.
The confidence level of an interval procedure is supposed to indicate the long-run percentage of intervals that capture the actual parameter value, if repeated random samples were taken from the identical process. As such, the confidence level presents a measure of the reliability of the method. A confidence interval procedure is considered valid when the achieved long-run coverage rate matches (or comes close to) the stated (aka nominal) confidence level.
To consider whether the one sample \(z\)-interval is valid for this study, we can pretend we know the process probability, generate many random samples from that process, calculate confidence intervals from the simulated samples, and see what percentage of these confidence intervals capture the actual value of the process probability that we specified.
That would be very unlikely! Each simulation uses a different random sample from the process with \(\pi = 0.15\text{,}\) so youβll get different intervals. Question 3 used the actual data with 8 out of 10 deaths.
No
Correct! Each simulation uses a different random sample from the process with \(\pi = 0.15\text{,}\) so youβll get different intervals. Question 3 used the actual data with 8 out of 10 deaths.
Solution.
Most likely no - each simulation uses a different random sample from the process with \(\pi = 0.15\text{,}\) so youβll get different intervals. Question 3 used the actual data with 8 out of 10 deaths.
Did your interval succeed in capturing the value of \(\pi = 0.15\text{?}\) (Check if 0.15 falls between your interval endpoints, or if the interval is colored green in the applet.)
How many of these random intervals succeed in capturing the actual value of the process probability (0.15)? Does this method appear to succeed in capturing the value of the parameter (which we assumed to be equal to 0.15) 95% "of the time"?
Because the \(z\)-interval procedure has the sample size conditions of the Central Limit Theorem, there are scenarios where we should not use this \(z\)-procedure to determine the confidence interval. In these cases, one option is to use the Exact Binomial distribution (Clopper-Pearson) method.
Use the Method pull-down menu to select Exact Binomial and continue to press Sample until you generate at least 1,000 confidence intervals from a process with \(\pi = 0.15\) and \(n = 10\text{.}\)
The Clopper-Pearson method will have a coverage rate of at least 95%, but tends to be "conservative," meaning the coverage rate is higher than 95% so the intervals are longer than they need to be. The method is also fairly complicated (as opposed to: "go two standard deviations on each side"). An alternative method that has been receiving much attention of late is often called the Plus Four procedure:
Increase the number of successes by two and the sample size by four, and calculate a new proportion: \(\tilde{p} = (X + 2)/(n + 4)\text{.}\) Make this value the midpoint of the interval.
Use the \(z\)-interval procedure as above for the augmented sample size of \((n + 4)\text{:}\)\(\tilde{p} \pm z^* \sqrt{\frac{\tilde{p}(1-\tilde{p})}{n+4}}\)
In short, with 95% confidence, the idea is to pretend you have 2 more successes and 2 more failures than you really did. For confidence levels other than 95%, researchers have recommended using the Adjusted Wald method: \(\tilde{p} = (X + 0.5(z^*)^2)/(n + (z^*)^2)\) and \(\tilde{n} = n + (z^*)^2\) so the interval becomes \(\tilde{p} \pm z^* \sqrt{\frac{\tilde{p}(1-\tilde{p})}{\tilde{n}}}\text{.}\)
What do you notice about how the midpoints of the intervals compare? (Also think how \(\tilde{p}\) and \(\hat{p}\) differ, these are the values shown in the "Sample statistics (CI Midpoints)" graph.) Why is the change in the midpoints of the intervals useful in this scenario?
Use the pull-down menu to select the Blaker method (the Exact Binomial using the tail probabilities), and wait a bit. (If itβs just 100 at a time, itβs not too bad to hit Sample a few more times.)
The Plus Four method is more likely to be a 95% confidence interval method than the one-sample \(z\)-interval method. It is never a bad idea to use the Plus Four method.
"Fudging our data" may seem like cheating, but there is actually some nice theory behind this method. The plus four adjustment is not that different in principle from the continuity correction we suggested with p-values. Many statisticians now recommend using this procedure over the one proportion \(z\)-interval procedure because it achieves the stated confidence level for any sample size and over the Clopper-Pearson binomial interval procedure because the intervals tend to be shorter (narrower). In fact, the more general case (something other than 95%) is becoming the default in many software packages. (The Blaker method in others.)
17.Plus Four Interval for St. Georgeβs Hospital.
Determine and then interpret a Plus Four confidence interval for the probability of a death during a heart transplant operation at St. Georgeβs hospital.
You can do the calculation either by hand, first finding \(\tilde{p}\) and \(z^*\text{,}\) or with software (e.g., the Theory-Based Inference applet) by telling the technology there were 4 more operations consisting of two more deaths than in the actual sample.
Return to your exploration using the Simulating Confidence Intervals applet. Assume that \(\pi = 0.667\) is the probability that a kissing couple leans to the right.
From the Method pull-down menu select the Score interval method. Evaluate the performance of this interval method, clearly explaining your steps. [Hint: Does the performance depend on the value of \(n\) that you use? What if you change the value of \(\pi\) to something closer to 0 or 1? Average widths?]
The Score interval method performs well across different sample sizes and parameter values. The coverage rate is close to the nominal 95% confidence level even for small sample sizes and extreme values of \(\pi\text{.}\) As sample size increases, the average width of the intervals decreases while maintaining the appropriate coverage rate.
Starting with \(z = \frac{\hat{p} - \pi}{\sqrt{\pi(1-\pi)/n}}\text{,}\) use the quadratic formula to solve for \(\pi\) and verify the first formula for the Score interval.
Starting with \(z = \frac{\hat{p} - \pi}{\sqrt{\pi(1-\pi)/n}}\text{,}\) square both sides and rearrange to get a quadratic equation in \(\pi\text{.}\) Using the quadratic formula yields the Score interval formula:
The midpoint of the Score interval is \(\frac{\hat{p} + \frac{z^2}{2n}}{1 + \frac{z^2}{n}}\text{.}\) With \(z^* = 1.96 \approx 2\text{,}\) this becomes approximately \(\frac{\hat{p} + \frac{4}{2n}}{1 + \frac{4}{n}} = \frac{\hat{p} + \frac{2}{n}}{1 + \frac{4}{n}}\text{,}\) which matches the midpoint of the Adjusted Wald (Plus Four) procedure.
Checkpoint4.6.4.Relate Adjusted Wald to Plus Four.
Show that with 95% confidence the midpoint of the Adjusted Wald procedure can be approximated with \(\frac{X + 2}{n + 4}\text{,}\) the Plus Four method.
Checkpoint4.6.5.Probability of Selecting Capturing Interval.
Suppose that you take 1000 samples from a random process and generate a 95% confidence interval for \(\pi\) with each sample. If you randomly select one of these intervals, what is the probability that you will select an interval that captures \(\pi\text{?}\)
The probability is approximately 0.95 (or 95%). Since these are 95% confidence intervals, in the long run about 95% of the intervals will capture the true parameter value \(\pi\text{.}\) Therefore, randomly selecting one interval gives you about a 95% chance of selecting one that captures \(\pi\text{.}\)
Checkpoint4.6.6.Probability for Specific Interval.
Suppose you take one sample from a random process and generate a 95% confidence interval to be (0.15, 0.25). What is the probability that this interval captures \(\pi\text{?}\)
The probability is either 0 or 1 - we just donβt know which! Once a specific interval is constructed, the parameter \(\pi\) either is or isnβt in that interval. The "95% confidence" refers to the method, not to any particular interval. If we were to use this method repeatedly, about 95% of the intervals constructed would contain \(\pi\text{.}\)
Incorrect. To be more confident, we need a wider interval and higher coverage rate.
Width decreases, coverage rate increases
Incorrect. Higher confidence level requires a wider interval.
Width increases, coverage rate increases
Correct! Increasing confidence level requires a larger critical value, which increases the margin of error and makes the interval wider. It also increases the coverage rate - a higher percentage of intervals will capture the true parameter.
Width increases, coverage rate decreases
Incorrect. Higher confidence level means we capture the parameter more often, not less.
Correct! Larger sample sizes reduce the standard error, which decreases the margin of error and makes intervals narrower. For a valid confidence interval procedure, the coverage rate should remain at the stated confidence level (e.g., 95%) regardless of sample size.
Width decreases, coverage rate increases
Incorrect. While larger samples produce narrower intervals, the coverage rate should match the confidence level regardless of sample size for a valid procedure.
Width stays the same, coverage rate stays the same
Incorrect. Sample size affects the width through the standard error.
Width increases, coverage rate stays the same
Incorrect. Larger samples give more precise estimates with narrower intervals, not wider.