Confidence Intervals and Sample Sizes Part 2: Proportions Lecture 6

Lecture 6
Confidence Intervals
and Sample Sizes
Part 2: Proportions
DePaul University
Bill Qualls
1
Objectives
At the end of this section you should be able to answer
questions concerning point and interval estimates of
a population proportion, and determining the
requisite sample size for a given confidence level.
Specifically, you should understand:
• the difference between a point estimate and an
interval estimate
• how to calculate a confidence interval for a
population proportion
• how to determine the requisite sample size given a
desired margin of error and confidence level.
2
Confidence Intervals
about a
Population Proportion
3
Point Estimate of a Population Proportion
• We used the following example when we introduced
the binomial distribution: Assume my free-throw
average is 40%. If I throw 3 free-throws, what is the
probability that I will miss all three? Hit 1? Hit 2? Hit
3?
• In our problems dealing with binomial probabilities,
we have been given p. But what if we would like to
estimate p with a known level of confidence?
4
Point Estimate of a Population Proportion
• A point estimate is a single value used to
approximate a population parameter.
• The best point estimate ("p-hat") of a population
proportion (p) is the sample proportion.
successes
pˆ =
trials
• The use of a carat symbol (^) over a letter is read as
"hat", and indicates it is an estimated value.
5
Point Estimate of a Population Variance
• Given that the variance for a binomial
distribution is defined as σ²=npq, where q = 1-p,
and that p-hat is the best estimate for the population
parameter p, it follows then that the best estimate
for the population variance is:
σˆ 2 = npˆ qˆ where qˆ = 1 − pˆ
6
The Problem with Point Estimates
• If I sink 4 free-throws out of 10, then my point
estimate for p is .4.
• Likewise, if I sink 40 free-throws out of 100, then my
point estimate for p is still .4.
• We would intuitively have more confidence in the
second statistic than in the first.
• But these are both point estimates, and the
problem with a point estimate is that we cannot
assign any statistical level of confidence to it.
7
Interval Estimates
• We can, however, assign a level of confidence to an
interval estimate.
• If you were asked to come up with a 95% confidence
interval for the first case (4 free-throws out of 10),
you might say you were 95% confident that the true
proportion is between .3 and .5.
• But in the second case (40 free-throws out of 100),
you might say you were 95% confident that the true
proportion is between .35 and .45.
(Numbers used above are "guesses" only, for illustrative purposes.)
8
CI for Population Proportion
• The formula for the confidence interval (CI) for a
population proportion is usually shown as:
p = pˆ ± zα / 2
pˆ qˆ
n
• Some texts prefer the notation:
p = pˆ ± E
where E is the margin of error and is calculated as:
E = zα / 2
pˆ qˆ
n
• These formulas require np ≥ 15 and nq ≥ 15 (or else
the distribution is too skewed; not normal.)
9
90% Confidence Interval
10
95% Confidence Interval
11
99% Confidence Interval
12
Calculating Confidence Intervals
13
Together
• I attempt 100 free throws, and make a basket 40
times. Calculate a 95% confidence interval for my
true free throw percentage.
• Solution:
pˆ qˆ
p = pˆ ± zα / 2
n
(.4)(.6)
= .4 ± 1.96
100
= .4 ± .096
= [.304, .496]
14
Interpretation
So what does it mean?
Wrong: We are 95% confident that the true population
proportion is between .304 and .496.
Correct: If the sampling process were repeated many
times, and the interval calculated each time, 95% of
those intervals would capture the true population
proportion.
15
Interpretation
A miss like this will
occur 5% of the time.
16
Using the TI-83 Plus
• Press [STAT] [TESTS] [1-PropZInt]
• These are always "z", never "t".
• Careful! Don't choose 1-PropZTest (yet).
17
Together
In a survey of 1002 people, 701 said that they voted in
a recent presidential election (based on data from ICR
Research Group). Voting records show that 61% of
eligible voters actually did vote.
a. Find a 99% confidence interval estimate of the
proportion of people who say that they voted.
b. Are the survey results consistent with the actual
voter turnout of 61%? Why or why not?
(Source: Triola, Page 333, Section 7-2, #34)
18
Margin of Error
Given a confidence interval of [0.25, 0.39].
• What is p-hat? (Answer: 0.32)
• What is the margin of error? (Answer: 0.07)
E
0.25
E
0.39
• What is the margin of error for the previous problem?
19
Together
Assume that a sample is used to estimate the
population proportion p. Find the margin of error E
that corresponds to the given statistics and
confidence level: n = 1200, x = 800, 99%
confidence.
(Source: Triola, Page 333, Section 7-2, #18)
20
Together
Find the margin of error:
21
Determining the Proper
Sample Size
22
Sample Size
• How large does sample need to be to get an estimate
of p, with an acceptable margin of error?
E = zα / 2
2
[
zα / 2 ] pˆ qˆ
pˆ qˆ
→ solve for n → n =
n
E2
• In the above formula, E might be, for example, .03
for a 3% margin of error.
• If no prior estimate of p is known then use .5 as .5
will always give you the maximum sample size.
23
Together
• My earlier attempts indicate that my free throw
percentage is around 40%. But I would like a more
narrow confidence interval than the ±9.6% I got with
n=100. How many free throws should I attempt in
order to get a 95% confidence interval with a 3%
margin of error?
24
What about Population Size?
"Many people incorrectly believe that the sample size
should be some percentage of the population, but
(the above formula) shows that the population size is
irrelevant. (In reality, the population size is
sometimes used, but only in cases in which we
sample without replacement from a relatively small
population.) Polls commonly use sample sizes in the
range of 1000 to 2000 and, even though such polls
may involve a very small percentage of the total
population, they can provide results that are quite
good." (Triola, page 330)
25
Together
Use the given data to find the minimum sample size
required to estimate a population proportion or
percentage. Margin of error: four percentage points;
confidence level: 95%; no prior estimate of p-hat is
available.
26
Together
Toyota provides an option of a sunroof and side air bag
package for its Corolla model. This package costs $1400
($1159 invoice price). Assume that prior to offering this
option package, Toyota wants to determine the
percentage of Corolla buyers who would pay $1400 extra
for the sunroof and side air bags. How many Corolla
buyers must be surveyed if we want to be 95% confident
that the sample percentage is within four percentage
points of the true percentage for all Corolla buyers?
(Source: Triola, Page 333, Section 7-2, #44)
Do parts of this problem sound familiar ? ? ?
27
Effect of Sample Size on C.I. Width
28
Gut check
Estimate of margin of error given sample size:
1
E≈
n
Estimate of sample size for given margin of error:
1
n≈ 2
E
29