Download Report

Author: Will A. Eagon
STAT 350 (Fall 2014)
Lab 5: SAS Solution
1
Lab 5 - Interpretation of Confidence Intervals and Power Analysis
for Z tests
Objectives: A Better Understanding of Confidence Intervals and
Power Curves.
A. (55 points) Interpretation of a Confidence Interval. Use software to generate 40 observations
from a normal distribution with µ = 10 and σ = 2. Repeat this 50 times.
Solution:
First, generate the data
Calc → Random Data → Normal (Number of Rows of data to generate=40, store in column
C4,C5,…,C53, Mean= 10, standard Deviation=2) → OK
1.
(30 points) From each set of observations, compute a 90% confidence interval. No data is
required, however, you need to include all 50 confidence intervals.
Solution:
Stat → Basic Statistics → 1-sample t → Insert column name in “One or more samples, each in a
column” (C4 – C53) → Options → Confidence level: 90.0 → Alternative hypothesis: “Mean ≠
hypothesized mean” → OK → OK
One-Sample T: C4, C5, C6, C7, C8, C9, C10, C11, ...
Variable
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33
C34
N
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
Mean
10.504
10.616
9.627
9.903
10.243
9.943
10.005
9.607
10.260
10.051
10.185
10.269
10.222
10.072
10.026
9.418
10.022
9.929
9.755
10.422
9.506
10.385
9.464
9.679
10.500
9.903
9.971
10.160
10.470
9.948
10.255
StDev
2.008
1.969
2.085
1.934
2.037
2.068
1.907
2.089
2.165
1.929
2.113
2.600
1.612
1.923
1.911
1.985
2.040
1.747
2.296
1.901
2.005
1.821
1.715
1.873
2.187
2.095
2.062
2.315
2.123
2.259
1.943
SE Mean
0.317
0.311
0.330
0.306
0.322
0.327
0.302
0.330
0.342
0.305
0.334
0.411
0.255
0.304
0.302
0.314
0.323
0.276
0.363
0.301
0.317
0.288
0.271
0.296
0.346
0.331
0.326
0.366
0.336
0.357
0.307
90%
( 9.969,
(10.092,
( 9.072,
( 9.388,
( 9.700,
( 9.392,
( 9.496,
( 9.050,
( 9.683,
( 9.537,
( 9.622,
( 9.577,
( 9.793,
( 9.560,
( 9.517,
( 8.889,
( 9.478,
( 9.463,
( 9.143,
( 9.916,
( 8.972,
( 9.900,
( 9.007,
( 9.180,
( 9.917,
( 9.345,
( 9.422,
( 9.543,
( 9.905,
( 9.346,
( 9.737,
CI
11.039)
11.141)
10.183)
10.419)
10.785)
10.494)
10.513)
10.163)
10.836)
10.565)
10.747)
10.962)
10.652)
10.584)
10.535)
9.947)
10.565)
10.394)
10.366)
10.929)
10.040)
10.871)
9.920)
10.178)
11.083)
10.461)
10.521)
10.777)
11.036)
10.550)
10.773)
Author: Will A. Eagon
STAT 350 (Fall 2014)
C35
C36
C37
C38
C39
C40
C41
C42
C43
C44
C45
C46
C47
C48
C49
C50
C51
C52
C53
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
40
10.359
9.629
9.914
9.902
9.399
10.002
9.496
10.154
9.503
9.719
10.260
9.925
9.894
10.364
9.847
10.948
9.783
10.012
10.683
Lab 5: SAS Solution
1.778
2.147
1.724
2.057
1.915
2.132
1.609
1.974
2.013
2.059
1.847
2.192
1.826
1.821
2.119
1.889
1.889
1.958
2.216
0.281
0.339
0.273
0.325
0.303
0.337
0.254
0.312
0.318
0.326
0.292
0.347
0.289
0.288
0.335
0.299
0.299
0.310
0.350
( 9.885,
( 9.057,
( 9.455,
( 9.354,
( 8.889,
( 9.434,
( 9.067,
( 9.628,
( 8.967,
( 9.170,
( 9.768,
( 9.341,
( 9.407,
( 9.879,
( 9.282,
(10.444,
( 9.280,
( 9.490,
(10.092,
2
10.832)
10.201)
10.373)
10.450)
9.909)
10.570)
9.924)
10.679)
10.040)
10.267)
10.752)
10.509)
10.380)
10.849)
10.411)
11.451)
10.287)
10.534)
11.273)
2. (10 points) Determine how many of these intervals contain the population mean, µ = 10. Please
indicate for each confidence interval if it contains the value or not. Is this number that you would
expect? Why or why not?
Solution
We can see that, in this special case, there are 7 confidence intervals that do not contain the true
population mean, 10. This is roughly what we would expect. The confidence level is 90%, so we would
expect (0.9)(50) = 45 of the confidence intervals to include the population mean, 10. Now we have
43, which is not far from what we expect.
3. (15 points) GROUP PART: Combine your data with 3 or 4 other students (in any of my sections)
and answer the following questions: 1) Is the number of intervals that contain the mean what you
would expect for the combined data? 2) How are the results from part 2 and part 3 different? (This
is due on the following Monday and must be submitted online also.)
Solution:
You will need to show your work to the number of intervals that contain the mean (just add up the
numbers from each student) and then calculate the percentage by dividing by the total number.
I would expect that this number would be more accurate because this is for a larger sample size. This
percentage is only valid at large numbers.
B. (45 points) Water quality testing. The Deely Laboratory is a drinking-water testing and analysis
service. One of the common contaminants it tests for is lead. Lead enters drinking water through
corrosion of plumbing materials, such as lead pipes, fixtures, and solder. The service knows that their
analysis procedure is unbiased but not perfectly precise, so the laboratory analyzes each water sample
three times and reports the mean result. The repeated measurements follow a Normal distribution
quite closely. The standard deviation of this distribution is a property of the analytic procedure and is
known to be σ = 0.25 parts per billion (ppb).
The Deely Laboratory has been asked by the university to evaluate a claim that the drinking water in
the Student Union has a lead concentration of 6 ppb, well below the Environmental Protection
Agency’s action level of 15 ppb. Since the true concentration of the sample is the mean μ of the
population of repeated analyses, the hypotheses are
Author: Will A. Eagon
STAT 350 (Fall 2014)
Lab 5: SAS Solution
3
The lab chooses the 1% level of significance,  = 0.01. They plan to perform three analyses of one
specimen (n=3).
1. (30 points, 6 points each part) Using computer software, calculate the following powers:
a. At the 1% level of significance, what is the power of this test against the specific alternative μ =
6.5?
Solution:
Stat → Power and Sample Size → 1-Sample Z Test → Set sample size, n=3, standard deviation=.25,
differences =.5 → Options: significance level=.01 → OK → OK
Thus, the power is 0.812803
b. At the 5% level of significance, what is the power of this test against the specific alternative μ =
6.5?
Solution:
Stat → Power and Sample Size → 1-Sample Z Test → Set sample size, n=3, standard deviation=.25,
differences =.5 → Options: significance level=.05 → OK → OK
Thus, this test is more powerful than the previous one at 0.933727
c. At the 1% level of significance, what is the power of this test against the specific alternative μ =
6.75?
Author: Will A. Eagon
STAT 350 (Fall 2014)
Lab 5: SAS Solution
4
Solution:
Stat->Power and Sample Size-> 1-Sample Z Test->Set sample size, n=3, standard deviation=.25,
differences =.75->Options: significance level=.05
Thus, this test is the most powerful of the three at .9956
d. If the lab performs five analyses of one specimen (n=5), what is the power of this test against the
specific alternative μ = 6.5?
Solution:
Stat → Power and Sample Size → 1-Sample Z Test → Set sample size, n=5, standard deviation=.25,
differences =.65 → Options: significance level=.01 → OK → OK
The power is .99998, this the most powerful of all the tests
e. Write a short paragraph explaining the consequences of changing the significance level, alternative
μ and sample size on the power.
Solution:
We have the following conclusion:
The larger the value of  (or equivalently the higher the significance level), the greater the power is.
The larger the distance between the ’ and 0 is, the greater the power.
The larger the sample size, the greater the power.
Author: Will A. Eagon
STAT 350 (Fall 2014)
Lab 5: SAS Solution
5
2. (10 points) Generate a power curve when n =3 at a 1% significance level. Please use an interval
length of 4.
Solution:
Stat → Power and Sample Size → 1-Sample Z Test → Set sample size, n=3, standard deviation=.25,
differences =-2:2/.05 → Options: significance level=.05 → OK → OK
3. (5 points) What sample size would be required for the power to be at least 0.90 at the 1% level of
significance against the specific alternative μ = 6.5?
Solution:
Stat → Power and Sample Size → 1-Sample Z Test → Set sample size, left blank, standard
deviation=.25, differences =.5, Power=.9 → Options: significance level=.01 → OK → OK
Thus, I would need a sample size of at least 4 to secure a power of at least .9 at the 1% level of
significance against the specific alternative µ=6.5