Document 292940

(b) Using m
= b + c = 8 + 26 =
34, the large sample test statistic is
b-
z
c- 1
8 - 26 - 1
= v'b+C =
";8
+ 26 = -3.258.
Then the P-value is
P
Since P
students.
Solutions
9.17
= 2(1
~(I-
3.2581))
=2x
0.0006
= 0.0012.
< a = 0.05, reject Ho and conclude that there was a change in opinion of the
to Section
9.3
= P2 = ... = pa = 1/8
= 144 x 1/8 = 18. Then
Ho: PI
np
-
vs. HI:
Not Ho· The expected frequencies in each case are
- EXP? -_ (29 -1818)2
X2 -_ "LJ (Obs Exp
+ (19 -1818)2 + ... + (11 -18)2
18
_- 16.333.
Since X2 > X~-I,O.05= 14.067, reject Ho and conclude that the horse's chances of winning
are not the same for each. starting gate.
9.18
Ho: PI
are np
= P2 = .,. = P12 = 1/12 vs.
= 700 x 1/12 = 58.333. Then
LJ
X2 -_ ,,(Obs
HI:
- Exp)2 -_ (66 -58.333
58.333)2
Exp
Not Ho. The expected frequencies in each case
58.333)2 + ... + (42 -58.333
58.333? _-19.726.
+ (63 -58.333
Since X2 > Xf2-I,O.05 = 19.675, reject Ho and conclude that the first births are not spread
uniformly throughout the year.
9.19
(a) Ho: Pi
= m (.5f
vs. HI:
Not Ho· Using
tli
= npi = 98 G) (0.5)7,
the results are summarized below:(ni-e»2
6901ni
5.3590.003
16.0780.269
14
0.766
26.7971.254
16.0782.181
26.7970.120
22
25
21
5.3592.452
98tli
6.278
X"l.=
Sons
""
04362
1
Total
75
..
-135 -
Note that cell 0 was combined with cellI, and cell 7 was combined with cell 6, to satisfy
the requirement that no cell can have ~ < 1 and no more than 1/ 5th of the ~ can be
< 5. Since X2 < X~-1,.10 = 9.236, do not reject Ho and conclude that the binomial
distribution with p = 0.5 is a plausible distribution.
(b) Ho: Pi is binomial vs. H1:
1>
=
Not Ho. Using
Number of Sons
Number of Children
=
364
7 x 98
=
0.531
and
e; =njJ;
the results are summarized below:
0
543276
= 98 G) {O.531)'(O.469)7-',
(ni-ei)
24.8740.001
22
21
14
13.2030.048
3.8930.595
7.1870.327
0.492
1.161
19.0720.450
25
6901~98~3.223
28.1191.802
1'2=
Sons
e·
1
Total
Note that cell 0 was combined with cellI, and cell 7 was combined with cell 6, to satisfy
the requirement that no cell can have ~ < 1 and no more than 1/5th of the ~ can be
< 5. Since x2 < X~-2,.10 = 7.779, do not reject Ho and conclude that the binomial
.5.
distribution is a plausible distribution. This agrees with (a), since f> = .531 :.:::::
9.20
(a) Ho : PI
=
9/16,P2
= Pa = 3/16,P4 = 1/16
vs. HI:
Not Ho.
(b) Using
~ = npi = 1611pi,
the results are smnmarized below:
Phenotype
Tall,cu~led
Dwarf, cut-led
Tall, potato-leaf
Dwarf, potato-leaf
Total
~
926
293
288
104
1611
~
906.188
302.063
302.063
100.688
rni-~)
~
0.433
0.272
0.655
0.109
1.469
Since X2 < ~-1,.05 = 7.815, do not reject Ho and conclude that the proportion
fit the observed frequencies well.
9.21
Since
.x _
-
229 x 0 + 211 x 1 +
229 + 211 +
- 136-
+ +1 7 x 1 = 0.932,
9:3:3:1
.-
then using the PO,isson formula,
and the expected frequencies are
The results are summarized below:
(ni-ei)~~
211
229
Pi
0710.626
0.367
0.003
1.331
7.137
0.022
0.000
0.012
0.171
0.053
0.394
211.390
226.743
3993
35
0.204
0.622
0.028
8.539
0.311
0.001XT=
0.002
576
0.207
0.960
71.i
Hits
7Total
0436215
1'-'
.
Note that cells 6 and 7 were combined with cell 5, to satisfy the requirement that no cell
can have ei < 1 and no more than 1/5th of the ~ can be < 5. Since X2 < X~-2,O.05 = 9.488,
do not reject Ho and conclude that the Poisson distribution is a plausible model.
9.22
(a) Since j,
= 0.519,
then
Pi
=
e-O.519 (0.519)i
.,
'&.
'
and
031>5
42
Total
The results are summarized
below:
(n;-e;)~
~1011
Pi
14
1.819
89.686
14.019
227
28
678
0.206
56
13.944
196.998
81.032
7.733
0.309
312.262
23.281
0.595
601.662
0.000
0.080
0.002
0.014
71.iX2=
251.642
--I'~
Passengers
Note that cell ~ 5 was combined with cell 4, to satisfy the requirement that no cell can
have ei < 1 and no more than 1/5th of the ~ can be < 5. Since X2 > ~-2,.05 = 7.815,
reject Ho and conclude that the Poisson distribution is not a plausible distribution for
the number of passengers.
(b) Since p
=
1/(1
+ 0.519) = 0.658, then
Pi = (1 - p)i-lp = (0.342)i-l
and
The results are summarized below:
-137 -
(0.658)
.-
(Tti-e; )2
4235
Total
>6
Pi
227.407
227
4.708
9.071
0.009
0.005
0.026
0.225
18.342
277.698
678
6.547
1011
56
28
0.079
6.060
14
80.232
0.001
0.077
0.658
665.569
0.126
24.841
71iX".l.=
"-'
Occupants
ej
1
Since x: > ;d-2,.05 = 9.488, reject Ho and conclude that the geometric distribution
not a plausible distribution for the number of occupants.
is
(c) While neither is a plausible distribution for the data, the geometric distribution seemed
to fit much better, since the X2 value is much smaller. Also note that the lack of fit of
the geometric distribution comes primarily from the tail category (~ 6).
9.23
(a)
x: =
t
i=l
(7li -. ej)2
ej
(x - npo)2
(n - x = -'-----"---'-+ -----npo
n(l-
9.24
Using
P
= q = 0.5,
Po)
_
-
(x - npo)2(1 - po) + (npo - x?po
npo(1 - Po)
_
(x - npo)2 _
npo(lPo)
_ -----z.
(b) We reject Ho if Izi
n(1 - Po))2
> ZQ/2 or
if z2
2
> .t~/2 = xtQ'
Hence the two tests are equivalent.
then
Pi
= (i-l)[4'4
3
P
qZ- +q 4'4]
pZ-
= (i-I)3
(0.5)'-1
.
and
ej = npi·
4756
Total
The results are summarized below:
(ni-ei)"
ej
Pi
14
21
11
6.500
60.038
0.250
13.000
0.308
0.313
16.250
52
0.312
0.125
1.388
2.046
7li~=
Games (i)
"-'
-
..
Since X2 <
data well.
X§.O.05
= 7.815,
we do not ~eject Ho and conclude that this model does fit the
-138 -
Multinomial sampling. Ho : Pij = Pi.p.j for all
and refers to political party affiliation.
i
i,i,
where
(b) Product Multinomial sampling. Ho : Pij = Pj for all
mutual fund and refers to the return classification.
i
i refers
i,i, where
to religious affiliation
i refers
to the type of
i
(a) Product Multinomial sampling. Ho : Pij = Pj for all i,i, where refers to the age group
and refers to the willingness to use the internet grocery service.
j
(b) Multinomial sampling. Ho : Pij = pi.p.j for all
and refers to the use of a safety restraint.
j
424
233
503
312
152
150
2442
.661
12
465
638
896
1431
992
9181
81202
485
5673367
646
3831
385
869
4146
Tot.
231
2618
109
138
263
1146
2685
275
64
663
13175
101
276
61
10
441
594
6106
24
125
12
863
1177
3479
1263
1434
1394
2092
2833"
1429
980
2752
205
40
1213
122
1102
400
1313
454
19281
73
2302
50
23
13+
2377
ft. •.;
ow:
i refers to the severity
of injury
7361
fl.,;
Using
Multinomial sampling.
(a) Product
n
~j=--,
x2
= ~t,J
=
-
ni.n.j
A
Length
(b)
i,i, where
(~j - ~j?
~j
(312 - 233)2
233
101.494.
(1146 - 1213)2
1213
+
+ ... +
(61 - 50)2
50
Since ~ > Xt2-1)(13-1),O.Ol= 26.217, reject Ho and conclude that the Q.C.S. letters do
not match Mark Twain's word length patterns.
(a) Product Multinomial sampling.
(b) Ho.: Pij = Pj , where
was returned.
(c) Using
i refers to type
of city and
i refers
to whether or not the wallet
~.n.j
~j=--,
n
A
the results are summarized below:
24
17
20
18
21
~j 30
12
96 Kept ~j
120
10
13
30
40
Total
Type
of
Cities
nij
80
Returned
.. Cities
Suburbs
Medium
Small Cities
Total·
~j
Cities
~j
.-
- 139-
Then
2
X
=
=
(ni- - ed2
2:
J
3
i,j
e;.j
(21 - 20)2
20
4.5.
+
(9 - 10)2
10
+ ... +
(6 - 10)2
10
Since X2 < X~4-1)(2-1),O.10= 6.251, do not reject Ho and conclude that there are no
differences in the return rates among the different types of cities.
9.29
(a) Multinomial sampling.
(b) Ho : Pij
= Pi-P-j
for all
i,j, where i refers to gender and j refers to height above ground.
(c) Using
fli·n.j
~j
Males
=---;-'
the results are summarized below:
173
138.25
184.75
150 223
125
113.25
73
198
84.75
298
521
35
Total
feet3323
feet
flij
Height above ground
Total
Females
Gender
e;.j
Then
2
X
=
(fli--e;._)2
2:
3
J
i,j
e;.j
=
=
(173 - 184.75)2
184.75
4.593.
+
(125 - 113.25)2
113.25
+ ... +
(73 - 84.75)2
84.75
Sinceassociated
~ > ~2-1)(2-1),O.05
3.841, reject Ho and conclude that gender and trap height
are
and are not =
independent.
9.30
(a)
77
185
4812OK
16
08
91
Row
TotalAirsick
6031
Treatment
Placebo
Dramamine·
Column
Total
i
(b) Ho : Pij = Pj , where refers to the treatment
volunteer became airsick.
(c) Using
~
given and j refers to whether or not the
- ni.n.j
--n
e;.j -
the results are summarizea below:
- 140-
,
..
31
45.50
60
108
2
16
125
48
1
08
77
62.50
Row
Totalfiij
OK
"'i;
Treatment
Airsick
91
olumn
acebo
ne
e;,j
~j Total
Then
x2
=
LiJ (~j
=
(31 - 45.50)2
45.50
15.970.
=
- ~j )2
eij
+
(77 :- 62.50)2
62.50
+ ···+
(48 - 62.50)2
62.50
Since
x? > X~2-1)(2-1),O.05
3.841, reject Ho and conclude that Dramamine is effective
in
reducing
the chances of =
airsickness.
9.31 (a)
•
Cholesterol Level
:5 250
> 250
8
3
12
"
Personality Type
17
(b) Ho : Pij = Pi.P.j for all
cholesterol level. Using
i,j,
where
B
i refers
A
~j
-
A
the results are summarized below:
17Row
12
14.5
38 20
5.5
11
40
>
250Total n;,j
n;j
A
Total
Personality Type
~j
=
to the personality type and
fli·?l·j
n
j
refers to the
,
<"Cholesterol
250
29
level
Then
=
=
=
LiJ (~j_~j)2
f1;
(12-
14.50)2
14.50
3.135.
.(8 - 5.50)2
+ 5.50
+ ···+
(3 - 5.50)2
5.50
•
-
~
Since x: < ~2-1)(2-1).O.lO = 2.706, reject Ho and conclude that personality type and
cholesterol level are associated and are not independent.
".
9.32 . (a) Multinomial sampling.
- 141 -
(b) Ho: Pij
= pi.P·j
for all
i,j, where
i refers to eye color and j refers to hair
color. Using
~
~.n.j
~j=--,
n
the results are summarized below:
15
40.14
11.68
16.97
68
513.73
20
39.22
14
11.15
17
Blond
127
29
Brown
119
106.28
19.95
64
16
44.93
30.92
286
Total
725.79
26
71
7.68
103.87
46.12
'54
93
10
94
47.20
592
26.39
Red Black
108
14
215
84
Eye 220
~j
nij
Brown eij
Green
Hazel
Blue
Total
~j
~j
Hair Color
Color
Then
x2
=
E
i,j
=
=
(68 - 40.14)2
(119 - 106.28)2
40.14
+
106.28
138.290.
(~ij
- ~j)2
~j
X2 > Xf4-1)(4-1),O.05
16.919, reject
Since
are
associated
and are not =independent.
9.33
Ho
+ ... +
(16 - 7.68)2
and conclude that eye color and hair color
i
(a) Ho : Pij = Pi.P.j for all i,j where refers to the opinion on full evacuation and
to the distance from Three Mile Island.
(b) Using
~j
= ~.n.j
n '
the results are summarized below:
7.04
974-6
413-15
11
112.88
64.84
4.4
10-12
7-9
53
05.6
1-3
6.16
11
29.92
10.12
9.68
12.32
38.08
29
22
23
68
10
39
84
Total
66
150
16
Full 8.96
15+
Distance
(in miles) fromRow
Three Mile Island
~j
l
Then
X2
=
(~.
E
i,j
=
(7 - 7.04)2
-
0.449.
_ ~.)2
3
3
+
(11 - 9.68)2
(39 - 38.08)2
9.68
+ ... +
38.08
- 142-
j refers
Chapter 10 Solutions
tions to Section 10.1
(a) Theoretical and deterministic.
(b) Empirical and probabilistic.
(c) Empirical and probabilistic.
(a) Empirical and probabilistic.
(b) Empirical and probabilistic.
(c) Theoretical and deterministic.
Experimental Study: Chemical reaction, where the temperature is controlled at various
settings to determine its effect on yield.
Observational Study: Model income as a function of years of education for a sample of
workers.
Solutions to Section 10.2
10.4 (a)
70
80
AST
80
•
• •
~
w
50
Z
•
•• •• ••• •• •
• •
40 3.5
4.0
2.5
2.0
5.0
4.5
3.0
1.5 Scatterplot
•
of Next
VS.
Last
•
•
90
This shows a positive approximately linear relationship.
(b) Using x
= 3.238 and y = 62.714,
s~
= LXiYi -
S~~
= [2.0 x 50 + 1.8 x 57 + ... + 4.3 x 72]= 217.629,
= LX; - nx2
i
-
nxy
..
i
- 153-
21(3.238)(62.714)
[(2.0)2 + (1.8)2 + ... + (4.3)2)
= 22.230,
= ~LYi2 -ny-2
_
S yy
_
=
[(50)2 + (57)2
2844.286.
+ ... + (72)2)
- 21(3.238)2
- 21(62.714)2
Then
PI
= Sr.r;
Sxy
_ 217.629
-
22.230
= 9.790,
and
Po
= ii -
PIX
= 62.714'-
9.790 x 3.238
= 31.013.
Therefore, the least squares regression line is
y
= 31.013 + 9.790x.
For a previous eruption of 3 minutes, the predicted time to the next eruption would
y
= 31.013 + 9.790(3) = 60.383.
(c) Since
SSR =P~Szz = (9.790)2 x 22.230 = 2130.599,
and SST
= Syy, then
2 = SSR = 2130.599 = 0 749
r
SST
2844.286
.
,
so that approximately 75% of the variability in NEXT is accounted for by LAST. LAST
appears to be a pretty good predictor of NEXT.
(d) Since
MSE = SSE = SST - SSR = 2844.286 - 2130.599 = 37
n-
2
n-
the estimate of u is
fJ
2
21 - 2
.562,
= -137.562 = 6.129.
10.5 (a)
Scatterplot of jump length by year
19
()
17~
YEAR
161
151
w
0
..•
• 1940
1980
1900
2000
• •
14 • 1920
1980
13,
• • Z1880•181
•
••
~
•
1
-154 -
••
•••••
•
This shows a positive approximately linear relationship.
(b) From Minitab, the least squares regression line is
= -62.312 + 0.040x.
y
(c) From Minitab, the MSE is 0.104, so the estimate of q is
a = v'0.104 = 0.322.
(a)
Scatterplot
220,
G)
!!
2'0 :>
ometric Pressure
E
~
'E
~
.s:
al
!
•
•
30
28
32
22
26
Q. 200 24
19020
'S
Cl
••
••
•
•
•
••
••
This shows a positive approximately linear relationship.
(b) From Minitab, the least squares regression line is
y
= 155.296 + 1.902x.
Since r2 = 0.994, approximately 99% of the variation in the boiling point is accounted
for by linear regression on the barometric pressure.
(c) From Minitab, the MSE is 0.197, so the estimate of q is
a = v'0.197 = 0.444.
- 155-
Scatterplot of Winning Time by Year
so
.
••
••
80
~
F
Cl
c:
1920
801
~
701
I
Year
3:
•2000
•
•
1930
1970
1980
1940
1950
•1990
••
•
••
•
•
IIJ
This shows a negative approximately linear relationship.
(b) From Minitab, the least squares regression line is
= 774.012
y
- 0.359x.
(c) From Minitab, the MSE is 1.728, so the estimate of q is
= v'1. 728 =
fJ
10.8
Since
1.315.
n
Q
= L(Yi
- ,BIXi)2,
i=l
to maximize Q we solve
dQ
n
df31
= 2?=
,=1 Xi(Yi
or
n
- f31Xi)
=0
n
f31LX~
i=l
= LXiYi,
i=l
yielding
EXiYi
A
f31=
Ex~'
Solutions to Section 10.3
10.9
(a) Since
SE(Pl)
= _8 ___
0.3227
the test statistic for R O.. f3 1=018 . v'Szz - v'18153
t=
Since t >
trend.
t19,O.OS
= 0.00247,
~
0.040
= -= 16.312.
SE(f31)
0.00247
A
= 1.729, we reject
Ho
and conclude that there is a significant linea£
-156 -
(b) The predicted value is
f/ = -62.3 + 0.0403
x 2004
= 18.461.
Then a 95% PI is given by
= it ± t21-2,O.0258
95% PI
=
=
=
1
(x* - x)2
1+ -n + ---Sxx
18.461
± 2.093(0.3227)
18.461
± 0.747
1
1
(2004 - 1947.429)2
18153
+ 21 +
[17.714,19.208].
This is unreliable because we are extrapolating beyond the domain of the data. A 95%
CI for the winning jump does not have a meaningful interpretation because there will
only be one winning jump in 2004, and we are not concerned about the average winning
jump in 2004 if we were to hold the competition over and over again.
The Minitab output is showp. below:
Regression
Analysis
The regression
Temp
equation
is
= 155 + 1.90 Pressure
Coef
Predictor
StDev
T
P
Constant
155.296
0.927
167.47
0.000
Pressure
1. 90178
0.03676
51. 74
0.000
S = 0.4440
= 99.4%
R-Sq
p.*
J1.
95% CI
= 99.4%
= 28 is
The predicted value at x
Then a 95% CI for
R~Sq(adj)
= 155.296 + 1.902 x
28
= 208.546.
is given by
=
=
=
=
(b) The predicted value at x
p.*
± t17-2,O.025S
1 (x* - x)2
n + ----Sxx
-
208.546
± 2.131(0.444)
208.546
± 0.325
1
17
+
(28 - 25.059)2
145.886
[208.221,208.871].
= 31 is
p.* = 155.296 + 1.902 x
-157 -
31 = 214.258.
_ i7
Then a 95% CI for
J.L
is given by
95% CI
1 (x * -x)2
± t17-2,O.025S~1 ~ + 8:&:&
=
p.*
-
214.258
± 2.131(0.444)
=
=
214.258
± 0.519
1
17
+
(31 - 25.059)2
145.886
[213.739,214.777].
This is wider than the CI at x
= 28
because the point of extrapolation
is outside the
range of data, where the regression line is less reliable.
10.11 (a) The Minitab output is shown below:
Regression Analysis
The regression equation is
NEXT = 31.0 + 9.79 LAST
T31.013
Coef= 74.9%
9.790
P StDev
7.53
7.02
0.000
1.300
4.417
R-Sq
Predictor
R-Sq(adj)
= 73.6%
Analysis of Variance
Source
DF
Regression
Residual Error
Total
19
SS
2130.6
713.7
2844.3
1
20
MS
F
P
2130.6
37.6
56.72
0.000
Predicted Values
Fit
60.38
40.80
StDev Fit
1.37
3.20
A 95% PI at x
(
(
95.0% CI
57.51,
63.26)
34.10.
47.51)
(
(
95.0% PI
47.24.
73.53)
26.33.
55.28)
for x=3
for x=l
= 3 is [47.24,73.53] ..
(b) From the output above, a 95% CI at x = 3 is [57.51,63.26]. This is narrower than the
PI, because it is a confidence interval for the average of all future observations at x = 3,
and not a single future observation.
(c) A 95% PI at x = 1 is [26.33,55.28]. We would not expect this PI to be reliable because ;it extrapolates beyond the domain of the data.
10.12 (a) The Minitab output is shown below:
- 158-
From the regression output,
S
= 0.1246,
and
q2
= Var(.81) _
-
Sxx
(0.1246)2
0.0137
= 82.717.
Then a 95% PI for 1995 is
7.95
± 2.306(0.1246)Y 1 +
1
10
+
(15 - 5.5)2
82.717
= 7.95 ± 0.425,
or [7.525,8.375].
10.14 Write Y = ~ 1:Yi and .81 =
1:CiYi,
where
=
Cov(Y,l1d
_
A
Ci
(Xi -
==
x)j
S:r;:r;
and
1:Ci = O. Then
L."L."
-(Ci)COV(Yi,
Yj)
..
n1
I
""
:J
1
- n-L.
I
Ci
Var(Yi)
0"2
= -LCi=O.
n .
I
Since Y and 131 are both normally distributed (as linear functions of normal random variables), a correlation of 0 implies that they are independent.
Solutions
to Section
10.4
10.15 Since
ef3o+fhx
p
= -:---.,.-1+ ef30+f31x '
then
p
+ pef30+f31x = ef30+f31x
and
_p_
1-p
Then the linearizing transformation
. h(P)
= ef30+fhx.
is
= loge -.l!= f30 + 111X,
1-p
10.16 (a)
- 160-
Scatterplot of p(x)
Scatterplot of p(x)
10,000/x
VS.
VS.
1,000/sqrt(x)
.51
i
!
I
.4'
.51
•
:
~
E
.21
~
I
I
:
•
'
i.d
~
D-
•
-200
.3
•
•
.2
I
C
c
:
0.oJ .... __
•
oS!
10
200
-_-_-_400
500
500
_1200
1000
.1
•
0.0]' __
•
~'--~--_------
-100
100
200
300
1 ,OOOlsqrt(x)
10,OOO/X
Scatterplot of p(x} vs. 1/1og (x)
.5
•
.4
.3
•
=:
E
it
.2
•
'0
c
•
~.1
8.
E
Q.
•
,..
O.O~-_--~--_-_--~-~
M
.2
u
A
~
~
u
11109-10 (xl
The plot ofp(x)
VS.
IJloglOX
appears the most linear.
(b) The Minitab output isshown below:
Regression
Analysis
The regression
p(x)
=
0.0152
equation
+
0.404
is
1/log(x)
0.015188
Coef
T
20.61
1.97
pR-Sq(adj)
0.085
StDev
0.000
0.01961
0.007720
0,40419
98.21.
R-Sq
= =97.91.
Predictor
.-
- 161 -
"
Analysis
of Variance
Source
DF
SS
0.11303
0.00213
0.11516
1
Regression
Residual Error
Total
8
9
MS
F
p
0.11303
0.00027
424.78
0.000
From the Minitab output, 131 = 0.401. To test whether it is significantly different from
log10 e = 0.4343, compute the test statistic for testing Ho : f31 = 0.4343,
t=
0.~343
131 -
=
SE(f31)
0.4042 - 0.4343
. 0.0196
=
-1.535.
Since It I < tS,O.025 = 2.306, we do not reject Ho and conclude that it does not significantly
differ from the theoretical value, log10e = 0.4343.
(c) To verify the Prime Number Theorem, we must first test whether f30 differs significantly
from O. From the Minitab output above, the P-value is 0.085 > Q = 0.05. Therefore,
f30 ~ 0 and the approximate relationship between p(x) and x is
p(x) ~ IOg10e
log10 X
= _1_.
loge x
10.17 (a)
16
8200
Scatterplot
••• ·2•
•• •
4
Q.
.4~
0.01
In(t)
.6~
•
of p vs. In (t) ••
•
•
.2~
'~
This plot of the transformed data shows a negative linear relationship.
(b) The Minitab output is shown below:
Regression
-
~
Analysis
The regression
equation
is
p = 0.846 - 0.0792 In(~)
- 162-
Log(Distance) = 3.12 + 0.521 Number
Coef
3.1231
0.52107
Predictor
Constant
Number
S = 0.1614
StDev
0.1102
0.01776
P
0.000
0.000
T
28.33
29.33
R-Sq(adj) = 99.0%
R-Sq = 99.1%
Analysis of Variance
DF
Source
1
Regression
Residual Error
Total
8
9
SS
22.400
0.208
22.608
MS
F
22.400
0.026
860.38
P
0.000
(c) Using the regression equation,
log (Distance)
= 3.12 + 0.521
(Number),
for planet X, with planet number 11, the predicted distance would be
DIstance
= exp{3.12 + O.521(11)} = 6981.4.
10.19 The original data and the transformation
1/speed2 appear below:
Graph
Graph
Scatterplot of Speed
VS.
Distance
Scatterplot of 1/(Speed)1\2 VS. Distance
10
•
1200001
10??oo
~
I•
a.
CD
"tl
600001
40000
80000,1000
2000
4000
0
3000
-1000
•
• •
•
•
6
•
-:1
OilWtnr.A
i
•
~
!
,
>C
•
8
.1I
"tl -2
-uioo
il
2000
3000
1000
•
•
•
~
~
:]
frnm !'Illn
OilWtnr.A
frnm !'Illn
This transformation appears to have linearized the relationship, suggesting that the appropriate transformation is h(speed) = 1/sPeed2.
10.20 (a) The Minitab output from fitting a LS straight line to the original data is below:
Regression Analysis
-164 -
The regression equation is
y = - 70.7 + 4.14 x
4.1350
Coef
T-70.65
7.67
PR-Sq(adj)
-3.09
StDev
0.000
0.5394
0.015
22.88
88.0%
R-Sq
= =86.5%
Predictor
Analysis of Variance
Source
SS
34196
4655
38852
DF
1
Regression
Residual Error
Total
8
9
MS
F
34196
582
58.77
P
0.000
This LS fit results in the residual plot against speed below:
Plot of Residuals against Speed
60
-
•
!
~
'i'D'D 'Ea:ccIIINIII
"ijj
'"
oS
20
-20
-400
•
10
•
•
•
•
•
•
•
20
30.
•
50
60
70
Speed, x
(b) The residual plot above appears curved, indicating that a transformation of the data is
needed. Also, the spread of the residuals appears to increase with larger speed, violating
the assumption ofhomoscedasticity, or constant variance. However, note that r2 = 0.880,
indicating that a linear fit explains part of the variation in braking distance.
(c) Since (Kinetic Energy) ex (Speed)2 and (Braking Energy)
linear relationship between Distance and (Speed)2.
(d) Fitting the equation Distance
= f30 + f31(Speed)2, Minitab
Regression Analysis
ex
(Distance), we might fit'a
gives the output below:
.-
The regression equation is
y= 1.~ + 0.0517 x-2
- 165-
P 12.92
T
0.13
1.62= 90.17.
Predictor 0.9030.051739
0.000
Coef
8.54
StDev
0.006056
R-Sq
R-Sq(adj)
= 88.97.
Analysis of Variance
Source
Regression
Residual Error
Total
SS
35014
3838
38852
DF
1
8
9
MS
F
P
35014
480
72.98
0.000
The residual plot against Speed is below:
Plot of New Residuals against Speed
30,
20
:;)
10
1
::s
S
speed, x
to
'E
iii ~l!
tII
J
•
•
••
40
70
30
-zoJ
0 I 20
60
• •
-301
•
50
.toJ
This residual plot no longer has the nonlinear pattern seen before, but it still shows the
variability of the residuals increasing with speed. This indicates that the variance is not
constant, but increases with E(Y). The predicted stopping distance at a speed of 40
mph is
1.62 + 0.0517(40)2
10.21
(a)
- 166-
= 84.34
feet.
Scatterplot of log (gestation time) vs. weigh1
•
6.6
6.4
Q)
5.8"
-
i
0
•
fi.2
•
.2
5.2
-20
lIio
60
40
0
120
6.0~ 20
5.0
• •
•
•
•
1:1
Weight (KG)
This plot of the transformed data shows a positive approximately linear relationship.
(c) The Minitab output is shown below:
Regression Analysis
The regression equation is
log (t) = 5.28 + 0.0104 Weight
Predictor
Constant
Weight
Coef
5.27880
0.010411
S = 0.2163
StDev
0.08817
0.001717
T
P
59.87
6.06
0.000
0.000
R-Sq(adj) = 78.1%
R-5q = 80.3%
Analysis of Variance
Source
DF
Regression
Residual Error
Total
10
1
9
F
MS
S5
1.7194
0.4211
2.1405
36.75
1.7194
0.0468
(d) For a weight of 1.2 kg, the predicted gestation time is
t = exp{5.279 + 0.0104(1.2)} = 198.600.
10.24 (a) After taking the loge transformation,
130
= logea
(b)
-171-
and
f3l
= b.
P
0.000
10.0
9.5 1ii9.0
l$if
:= 0
C<'3
• •
..9
•
log(Length of Stay)
J:
••.
o>
•7.58.0
8.5
7.0
•••••••3450•
•
-1
10.5
• •••
• 2 •of• •••
Scatterplot
log(h)
•.
•
VS.
•.
10g(I)
•
This plot of the transformed data shows a positive approXimately linear relationship.
(c) The Minitab output is shown below: .
The regression equation is
log(h) = 7.09 + 0.691 log(l)
Predictor
Constant
log(l)
S = 0.5538
Coef
7.0916
0.69096
P
0.000
0.000
T
StDev
0.2554
0.09975
27.76
6.93
R-Sq(adj) = 59.54
R-Sq = 60.74
Analysis of Variance
Source
Re~ession
Residual Error
Total
SS
14.716
9.508
24.225
DF
1
31
32
MS
F
P
14.716
0.307
47.98
0.000
(d) For a hospital stay of 3 days, the predicted hospital cost would be
h =exp{7.o9
+ 0.691log 3} =
2563.
10.25 (a) Since
. p(l-n
VareP)
==
p)
= ~(1n ~),
then
g(~)
=
.-
j~(l::~).
The appropriate transformation is
h(;')
\JJ =.
J geP)
dp
='. J
..,jiidpp)
vp(l-
-173 -
r.= .
= 2 ynsm
-1 yp.
G
Chapter
11 Solutions
Solutions to Section 11.2
11.1
Q
= L(Yi
-
130 -
131xi- 132x~)2.
To minimize this, set the partial derivatives equal to 0,
=
-2
=
-2
L
L
Xi(Yi - f30 - 131xi- f32xn
= 0,
X~(Yi - ,80 - 131xi- f32x~) = o.
From the first equation,
LYi
= n130+ 131 LXi + f32LX~'
From the second equation,
From the third equation,
LX~Yi
= 130 LX~ + 131 LX~ + f32Lxi·
These are the normal equations.
~
11.2
The fitted model is Y = -1.571 +0.02573 Verbal +0.03361 Math. r2
the variability in GPA is accounted for by math and verbal scores.
11.3
The fitted model is Y = 111.354 + 2.060X1 -2.7323:2 + 0.000X3· r2 = 0.295, so 29.5% of the
variability in PIQ is accounted for by the brain size, height, and weight of a person.
11.4
(a)
• 120
40 •• • •
80
140
•••
• •20.100
•
••
Scatterplot
tii
::>•
I6OOi.·
1200
• I • 0
1000
••
• ••
of Y VS. x1
'E
E
••
...,.!!. ••
~
so 68.1% of
Scatterplot of log Y vs. log x1
•
7.5
I
••••
• • j!!,.. ••• •••
•
~ AI ••• : •
= 0.681,
7.0
8.5
5.5
...,
c: 6.0
4.5
5.0
11 0
~
• •• •• ••
2 •
• •••
'-t
.• - ••
- ••••
4
log xl (Alkalinity)
Alkalinity
-183 -
5