(b) Using m = b + c = 8 + 26 = 34, the large sample test statistic is b- z c- 1 8 - 26 - 1 = v'b+C = ";8 + 26 = -3.258. Then the P-value is P Since P students. Solutions 9.17 = 2(1 ~(I- 3.2581)) =2x 0.0006 = 0.0012. < a = 0.05, reject Ho and conclude that there was a change in opinion of the to Section 9.3 = P2 = ... = pa = 1/8 = 144 x 1/8 = 18. Then Ho: PI np - vs. HI: Not Ho· The expected frequencies in each case are - EXP? -_ (29 -1818)2 X2 -_ "LJ (Obs Exp + (19 -1818)2 + ... + (11 -18)2 18 _- 16.333. Since X2 > X~-I,O.05= 14.067, reject Ho and conclude that the horse's chances of winning are not the same for each. starting gate. 9.18 Ho: PI are np = P2 = .,. = P12 = 1/12 vs. = 700 x 1/12 = 58.333. Then LJ X2 -_ ,,(Obs HI: - Exp)2 -_ (66 -58.333 58.333)2 Exp Not Ho. The expected frequencies in each case 58.333)2 + ... + (42 -58.333 58.333? _-19.726. + (63 -58.333 Since X2 > Xf2-I,O.05 = 19.675, reject Ho and conclude that the first births are not spread uniformly throughout the year. 9.19 (a) Ho: Pi = m (.5f vs. HI: Not Ho· Using tli = npi = 98 G) (0.5)7, the results are summarized below:(ni-e»2 6901ni 5.3590.003 16.0780.269 14 0.766 26.7971.254 16.0782.181 26.7970.120 22 25 21 5.3592.452 98tli 6.278 X"l.= Sons "" 04362 1 Total 75 .. -135 - Note that cell 0 was combined with cellI, and cell 7 was combined with cell 6, to satisfy the requirement that no cell can have ~ < 1 and no more than 1/ 5th of the ~ can be < 5. Since X2 < X~-1,.10 = 9.236, do not reject Ho and conclude that the binomial distribution with p = 0.5 is a plausible distribution. (b) Ho: Pi is binomial vs. H1: 1> = Not Ho. Using Number of Sons Number of Children = 364 7 x 98 = 0.531 and e; =njJ; the results are summarized below: 0 543276 = 98 G) {O.531)'(O.469)7-', (ni-ei) 24.8740.001 22 21 14 13.2030.048 3.8930.595 7.1870.327 0.492 1.161 19.0720.450 25 6901~98~3.223 28.1191.802 1'2= Sons e· 1 Total Note that cell 0 was combined with cellI, and cell 7 was combined with cell 6, to satisfy the requirement that no cell can have ~ < 1 and no more than 1/5th of the ~ can be < 5. Since x2 < X~-2,.10 = 7.779, do not reject Ho and conclude that the binomial .5. distribution is a plausible distribution. This agrees with (a), since f> = .531 :.::::: 9.20 (a) Ho : PI = 9/16,P2 = Pa = 3/16,P4 = 1/16 vs. HI: Not Ho. (b) Using ~ = npi = 1611pi, the results are smnmarized below: Phenotype Tall,cu~led Dwarf, cut-led Tall, potato-leaf Dwarf, potato-leaf Total ~ 926 293 288 104 1611 ~ 906.188 302.063 302.063 100.688 rni-~) ~ 0.433 0.272 0.655 0.109 1.469 Since X2 < ~-1,.05 = 7.815, do not reject Ho and conclude that the proportion fit the observed frequencies well. 9.21 Since .x _ - 229 x 0 + 211 x 1 + 229 + 211 + - 136- + +1 7 x 1 = 0.932, 9:3:3:1 .- then using the PO,isson formula, and the expected frequencies are The results are summarized below: (ni-ei)~~ 211 229 Pi 0710.626 0.367 0.003 1.331 7.137 0.022 0.000 0.012 0.171 0.053 0.394 211.390 226.743 3993 35 0.204 0.622 0.028 8.539 0.311 0.001XT= 0.002 576 0.207 0.960 71.i Hits 7Total 0436215 1'-' . Note that cells 6 and 7 were combined with cell 5, to satisfy the requirement that no cell can have ei < 1 and no more than 1/5th of the ~ can be < 5. Since X2 < X~-2,O.05 = 9.488, do not reject Ho and conclude that the Poisson distribution is a plausible model. 9.22 (a) Since j, = 0.519, then Pi = e-O.519 (0.519)i ., '&. ' and 031>5 42 Total The results are summarized below: (n;-e;)~ ~1011 Pi 14 1.819 89.686 14.019 227 28 678 0.206 56 13.944 196.998 81.032 7.733 0.309 312.262 23.281 0.595 601.662 0.000 0.080 0.002 0.014 71.iX2= 251.642 --I'~ Passengers Note that cell ~ 5 was combined with cell 4, to satisfy the requirement that no cell can have ei < 1 and no more than 1/5th of the ~ can be < 5. Since X2 > ~-2,.05 = 7.815, reject Ho and conclude that the Poisson distribution is not a plausible distribution for the number of passengers. (b) Since p = 1/(1 + 0.519) = 0.658, then Pi = (1 - p)i-lp = (0.342)i-l and The results are summarized below: -137 - (0.658) .- (Tti-e; )2 4235 Total >6 Pi 227.407 227 4.708 9.071 0.009 0.005 0.026 0.225 18.342 277.698 678 6.547 1011 56 28 0.079 6.060 14 80.232 0.001 0.077 0.658 665.569 0.126 24.841 71iX".l.= "-' Occupants ej 1 Since x: > ;d-2,.05 = 9.488, reject Ho and conclude that the geometric distribution not a plausible distribution for the number of occupants. is (c) While neither is a plausible distribution for the data, the geometric distribution seemed to fit much better, since the X2 value is much smaller. Also note that the lack of fit of the geometric distribution comes primarily from the tail category (~ 6). 9.23 (a) x: = t i=l (7li -. ej)2 ej (x - npo)2 (n - x = -'-----"---'-+ -----npo n(l- 9.24 Using P = q = 0.5, Po) _ - (x - npo)2(1 - po) + (npo - x?po npo(1 - Po) _ (x - npo)2 _ npo(lPo) _ -----z. (b) We reject Ho if Izi n(1 - Po))2 > ZQ/2 or if z2 2 > .t~/2 = xtQ' Hence the two tests are equivalent. then Pi = (i-l)[4'4 3 P qZ- +q 4'4] pZ- = (i-I)3 (0.5)'-1 . and ej = npi· 4756 Total The results are summarized below: (ni-ei)" ej Pi 14 21 11 6.500 60.038 0.250 13.000 0.308 0.313 16.250 52 0.312 0.125 1.388 2.046 7li~= Games (i) "-' - .. Since X2 < data well. X§.O.05 = 7.815, we do not ~eject Ho and conclude that this model does fit the -138 - Multinomial sampling. Ho : Pij = Pi.p.j for all and refers to political party affiliation. i i,i, where (b) Product Multinomial sampling. Ho : Pij = Pj for all mutual fund and refers to the return classification. i i refers i,i, where to religious affiliation i refers to the type of i (a) Product Multinomial sampling. Ho : Pij = Pj for all i,i, where refers to the age group and refers to the willingness to use the internet grocery service. j (b) Multinomial sampling. Ho : Pij = pi.p.j for all and refers to the use of a safety restraint. j 424 233 503 312 152 150 2442 .661 12 465 638 896 1431 992 9181 81202 485 5673367 646 3831 385 869 4146 Tot. 231 2618 109 138 263 1146 2685 275 64 663 13175 101 276 61 10 441 594 6106 24 125 12 863 1177 3479 1263 1434 1394 2092 2833" 1429 980 2752 205 40 1213 122 1102 400 1313 454 19281 73 2302 50 23 13+ 2377 ft. •.; ow: i refers to the severity of injury 7361 fl.,; Using Multinomial sampling. (a) Product n ~j=--, x2 = ~t,J = - ni.n.j A Length (b) i,i, where (~j - ~j? ~j (312 - 233)2 233 101.494. (1146 - 1213)2 1213 + + ... + (61 - 50)2 50 Since ~ > Xt2-1)(13-1),O.Ol= 26.217, reject Ho and conclude that the Q.C.S. letters do not match Mark Twain's word length patterns. (a) Product Multinomial sampling. (b) Ho.: Pij = Pj , where was returned. (c) Using i refers to type of city and i refers to whether or not the wallet ~.n.j ~j=--, n A the results are summarized below: 24 17 20 18 21 ~j 30 12 96 Kept ~j 120 10 13 30 40 Total Type of Cities nij 80 Returned .. Cities Suburbs Medium Small Cities Total· ~j Cities ~j .- - 139- Then 2 X = = (ni- - ed2 2: J 3 i,j e;.j (21 - 20)2 20 4.5. + (9 - 10)2 10 + ... + (6 - 10)2 10 Since X2 < X~4-1)(2-1),O.10= 6.251, do not reject Ho and conclude that there are no differences in the return rates among the different types of cities. 9.29 (a) Multinomial sampling. (b) Ho : Pij = Pi-P-j for all i,j, where i refers to gender and j refers to height above ground. (c) Using fli·n.j ~j Males =---;-' the results are summarized below: 173 138.25 184.75 150 223 125 113.25 73 198 84.75 298 521 35 Total feet3323 feet flij Height above ground Total Females Gender e;.j Then 2 X = (fli--e;._)2 2: 3 J i,j e;.j = = (173 - 184.75)2 184.75 4.593. + (125 - 113.25)2 113.25 + ... + (73 - 84.75)2 84.75 Sinceassociated ~ > ~2-1)(2-1),O.05 3.841, reject Ho and conclude that gender and trap height are and are not = independent. 9.30 (a) 77 185 4812OK 16 08 91 Row TotalAirsick 6031 Treatment Placebo Dramamine· Column Total i (b) Ho : Pij = Pj , where refers to the treatment volunteer became airsick. (c) Using ~ given and j refers to whether or not the - ni.n.j --n e;.j - the results are summarizea below: - 140- , .. 31 45.50 60 108 2 16 125 48 1 08 77 62.50 Row Totalfiij OK "'i; Treatment Airsick 91 olumn acebo ne e;,j ~j Total Then x2 = LiJ (~j = (31 - 45.50)2 45.50 15.970. = - ~j )2 eij + (77 :- 62.50)2 62.50 + ···+ (48 - 62.50)2 62.50 Since x? > X~2-1)(2-1),O.05 3.841, reject Ho and conclude that Dramamine is effective in reducing the chances of = airsickness. 9.31 (a) • Cholesterol Level :5 250 > 250 8 3 12 " Personality Type 17 (b) Ho : Pij = Pi.P.j for all cholesterol level. Using i,j, where B i refers A ~j - A the results are summarized below: 17Row 12 14.5 38 20 5.5 11 40 > 250Total n;,j n;j A Total Personality Type ~j = to the personality type and fli·?l·j n j refers to the , <"Cholesterol 250 29 level Then = = = LiJ (~j_~j)2 f1; (12- 14.50)2 14.50 3.135. .(8 - 5.50)2 + 5.50 + ···+ (3 - 5.50)2 5.50 • - ~ Since x: < ~2-1)(2-1).O.lO = 2.706, reject Ho and conclude that personality type and cholesterol level are associated and are not independent. ". 9.32 . (a) Multinomial sampling. - 141 - (b) Ho: Pij = pi.P·j for all i,j, where i refers to eye color and j refers to hair color. Using ~ ~.n.j ~j=--, n the results are summarized below: 15 40.14 11.68 16.97 68 513.73 20 39.22 14 11.15 17 Blond 127 29 Brown 119 106.28 19.95 64 16 44.93 30.92 286 Total 725.79 26 71 7.68 103.87 46.12 '54 93 10 94 47.20 592 26.39 Red Black 108 14 215 84 Eye 220 ~j nij Brown eij Green Hazel Blue Total ~j ~j Hair Color Color Then x2 = E i,j = = (68 - 40.14)2 (119 - 106.28)2 40.14 + 106.28 138.290. (~ij - ~j)2 ~j X2 > Xf4-1)(4-1),O.05 16.919, reject Since are associated and are not =independent. 9.33 Ho + ... + (16 - 7.68)2 and conclude that eye color and hair color i (a) Ho : Pij = Pi.P.j for all i,j where refers to the opinion on full evacuation and to the distance from Three Mile Island. (b) Using ~j = ~.n.j n ' the results are summarized below: 7.04 974-6 413-15 11 112.88 64.84 4.4 10-12 7-9 53 05.6 1-3 6.16 11 29.92 10.12 9.68 12.32 38.08 29 22 23 68 10 39 84 Total 66 150 16 Full 8.96 15+ Distance (in miles) fromRow Three Mile Island ~j l Then X2 = (~. E i,j = (7 - 7.04)2 - 0.449. _ ~.)2 3 3 + (11 - 9.68)2 (39 - 38.08)2 9.68 + ... + 38.08 - 142- j refers Chapter 10 Solutions tions to Section 10.1 (a) Theoretical and deterministic. (b) Empirical and probabilistic. (c) Empirical and probabilistic. (a) Empirical and probabilistic. (b) Empirical and probabilistic. (c) Theoretical and deterministic. Experimental Study: Chemical reaction, where the temperature is controlled at various settings to determine its effect on yield. Observational Study: Model income as a function of years of education for a sample of workers. Solutions to Section 10.2 10.4 (a) 70 80 AST 80 • • • ~ w 50 Z • •• •• ••• •• • • • 40 3.5 4.0 2.5 2.0 5.0 4.5 3.0 1.5 Scatterplot • of Next VS. Last • • 90 This shows a positive approximately linear relationship. (b) Using x = 3.238 and y = 62.714, s~ = LXiYi - S~~ = [2.0 x 50 + 1.8 x 57 + ... + 4.3 x 72]= 217.629, = LX; - nx2 i - nxy .. i - 153- 21(3.238)(62.714) [(2.0)2 + (1.8)2 + ... + (4.3)2) = 22.230, = ~LYi2 -ny-2 _ S yy _ = [(50)2 + (57)2 2844.286. + ... + (72)2) - 21(3.238)2 - 21(62.714)2 Then PI = Sr.r; Sxy _ 217.629 - 22.230 = 9.790, and Po = ii - PIX = 62.714'- 9.790 x 3.238 = 31.013. Therefore, the least squares regression line is y = 31.013 + 9.790x. For a previous eruption of 3 minutes, the predicted time to the next eruption would y = 31.013 + 9.790(3) = 60.383. (c) Since SSR =P~Szz = (9.790)2 x 22.230 = 2130.599, and SST = Syy, then 2 = SSR = 2130.599 = 0 749 r SST 2844.286 . , so that approximately 75% of the variability in NEXT is accounted for by LAST. LAST appears to be a pretty good predictor of NEXT. (d) Since MSE = SSE = SST - SSR = 2844.286 - 2130.599 = 37 n- 2 n- the estimate of u is fJ 2 21 - 2 .562, = -137.562 = 6.129. 10.5 (a) Scatterplot of jump length by year 19 () 17~ YEAR 161 151 w 0 ..• • 1940 1980 1900 2000 • • 14 • 1920 1980 13, • • Z1880•181 • •• ~ • 1 -154 - •• ••••• • This shows a positive approximately linear relationship. (b) From Minitab, the least squares regression line is = -62.312 + 0.040x. y (c) From Minitab, the MSE is 0.104, so the estimate of q is a = v'0.104 = 0.322. (a) Scatterplot 220, G) !! 2'0 :> ometric Pressure E ~ 'E ~ .s: al ! • • 30 28 32 22 26 Q. 200 24 19020 'S Cl •• •• • • • •• •• This shows a positive approximately linear relationship. (b) From Minitab, the least squares regression line is y = 155.296 + 1.902x. Since r2 = 0.994, approximately 99% of the variation in the boiling point is accounted for by linear regression on the barometric pressure. (c) From Minitab, the MSE is 0.197, so the estimate of q is a = v'0.197 = 0.444. - 155- Scatterplot of Winning Time by Year so . •• •• 80 ~ F Cl c: 1920 801 ~ 701 I Year 3: •2000 • • 1930 1970 1980 1940 1950 •1990 •• • •• • • IIJ This shows a negative approximately linear relationship. (b) From Minitab, the least squares regression line is = 774.012 y - 0.359x. (c) From Minitab, the MSE is 1.728, so the estimate of q is = v'1. 728 = fJ 10.8 Since 1.315. n Q = L(Yi - ,BIXi)2, i=l to maximize Q we solve dQ n df31 = 2?= ,=1 Xi(Yi or n - f31Xi) =0 n f31LX~ i=l = LXiYi, i=l yielding EXiYi A f31= Ex~' Solutions to Section 10.3 10.9 (a) Since SE(Pl) = _8 ___ 0.3227 the test statistic for R O.. f3 1=018 . v'Szz - v'18153 t= Since t > trend. t19,O.OS = 0.00247, ~ 0.040 = -= 16.312. SE(f31) 0.00247 A = 1.729, we reject Ho and conclude that there is a significant linea£ -156 - (b) The predicted value is f/ = -62.3 + 0.0403 x 2004 = 18.461. Then a 95% PI is given by = it ± t21-2,O.0258 95% PI = = = 1 (x* - x)2 1+ -n + ---Sxx 18.461 ± 2.093(0.3227) 18.461 ± 0.747 1 1 (2004 - 1947.429)2 18153 + 21 + [17.714,19.208]. This is unreliable because we are extrapolating beyond the domain of the data. A 95% CI for the winning jump does not have a meaningful interpretation because there will only be one winning jump in 2004, and we are not concerned about the average winning jump in 2004 if we were to hold the competition over and over again. The Minitab output is showp. below: Regression Analysis The regression Temp equation is = 155 + 1.90 Pressure Coef Predictor StDev T P Constant 155.296 0.927 167.47 0.000 Pressure 1. 90178 0.03676 51. 74 0.000 S = 0.4440 = 99.4% R-Sq p.* J1. 95% CI = 99.4% = 28 is The predicted value at x Then a 95% CI for R~Sq(adj) = 155.296 + 1.902 x 28 = 208.546. is given by = = = = (b) The predicted value at x p.* ± t17-2,O.025S 1 (x* - x)2 n + ----Sxx - 208.546 ± 2.131(0.444) 208.546 ± 0.325 1 17 + (28 - 25.059)2 145.886 [208.221,208.871]. = 31 is p.* = 155.296 + 1.902 x -157 - 31 = 214.258. _ i7 Then a 95% CI for J.L is given by 95% CI 1 (x * -x)2 ± t17-2,O.025S~1 ~ + 8:&:& = p.* - 214.258 ± 2.131(0.444) = = 214.258 ± 0.519 1 17 + (31 - 25.059)2 145.886 [213.739,214.777]. This is wider than the CI at x = 28 because the point of extrapolation is outside the range of data, where the regression line is less reliable. 10.11 (a) The Minitab output is shown below: Regression Analysis The regression equation is NEXT = 31.0 + 9.79 LAST T31.013 Coef= 74.9% 9.790 P StDev 7.53 7.02 0.000 1.300 4.417 R-Sq Predictor R-Sq(adj) = 73.6% Analysis of Variance Source DF Regression Residual Error Total 19 SS 2130.6 713.7 2844.3 1 20 MS F P 2130.6 37.6 56.72 0.000 Predicted Values Fit 60.38 40.80 StDev Fit 1.37 3.20 A 95% PI at x ( ( 95.0% CI 57.51, 63.26) 34.10. 47.51) ( ( 95.0% PI 47.24. 73.53) 26.33. 55.28) for x=3 for x=l = 3 is [47.24,73.53] .. (b) From the output above, a 95% CI at x = 3 is [57.51,63.26]. This is narrower than the PI, because it is a confidence interval for the average of all future observations at x = 3, and not a single future observation. (c) A 95% PI at x = 1 is [26.33,55.28]. We would not expect this PI to be reliable because ;it extrapolates beyond the domain of the data. 10.12 (a) The Minitab output is shown below: - 158- From the regression output, S = 0.1246, and q2 = Var(.81) _ - Sxx (0.1246)2 0.0137 = 82.717. Then a 95% PI for 1995 is 7.95 ± 2.306(0.1246)Y 1 + 1 10 + (15 - 5.5)2 82.717 = 7.95 ± 0.425, or [7.525,8.375]. 10.14 Write Y = ~ 1:Yi and .81 = 1:CiYi, where = Cov(Y,l1d _ A Ci (Xi - == x)j S:r;:r; and 1:Ci = O. Then L."L." -(Ci)COV(Yi, Yj) .. n1 I "" :J 1 - n-L. I Ci Var(Yi) 0"2 = -LCi=O. n . I Since Y and 131 are both normally distributed (as linear functions of normal random variables), a correlation of 0 implies that they are independent. Solutions to Section 10.4 10.15 Since ef3o+fhx p = -:---.,.-1+ ef30+f31x ' then p + pef30+f31x = ef30+f31x and _p_ 1-p Then the linearizing transformation . h(P) = ef30+fhx. is = loge -.l!= f30 + 111X, 1-p 10.16 (a) - 160- Scatterplot of p(x) Scatterplot of p(x) 10,000/x VS. VS. 1,000/sqrt(x) .51 i ! I .4' .51 • : ~ E .21 ~ I I : • ' i.d ~ D- • -200 .3 • • .2 I C c : 0.oJ .... __ • oS! 10 200 -_-_-_400 500 500 _1200 1000 .1 • 0.0]' __ • ~'--~--_------ -100 100 200 300 1 ,OOOlsqrt(x) 10,OOO/X Scatterplot of p(x} vs. 1/1og (x) .5 • .4 .3 • =: E it .2 • '0 c • ~.1 8. E Q. • ,.. O.O~-_--~--_-_--~-~ M .2 u A ~ ~ u 11109-10 (xl The plot ofp(x) VS. IJloglOX appears the most linear. (b) The Minitab output isshown below: Regression Analysis The regression p(x) = 0.0152 equation + 0.404 is 1/log(x) 0.015188 Coef T 20.61 1.97 pR-Sq(adj) 0.085 StDev 0.000 0.01961 0.007720 0,40419 98.21. R-Sq = =97.91. Predictor .- - 161 - " Analysis of Variance Source DF SS 0.11303 0.00213 0.11516 1 Regression Residual Error Total 8 9 MS F p 0.11303 0.00027 424.78 0.000 From the Minitab output, 131 = 0.401. To test whether it is significantly different from log10 e = 0.4343, compute the test statistic for testing Ho : f31 = 0.4343, t= 0.~343 131 - = SE(f31) 0.4042 - 0.4343 . 0.0196 = -1.535. Since It I < tS,O.025 = 2.306, we do not reject Ho and conclude that it does not significantly differ from the theoretical value, log10e = 0.4343. (c) To verify the Prime Number Theorem, we must first test whether f30 differs significantly from O. From the Minitab output above, the P-value is 0.085 > Q = 0.05. Therefore, f30 ~ 0 and the approximate relationship between p(x) and x is p(x) ~ IOg10e log10 X = _1_. loge x 10.17 (a) 16 8200 Scatterplot ••• ·2• •• • 4 Q. .4~ 0.01 In(t) .6~ • of p vs. In (t) •• • • .2~ '~ This plot of the transformed data shows a negative linear relationship. (b) The Minitab output is shown below: Regression - ~ Analysis The regression equation is p = 0.846 - 0.0792 In(~) - 162- Log(Distance) = 3.12 + 0.521 Number Coef 3.1231 0.52107 Predictor Constant Number S = 0.1614 StDev 0.1102 0.01776 P 0.000 0.000 T 28.33 29.33 R-Sq(adj) = 99.0% R-Sq = 99.1% Analysis of Variance DF Source 1 Regression Residual Error Total 8 9 SS 22.400 0.208 22.608 MS F 22.400 0.026 860.38 P 0.000 (c) Using the regression equation, log (Distance) = 3.12 + 0.521 (Number), for planet X, with planet number 11, the predicted distance would be DIstance = exp{3.12 + O.521(11)} = 6981.4. 10.19 The original data and the transformation 1/speed2 appear below: Graph Graph Scatterplot of Speed VS. Distance Scatterplot of 1/(Speed)1\2 VS. Distance 10 • 1200001 10??oo ~ I• a. CD "tl 600001 40000 80000,1000 2000 4000 0 3000 -1000 • • • • • 6 • -:1 OilWtnr.A i • ~ ! , >C • 8 .1I "tl -2 -uioo il 2000 3000 1000 • • • ~ ~ :] frnm !'Illn OilWtnr.A frnm !'Illn This transformation appears to have linearized the relationship, suggesting that the appropriate transformation is h(speed) = 1/sPeed2. 10.20 (a) The Minitab output from fitting a LS straight line to the original data is below: Regression Analysis -164 - The regression equation is y = - 70.7 + 4.14 x 4.1350 Coef T-70.65 7.67 PR-Sq(adj) -3.09 StDev 0.000 0.5394 0.015 22.88 88.0% R-Sq = =86.5% Predictor Analysis of Variance Source SS 34196 4655 38852 DF 1 Regression Residual Error Total 8 9 MS F 34196 582 58.77 P 0.000 This LS fit results in the residual plot against speed below: Plot of Residuals against Speed 60 - • ! ~ 'i'D'D 'Ea:ccIIINIII "ijj '" oS 20 -20 -400 • 10 • • • • • • • 20 30. • 50 60 70 Speed, x (b) The residual plot above appears curved, indicating that a transformation of the data is needed. Also, the spread of the residuals appears to increase with larger speed, violating the assumption ofhomoscedasticity, or constant variance. However, note that r2 = 0.880, indicating that a linear fit explains part of the variation in braking distance. (c) Since (Kinetic Energy) ex (Speed)2 and (Braking Energy) linear relationship between Distance and (Speed)2. (d) Fitting the equation Distance = f30 + f31(Speed)2, Minitab Regression Analysis ex (Distance), we might fit'a gives the output below: .- The regression equation is y= 1.~ + 0.0517 x-2 - 165- P 12.92 T 0.13 1.62= 90.17. Predictor 0.9030.051739 0.000 Coef 8.54 StDev 0.006056 R-Sq R-Sq(adj) = 88.97. Analysis of Variance Source Regression Residual Error Total SS 35014 3838 38852 DF 1 8 9 MS F P 35014 480 72.98 0.000 The residual plot against Speed is below: Plot of New Residuals against Speed 30, 20 :;) 10 1 ::s S speed, x to 'E iii ~l! tII J • • •• 40 70 30 -zoJ 0 I 20 60 • • -301 • 50 .toJ This residual plot no longer has the nonlinear pattern seen before, but it still shows the variability of the residuals increasing with speed. This indicates that the variance is not constant, but increases with E(Y). The predicted stopping distance at a speed of 40 mph is 1.62 + 0.0517(40)2 10.21 (a) - 166- = 84.34 feet. Scatterplot of log (gestation time) vs. weigh1 • 6.6 6.4 Q) 5.8" - i 0 • fi.2 • .2 5.2 -20 lIio 60 40 0 120 6.0~ 20 5.0 • • • • • 1:1 Weight (KG) This plot of the transformed data shows a positive approximately linear relationship. (c) The Minitab output is shown below: Regression Analysis The regression equation is log (t) = 5.28 + 0.0104 Weight Predictor Constant Weight Coef 5.27880 0.010411 S = 0.2163 StDev 0.08817 0.001717 T P 59.87 6.06 0.000 0.000 R-Sq(adj) = 78.1% R-5q = 80.3% Analysis of Variance Source DF Regression Residual Error Total 10 1 9 F MS S5 1.7194 0.4211 2.1405 36.75 1.7194 0.0468 (d) For a weight of 1.2 kg, the predicted gestation time is t = exp{5.279 + 0.0104(1.2)} = 198.600. 10.24 (a) After taking the loge transformation, 130 = logea (b) -171- and f3l = b. P 0.000 10.0 9.5 1ii9.0 l$if := 0 C<'3 • • ..9 • log(Length of Stay) J: ••. o> •7.58.0 8.5 7.0 •••••••3450• • -1 10.5 • ••• • 2 •of• ••• Scatterplot log(h) •. • VS. •. 10g(I) • This plot of the transformed data shows a positive approXimately linear relationship. (c) The Minitab output is shown below: . The regression equation is log(h) = 7.09 + 0.691 log(l) Predictor Constant log(l) S = 0.5538 Coef 7.0916 0.69096 P 0.000 0.000 T StDev 0.2554 0.09975 27.76 6.93 R-Sq(adj) = 59.54 R-Sq = 60.74 Analysis of Variance Source Re~ession Residual Error Total SS 14.716 9.508 24.225 DF 1 31 32 MS F P 14.716 0.307 47.98 0.000 (d) For a hospital stay of 3 days, the predicted hospital cost would be h =exp{7.o9 + 0.691log 3} = 2563. 10.25 (a) Since . p(l-n VareP) == p) = ~(1n ~), then g(~) = .- j~(l::~). The appropriate transformation is h(;') \JJ =. J geP) dp ='. J ..,jiidpp) vp(l- -173 - r.= . = 2 ynsm -1 yp. G Chapter 11 Solutions Solutions to Section 11.2 11.1 Q = L(Yi - 130 - 131xi- 132x~)2. To minimize this, set the partial derivatives equal to 0, = -2 = -2 L L Xi(Yi - f30 - 131xi- f32xn = 0, X~(Yi - ,80 - 131xi- f32x~) = o. From the first equation, LYi = n130+ 131 LXi + f32LX~' From the second equation, From the third equation, LX~Yi = 130 LX~ + 131 LX~ + f32Lxi· These are the normal equations. ~ 11.2 The fitted model is Y = -1.571 +0.02573 Verbal +0.03361 Math. r2 the variability in GPA is accounted for by math and verbal scores. 11.3 The fitted model is Y = 111.354 + 2.060X1 -2.7323:2 + 0.000X3· r2 = 0.295, so 29.5% of the variability in PIQ is accounted for by the brain size, height, and weight of a person. 11.4 (a) • 120 40 •• • • 80 140 ••• • •20.100 • •• Scatterplot tii ::>• I6OOi.· 1200 • I • 0 1000 •• • •• of Y VS. x1 'E E •• ...,.!!. •• ~ so 68.1% of Scatterplot of log Y vs. log x1 • 7.5 I •••• • • j!!,.. ••• ••• • ~ AI ••• : • = 0.681, 7.0 8.5 5.5 ..., c: 6.0 4.5 5.0 11 0 ~ • •• •• •• 2 • • ••• '-t .• - •• - •••• 4 log xl (Alkalinity) Alkalinity -183 - 5
© Copyright 2024