SMAM 319      Exam 1  Name______________________ 

SMAM 319 Exam 1 Name______________________ 1.Pick the best choice for the multiple choice questions below (10 points‐2 each) A____b__ In Metropolis there are some houses for sale. Superman and Lois Lane are interested in the average cost of houses in Metropolis. A random sample of five houses that are on the market have asking prices $101,400,$131,200,$98,400,$562,600, $101,400. The best estimator of the average cost of houses in Metropolis is The a. mean b. median c. standard deviation d. mode e. the range. B. ____d___ The five number summary for a set of data is 10, 12, 14, 16, 24. Based on this information the data set has a. no outliers b. exactly one outlier, c. at most one outlier d. at least one outlier e. two outliers. C. _d______The percentage of variation accounted for by a linear regression model is 64%. The least square equation has a negative slope. The correlation coefficient is a. 0.64 b.–0.64 c. 0.8 d. –0.8 e. 64. D____b___ Two variables have a correlation coefficient of ‐0.2. That indicates that there is a. no relationship between the two variables b. There is a weak negative linear relationship between the two variables. c. There is a strong positive linear relationship between the two variables. d. There is a strong negative linear relationship between the two variables. e. the least square equation is a perfect fit. E. ____a___The mean of a set of a normal set of data is 15 and the standard deviation is 2. According to the empirical rule about 95% of the data points should lie between the numbers a. 11 and 19 b. 13 and 17 c. 9 and 21 d. 0 and 16 e. 13 and infinity. 2. Given the set of numbers 1,2,2,2,3,5,7. A.Use the appropriate buttons on your calculator to find the sample (15 points ‐3 per numbered part) (1) mean x = 3.142 (2) standard deviation s = 2.116__________ B. What is the (1) median median = __2_________ (2) mode mode =____2________ (3) range range= _______6______ 3. The data in the ordered stem and leaf display below is for the number of grams of fat in 20 sandwiches served at Mc Donalds. 0
1
1
2
2
3
3
4
8
0
6
1
6
9
2
6
3
6
4
7 8 9
3 4
8 8 9
2
A. Make a five number summary.(5 points) 20
15
26 8
42
B. Find the interquartile range and use it to determine whether there are any outliers.(5 points) IQR =26‐15=11 11x1.5=16.5 no outliers C. Draw a boxplot.(5 points) D. Would the empirical rule apply to this data? Explain why or why not.(4 points) The empirical rule would not apply because the data is not normally distributed. E. x = 20.45,s = 8.29 . How many observations are at most 1.25 standard deviations from the mean?(3 points) 20.45 ± 1.25(8.29)
(10.5,30.8)
16 observations F. According to Chebychev’s rule at least how many observations should be at most 1.25 standard deviations from the mean?(3 points) 1 − 1 / (1.25)2 = .36 At least 36% of the observations or at least 7 observations. 4.An examination in elementary statistics for Section 1 has an average grade of 80 with a standard deviation of 5. The examination in Section 2 has an average grade of 70 with a standard deviation of 8. A. Tom in Section 1 earns a 78 on the exam. What is his Z score?(5 points) 78 − 80
Z=
= −.4 5
B. Robert in Section 2 earns 72 on his exam. What is his Z score?(5 points) 72 − 70
Z=
= .25 8
C. Who does better relative to his class? Explain.(5 points) Tom does better because he has a higher z score. 5. A data set has regression equation y = 4.6 –1.1x. A. Would the correlation coefficient be positive or negative? Explain.(5 points) The correlation coefficient and the slope have the same sign. The correlation coefficient would be negative. B. The correlation coefficient is –0.9. What percentage of the variation is accounted for by the regression model?(5 points) 81% C. The observed value of y when x = 2 is y=2.6. What is the residual when x =2?(5 points) y = 4.6 –1.1(2) =2.4 residual = 2.6 – 2.4 =0.2 6. Classified ads in the Ithaca Journal offered several used Toyota Corollas for sale. Listed below are the ages of the cars and the advertised prices. Price
Row Age(yr)
1
1
2
1
3
3
4
4
5
4
6
5
7
5
8
6
9
7
10
7
11
8
12
8
13
10
14
10
15
13
Advertised Price($)
13990
13495
12999
9500
10495
8995
9495
6999
6950
7850
6999
5995
4950
4495
2850
The following Minitab output was generated. Regression Analysis: Price Advertised($) versus Age(yr)
The regression equation is
Price Advertised($) = 14286 - 959 Age(yr)
Predictor
Constant
Age(yr)
Coef
14285.9
-959.05
S = 816.214
SE Coef
448.7
64.58
R-Sq = 94.4%
T
31.84
-14.85
P
0.000
0.000
R-Sq(adj) = 94.0%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
13
14
SS
146917777
8660659
155578436
MS
146917777
666205
F
220.53
P
0.000
Unusual Observations
Obs
Age(yr)
Price
Advertised($)
Fit
SE Fit
Residual
St Resid
3
3.0
12999
11409
292
1590
2.09R
R denotes an observation with a large standardized residual.
MTB > print c1 c2
A.Fill in the blanks below using the computer printout and your knowledge of Statistics. (1)The least square equation has slope__‐959___________ and y intercept_______14286______. The least square equation is__y=14286‐959x________________ (6 points ‐2 each) (2) The percentage of variation accounted for by the regression model is _94.4%___________. As a result the correlation coefficient is___‐.9715_____________ (6 points ‐2,4) B. Predict the price of an 8 year old car.(4 points) y=14286‐959(8)=6614 C.Suppose I asked you to find the residual for a 4 year old car. (1)What difficulty would you encounter?(2 points) The values of y are different each time you replicate x=4. (2) Suggest a way around this difficulty.(2 points) Obtain the mean of the y values for x=4. Subtract this from the observed value.