Name: Date: ______ 1. A study is conducted to determine if one can

Name: __________________________ Date: _____________
1. A study is conducted to determine if one can predict the yield of a crop based on the
amount of yearly rainfall. The response variable in this study is
A) the yield of the crop.
B) the amount of yearly rainfall.
C) the experimenter.
D) either bushels or inches of water.
E) the month the crop is harvested.
2. A researcher is interested in determining if one can predict the score a student gets on a
statistics exam from the amount of time the student spends studying for the exam. In this
study, the explanatory variable is
A) the researcher.
B) the students taking the exam.
C) the score on the exam.
D) the fact that this is a statistics exam.
E) the amount of time spent studying for the exam.
3. When creating a scatterplot, one should
A) use only positive values of the explanatory variable.
B) use the horizontal axis for the explanatory variable.
C) use a different plotting symbol depending on whether the explanatory variable is
categorical or the response variable is categorical.
D) use a plotting scale that makes the overall trend roughly linear.
E) use the horizontal axis for the response variable.
Page 1
Use the following to answer questions 4-5:
A researcher measures the height (in feet) and volume of usable lumber (in cubic feet) of 32
cherry trees. The goal is to determine if the volume of a tree's usable lumber can be estimated
from the height of the tree. The results are plotted below.
4. In the study above, the response variable is
A) number of trees.
B) volume.
C) height or volume; it doesn't matter which is considered the response variable.
D) neither height nor volume; the measuring instrument used to measure height is the
response variable.
E) height.
5. The scatterplot above suggests that
A) there is a positive association between height and volume.
B) there is an outlier in the plot.
C) both A and B.
D) neither A nor B.
E) the relationship between height and volume is nonlinear.
Page 2
6. At a large university, the office responsible for scheduling classes notices that demand is
low for classes that meet before 10:00 AM or after 3:00 PM and is high for classes that
meet between 10:00 AM and 3:00 PM. Which of the following may we conclude?
A) There is an association between demand for classes and the time the classes meet.
B) The association between demand for classes and time for classes is linear.
C) There is a negative association between demand for classes and the time the classes
meet.
D) There is no association between demand for classes and the time the classes meet.
E) There is a positive association between demand for classes and the time the classes
meet.
7. The graph below plots the gas mileage (miles per gallon) of various 1978 model cars
versus the weight of these cars in thousands of pounds.
In the graph, the points denoted by the plotting symbol x correspond to cars made in
Japan. From this plot, we may conclude that
A) in 1978 there was little difference between Japanese cars and cars made in other
countries.
B) in 1978 Japanese cars tended to be lighter in weight than other cars.
C) in 1978 Japanese cars tended to get poorer gas mileage than other cars.
D) there is a positive association between weight and gas mileage for Japanese cars.
E) the plot is invalid. A scatterplot is used to represent quantitative variables, and the
country that makes a car is a qualitative variable.
Page 3
8. Volunteers for a research study were divided into three groups. Group 1 listened to
Western religious music, group 2 listened to Western rock music, and group 3 listened
to Chinese religious music. The blood pressure of each volunteer was measured before
and after listening to the music, and the change in blood pressure (blood pressure before
listening minus blood pressure after listening) was recorded. To explore the relationship
between type of music listened to and change in blood pressure, we could
A) see if blood pressure decreases as type of music increases by examining a
scatterplot.
B) make a histogram of the change in blood pressure for all of the volunteers.
C) make side-by-side boxplots of the change in blood pressure, with a separate
boxplot for each group.
D) make a pie chart displaying the distribution of type of music listened to for all of
the volunteers.
E) do all of the above.
9. A school guidance counselor examines the number of extracurricular activities of
students and their grade point average. The guidance counselor says, “The evidence
indicates that the correlation between the number of extracurricular activities a student
participates in and his or her grade point average is close to zero.” A correct
interpretation of this statement would be that
A) active students tend to be students with poor grades, and vice versa.
B) students with good grades tend to be students that are not involved in many
activities, and vice versa.
C) students involved in many extracurricular activities are just as likely to get good
grades as bad grades. The same is true for students involved in few extracurricular
activities.
D) as a student becomes more involved in extracurricular activities, there will be a
change in his/her grades.
E) involvement in many extracurricular activities and good grades go hand in hand.
Page 4
10. A student wonders if people of similar heights tend to date each other. She measures
herself, her dormitory roommate, and the women in the adjoining rooms; then she
measures the next man each woman dates. Here are the data (heights in inches):
Women
Men
66
72
64
68
66
70
65
68
70
74
65
69
Which of the following statements is true?
A) The variables measured are all categorical.
B) There is a strong negative association between the heights of men and women,
since the women are always smaller than the men they date.
C) Tall women tend to date short men.
D) Any height above 70 inches must be considered an outlier.
E) There is a positive association between the heights of men and women who date
each other.
11. Which of the following statements about the correlation coefficient is true?
A) The correlation coefficient measures the proportion of variability between the two
variables.
B) The correlation coefficient will be equal to 1 only if all the data lie on a perfectly
horizontal straight line.
C) The correlation coefficient measures the fraction of outliers that appear in a
scatterplot.
D) The correlation coefficient has no unit of measurement and must always lie
between –1 and 1, inclusive.
E) The correlation coefficient equals the proportion of times two variables lie on a
straight line.
12. A study found a correlation of r = –0.61 between the gender of a worker and his or her
income. We may correctly conclude that
A) women earn more than men on the average.
B) women earn less than men on the average.
C) an arithmetic mistake was made, since correlation must always be positive.
D) this result is incorrect, because computing r makes no sense in this situation.
E) on average, women earn 61% less than men.
Page 5
13. Consider the scatterplot below.
According to the scatterplot, which of the following is a plausible value for the
correlation coefficient between weight and MPG?
A) 1.0 .
B) 0.9 .
C) 0.5 .
D) 0.2.
E) 0.7.
Page 6
14. Consider the scatterplot below.
The correlation between X and Y is approximately
A) 0.999.
B) 0.8.
C) 0.5.
D) 0.
E) –0.7.
Page 7
15. Consider the scatterplot below.
We may conclude that
A) the correlation between X and Y must be close to 1 since there is a nearly perfect
relationship between them.
B) the correlation between X and Y shows a quadratic relationship.
C) the correlation between X and Y is close to 0.
D) the correlation between X and Y could be any number between –1 and 1. Without
knowing the actual values of X and Y we can say nothing more.
E) the correlation between X and Y must be close to –1 since there is a nearly perfect
relation between them, but it is not a straight-line relation.
Use the following to answer questions 16-17:
I wish to determine the correlation between the height (in inches) and weight (in pounds) of
21-year-old males. To do this, I measure the height and weight of two 21-year-old men. The
measured values are
Height
Weight
Male #1
70
160
Male #2
75
200
16. Referring to the information above, the correlation r computed from the measurements
on these males is
A) equal to 1.
B) positive and between 0.25 and 0.75.
C) near 0, but could be either positive or negative.
D) exactly 0.
E) Meaningless, since the slope is greater than 1.
Page 8
17. Referring to the information above, which of the following units would the correlation
coefficient r have?
A) Inches.
B) Pounds.
C) Pounds per inch.
D) None, because r has no units.
E) Inches-pounds.
18. Which of the following is true of the correlation coefficient r?
A) It is a resistant measure of association.
B) It does not change if either all the X-data or all the Y-data are multiplied by a
constant.
C) If r is the correlation between X and Y, then –r is the correlation between Y and X.
D) r can never be 0 if there is a linear relationship between X and Y.
E) All of the above.
19. The scatterplot below is from a small data set.
The data were classified as either type 1 or type 2. Those of type 1 are indicated by o's,
those of type 2 by x's. The overall correlation of the data in this scatterplot is
A) positive.
B) near 0, since the overall data do not show a distinct pattern.
C) near 0, because the o's display a negative trend and the x's display a negative trend,
but the trend from the o's to the x's is positive. The different trends cancel.
D) impossible to compute for such a data set.
E) negative, since the o's display a negative trend and the x's display a negative trend.
Page 9
20. A scatterplot of a variable Y versus a variable X produced the results below. The value
of Y for all values of X is exactly 1.0.
The correlation between Y and X is
A) 1, because the points lie perfectly on a line.
B) either 1 or –1, because the points lie perfectly on a line.
C) 0, because Y does not change as X increases.
D) impossible to determine, since there is no slope to the data.
E) none of the above.
21. The profits (in multiples of $100,000) versus the sales (in multiples of $100,000) for a
number of companies are plotted below. The correlation between profits and sales is
0.814. Suppose we removed the point that is circled from the data represented in the
plot. The correlation between profits and sales would then be
A)
B)
C)
D)
E)
0.814.
significantly larger than 0.814.
significantly smaller than 0.814.
slightly larger than 0.814.
slightly smaller than 0.814.
Page 10
22. Volunteers for a research study were divided into three groups. Group 1 listened to
Western religious music, group 2 listened to Western rock music, and group 3 listened
to Chinese religious music. The blood pressure of each volunteer was measured before
and after listening to the music, and the change in blood pressure (blood pressure before
listening minus blood pressure after listening) was recorded. A scatterplot of change in
blood pressure versus type of music listened to is given below.
The correlation between change in blood pressure and type of music listened to is
A) negative.
B) positive.
C) first negative, then positive.
D) nearly 0.
E) none of the above.
Page 11
23. The profits (in multiples of $100,000) versus the sales (in multiples of $100,000) for a
number of companies are plotted below.
Notice that in the plot, profits is treated as the response variable and sales as the
explanatory variable. The correlation between profits and sales is 0.814. Suppose we
had taken sales to be the response variable and profits to be the explanatory variable. In
this case, the correlation between sales and profits would be
A) 0.814.
B) –0.814.
C) 0.
D) any number between 0.814 and 0.814, but we can't state the exact value.
E) 1, since the direction of the data doesn't change.
Page 12
24. Below is a scatterplot of the calories and sodium content (in milligrams) of several
brands of meat hot dogs. The least-squares regression line has been drawn on the plot.
Based on the least-squares regression line in this scatterplot, one would predict that a
hot dog containing 100 calories would have a sodium content (in milligrams) of about
A) 70.
B) 350.
C) 375.
D) 400.
E) 600.
Page 13
25. The British government conducts regular surveys of household spending. The average
weekly household spending on tobacco products and alcoholic beverages for each of 11
regions in Great Britain was recorded. A scatterplot of spending on alcohol versus
spending on tobacco is given below.
Which of the following statements is true?
A) The observation (4.5, 6.0) is an outlier.
B) There is clear evidence of a negative association between spending on alcohol and
spending on tobacco.
C) The equation of the least-squares line for this plot would be approximately
y  10  2 x
D) The correlation coefficient for this data is 0.99.
E) The observation in the lower right corner of the plot is influential.
26. The fraction of the variation in the values of y that is explained by the least-squares
regression of y on x is
A) the correlation coefficient.
B) the slope of the least-squares regression line.
C) the square of the correlation coefficient.
D) the intercept of the least-squares regression line.
E) the residual.
Page 14
27. In a statistics course, a linear regression equation was computed to predict a student's
final exam score from his/her score on the first test. The equation of the least-squares
regression line was
yˆ  10  0.9 x
where y represents the final exam score and x is the score on the first exam. Suppose Joe
scores a 90 on the first exam. What would be the predicted value of his score on the
final exam?
A) 91.
B) 90.
C) 89.
D) 81.
E) It cannot be determined from the information given. We also need to know the
correlation coefficient.
28. John's parents recorded his height at various ages up to 66 months. Below is a record of
the results.
Age (months)
36
Height (inches) 35
48
38
54
41
60
43
66
45
Which of the following is the equation of the least-squares regression line of John's
height on age? (NOTE: You do not need to directly calculate the least-squares
regression line to answer this question.)
A)
= 12  (Age).
B)
= 0.34 + 22.3  (Age).
C)
= Age/12.
D)
= 60 – 0.22  (Age).
E)
= 22.3 + 0.34  (Age).
Page 15
29. Foresters use regression to predict the volume of timber in a tree using easily measured
quantities such as diameter. Let y be the volume of timber in cubic feet and x be the
tree's diameter in feet (measured at three feet above ground level). One set of data gives
the following least-squares regression equation:
yˆ = –30 + 60x
The predicted volume of timber in a tree of diameter 18 inches is
A) 1080 cubic feet.
B) 1050 cubic feet.
C) 90 cubic feet.
D) 60 cubic feet.
E) 30 cubic feet.
30. A researcher wishes to determine whether the rate of water flow (in liters per second)
over an experimental soil bed can be used to predict the amount of soil washed away (in
kilograms). The researcher measures the amount of soil washed away for various flow
rates and from these data calculates the least-squares regression line to be
[y-hat]amount of eroded soil[y-hat] = 0.4 + 1.3  (flow rate)
The correlation between amount of eroded soil and flow rate would be
A) 1/1.3.
B) 0.4.
C) 1.3.
D) positive, but we cannot say what the exact value is using the information given.
E) either positive or negative. It is impossible to say anything about the correlation
from the information given.
31. The least-squares regression line is
A) the line that makes the square of the correlation in the data as large as possible.
B) the line that makes the sum of the squares of the vertical distances of the data
points from the line as small as possible.
C) the line that passes through the greatest number of data points.
D) the line that best splits the data in half, with half of the points above the line and
half below the line.
E) all of the above.
32. Which of the following is true of the least-squares regression line?
A) The slope is the change in the response variable that would be predicted by a unit
change in the explanatory variable.
B) It always passes through the point ( X , Y ), the means of the explanatory and
response variables, respectively.
C) It will only pass through all the data points if r = ± 1.
D) No more than 50% of the residual values will be positive.
E) All of the above.
Page 16
33. A researcher wishes to study how the average weight Y (in kilograms) of children
changes during the first year of life. He plots these averages versus the children's age X
(in months) and decides to fit a least-squares regression line to the data with X as the
explanatory variable and Y as the response variable. He computes the following
quantities.
r = correlation between X and Y = 0.9
X = mean of the values of X = 6.5
Y = mean of the values of Y = 6.6
s = standard deviation of the values of X = 3.6
sm = standard deviation of the values of Y = 1.2
The slope of the least-squares line is
A) 0.30.
B) 0.88.
C) 1.01.
D) 2.7.
E) 3.0.
34. Recall that when we standardize the values of a variable, the distribution of standardized
values has mean 0 and standard deviation 1. Suppose we measure two variables X and Y
on each of several subjects. We standardize both variables and then compute the
least-squares regression line of Y on X for these standardized values. Suppose the slope
of this least-squares regression line is –0.44. We may conclude that
A) the correlation will be 1/–0.44.
B) the intercept will also be –0.44.
C) the intercept will be 1.0.
D) the correlation will be 1.0.
E) the correlation will also be –0.44.
35. In a study of 1991 model cars, a researcher found that the fraction of the variation in the
price of cars that was explained by the least-squares regression on horsepower was
about 0.64. For the cars in this study, the correlation between the price of the car and its
horsepower was found to be positive. The actual value of the correlation
A) is 0.80.
B) is 0.64.
C) is 0.41.
D) is –0.80.
E) cannot be determined from the information given.
Page 17
36. In a study of 1991 model cars, a researcher computed the least-squares regression line of
price (in dollars) on horsepower. He obtained the following equation for this line.
= –6677 + 175  horsepower
Based on the least-squares regression line, we would predict that a 1991 model car with
horsepower equal to 200 would cost
A) $41,677.
B) $35,000.
C) $34,175.
D) $28,323.
E) $13,354.
37. A scatterplot of the calories and sodium content of several brands of meat hot dogs is
shown below. The least-squares regression line has been drawn on the plot.
Referring to this scatterplot, the value of the residual for the point labeled x
A) is about 40.
B) is about 125.
C) is about 425.
D) is about 1300.
E) cannot be determined from the information given.
Page 18
38. A researcher wishes to determine whether the rate of water flow (in liters per second)
over an experimental soil bed can be used to predict the amount of soil washed away (in
kilograms). The researcher measures the amount of soil washed away for various flow
rates and from these data calculates the least-squares regression line to be
= 0.4 + 1.3  (flow rate)
One of the flow rates used by the researcher was 0.3 liters per second; for this flow rate,
the amount of eroded soil was 0.8 kilograms. These values were used in the calculation
of the least-squares regression line. The residual corresponding to these values is
A) 0.01.
B) –0.01.
C) 0.5.
D) –0.5.
E) –3.5.
Page 19
39. A response variable Y and explanatory variable X were measured on each of several
subjects. A scatterplot of the measurements is shown below. The least-squares
regression line is shown in the plot.
Which of the following five plots is a plot of the residuals for the data shown in the
scatterplot above versus X?
A)
B)
C)
Page 20
D)
E)
40. A least-squares regression line is fitted to a set of data. If one of the data points has a
positive residual, then
A) the correlation between the values of the response and explanatory variables must
be positive.
B) the point must lie above the least-squares regression line.
C) the slope of the least-squares regression line must be positive.
D) the point must lie near the right edge of the scatterplot.
E) all of the above.
Page 21
41. Which of the following statements concerning residuals is true?
A) The sum of the residuals is always 0.
B) A plot of the residuals is useful for assessing the fit of the least-squares regression
line.
C) The value of a residual is the observed value of the response minus the value of the
response that one would predict from the least-squares regression line.
D) If the data are linear, then the plot of the residuals should have no discernible
pattern.
E) All of the above.
42. Consider the scatterplot below.
The point indicated by the plotting symbol x would be
A) a residual.
B) influential.
C) a z-score.
D) a least-squares point.
E) a partial outlier.
Page 22
43. A sample of 79 companies was taken, and the annual profits (y) were plotted against
annual sales (x). The plot is given below. All values in the plots are in units of $100,000.
The correlation between sales and profits is found to be 0.814. Based on this
information, we may conclude which of the following?
A) If the sales were less than $20,000, the equation of the least-squares regression line
would predict the profits quite accurately.
B) There are clearly influential observations present.
C) If we group the companies in the plot into those that are small in size, those that are
medium in size, and those that are large in size and compute the correlation
between sales and profits for each group of companies separately, the correlation in
each group will be about 0.8.
D) Not surprisingly, increasing sales causes an increase in profits. This is confirmed
by the large positive correlation.
E) All of the above.
Page 23
Answer Key
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
A
E
B
B
C
A
B
C
C
E
D
D
B
B
C
A
D
B
A
C
C
E
A
B
E
C
A
E
D
D
B
E
A
E
A
D
A
A
A
B
E
B
B
Page 24