Texas Middle School Math Series Course 3

Data Displays
and Analysis
© Carnegie Learning
Whether
you're a Panther,
or a Knight, or a Tiger,
it stands to reason that
your school has a mascot and
nickname to represent your
school. Many times, logos are
used on sweatshirts,
banners, and even face
paint to promote
school spirit.
14.1 School Spirit and Scatter Plots
Using Scatter Plots to Display and Analyze
Two-Variable Relationships........................................... 731
14.2Jump In! The Water’s Fine!
Interpreting Patterns in Scatter Plots........................... 741
14.3 How Fast Are Your Nerve Impulses?
Connecting Tables and Scatter Plots
for Collected Data........................................................ 751
14.4 Picking a Player
Calculating and Interpreting the Mean
Absolute Deviation.................................................................759
14.5 Texas-Sized Loving
Sample Populations................................................................773
729
462351_CL_C3_CH14_pp729-794.indd 729
28/08/13 3:00 PM
© Carnegie Learning
730 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 730
28/08/13 3:00 PM
SchoolTItle
Spiritgoes
and
Lesson
Scatter Plots
here
Using Scatter Plots to Display
Lesson Subtitle
and Analyze Two-Variable
Relationships
Learning Goals
Key Terms
In this lesson, you will:
 variable
 two-variable data set
 Define the meaning of two-variable data.
 Collect and record two-variable data.
 Construct and interpret a scatter plot.
 Determine if a change in the value of one variable
results in a change in the value of the second variable.
 Identify patterns in a scatter plot.
C
hances are that your school has a nickname. Perhaps your school may even
have a mascot dress in the school colors or in a costume to promote school spirit.
This spirit is especially seen during competition between other schools.
School names and mascots can range from very common names like the Auburn
Tigers in Georgia, to the sometimes rare names like the Banana Slugs of the
University of California at Santa Cruz. So, what’s your school mascot? Does your
© Carnegie Learning
school mascot get your school spirit juices flowing?
14.1 Using Scatter Plots to Display and Analyze Two-Variable Relationships • 731
462351_CL_C3_CH14_pp729-794.indd 731
28/08/13 3:00 PM
Problem 1 School Spirit, Sweatpants, Sweatshirts,
and Scatter Plots
The School Spirit Club plans to sell sweatpants and sweatshirt
sets with the school’s logo. The club is determining if there is a
way to package sweatshirt and sweatpants sets so that most
of the students can buy a set that will fit.
1. Do you think there might be a relationship between the
sweatpant size and the sweatshirt size a person would
buy? Why or why not?
When collecting information about a person or thing, the specific characteristic of the
information gathered can be called a variable. Previously, you have seen variables in
mathematics refer to a letter or symbol to represent a number. In this case, a variable can
refer to any characteristic that can change, or vary.
2. Name a variable that can affect a sweatshirt size.
3. Do you think collecting information about one sweatshirt characteristic is enough to
The School Spirit Club decides to collect students’ heights and arm spans. They hope that
collecting this information can determine if there is a relationship between sweatshirt size
and sweatpant size. Collecting information about two separate characteristics for the
© Carnegie Learning
determine which shirt sizes should be paired with which pant sizes?
same person or thing can be called a two-variable data set.
4. Why is it important to record each student’s height and arm span?
732 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 732
28/08/13 3:00 PM
© Carnegie Learning
5. Analyze the results the School Spirit Club collected.
Name
Height
(cm)
Arm Span
(cm)
Donna
141
143
Edward
152
157
Betina
150
151
Thomas
162
159
Aaron
176
172
Julianna
155
153
Amy
156
159
Larry
162
164
Carlos
167
162
Jermont
163
168
Shayla
154
154
Mary
161
157
Latisha
164
167
Blake
172
180
Genifer
154
152
Ella
161
167
Felix
167
168
Ordered Pair
Can you determine whether there is a relationship between height and arm span, and
sweatshirt size and sweatpant size? Why or why not?
14.1 Using Scatter Plots to Display and Analyze Two-Variable Relationships • 733
462351_CL_C3_CH14_pp729-794.indd 733
28/08/13 3:00 PM
One way to show the relationship between two different characteristics is to create a
graph that can represent the two variables. As you learned previously, a scatter plot is a
graph in the coordinate plane of a collection of two-variable data plotted as ordered pairs.
The ordered pairs are formed by using the values of the first characteristic as the
x-coordinate, and using the values of the second characteristic as the y-coordinate.
6. Write the ordered pairs for the data in the table.
7. Complete the scatter plot shown by plotting the data points that the School Spirit
Club collected.
Height vs. Arm Span
Arm Span (cm)
180
170
160
150
140
0
0
140
150
160
170
180
Height (cm)
8. What patterns do you notice in the scatter plot?
Do you
connect the
points in this
type of graph?
9. Circle the point (150, 151) on the scatter plot.
© Carnegie Learning
Then, explain what this point means.
734 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 734
28/08/13 3:00 PM
10. Suppose two points on the scatter plot are (162, 164) and (164, 162).
a. Grace says that the two points are in different locations, but Jorge disagrees. He
thinks they are both the same point. Who is correct? Explain your reasoning.
© Carnegie Learning
b. Describe the locations of the two points on the scatter plot relative to each other.
14.1 Using Scatter Plots to Display and Analyze Two-Variable Relationships • 735
462351_CL_C3_CH14_pp729-794.indd 735
28/08/13 3:00 PM
Problem 2 Video Games and Math Grades
Ms. Liu is trying to determine if there is a relationship between the math grade percentage
of her students, and the time spent playing video games. Ms. Liu constructed the scatter
plot for the number of hours her math students spent playing video games per weeknight
(Monday through Thursday), and their grade percentage in her math class.
Math Grades and Time Spent
Playing Video Games
100
Math Grades
(percent)
90
80
70
60
50
0
0
1
2
3
4
5
6
7
8
Time Playing Video Games
(hours)
1. Circle the point (3.5, 80) on the scatter plot. Explain the meaning of the point.
© Carnegie Learning
2. Describe any patterns you see in Ms. Liu’s scatter plot.
736 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 736
28/08/13 3:00 PM
3. Complete the table with the data from Ms. Liu’s scatter plot.
Math Grades
(percent)
© Carnegie Learning
Time spent playing
video games
(hours)
4. Using the table, identify the student who earned a 76% in math class. How many
hours did that student spend playing video games?
14.1 Using Scatter Plots to Display and Analyze Two-Variable Relationships • 737
462351_CL_C3_CH14_pp729-794.indd 737
28/08/13 3:00 PM
Problem 3 Creating a Scatter Plot with a Graphing Calculator
You can use a graphing calculator to create a scatter plot.
Follow the steps to create a scatter plot for the height and arm span data the School Spirit
Club collected. Begin by entering the data set into List 1 and List 2.
Step 1: Press STAT . The EDIT and 1: EDIT should be highlighted.
Press ENTER .
To clear data
in a list, move the
cursor up to L1 and
press the Clear
button.
Enter the data points for the students’ height and arm span. It is critical that the
x-coordinates in List 1 and the y-coordinates in List 2 match the ordered pairs
from your table.
L1
L2
157
161
167
164
180
172
152
154
167
161
168
167
----L2(18)
L1 2:SortA(
L3
2
This table
shows the last six
data entries. The
School Spirit Club
collected information
from 17 students.
Step 2: Press Y5 and delete any equations that have been entered.
Press 2nd followed by Y5 . Make certain that only Plot 1
is turned on; all other plots are should be turned off.
Press ENTER . Use the arrow keys to highlight the scatter plot,
and press ENTER .
Plot1 Plot2 Plot3
On Off
Xlist:L1
Ylist:L2
Mark:
+
© Carnegie Learning
Type:
738 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 738
28/08/13 3:00 PM
Step 3: Make certain Xlist: displays L1 and Ylist: displays L2.
There are two ways to have the calculator display the graph.
a. Press WINDOW and enter the values for Xmin and Xmax. Then,
press GRAPH .
b. Press ZOOM followed by 9 .
1. Would a student who is 145 centimeters tall be more likely to have an arm span of
150 centimeters or 180 centimeters? Why?
2. Would a student who is 170 centimeters tall be more likely to have an arm span of
© Carnegie Learning
150 centimeters or 180 centimeters? Why?
14.1 Using Scatter Plots to Display and Analyze Two-Variable Relationships • 739
462351_CL_C3_CH14_pp729-794.indd 739
28/08/13 3:00 PM
Talk the Talk
At the beginning of the lesson, the School Spirit Club wanted to see if there was a way to
determine if there might be a relationship between the sweatpant size and the sweatshirt
size a person could buy together.
1. What conclusions did you reach about the relationship between a student’s height
and arm span from the data the School Spirit Club collected?
2. How could your conclusion help the School Spirit Club decide how to package their
© Carnegie Learning
sets of sweatpants and sweatshirts?
Be prepared to share your solutions and methods.
740 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 740
28/08/13 3:00 PM
Jump In! The
Water’s Fine!
Interpreting Patterns in
Scatter Plots
Learning Goals
Key Terms
In this lesson, you will:






 Interpret patterns in a scatter plot.
 Determine if a pattern in a scatter
plot has a linear relationship.
 Identify potential outliers in a
scatter plot.
independent variable (explanatory variable)
dependent variable (response variable)
association
linear association
cluster
positive association
 negative association
 outlier
F
or many East Coast residents visiting sunny Southern California for the first
time, a trip to a white sandy beach and warm water is usually on the agenda.
However, the ocean temperature in the Pacific Ocean along California’s coast is
part of a cold current. This means that the average ocean temperature rarely
breaks 70ºF! The world’s oceans and seas fall into one of two categories: cold
currents and warm currents. Generally, the warmest bodies of water are found
near the equator while many of the cold currents are closer to the earth’s poles.
© Carnegie Learning
Why does it seem that warm currents are near the equator while cold currents
seem to be closer to the North and South Poles?
14.2 Interpreting Patterns in Scatter Plots • 741
462351_CL_C3_CH14_pp729-794.indd 741
28/08/13 3:00 PM
Problem 1 Ocean Temperatures
As you learned previously, a two-data variable set is
a data set in which two separate characteristics are
measured for a person or a thing. Sometimes, the
two-data variable set is designated as two different
variables: the independent variable and the
The independent
variable can also be called
the explanatory variable. The
dependent variable can also be
called the response variable
because this is the variable that
responds to what occurs to the
explanatory variable.
dependent variable. The independent variable is the
variable whose value is not determined by another variable.
Generally, the independent variable is represented by the
x-coordinate. The dependent variable is the variable whose value is
determined by an independent variable. Generally, this dependent
variable is represented by the y-coordinate.
1. Erica, who is an oceanographer, is measuring the temperature of
the ocean at different depths. Her results are listed in the table.
Depth
(m)
100
200
300
400
500
600
700
800
900
Temperature
(ºF)
76
73
70
66
61
56
52
48
43
a. Identify the independent and dependent variables in Erica’s data table.
b. Create a scatter plot using the data Erica gathered for the ocean temperatures at
different depths.
Ocean Temperature at Different Depths
80
72
68
64
© Carnegie Learning
Temperature (oF)
76
60
56
52
48
44
40
0
0
100 200 300 400 500 600 700 800 900 1000
Depth of Water (meters)
742 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 742
28/08/13 3:00 PM
2. Explain the meaning of the point (400, 66).
3. What relationship does the scatter plot show between the depth of the ocean water
and the temperature of the water?
As you have experienced, scatter plots can be great tools to identify patterns in a
two-variable data set. Sometimes, these patterns or relationships are called associations.
One common pattern that exists in data is when the points on a scatter plot form a linear
association. A linear association occurs when the points on the scatter plot seem to form
a line. In most cases the points will not form a perfect line, but they are clustered. When
data values are clustered, the data values are arranged in such a way that as you look at
the graph from left to right, you can imagine a line going through the scatter plot with most
of the points being clustered close to the line.
4. Explain how there seems to be a linear association between the depth of the ocean
water and the water temperature.
If the two variables have a linear association, you can then determine the type of
association between two variables. You can typically look for a pattern in the dependent
variable on the y-axis as the independent variable on the x-axis increases from left to right.
The two variables have a positive association if, as the independent variable increases,
the dependent variable also increases. If the dependent variable decreases as the
© Carnegie Learning
independent variable increases, then the two variables have a negative association.
Once you identify the pattern for two variables with a linear relationship, you can state the
association between the two variables.
14.2 Interpreting Patterns in Scatter Plots • 743
462351_CL_C3_CH14_pp729-794.indd 743
28/08/13 3:00 PM
5. Describe the type of association that exists
between the depth of the ocean water and the
water temperature. State the association in
terms of the variables.
I see! So, with
the height and arm
span data, there seems
to be a positive association
because as the height
increased, the arm span
increased as well.
6. Analyze each scatter plot shown. Then, identify the
following for each:
●
Identify the independent and dependent variables.
●
Determine whether the scatter plots show a linear
association.
●
If there is a linear relationship, determine whether it
has a positive association or a negative association.
●
State the association in terms of the variables.
Height and Weight of Soccer Players
Height
© Carnegie Learning
Weight
a.
744 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 744
28/08/13 3:00 PM
Wait Time and Movie Attendance
Movie Attendance
b.
Time Waiting in Line
c.
Student Grades in Homeroom 6
100
Grade in Art
90
80
70
© Carnegie Learning
60
50
50
60
70
80
Grade in Math
90
100
14.2 Interpreting Patterns in Scatter Plots • 745
462351_CL_C3_CH14_pp729-794.indd 745
28/08/13 3:00 PM
d.
Degrees in Celsius
Temperature in Fahrenheit and Celsius
Degrees in Fahrenheit
e.
Time
© Carnegie Learning
Height
Height of Ball Tossed into Air
746 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 746
28/08/13 3:00 PM
Problem 2 Other Patterns in Scatter Plots
Another pattern that can occur in a scatter plot is an outlier. An outlier for two-variable
data is a point that varies greatly from the overall pattern of the data.
1. The scatter plot shows the fat and calories in 11 different foods.
Fat and Calories in Various Foods
650
Calories
550
450
350
250
0
0
10
15
Fat
(grams)
20
25
a. Determine the independent and dependent variables in the two-variable data set.
b. Does there appear to be a linear association between the fat and calories
© Carnegie Learning
of the foods?
14.2 Interpreting Patterns in Scatter Plots • 747
462351_CL_C3_CH14_pp729-794.indd 747
28/08/13 3:00 PM
2. Complete a table of values for the points displayed in the scatter plot, beginning with
the item with the lowest amount of fat.
Fat
(g)
Calories
So, an outlier
is like a point
that doesn't
belong.
around any outliers in the scatter plot and identify the outlier in the table.
4. Explain why the point (25, 300) is a potential outlier.
© Carnegie Learning
3. Do any of the points appear to vary greatly from the other points? If so, place a star
5. Examine the values in the table. How can you determine that (25, 300) is a
possible outlier?
748 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 748
28/08/13 3:00 PM
6. Place your finger on top of the point (25, 300) and examine the scatter plot. What do
you notice?
Talk the Talk
1. Explain the difference between a positive association and a negative association of a
two-variable data set.
2. Explain how you can identify an outlier in a two-variable data set. Do the data need to
© Carnegie Learning
have a linear association?
Be prepared to share your solutions and methods.
14.2 Interpreting Patterns in Scatter Plots • 749
462351_CL_C3_CH14_pp729-794.indd 749
28/08/13 3:00 PM
© Carnegie Learning
750 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 750
28/08/13 3:00 PM
How Fast are Your
Nerve Impulses?
Connecting Tables and Scatter
Plots for Collected Data
Learning Goals
In this lesson, you will:
 Conduct an experiment and collect the data.
 Connect tables and scatter plots for collected data.
 Interpret collected data displayed on a scatter plot and in a table.
Y
ou’ve done it before. Even though your server told you your plate was hot,
your hand accidentally touches a side and—ouch! Just as quickly as your skin
touches the plate, your muscles quickly move your hand away from the plate
thanks to the central nervous system! One of the quickest communication
systems resides in your body. The central nervous system is responsible for
communicating information between your brain, spinal cord, and other tissues,
coordinating movements of the muscles and other body parts. It regulates
breathing, your heartbeat, and other bodily functions that you don’t even think
© Carnegie Learning
about. So, just how fast can the central nervous system work?
14.3 Connecting Tables and Scatter Plots for Collected Data • 751
462351_CL_C3_CH14_pp729-794.indd 751
28/08/13 3:00 PM
Problem 1 Human Wrist Chain
Your class is going to explore the speed of nerve impulses in the body by performing an
experiment that involves a human chain. In this experiment, a group of students forms a
circle with each person gently holding the wrist of the person to his or her right. One
student begins the chain, and another student ends the chain. Also, one student in the
group must be the timekeeper. The group members must keep their eyes closed. When the
experiment begins, the timekeeper says, “Go,” and the first student carefully, but quickly
squeezes the student’s wrist to their right, then this next student squeezes a wrist, and so
on. After the last student’s wrist is squeezed, he or she says “Stop,” and releases the next
student’s wrist. The amount of time from when the word go is spoken until the word stop is
spoken (the amount of time it takes to complete the chain) is recorded by the timekeeper.
1. Conduct the experiment 10 times, using a different number of students in the chain
each time. For each number of students in the chain, run 3 trials of the experiment.
Record the data for the experiment in the table shown. Then, calculate the mean time
for each row and record the result in the last column of the table. Round your average
times to the nearest tenth.
Trial 1
Trial 2
Trial 3
Average time
(seconds)
© Carnegie Learning
Chain length
(number of students)
752 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 752
28/08/13 3:00 PM
2. Write the ordered pairs from the table with the number of students in the chain as the
independent variable and the mean time as the dependent variable.
3. Create a scatter plot of the ordered pairs on the coordinate plane.
Human Wrist Chain
10.0
9.0
8.0
Mean Time
(sec)
7.0
6.0
5.0
4.0
3.0
2.0
1.0
0.0
0
5
10
15
20
Chain Length
(number of students)
25
30
35
4. Does there appear to be a linear association between the chain length and mean
time? Explain your reasoning. If so, identify the linear association as either a positive
© Carnegie Learning
or negative association.
14.3 Connecting Tables and Scatter Plots for Collected Data • 753
462351_CL_C3_CH14_pp729-794.indd 753
28/08/13 3:00 PM
5. State the association between the two variables.
6. On the scatter plot, identify the point representing the longest chain. Then, identify
the values of the point in the table. Explain how you identified the point and values.
7. On the scatter plot, identify the point representing the least mean time. Then, identify
© Carnegie Learning
the values of the point in the table. Explain how you identified the point and values.
754 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 754
28/08/13 3:00 PM
Problem 2 Video Games and Math Grades Revisited
The scatter plot and table of values represent the mean time spent playing video games
and grades for Ms. Liu’s math students.
Math Grades and Time Spent Playing Video Games
100
Average Time
Spent Playing
Video Games
Math Grades
0
98
1
100
2
90
60
2.5
90
50
3
76
3
88
3.5
80
4
78
4.5
72
5
68
5
70
5
72
5.5
65
6
70
7
55
Math Grades
(percent)
90
80
70
­
0
© Carnegie Learning
0
1
2
3
4
5
6
7
Mean Time Playing Video Games
(hours)
8
14.3 Connecting Tables and Scatter Plots for Collected Data • 755
462351_CL_C3_CH14_pp729-794.indd 755
28/08/13 3:00 PM
1. Place a star around the point on the scatter plot representing the student with the
greatest grade percentage. Explain how you identified the point.
2. Identify the student with the greatest grade percentage in the table. Explain how you
identified the student.
3. Write the ordered pair for the student with the highest grade. Then, explain the
meaning of each of the coordinates.
4. Place a star around the point on the scatter plot representing the student who spent
5. Identify the student with the greatest mean time playing video games in the table.
Explain how you identified the student.
© Carnegie Learning
the most mean time playing video games. Explain how you identified the point.
756 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 756
28/08/13 3:00 PM
6. Write the ordered pair for the student with the greatest mean time playing video
games. Then, explain the meaning of each of the coordinates.
7. Write the ordered pairs for two students who earned the same grade in math. What
grade did the students receive, and how many mean hours did each student spend
playing video games?
8. Write the ordered pairs for three students who had the same mean time playing video
games. What grade did these students receive, and on average how many hours did
each student spend playing video games?
© Carnegie Learning
9. Martha is puzzled by the scatter plot. There are only 15 points on the plot, but there
are 20 students in the class. Explain to Martha how all 20 students could be
represented in the scatter plot.
14.3 Connecting Tables and Scatter Plots for Collected Data • 757
462351_CL_C3_CH14_pp729-794.indd 757
28/08/13 3:00 PM
Talk the Talk
1. Does every data set in a scatter plot always have a linear association?
2. What must be true for the data points in a two-data variable set to have a positive or
© Carnegie Learning
negative association?
Be prepared to share your solutions and methods.
758 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 758
28/08/13 3:00 PM
Picking a Player
Calculating and Interpreting
the Mean Absolute Deviation
Learning Goals
Key Terms
In this lesson, you will:
 measures of variation
or variability
 Calculate the deviations of each data value from the
mean of a data set.
 Calculate the absolute deviations of each data value
from the mean of a data set.
 deviation
 absolute deviation
 mean absolute deviation
 Calculate and interpret the mean absolute deviation
for a data set.
F
antasy sports has become very popular in the last 25 years. In fantasy sports,
participants take the role of manager and coach by drafting players, trading
players, and determining who should start in upcoming games. Weekly, and even
daily, serious participants scour over player statistics to determine which player
might score the most points for the team. These statistics are based on what
each player does in a real-life game. Why do you think there is an appeal for
common people to want to be a “fantasy” coach and manager? Do you think
© Carnegie Learning
fantasy sports leagues are popular in other countries too?
14.4 Calculating and Interpreting the Mean Absolute Deviation • 759
462351_CL_C3_CH14_pp729-794.indd 759
28/08/13 3:00 PM
Problem 1 On to the Championship—But Who Should Start?
Previously, you examined the points scored by three players on the Olive Street Middle
School girls’ basketball team. Suppose the team won the game and will advance to the
district championship. Coach Harris needs to choose between Tamika and Lynn as
starters for the game. In the past six games, Tamika scored 11, 11, 6, 26, 6, and 12 points.
In the past six games, Lynn scored 15, 12, 13, 10, 9, and 13 points.
The dot plots for each player’s scoring over the past six games played are shown.
Number of Points Scored by Tamika
X
X
5
6
X
X X
7
8
X
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Mean 5 12
Number of Points Scored by Lynn
X X
5
6
7
8
X
X X
X
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Mean 5 12
1. Based on the dot plots, which player do you think
What does it
mean for both players
to have the same mean?
Does it matter who
Coach Harris puts in
the game?
When analyzing a data set, measures of center give you an idea of
where the data is centered, or what a typical data value might be.
There is another measure that can help you analyze data. Measures
of variation or variability describe the spread of the data values.
Just as there are several measures of central tendency, there are
also several measures of variation.
© Carnegie Learning
Coach Harris should choose?
The deviation of a data value indicates how far that data value is
from the mean. To calculate the deviation, subtract the mean from
the data value:
deviation 5 data value 2 mean
760 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 760
28/08/13 3:00 PM
2. Calculate the deviations for the points scored for each player using the equation.
Record your results in the tables. Tamika’s deviation is completed for you.
Tamika
Oh, I see!
1 1 is 1 less than
the mean of 12,
and 26 is 14
more than the
mean of 12.
Lynn
Points Scored
Describe the
Deviation from
the Mean
Points Scored
11
1 less than the
mean
15
11
1 less than the
mean
12
6
6 less than the
mean
13
26
14 more than the
mean
10
6
6 less than the
mean
9
12
12 is the mean
13
Describe the
Deviation from
the Mean
3. What is the meaning if a deviation is less than the mean? Is more than the mean?
© Carnegie Learning
Is equivalent to the mean?
4. What do you notice about the deviations for each player?
14.4 Calculating and Interpreting the Mean Absolute Deviation • 761
462351_CL_C3_CH14_pp729-794.indd 761
28/08/13 3:00 PM
5. Carly claims the sum of the deviations for a data set will always be 0. Do you agree?
Why or why not?
The sum of all the distances less than the mean are equivalent to all the distances greater
than the mean. Because the mean is the balance point, the sum of data points on either
side of the balance point are equivalent to each other.
In order to get an idea of the spread of the data values, you can take the absolute value of
each deviation, and then determine the mean of those absolute values. The absolute value
of each deviation is called the absolute deviation. The mean absolute deviation is the
average or mean of the absolute deviations.
6. Determine the absolute deviation for the points scored for each player. Record your
answers in the tables.
Lynn
Tamika
11
11–12
Absolute
Deviation
21
1
Calculation
of
Deviation
Points
Deviation from the
Scored
from the
Mean
Mean
Absolute
Deviation
15
11
12
6
13
26
10
6
9
12
13
© Carnegie Learning
Calculation
of
Deviation
Points
Deviation from the
Scored
from the
Mean
Mean
762 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 762
28/08/13 3:00 PM
7. Calculate the mean absolute deviation for the points scored by each player.
8. What does the mean absolute deviation tell you about the points scored by
each player?
9. If you were Coach Harris, which player would you choose to play in the championship
game? Explain how you determined your decision.
Problem 2 Determining the Absolute Deviation of Two Data Sets
Let’s examine the two data sets and their dot plots.
Set 1: 50, 50, 55, 60, 60 X X
XXX
© Carnegie Learning
20
30
40
50
Set 2: 25, 50, 55, 60, 60 60
Mean 5 55
X
XXX
X
20
30
40
50
60
Mean 5 50
1. Which data set do you think will have the greater mean absolute deviation?
Explain your reasoning.
14.4 Calculating and Interpreting the Mean Absolute Deviation • 763
462351_CL_C3_CH14_pp729-794.indd 763
28/08/13 3:00 PM
2. Calculate the mean absolute deviation for each data set.
Data
Value
Describe the
Deviation
from the
Mean
Absolute
Deviation
Set 2 (mean = 50)
Data
Value
50
25
50
50
55
55
60
60
60
60
3. Interpret each of the mean absolute deviations:
Describe the
Deviation
from the
Mean
Absolute
Deviation
© Carnegie Learning
Set 1 (mean = 55)
764 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 764
28/08/13 3:00 PM
Problem 3 Using Technology to Calculate Mean
Absolute Deviation
You can use a graphing calculator to determine the mean absolute deviation of a data set.
For this example, you will use a graphing calculator to determine the mean absolute
deviation for the exercise data of the students at Sharpe Middle School.
Begin by entering the data set into List 1 of the calculator. The steps to enter data values
into the graphing calculator are shown.
Step 1: Press STAT , EDIT and 1: Edit should be highlighted. Then, press ENTER .
Step 2:Enter the data into L1. Remember, it is not necessary to enter the data in
numerical order. Press ENTER after you enter each data value. Remember,
these data points represent the time in minutes that students exercised during
one week.
0 40 60 30 60 10 45 30 300 90 30 120 60 0 20
Step 3:Obtain the value of the mean.
●
Press STAT and scroll left to CALC . The 1: 1-Var Stats should be
© Carnegie Learning
highlighted.
●
Press ENTER . The screen should display 1-Var Stats.
14.4 Calculating and Interpreting the Mean Absolute Deviation • 765
462351_CL_C3_CH14_pp729-794.indd 765
28/08/13 3:00 PM
●
The default setting is List 1 (L1),so you can press ENTER . You should get
the screen shown.
●
__
The x ​
​  represents the mean of the data set. The mean number of minutes the
students exercised was approximately 59.67 minutes.
Step 4:Calculate the deviations from the mean.
●
Return to the lists by pressing STAT , EDIT , 1: Edit, and ENTER .
●
Use the arrow keys to highlight L2.
●
Type the following formula in L2: 2nd
L1
 59.67. Press ENTER .
In some calculators, L1 is found on the 1 key.
●
The deviations for each data value should appear in List 2.
Step 5:Calculate the absolute deviations from the mean.
●
●
Use the arrows to highlight L3.
Type the following formula in L3: 2nd CATALOG .The CATALOG key is
sometimes located on the 0 key. The abs( should be the first entry in
the catalog.
Press ENTER . abs( should appear after the = sign at the bottom of
the screen.
L2 and ) . Press ENTER .
●
Type 2nd
●
The absolute deviations for each data value should appear in List 3.
© Carnegie Learning
●
766 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 766
28/08/13 3:00 PM
Step 6:Calculate the mean absolute deviation.
●
Press STAT and scroll to CALC . The 1: 1-Var Stats should be highlighted.
●
Press ENTER . The screen should display 1-Var Stats.
●
Type 2nd
L3 . The L3 button is sometimes located on the 3 key. Press
ENTER .
In this case, x represents the mean absolute deviation as 44.27 minutes.
You can also use a spreadsheet to determine the mean absolute deviation of a data set.
Determine the mean absolute deviation for the exercise data of the students at Sharpe
Middle School. The data values are shown.
0 minutes
10 minutes
30 minutes
40 minutes
45 minutes
120 minutes
60 minutes
30 minutes
60 minutes
30 minutes
300 minutes
0 minutes
60 minutes
90 minutes
20 minutes
Step 1: Open a computer spreadsheet and enter the first value in cell A1. Press ENTER
© Carnegie Learning
and continue entering the data values, pressing ENTER after each entry.
14.4 Calculating and Interpreting the Mean Absolute Deviation • 767
462351_CL_C3_CH14_pp729-794.indd 767
28/08/13 3:00 PM
Step 2: Click in cell A17. On the menu bar select Formulas , then select
More Functions , and then Statistical , and finally AVERAGE .
Click on AVERAGE. This will open a dialog box that is titled AVERAGE. In the
space labeled Number 1, enter A1:A15. This tells the computer to determine the
average of the numbers in cells A1 through A15.
Click on OK . Cell A17 should now show the value 59.66667.
In cell B1 type the following: = A1-$A$17. This tells the computer to take the
value in A1 and subtract the mean. The value –59.6667 should appear in cell B1.
© Carnegie Learning
Step 3:Determine the deviations from the mean.
768 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 768
28/08/13 3:00 PM
Step 4: ●
Click in cell B1 and drag to highlight cells B2 through B15.
●
Make certain the “Home” tab is selected and click on the icon of an arrow that
represents filling shaded cells. Highlight “Down.” This will copy the formula
you wrote in cell B1 to the remaining cells.
●
Click on “Down.” The deviations from the mean should now be in cells B2
© Carnegie Learning
through B15.
14.4 Calculating and Interpreting the Mean Absolute Deviation • 769
462351_CL_C3_CH14_pp729-794.indd 769
28/08/13 3:00 PM
Step 5: Click in cell C1. You will now take the absolute value of the deviations.
●
Type 5 ABS(B1) and press Enter. The absolute value of 259.667 should
appear in cell C1.
●
Follow Step 4 to find the absolute deviations for the remaining cells in
column C.
Step 6: Click in cell C17. Follow Step 2 to calculate the mean absolute deviation.
The mean absolute deviation for the exercise data is 44.26667.
© Carnegie Learning
●
770 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 770
28/08/13 3:00 PM
Talk the Talk
1. Use your graphing calculator to complete the table shown. You will determine the
deviation from the mean and the absolute deviation of a data set.
Height of 6th-grade basketball players in inches:
68, 64, 60, 58, 62, 65, 64, 60, 61, 65
Data Value (L1)
Deviation from the Mean (L2)
Absolute Deviation (L3)
68
64
60
58
62
65
64
60
© Carnegie Learning
61
65
Be prepared to share your solutions and methods.
14.4 Calculating and Interpreting the Mean Absolute Deviation • 771
462351_CL_C3_CH14_pp729-794.indd 771
28/08/13 3:00 PM
© Carnegie Learning
772 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 772
28/08/13 3:00 PM
Texas-Sized Loving
Sample Populations
Learning Goal
In this lesson, you will:
 Determine that a random sample has characteristics that are representative of the
whole population.
Y
ou may not have known this, but the large state of Texas is home to the
“smallest” county in the whole country. Loving County covers 677 square miles,
but is home to only slightly more than 80 people! The 2000 census reported just
31 total households, with only 13 total people under the age of 18. The school-aged
residents attend school outside of Loving, though, because in 1972 the district
had to consolidate with a neighboring county after enrollment dropped to 2
pupils! How’s that for a student-teacher ratio?! In many respects the county is far
from struggling, though. The per capita income of its residents is one of the
highest in the whole nation. In fact, the 2010 census reported it as the only
county in the nation without a single resident living below the poverty level.
Loving is defined by much more than its small size, though. It was named after
the pioneer of the cattle drive. Ranches still exist in the county, but the economy
© Carnegie Learning
hinges almost entirely on oil. Loving’s storied history includes being the first
county in Texas to elect a female sheriff. Edna Dewees, elected sheriff in the
1940’s, never carried a firearm and ended her illustrious career with a whopping
two arrests.
Every county has interesting people and events in its history. What are some
interesting facts about the county you live in?
14.5 Sample Populations • 773
462351_CL_C3_CH14_pp729-794.indd 773
28/08/13 3:00 PM
Problem 1 Sample Population
In the United States, the Constitution mandates that a census be taken every ten
years. A census is an official count or survey of the population, in which detailed
demographic information is collected. Populations at the city, state, and county levels
are documented annually.
Census data is used by the government to determine the number of representatives
in government. It can also serve as a guide for spending and taxation. Outside of
government, social services can use census data as a guide for their programs, and
companies can use census data for marketing purposes.
The population for each county in the state of Texas for 2011 is shown in the tables at the
end of this lesson.
1. Analyze the Texas county population data for 2011.
a. List several characteristics of the data.
Recall that a sample is a subset of a larger data set. It is often impractical to analyze all
data points when gathering information. Statisticians can reliably determine characteristics
of a whole group by analyzing samples of the whole population.
© Carnegie Learning
b. Does the data contain any outliers? Explain your reasoning.
774 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 774
28/08/13 3:00 PM
2. Why is it important to choose a sample group randomly?
3. Micah and Sergio are studying the Texas county population data. They each randomly
chose the populations of the counties shown to conduct their study.
Micah
Sergio
Angelina County . . . 87,669
Wise County . . . 59,833
Wichita County . . . 130,698
Bowie County . . . 92,793
Walker County . . . 68,087
Victoria County . . . 87,545
Coryell County . . . 76,508
Tom Green County . . . 111,823 Taylor County . . . 132,755
Ector County . . . 140,111
Starr County . . . 61,715
Rockwall County . . . 81,290
Grayson County . . . 121,419
Randall County . . . 123,351
Potter County . . . 122,285
Parker County . . . 118,376
Orange County . . . 82,487
Midland County . . . 140,308 Liberty County . . . 76,206
Hunt County . . . 86,531
Comal County . . . 111,963
Each student says that his own sample is more likely to accurately represent the
population data. Without performing any calculations, who do you think is correct?
© Carnegie Learning
Explain your reasoning.
14.5 Sample Populations • 775
462351_CL_C3_CH14_pp729-794.indd 775
28/08/13 3:00 PM
4. Choose a sample of ten counties randomly. List the counties and their populations in
the first two columns of the table.
County
Population
(people)
Absolute
Deviation from
the Mean
Remember, you
can use the
randInt( function on
your calculator to
generate random
numbers.
5. Calculate each measure of central tendency for your sample.
a. Mean
c. Mode
© Carnegie Learning
b. Median
776 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 776
28/08/13 3:00 PM
6. Calculate each quartile for your sample.
a. First quartile
b. Third quartile
7. Create a box-and-whisker plot to represent your sample.
8. Determine each county’s absolute deviation from the mean. Write your answers in the
third column of the table in Question 4.
© Carnegie Learning
9. Calculate the mean absolute deviation of your sample.
14.5 Sample Populations • 777
462351_CL_C3_CH14_pp729-794.indd 777
28/08/13 3:00 PM
The mean population of a Texas county in the year 2011
Notice the jagged line
on the box-and-whisker
plot. This indicates a
break on the number line,
because the range of
values is so large.
is 101,081 people, while the median is 18,395.
The first quartile is 6866, and the third quartile is 49,783.
A box-and-whisker plot of the population data for all
counties in Texas is shown.
Q1 = 6866
Q2 = 18,395
Q3 = 49,783
min = 94
0
max = 4,180,894.
5000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000
4,180,864
The average absolute deviation for the whole population is 135,279.
10. Compare your sample statistics to the statistics for the whole state of Texas. Do you
think your sample accurately reflects the characteristics of the whole population?
© Carnegie Learning
Explain your reasoning.
778 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 778
28/08/13 3:00 PM
Problem 2 More Samples
1. Choose two more different samples of ten counties randomly. Then, calculate the
mean of each sample and use it to calculate the absolute deviations. Write your
answers in the table.
Sample 2
County
Population
Sample 3
Absolute
Deviation
from the
Mean
County
Population
Absolute
Deviation
from the
Mean
2. Calculate each statistic in the table on the following page for Samples 2 and 3. Also,
© Carnegie Learning
record each statistic for Sample 1 from the previous problem in this same table.
14.5 Sample Populations • 779
462351_CL_C3_CH14_pp729-794.indd 779
28/08/13 3:00 PM
Mean
Median
Mode
Quartiles
Q1 5
Q3 5
Sample 2
Q1 5
Q3 5
Sample 3
Q1 5
Q3 5
© Carnegie Learning
Sample 1
Mean
Absolute
Deviation
Population
101,081
18,395
None
Q1 5 6866
Q3 5 49,783
135,279
780 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 780
28/08/13 3:00 PM
3. How do the statistics of your three samples compare to the statistics for the entire
data set?
4. Use the additional space in the table to record the statistics for the samples from
other groups in your class.
5. Calculate the average of each statistic for all of the samples you and your
classmates gathered.
a. average mean
© Carnegie Learning
b. average median
c. average mode
d. average first quartile
14.5 Sample Populations • 781
462351_CL_C3_CH14_pp729-794.indd 781
28/08/13 3:00 PM
e. average third quartile
f. average mean absolute deviation
6. How do the averages of the statistics compare to the statistics for the entire data set?
7. When conducting a study, do you think it is better to use the mean of one random
sample, or to use the average of the means of multiple samples? Explain your
reasoning.
8. When conducting a study, do you think it is better to use the mean absolute deviation
of one random sample, or to use the average of the mean absolute deviations of
© Carnegie Learning
multiple samples? Explain your reasoning.
Be prepared to share your solutions and methods.
782 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 782
28/08/13 3:00 PM
© Carnegie Learning
ID #
County Name
2011 Population
ID #
County Name
2011 Population
1
Anderson County
58,308
36
Chambers County
35,552
2
Andrews County
15,445
37
Cherokee County
51,140
3
Angelina County
87,669
38
Childress County
6,984
4
Aransas County
23,374
39
Clay County
10,721
5
Archer County
8,842
40
Cochran County
3,109
6
Armstrong County
1,928
41
Coke County
3,305
7
Atascosa County
45,579
42
Coleman County
8,763
8
Austin County
28,665
43
Collin County
812,226
9
Bailey County
7,247
44
Collingsworth County
3,092
10
Bandera County
20,538
45
Colorado County
20,816
11
Bastrop County
75,115
46
Comal County
111,963
12
Baylor County
3,741
47
Comanche County
13,891
13
Bee County
32,095
48
Concho County
4,081
14
Bell County
315,196
49
Cooke County
38,396
15
Bexar County
1,756,153
50
Coryell County
76,508
16
Blanco County
10,600
51
Cottle County
1,499
17
Borden County
626
52
Crane County
4,383
18
Bosque County
18,306
53
Crockett County
3,722
19
Bowie County
92,793
54
Crosby County
6,092
20
Brazoria County
319,973
55
Culberson County
2,383
21
Brazos County
197,632
56
Dallam County
6,866
22
Brewster County
9,386
57
Dallas County
2,416,014
23
Briscoe County
1,651
58
Dawson County
13,751
24
Brooks County
7,222
59
Deaf Smith County
19,595
25
Brown County
38,186
60
Delta County
5,212
26
Burleson County
17,251
61
Denton County
686,406
27
Burnet County
43,117
62
DeWitt County
20,255
28
Caldwell County
38,442
63
Dickens County
2,402
29
Calhoun County
21,442
64
Dimmit County
10,118
30
Callahan County
13,515
65
Donley County
3,651
31
Cameron County
414,123
66
Duval County
11,669
32
Camp County
12,407
67
Eastland County
18,633
33
Carson County
6,259
68
Ector County
140,111
34
Cass County
30,256
69
Edwards County
1,966
35
Castro County
8,116
70
Ellis County
152,753
14.5 Sample Populations • 783
462351_CL_C3_CH14_pp729-794.indd 783
28/08/13 3:00 PM
County Name
2011 Population
ID #
County Name
2011 Population
71
El Paso County
820,790
106
Hemphill County
3,970
72
Erath County
38,266
107
Henderson County
78,826
73
Falls County
17,944
108
Hidalgo County
797,810
74
Fannin County
33,958
109
Hill County
35,392
75
Fayette County
24,732
110
Hockley County
22,892
76
Fisher County
3,914
111
Hood County
51,670
77
Floyd County
6,394
112
Hopkins County
35,371
78
Foard County
1,341
113
Houston County
23,484
79
Fort Bend County
606,953
114
Howard County
35,122
80
Franklin County
10,551
115
Hudspeth County
3,423
81
Freestone County
19,684
116
Hunt County
86,531
82
Frio County
17,400
117
Hutchinson County
22,132
83
Gaines County
18,003
118
Irion County
1,620
84
Galveston County
295,747
119
Jack County
9,035
85
Garza County
6,562
120
Jackson County
14,032
86
Gillespie County
25,114
121
Jasper County
36,296
87
Glasscock County
1,251
122
Jeff Davis County
2,288
88
Goliad County
7,243
123
Jefferson County
252,802
89
Gonzales County
19,904
124
Jim Hogg County
5,265
90
Gray County
22,755
125
Jim Wells County
41,339
91
Grayson County
121,419
126
Johnson County
152,734
92
Gregg County
123,081
127
Jones County
20,146
93
Grimes County
26,887
128
Karnes County
14,946
94
Guadalupe County
135,757
129
Kaufman County
105,358
95
Hale County
36,498
130
Kendall County
34,781
96
Hall County
3,377
131
Kenedy County
437
97
Hamilton County
8,472
132
Kent County
825
98
Hansford County
5,577
133
Kerr County
49,783
99
Hardeman County
4,176
134
Kimble County
4,632
100
Hardin County
55,246
135
King County
255
101
Harris County
4,180,894
136
Kinney County
3,630
102
Harrison County
66,296
137
Kleberg County
32,196
103
Hartley County
5,971
138
Knox County
3,758
104
Haskell County
5,989
139
Lamar County
50,074
105
Hays County
164,050
140
Lamb County
14,167
© Carnegie Learning
ID #
784 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 784
28/08/13 3:00 PM
© Carnegie Learning
ID #
County Name
2011 Population
ID #
County Name
2011 Population
141
Lampasas County
19,891
176
Newton County
14,454
142
La Salle County
7,001
177
Nolan County
15,269
143
Lavaca County
19,347
178
Nueces County
343,281
144
Lee County
16,666
179
Ochiltree County
10,530
145
Leon County
16,916
180
Oldham County
2,074
146
Liberty County
76,206
181
Orange County
82,487
147
Limestone County
23,634
182
Palo Pinto County
28,115
148
Lipscomb County
3,327
183
Panola County
24,058
149
Live Oak County
11,447
184
Parker County
118,376
150
Llano County
19,181
185
Parmer County
10,332
151
Loving County
94
186
Pecos County
15,716
152
Lubbock County
283,910
187
Polk County
45,725
153
Lynn County
5,888
188
Potter County
122,285
154
McCulloch County
8,290
189
Presidio County
7,761
155
McLennan County
238,564
190
Rains County
11,059
156
McMullen County
690
191
Randall County
123,351
157
Madison County
13,747
192
Reagan County
3,390
158
Marion County
10,507
193
Real County
3,426
159
Martin County
4,934
194
Red River County
12,703
160
Mason County
3,984
195
Reeves County
13,757
161
Matagorda County
36,809
196
Refugio County
7,291
162
Maverick County
55,405
197
Roberts County
816
163
Medina County
46,367
198
Robertson County
16,740
164
Menard County
2,264
199
Rockwall County
81,290
165
Midland County
140,308
200
Runnels County
10,524
166
Milam County
24,699
201
Rusk County
53,759
167
Mills County
4,848
202
Sabine County
10,740
168
Mitchell County
9,426
203 San Augustine County
8,874
169
Montague County
19,710
204
San Jacinto County
26,801
170
Montgomery County
471,734
205
San Patricio County
64,726
171
Moore County
21,954
206
San Saba County
6,023
172
Morris County
12,848
207
Schleicher County
3,309
173
Motley County
1,230
208
Scurry County
16,919
174
Nacogdoches County
65,466
209
Shackelford County
3,318
175
Navarro County
48,054
210
Shelby County
25,772
14.5 Sample Populations • 785
462351_CL_C3_CH14_pp729-794.indd 785
28/08/13 3:00 PM
County Name
2011 Population
ID #
County Name
2011 Population
211
Sherman County
3,040
246
Williamson County
442,782
212
Smith County
213,381
247
Wilson County
43,789
213
Somervell County
8,451
248
Winkler County
7,178
214
Starr County
61,715
249
Wise County
59,833
215
Stephens County
9,548
250
Wood County
42,164
216
Sterling County
1,158
251
Yoakum County
8,005
217
Stonewall County
1,472
252
Young County
18,484
218
Sutton County
4,007
253
Zapata County
14,282
219
Swisher County
7,801
254
Zavala County
11,849
220
Tarrant County
1,849,815
221
Taylor County
132,755
222
Terrell County
960
223
Terry County
12,675
224
Throckmorton County
1,609
225
Titus County
32,596
226
Tom Green County
111,823
227
Travis County
1,063,130
228
Trinity County
14,659
229
Tyler County
21,666
230
Upshur County
39,826
231
Upton County
3,346
232
Uvalde County
26,535
233
Val Verde County
49,106
234
Van Zandt County
52,776
235
Victoria County
87,545
236
Walker County
68,087
237
Waller County
44,013
238
Ward County
10,716
239
Washington County
33,791
240
Webb County
256,496
241
Wharton County
41,314
242
Wheeler County
5,465
243
Wichita County
130,698
244
Wilbarger County
13,404
245
Willacy County
22,095
© Carnegie Learning
ID #
786 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 786
28/08/13 3:00 PM
Chapter 14 Summary
Key Terms








 variable (14.1)
 two-variable data set (14.1)
 independent variable
(explanatory variable) (14.2)
 dependent variable
(response variable) (14.2)
 association (14.2)
 linear association (14.2)
cluster (14.2)
positive association (14.2)
negative association (14.2)
outlier (14.2)
measures of variation or variability (14.4)
deviation (14.4)
absolute deviation (14.4)
mean absolute deviation (14.4)
Creating and Interpreting Scatter Plots
A scatter plot is a graph of a set of ordered pairs. The ordered pairs are formed by using
the values of the first characteristic as the first, or x-coordinate, and the values of the
second characteristic as the second, or y-coordinate. By analyzing the data points, it can
be determined if there is a relationship between the two variables.
Example
The graph shown represents players’ height on a baseball team and the cap size
they wear.
Height vs. Baseball Cap Size
8
Baseball Cap Size
© Carnegie Learning
7½
7
6½
6
0
0
72
74
76
78
80
Height (in.)
It appears that as a player’s height becomes greater, the player’s cap size increases.
Chapter 14 Summary • 787
462351_CL_C3_CH14_pp729-794.indd 787
28/08/13 3:00 PM
In a linear plot, if one variable increases as the other variable increases, then the two
variables are said to have a positive association. If one variable increases as the other
variable decreases, then the two variables are said to have a negative association. An
outlier is a point that varies greatly from the overall pattern of the scatter plot.
Example
Fifteen students recently took a math test. The table lists the time each student spent
studying for the test and the student’s corresponding test score.
Minutes
Spent
30
Studying
Test
Score
0
40
10
60
0
120 100 90 60 20 100 80 30 90
80 50 75 65 90 70 100
95
55 80 70 100 85 70 90
Minutes Spent Studying vs. Test Score
100
90
Test Score
80
70
60
50
40
0
0
20
40
60
80
100
120
140
Minutes Spent Studying
The data are linear and have a positive association because the test scores increase, in
© Carnegie Learning
general, as the minutes spent studying increase. There is an outlier at the point (90, 55).
788 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 788
28/08/13 3:01 PM
Conducting an Experiment to Display and Interpret Data
Using a Scatter Plot
Data from an experiment can be collected in a table, but can be hard to interpret. The
dependent and independent variables from the experiment can be listed as ordered pairs
and plotted on a scatter plot. Then, patterns or associations can be determined and
interpreted from the scatter plot.
Example
Students set up an experiment to compare leg muscle fatigue to the ability to balance on
one leg. Two trials of the balancing (one on each leg) were recorded, starting with 0 lunges
being performed. Then the students performed 10 lunges and two more trials were
recorded. With each experiment, the students performed 10 more lunges. Then they
plotted the results on a scatter plot with the total number of lunges performed as the
independent variable and the average balance time as the dependent variable. A negative
linear association can be seen on the scatter plot. So, as the number of lunges performed
© Carnegie Learning
increases, the average time a student can balance on one leg decreases.
Chapter 14 Summary • 789
462351_CL_C3_CH14_pp729-794.indd 789
28/08/13 3:01 PM
Total Number of
Lunges Performed
Trial 1 Right Leg
(seconds)
Trial 2 Left Leg
(seconds)
Average Time
Balanced on
One Leg
(seconds)
0
120
110
115
10
72
75
73.5
20
58
55
56.5
30
47
45
46
40
35
37
36
50
29
26
27.5
60
20
18
19
y
Average Balance Time(s)
120
100
80
60
40
20
0
x
0
10
30
20
40
50
60
© Carnegie Learning
Number of Lunges
790 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 790
28/08/13 3:01 PM
Calculating the Deviation of Each Data Value
from the Mean of a Data Set
Measures of variation or variability describe the spread of the data values. Just as there
are several measures of central tendency, there are also several measures of variation. The
deviation of a data value indicates how far the data value is from the mean. To calculate
the deviation, subtract the mean from the data value.
Example
(78 1 83 1 90 1 85)
The mean is ___________________
​ 
  , 
 ​
or 84. So, the deviation of each data value from the
4
mean is determined by subtracting the mean from the data value, as shown in the table.
Deviation
from the
Mean
78
26
83
21
90
6
85
1
© Carnegie Learning
Test Score
Chapter 14 Summary • 791
462351_CL_C3_CH14_pp729-794.indd 791
28/08/13 3:01 PM
Calculating the Absolute Deviation of Each Data Value from the
Mean of a Data Set
The absolute value of each deviation is called the absolute deviation. To determine the
absolute deviation, calculate the deviation, and then determine the absolute value.
Example
(5 1 10 1 8 1 9)
The mean is ________________
​ 
  , 
 ​
or 8. So, the deviation of each data value from the mean is
4
determined by subtracting the mean from the data value. The absolute value of each
deviation is then calculated.
Calculation of
Deviation from
the Mean
Deviation
from the
Mean
Absolute
Deviation
5
5—8
23
3
10
10—8
2
2
8
8—8
0
0
9
9—8
1
1
© Carnegie Learning
Calls Made
Per Day
792 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 792
28/08/13 3:01 PM
Calculating and Interpreting the Mean Absolute Deviation
for a Data Set
In order to get an idea of the spread of the data values, you can take the absolute value of
each deviation, and then determine the mean of those absolute values.
Example
The number of apples harvested each year for the last 5 years has been 235, 201, 193,
(235 1 201 1 193 1 228 1 243)
   
  , or 220.
 ​
228, and 243. The mean is _____________________________
​ 
5
Apples
Harvested
Per Year
Deviation
from the
Mean
Absolute
Deviation
235
15
15
201
219
19
193
227
27
228
8
8
243
23
23
© Carnegie Learning
(15 1 19 1 27 1 8 1 23)
2 ​ . This means that the
The mean absolute deviation is _______________________
​ 
   
  , or 18​ __
 ​
5
5
annual apple harvest is typically within about 18 apples from a total of 220.
Chapter 14 Summary • 793
462351_CL_C3_CH14_pp729-794.indd 793
28/08/13 3:01 PM
Using a Random Sample to Represent a Whole Population
When dealing with a large amount of data, it is often impractical to analyze all the data
points. A sample is a subset of a larger data set which can be used to determine
characteristics of a whole group.
Example
A state in the U.S. has 67 counties, all with varied populations. You can use the randInt(
function on your graphing calculator to generate a random sample. Then, you can use the
sample to determine average statistics for the entire state.
County
Population
Absolute
Deviation from
the Mean
Adams
91,292
120,152.875
Butler
174,083
37,361.875
Dauphin
251,798
40,353.125
Huntingdon
45,586
165,858.875
Luzerne
319,250
107,805.125
Montgomery
750,097
538,652.125
Potter
18,080
193,364.875
Tioga
41,373
170,071.875
mean of sample population: 211,444.875
median of sample population: 132,687.5
mean of entire population: 181,212.1618
median of entire population: 90,366
mean absolute deviation of entire population: 162,548.439
The mean, median, and mean absolute deviation of the sample population are all slightly
larger than the mean, median, and mean absolute deviation of the entire population.
© Carnegie Learning
mean absolute deviation of sample population: 171,702.5938
However, the mean and mean absolute deviation are quite close. This means the random
sample is probably an accurate representation of the entire population.
794 • Chapter 14 Data Displays and Analysis
462351_CL_C3_CH14_pp729-794.indd 794
28/08/13 3:01 PM