Data Displays and Analysis © Carnegie Learning Whether you're a Panther, or a Knight, or a Tiger, it stands to reason that your school has a mascot and nickname to represent your school. Many times, logos are used on sweatshirts, banners, and even face paint to promote school spirit. 14.1 School Spirit and Scatter Plots Using Scatter Plots to Display and Analyze Two-Variable Relationships........................................... 731 14.2Jump In! The Water’s Fine! Interpreting Patterns in Scatter Plots........................... 741 14.3 How Fast Are Your Nerve Impulses? Connecting Tables and Scatter Plots for Collected Data........................................................ 751 14.4 Picking a Player Calculating and Interpreting the Mean Absolute Deviation.................................................................759 14.5 Texas-Sized Loving Sample Populations................................................................773 729 462351_CL_C3_CH14_pp729-794.indd 729 28/08/13 3:00 PM © Carnegie Learning 730 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 730 28/08/13 3:00 PM SchoolTItle Spiritgoes and Lesson Scatter Plots here Using Scatter Plots to Display Lesson Subtitle and Analyze Two-Variable Relationships Learning Goals Key Terms In this lesson, you will: variable two-variable data set Define the meaning of two-variable data. Collect and record two-variable data. Construct and interpret a scatter plot. Determine if a change in the value of one variable results in a change in the value of the second variable. Identify patterns in a scatter plot. C hances are that your school has a nickname. Perhaps your school may even have a mascot dress in the school colors or in a costume to promote school spirit. This spirit is especially seen during competition between other schools. School names and mascots can range from very common names like the Auburn Tigers in Georgia, to the sometimes rare names like the Banana Slugs of the University of California at Santa Cruz. So, what’s your school mascot? Does your © Carnegie Learning school mascot get your school spirit juices flowing? 14.1 Using Scatter Plots to Display and Analyze Two-Variable Relationships • 731 462351_CL_C3_CH14_pp729-794.indd 731 28/08/13 3:00 PM Problem 1 School Spirit, Sweatpants, Sweatshirts, and Scatter Plots The School Spirit Club plans to sell sweatpants and sweatshirt sets with the school’s logo. The club is determining if there is a way to package sweatshirt and sweatpants sets so that most of the students can buy a set that will fit. 1. Do you think there might be a relationship between the sweatpant size and the sweatshirt size a person would buy? Why or why not? When collecting information about a person or thing, the specific characteristic of the information gathered can be called a variable. Previously, you have seen variables in mathematics refer to a letter or symbol to represent a number. In this case, a variable can refer to any characteristic that can change, or vary. 2. Name a variable that can affect a sweatshirt size. 3. Do you think collecting information about one sweatshirt characteristic is enough to The School Spirit Club decides to collect students’ heights and arm spans. They hope that collecting this information can determine if there is a relationship between sweatshirt size and sweatpant size. Collecting information about two separate characteristics for the © Carnegie Learning determine which shirt sizes should be paired with which pant sizes? same person or thing can be called a two-variable data set. 4. Why is it important to record each student’s height and arm span? 732 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 732 28/08/13 3:00 PM © Carnegie Learning 5. Analyze the results the School Spirit Club collected. Name Height (cm) Arm Span (cm) Donna 141 143 Edward 152 157 Betina 150 151 Thomas 162 159 Aaron 176 172 Julianna 155 153 Amy 156 159 Larry 162 164 Carlos 167 162 Jermont 163 168 Shayla 154 154 Mary 161 157 Latisha 164 167 Blake 172 180 Genifer 154 152 Ella 161 167 Felix 167 168 Ordered Pair Can you determine whether there is a relationship between height and arm span, and sweatshirt size and sweatpant size? Why or why not? 14.1 Using Scatter Plots to Display and Analyze Two-Variable Relationships • 733 462351_CL_C3_CH14_pp729-794.indd 733 28/08/13 3:00 PM One way to show the relationship between two different characteristics is to create a graph that can represent the two variables. As you learned previously, a scatter plot is a graph in the coordinate plane of a collection of two-variable data plotted as ordered pairs. The ordered pairs are formed by using the values of the first characteristic as the x-coordinate, and using the values of the second characteristic as the y-coordinate. 6. Write the ordered pairs for the data in the table. 7. Complete the scatter plot shown by plotting the data points that the School Spirit Club collected. Height vs. Arm Span Arm Span (cm) 180 170 160 150 140 0 0 140 150 160 170 180 Height (cm) 8. What patterns do you notice in the scatter plot? Do you connect the points in this type of graph? 9. Circle the point (150, 151) on the scatter plot. © Carnegie Learning Then, explain what this point means. 734 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 734 28/08/13 3:00 PM 10. Suppose two points on the scatter plot are (162, 164) and (164, 162). a. Grace says that the two points are in different locations, but Jorge disagrees. He thinks they are both the same point. Who is correct? Explain your reasoning. © Carnegie Learning b. Describe the locations of the two points on the scatter plot relative to each other. 14.1 Using Scatter Plots to Display and Analyze Two-Variable Relationships • 735 462351_CL_C3_CH14_pp729-794.indd 735 28/08/13 3:00 PM Problem 2 Video Games and Math Grades Ms. Liu is trying to determine if there is a relationship between the math grade percentage of her students, and the time spent playing video games. Ms. Liu constructed the scatter plot for the number of hours her math students spent playing video games per weeknight (Monday through Thursday), and their grade percentage in her math class. Math Grades and Time Spent Playing Video Games 100 Math Grades (percent) 90 80 70 60 50 0 0 1 2 3 4 5 6 7 8 Time Playing Video Games (hours) 1. Circle the point (3.5, 80) on the scatter plot. Explain the meaning of the point. © Carnegie Learning 2. Describe any patterns you see in Ms. Liu’s scatter plot. 736 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 736 28/08/13 3:00 PM 3. Complete the table with the data from Ms. Liu’s scatter plot. Math Grades (percent) © Carnegie Learning Time spent playing video games (hours) 4. Using the table, identify the student who earned a 76% in math class. How many hours did that student spend playing video games? 14.1 Using Scatter Plots to Display and Analyze Two-Variable Relationships • 737 462351_CL_C3_CH14_pp729-794.indd 737 28/08/13 3:00 PM Problem 3 Creating a Scatter Plot with a Graphing Calculator You can use a graphing calculator to create a scatter plot. Follow the steps to create a scatter plot for the height and arm span data the School Spirit Club collected. Begin by entering the data set into List 1 and List 2. Step 1: Press STAT . The EDIT and 1: EDIT should be highlighted. Press ENTER . To clear data in a list, move the cursor up to L1 and press the Clear button. Enter the data points for the students’ height and arm span. It is critical that the x-coordinates in List 1 and the y-coordinates in List 2 match the ordered pairs from your table. L1 L2 157 161 167 164 180 172 152 154 167 161 168 167 ----L2(18) L1 2:SortA( L3 2 This table shows the last six data entries. The School Spirit Club collected information from 17 students. Step 2: Press Y5 and delete any equations that have been entered. Press 2nd followed by Y5 . Make certain that only Plot 1 is turned on; all other plots are should be turned off. Press ENTER . Use the arrow keys to highlight the scatter plot, and press ENTER . Plot1 Plot2 Plot3 On Off Xlist:L1 Ylist:L2 Mark: + © Carnegie Learning Type: 738 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 738 28/08/13 3:00 PM Step 3: Make certain Xlist: displays L1 and Ylist: displays L2. There are two ways to have the calculator display the graph. a. Press WINDOW and enter the values for Xmin and Xmax. Then, press GRAPH . b. Press ZOOM followed by 9 . 1. Would a student who is 145 centimeters tall be more likely to have an arm span of 150 centimeters or 180 centimeters? Why? 2. Would a student who is 170 centimeters tall be more likely to have an arm span of © Carnegie Learning 150 centimeters or 180 centimeters? Why? 14.1 Using Scatter Plots to Display and Analyze Two-Variable Relationships • 739 462351_CL_C3_CH14_pp729-794.indd 739 28/08/13 3:00 PM Talk the Talk At the beginning of the lesson, the School Spirit Club wanted to see if there was a way to determine if there might be a relationship between the sweatpant size and the sweatshirt size a person could buy together. 1. What conclusions did you reach about the relationship between a student’s height and arm span from the data the School Spirit Club collected? 2. How could your conclusion help the School Spirit Club decide how to package their © Carnegie Learning sets of sweatpants and sweatshirts? Be prepared to share your solutions and methods. 740 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 740 28/08/13 3:00 PM Jump In! The Water’s Fine! Interpreting Patterns in Scatter Plots Learning Goals Key Terms In this lesson, you will: Interpret patterns in a scatter plot. Determine if a pattern in a scatter plot has a linear relationship. Identify potential outliers in a scatter plot. independent variable (explanatory variable) dependent variable (response variable) association linear association cluster positive association negative association outlier F or many East Coast residents visiting sunny Southern California for the first time, a trip to a white sandy beach and warm water is usually on the agenda. However, the ocean temperature in the Pacific Ocean along California’s coast is part of a cold current. This means that the average ocean temperature rarely breaks 70ºF! The world’s oceans and seas fall into one of two categories: cold currents and warm currents. Generally, the warmest bodies of water are found near the equator while many of the cold currents are closer to the earth’s poles. © Carnegie Learning Why does it seem that warm currents are near the equator while cold currents seem to be closer to the North and South Poles? 14.2 Interpreting Patterns in Scatter Plots • 741 462351_CL_C3_CH14_pp729-794.indd 741 28/08/13 3:00 PM Problem 1 Ocean Temperatures As you learned previously, a two-data variable set is a data set in which two separate characteristics are measured for a person or a thing. Sometimes, the two-data variable set is designated as two different variables: the independent variable and the The independent variable can also be called the explanatory variable. The dependent variable can also be called the response variable because this is the variable that responds to what occurs to the explanatory variable. dependent variable. The independent variable is the variable whose value is not determined by another variable. Generally, the independent variable is represented by the x-coordinate. The dependent variable is the variable whose value is determined by an independent variable. Generally, this dependent variable is represented by the y-coordinate. 1. Erica, who is an oceanographer, is measuring the temperature of the ocean at different depths. Her results are listed in the table. Depth (m) 100 200 300 400 500 600 700 800 900 Temperature (ºF) 76 73 70 66 61 56 52 48 43 a. Identify the independent and dependent variables in Erica’s data table. b. Create a scatter plot using the data Erica gathered for the ocean temperatures at different depths. Ocean Temperature at Different Depths 80 72 68 64 © Carnegie Learning Temperature (oF) 76 60 56 52 48 44 40 0 0 100 200 300 400 500 600 700 800 900 1000 Depth of Water (meters) 742 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 742 28/08/13 3:00 PM 2. Explain the meaning of the point (400, 66). 3. What relationship does the scatter plot show between the depth of the ocean water and the temperature of the water? As you have experienced, scatter plots can be great tools to identify patterns in a two-variable data set. Sometimes, these patterns or relationships are called associations. One common pattern that exists in data is when the points on a scatter plot form a linear association. A linear association occurs when the points on the scatter plot seem to form a line. In most cases the points will not form a perfect line, but they are clustered. When data values are clustered, the data values are arranged in such a way that as you look at the graph from left to right, you can imagine a line going through the scatter plot with most of the points being clustered close to the line. 4. Explain how there seems to be a linear association between the depth of the ocean water and the water temperature. If the two variables have a linear association, you can then determine the type of association between two variables. You can typically look for a pattern in the dependent variable on the y-axis as the independent variable on the x-axis increases from left to right. The two variables have a positive association if, as the independent variable increases, the dependent variable also increases. If the dependent variable decreases as the © Carnegie Learning independent variable increases, then the two variables have a negative association. Once you identify the pattern for two variables with a linear relationship, you can state the association between the two variables. 14.2 Interpreting Patterns in Scatter Plots • 743 462351_CL_C3_CH14_pp729-794.indd 743 28/08/13 3:00 PM 5. Describe the type of association that exists between the depth of the ocean water and the water temperature. State the association in terms of the variables. I see! So, with the height and arm span data, there seems to be a positive association because as the height increased, the arm span increased as well. 6. Analyze each scatter plot shown. Then, identify the following for each: ● Identify the independent and dependent variables. ● Determine whether the scatter plots show a linear association. ● If there is a linear relationship, determine whether it has a positive association or a negative association. ● State the association in terms of the variables. Height and Weight of Soccer Players Height © Carnegie Learning Weight a. 744 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 744 28/08/13 3:00 PM Wait Time and Movie Attendance Movie Attendance b. Time Waiting in Line c. Student Grades in Homeroom 6 100 Grade in Art 90 80 70 © Carnegie Learning 60 50 50 60 70 80 Grade in Math 90 100 14.2 Interpreting Patterns in Scatter Plots • 745 462351_CL_C3_CH14_pp729-794.indd 745 28/08/13 3:00 PM d. Degrees in Celsius Temperature in Fahrenheit and Celsius Degrees in Fahrenheit e. Time © Carnegie Learning Height Height of Ball Tossed into Air 746 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 746 28/08/13 3:00 PM Problem 2 Other Patterns in Scatter Plots Another pattern that can occur in a scatter plot is an outlier. An outlier for two-variable data is a point that varies greatly from the overall pattern of the data. 1. The scatter plot shows the fat and calories in 11 different foods. Fat and Calories in Various Foods 650 Calories 550 450 350 250 0 0 10 15 Fat (grams) 20 25 a. Determine the independent and dependent variables in the two-variable data set. b. Does there appear to be a linear association between the fat and calories © Carnegie Learning of the foods? 14.2 Interpreting Patterns in Scatter Plots • 747 462351_CL_C3_CH14_pp729-794.indd 747 28/08/13 3:00 PM 2. Complete a table of values for the points displayed in the scatter plot, beginning with the item with the lowest amount of fat. Fat (g) Calories So, an outlier is like a point that doesn't belong. around any outliers in the scatter plot and identify the outlier in the table. 4. Explain why the point (25, 300) is a potential outlier. © Carnegie Learning 3. Do any of the points appear to vary greatly from the other points? If so, place a star 5. Examine the values in the table. How can you determine that (25, 300) is a possible outlier? 748 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 748 28/08/13 3:00 PM 6. Place your finger on top of the point (25, 300) and examine the scatter plot. What do you notice? Talk the Talk 1. Explain the difference between a positive association and a negative association of a two-variable data set. 2. Explain how you can identify an outlier in a two-variable data set. Do the data need to © Carnegie Learning have a linear association? Be prepared to share your solutions and methods. 14.2 Interpreting Patterns in Scatter Plots • 749 462351_CL_C3_CH14_pp729-794.indd 749 28/08/13 3:00 PM © Carnegie Learning 750 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 750 28/08/13 3:00 PM How Fast are Your Nerve Impulses? Connecting Tables and Scatter Plots for Collected Data Learning Goals In this lesson, you will: Conduct an experiment and collect the data. Connect tables and scatter plots for collected data. Interpret collected data displayed on a scatter plot and in a table. Y ou’ve done it before. Even though your server told you your plate was hot, your hand accidentally touches a side and—ouch! Just as quickly as your skin touches the plate, your muscles quickly move your hand away from the plate thanks to the central nervous system! One of the quickest communication systems resides in your body. The central nervous system is responsible for communicating information between your brain, spinal cord, and other tissues, coordinating movements of the muscles and other body parts. It regulates breathing, your heartbeat, and other bodily functions that you don’t even think © Carnegie Learning about. So, just how fast can the central nervous system work? 14.3 Connecting Tables and Scatter Plots for Collected Data • 751 462351_CL_C3_CH14_pp729-794.indd 751 28/08/13 3:00 PM Problem 1 Human Wrist Chain Your class is going to explore the speed of nerve impulses in the body by performing an experiment that involves a human chain. In this experiment, a group of students forms a circle with each person gently holding the wrist of the person to his or her right. One student begins the chain, and another student ends the chain. Also, one student in the group must be the timekeeper. The group members must keep their eyes closed. When the experiment begins, the timekeeper says, “Go,” and the first student carefully, but quickly squeezes the student’s wrist to their right, then this next student squeezes a wrist, and so on. After the last student’s wrist is squeezed, he or she says “Stop,” and releases the next student’s wrist. The amount of time from when the word go is spoken until the word stop is spoken (the amount of time it takes to complete the chain) is recorded by the timekeeper. 1. Conduct the experiment 10 times, using a different number of students in the chain each time. For each number of students in the chain, run 3 trials of the experiment. Record the data for the experiment in the table shown. Then, calculate the mean time for each row and record the result in the last column of the table. Round your average times to the nearest tenth. Trial 1 Trial 2 Trial 3 Average time (seconds) © Carnegie Learning Chain length (number of students) 752 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 752 28/08/13 3:00 PM 2. Write the ordered pairs from the table with the number of students in the chain as the independent variable and the mean time as the dependent variable. 3. Create a scatter plot of the ordered pairs on the coordinate plane. Human Wrist Chain 10.0 9.0 8.0 Mean Time (sec) 7.0 6.0 5.0 4.0 3.0 2.0 1.0 0.0 0 5 10 15 20 Chain Length (number of students) 25 30 35 4. Does there appear to be a linear association between the chain length and mean time? Explain your reasoning. If so, identify the linear association as either a positive © Carnegie Learning or negative association. 14.3 Connecting Tables and Scatter Plots for Collected Data • 753 462351_CL_C3_CH14_pp729-794.indd 753 28/08/13 3:00 PM 5. State the association between the two variables. 6. On the scatter plot, identify the point representing the longest chain. Then, identify the values of the point in the table. Explain how you identified the point and values. 7. On the scatter plot, identify the point representing the least mean time. Then, identify © Carnegie Learning the values of the point in the table. Explain how you identified the point and values. 754 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 754 28/08/13 3:00 PM Problem 2 Video Games and Math Grades Revisited The scatter plot and table of values represent the mean time spent playing video games and grades for Ms. Liu’s math students. Math Grades and Time Spent Playing Video Games 100 Average Time Spent Playing Video Games Math Grades 0 98 1 100 2 90 60 2.5 90 50 3 76 3 88 3.5 80 4 78 4.5 72 5 68 5 70 5 72 5.5 65 6 70 7 55 Math Grades (percent) 90 80 70 0 © Carnegie Learning 0 1 2 3 4 5 6 7 Mean Time Playing Video Games (hours) 8 14.3 Connecting Tables and Scatter Plots for Collected Data • 755 462351_CL_C3_CH14_pp729-794.indd 755 28/08/13 3:00 PM 1. Place a star around the point on the scatter plot representing the student with the greatest grade percentage. Explain how you identified the point. 2. Identify the student with the greatest grade percentage in the table. Explain how you identified the student. 3. Write the ordered pair for the student with the highest grade. Then, explain the meaning of each of the coordinates. 4. Place a star around the point on the scatter plot representing the student who spent 5. Identify the student with the greatest mean time playing video games in the table. Explain how you identified the student. © Carnegie Learning the most mean time playing video games. Explain how you identified the point. 756 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 756 28/08/13 3:00 PM 6. Write the ordered pair for the student with the greatest mean time playing video games. Then, explain the meaning of each of the coordinates. 7. Write the ordered pairs for two students who earned the same grade in math. What grade did the students receive, and how many mean hours did each student spend playing video games? 8. Write the ordered pairs for three students who had the same mean time playing video games. What grade did these students receive, and on average how many hours did each student spend playing video games? © Carnegie Learning 9. Martha is puzzled by the scatter plot. There are only 15 points on the plot, but there are 20 students in the class. Explain to Martha how all 20 students could be represented in the scatter plot. 14.3 Connecting Tables and Scatter Plots for Collected Data • 757 462351_CL_C3_CH14_pp729-794.indd 757 28/08/13 3:00 PM Talk the Talk 1. Does every data set in a scatter plot always have a linear association? 2. What must be true for the data points in a two-data variable set to have a positive or © Carnegie Learning negative association? Be prepared to share your solutions and methods. 758 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 758 28/08/13 3:00 PM Picking a Player Calculating and Interpreting the Mean Absolute Deviation Learning Goals Key Terms In this lesson, you will: measures of variation or variability Calculate the deviations of each data value from the mean of a data set. Calculate the absolute deviations of each data value from the mean of a data set. deviation absolute deviation mean absolute deviation Calculate and interpret the mean absolute deviation for a data set. F antasy sports has become very popular in the last 25 years. In fantasy sports, participants take the role of manager and coach by drafting players, trading players, and determining who should start in upcoming games. Weekly, and even daily, serious participants scour over player statistics to determine which player might score the most points for the team. These statistics are based on what each player does in a real-life game. Why do you think there is an appeal for common people to want to be a “fantasy” coach and manager? Do you think © Carnegie Learning fantasy sports leagues are popular in other countries too? 14.4 Calculating and Interpreting the Mean Absolute Deviation • 759 462351_CL_C3_CH14_pp729-794.indd 759 28/08/13 3:00 PM Problem 1 On to the Championship—But Who Should Start? Previously, you examined the points scored by three players on the Olive Street Middle School girls’ basketball team. Suppose the team won the game and will advance to the district championship. Coach Harris needs to choose between Tamika and Lynn as starters for the game. In the past six games, Tamika scored 11, 11, 6, 26, 6, and 12 points. In the past six games, Lynn scored 15, 12, 13, 10, 9, and 13 points. The dot plots for each player’s scoring over the past six games played are shown. Number of Points Scored by Tamika X X 5 6 X X X 7 8 X 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Mean 5 12 Number of Points Scored by Lynn X X 5 6 7 8 X X X X 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Mean 5 12 1. Based on the dot plots, which player do you think What does it mean for both players to have the same mean? Does it matter who Coach Harris puts in the game? When analyzing a data set, measures of center give you an idea of where the data is centered, or what a typical data value might be. There is another measure that can help you analyze data. Measures of variation or variability describe the spread of the data values. Just as there are several measures of central tendency, there are also several measures of variation. © Carnegie Learning Coach Harris should choose? The deviation of a data value indicates how far that data value is from the mean. To calculate the deviation, subtract the mean from the data value: deviation 5 data value 2 mean 760 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 760 28/08/13 3:00 PM 2. Calculate the deviations for the points scored for each player using the equation. Record your results in the tables. Tamika’s deviation is completed for you. Tamika Oh, I see! 1 1 is 1 less than the mean of 12, and 26 is 14 more than the mean of 12. Lynn Points Scored Describe the Deviation from the Mean Points Scored 11 1 less than the mean 15 11 1 less than the mean 12 6 6 less than the mean 13 26 14 more than the mean 10 6 6 less than the mean 9 12 12 is the mean 13 Describe the Deviation from the Mean 3. What is the meaning if a deviation is less than the mean? Is more than the mean? © Carnegie Learning Is equivalent to the mean? 4. What do you notice about the deviations for each player? 14.4 Calculating and Interpreting the Mean Absolute Deviation • 761 462351_CL_C3_CH14_pp729-794.indd 761 28/08/13 3:00 PM 5. Carly claims the sum of the deviations for a data set will always be 0. Do you agree? Why or why not? The sum of all the distances less than the mean are equivalent to all the distances greater than the mean. Because the mean is the balance point, the sum of data points on either side of the balance point are equivalent to each other. In order to get an idea of the spread of the data values, you can take the absolute value of each deviation, and then determine the mean of those absolute values. The absolute value of each deviation is called the absolute deviation. The mean absolute deviation is the average or mean of the absolute deviations. 6. Determine the absolute deviation for the points scored for each player. Record your answers in the tables. Lynn Tamika 11 11–12 Absolute Deviation 21 1 Calculation of Deviation Points Deviation from the Scored from the Mean Mean Absolute Deviation 15 11 12 6 13 26 10 6 9 12 13 © Carnegie Learning Calculation of Deviation Points Deviation from the Scored from the Mean Mean 762 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 762 28/08/13 3:00 PM 7. Calculate the mean absolute deviation for the points scored by each player. 8. What does the mean absolute deviation tell you about the points scored by each player? 9. If you were Coach Harris, which player would you choose to play in the championship game? Explain how you determined your decision. Problem 2 Determining the Absolute Deviation of Two Data Sets Let’s examine the two data sets and their dot plots. Set 1: 50, 50, 55, 60, 60 X X XXX © Carnegie Learning 20 30 40 50 Set 2: 25, 50, 55, 60, 60 60 Mean 5 55 X XXX X 20 30 40 50 60 Mean 5 50 1. Which data set do you think will have the greater mean absolute deviation? Explain your reasoning. 14.4 Calculating and Interpreting the Mean Absolute Deviation • 763 462351_CL_C3_CH14_pp729-794.indd 763 28/08/13 3:00 PM 2. Calculate the mean absolute deviation for each data set. Data Value Describe the Deviation from the Mean Absolute Deviation Set 2 (mean = 50) Data Value 50 25 50 50 55 55 60 60 60 60 3. Interpret each of the mean absolute deviations: Describe the Deviation from the Mean Absolute Deviation © Carnegie Learning Set 1 (mean = 55) 764 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 764 28/08/13 3:00 PM Problem 3 Using Technology to Calculate Mean Absolute Deviation You can use a graphing calculator to determine the mean absolute deviation of a data set. For this example, you will use a graphing calculator to determine the mean absolute deviation for the exercise data of the students at Sharpe Middle School. Begin by entering the data set into List 1 of the calculator. The steps to enter data values into the graphing calculator are shown. Step 1: Press STAT , EDIT and 1: Edit should be highlighted. Then, press ENTER . Step 2:Enter the data into L1. Remember, it is not necessary to enter the data in numerical order. Press ENTER after you enter each data value. Remember, these data points represent the time in minutes that students exercised during one week. 0 40 60 30 60 10 45 30 300 90 30 120 60 0 20 Step 3:Obtain the value of the mean. ● Press STAT and scroll left to CALC . The 1: 1-Var Stats should be © Carnegie Learning highlighted. ● Press ENTER . The screen should display 1-Var Stats. 14.4 Calculating and Interpreting the Mean Absolute Deviation • 765 462351_CL_C3_CH14_pp729-794.indd 765 28/08/13 3:00 PM ● The default setting is List 1 (L1),so you can press ENTER . You should get the screen shown. ● __ The x represents the mean of the data set. The mean number of minutes the students exercised was approximately 59.67 minutes. Step 4:Calculate the deviations from the mean. ● Return to the lists by pressing STAT , EDIT , 1: Edit, and ENTER . ● Use the arrow keys to highlight L2. ● Type the following formula in L2: 2nd L1 59.67. Press ENTER . In some calculators, L1 is found on the 1 key. ● The deviations for each data value should appear in List 2. Step 5:Calculate the absolute deviations from the mean. ● ● Use the arrows to highlight L3. Type the following formula in L3: 2nd CATALOG .The CATALOG key is sometimes located on the 0 key. The abs( should be the first entry in the catalog. Press ENTER . abs( should appear after the = sign at the bottom of the screen. L2 and ) . Press ENTER . ● Type 2nd ● The absolute deviations for each data value should appear in List 3. © Carnegie Learning ● 766 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 766 28/08/13 3:00 PM Step 6:Calculate the mean absolute deviation. ● Press STAT and scroll to CALC . The 1: 1-Var Stats should be highlighted. ● Press ENTER . The screen should display 1-Var Stats. ● Type 2nd L3 . The L3 button is sometimes located on the 3 key. Press ENTER . In this case, x represents the mean absolute deviation as 44.27 minutes. You can also use a spreadsheet to determine the mean absolute deviation of a data set. Determine the mean absolute deviation for the exercise data of the students at Sharpe Middle School. The data values are shown. 0 minutes 10 minutes 30 minutes 40 minutes 45 minutes 120 minutes 60 minutes 30 minutes 60 minutes 30 minutes 300 minutes 0 minutes 60 minutes 90 minutes 20 minutes Step 1: Open a computer spreadsheet and enter the first value in cell A1. Press ENTER © Carnegie Learning and continue entering the data values, pressing ENTER after each entry. 14.4 Calculating and Interpreting the Mean Absolute Deviation • 767 462351_CL_C3_CH14_pp729-794.indd 767 28/08/13 3:00 PM Step 2: Click in cell A17. On the menu bar select Formulas , then select More Functions , and then Statistical , and finally AVERAGE . Click on AVERAGE. This will open a dialog box that is titled AVERAGE. In the space labeled Number 1, enter A1:A15. This tells the computer to determine the average of the numbers in cells A1 through A15. Click on OK . Cell A17 should now show the value 59.66667. In cell B1 type the following: = A1-$A$17. This tells the computer to take the value in A1 and subtract the mean. The value –59.6667 should appear in cell B1. © Carnegie Learning Step 3:Determine the deviations from the mean. 768 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 768 28/08/13 3:00 PM Step 4: ● Click in cell B1 and drag to highlight cells B2 through B15. ● Make certain the “Home” tab is selected and click on the icon of an arrow that represents filling shaded cells. Highlight “Down.” This will copy the formula you wrote in cell B1 to the remaining cells. ● Click on “Down.” The deviations from the mean should now be in cells B2 © Carnegie Learning through B15. 14.4 Calculating and Interpreting the Mean Absolute Deviation • 769 462351_CL_C3_CH14_pp729-794.indd 769 28/08/13 3:00 PM Step 5: Click in cell C1. You will now take the absolute value of the deviations. ● Type 5 ABS(B1) and press Enter. The absolute value of 259.667 should appear in cell C1. ● Follow Step 4 to find the absolute deviations for the remaining cells in column C. Step 6: Click in cell C17. Follow Step 2 to calculate the mean absolute deviation. The mean absolute deviation for the exercise data is 44.26667. © Carnegie Learning ● 770 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 770 28/08/13 3:00 PM Talk the Talk 1. Use your graphing calculator to complete the table shown. You will determine the deviation from the mean and the absolute deviation of a data set. Height of 6th-grade basketball players in inches: 68, 64, 60, 58, 62, 65, 64, 60, 61, 65 Data Value (L1) Deviation from the Mean (L2) Absolute Deviation (L3) 68 64 60 58 62 65 64 60 © Carnegie Learning 61 65 Be prepared to share your solutions and methods. 14.4 Calculating and Interpreting the Mean Absolute Deviation • 771 462351_CL_C3_CH14_pp729-794.indd 771 28/08/13 3:00 PM © Carnegie Learning 772 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 772 28/08/13 3:00 PM Texas-Sized Loving Sample Populations Learning Goal In this lesson, you will: Determine that a random sample has characteristics that are representative of the whole population. Y ou may not have known this, but the large state of Texas is home to the “smallest” county in the whole country. Loving County covers 677 square miles, but is home to only slightly more than 80 people! The 2000 census reported just 31 total households, with only 13 total people under the age of 18. The school-aged residents attend school outside of Loving, though, because in 1972 the district had to consolidate with a neighboring county after enrollment dropped to 2 pupils! How’s that for a student-teacher ratio?! In many respects the county is far from struggling, though. The per capita income of its residents is one of the highest in the whole nation. In fact, the 2010 census reported it as the only county in the nation without a single resident living below the poverty level. Loving is defined by much more than its small size, though. It was named after the pioneer of the cattle drive. Ranches still exist in the county, but the economy © Carnegie Learning hinges almost entirely on oil. Loving’s storied history includes being the first county in Texas to elect a female sheriff. Edna Dewees, elected sheriff in the 1940’s, never carried a firearm and ended her illustrious career with a whopping two arrests. Every county has interesting people and events in its history. What are some interesting facts about the county you live in? 14.5 Sample Populations • 773 462351_CL_C3_CH14_pp729-794.indd 773 28/08/13 3:00 PM Problem 1 Sample Population In the United States, the Constitution mandates that a census be taken every ten years. A census is an official count or survey of the population, in which detailed demographic information is collected. Populations at the city, state, and county levels are documented annually. Census data is used by the government to determine the number of representatives in government. It can also serve as a guide for spending and taxation. Outside of government, social services can use census data as a guide for their programs, and companies can use census data for marketing purposes. The population for each county in the state of Texas for 2011 is shown in the tables at the end of this lesson. 1. Analyze the Texas county population data for 2011. a. List several characteristics of the data. Recall that a sample is a subset of a larger data set. It is often impractical to analyze all data points when gathering information. Statisticians can reliably determine characteristics of a whole group by analyzing samples of the whole population. © Carnegie Learning b. Does the data contain any outliers? Explain your reasoning. 774 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 774 28/08/13 3:00 PM 2. Why is it important to choose a sample group randomly? 3. Micah and Sergio are studying the Texas county population data. They each randomly chose the populations of the counties shown to conduct their study. Micah Sergio Angelina County . . . 87,669 Wise County . . . 59,833 Wichita County . . . 130,698 Bowie County . . . 92,793 Walker County . . . 68,087 Victoria County . . . 87,545 Coryell County . . . 76,508 Tom Green County . . . 111,823 Taylor County . . . 132,755 Ector County . . . 140,111 Starr County . . . 61,715 Rockwall County . . . 81,290 Grayson County . . . 121,419 Randall County . . . 123,351 Potter County . . . 122,285 Parker County . . . 118,376 Orange County . . . 82,487 Midland County . . . 140,308 Liberty County . . . 76,206 Hunt County . . . 86,531 Comal County . . . 111,963 Each student says that his own sample is more likely to accurately represent the population data. Without performing any calculations, who do you think is correct? © Carnegie Learning Explain your reasoning. 14.5 Sample Populations • 775 462351_CL_C3_CH14_pp729-794.indd 775 28/08/13 3:00 PM 4. Choose a sample of ten counties randomly. List the counties and their populations in the first two columns of the table. County Population (people) Absolute Deviation from the Mean Remember, you can use the randInt( function on your calculator to generate random numbers. 5. Calculate each measure of central tendency for your sample. a. Mean c. Mode © Carnegie Learning b. Median 776 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 776 28/08/13 3:00 PM 6. Calculate each quartile for your sample. a. First quartile b. Third quartile 7. Create a box-and-whisker plot to represent your sample. 8. Determine each county’s absolute deviation from the mean. Write your answers in the third column of the table in Question 4. © Carnegie Learning 9. Calculate the mean absolute deviation of your sample. 14.5 Sample Populations • 777 462351_CL_C3_CH14_pp729-794.indd 777 28/08/13 3:00 PM The mean population of a Texas county in the year 2011 Notice the jagged line on the box-and-whisker plot. This indicates a break on the number line, because the range of values is so large. is 101,081 people, while the median is 18,395. The first quartile is 6866, and the third quartile is 49,783. A box-and-whisker plot of the population data for all counties in Texas is shown. Q1 = 6866 Q2 = 18,395 Q3 = 49,783 min = 94 0 max = 4,180,894. 5000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 4,180,864 The average absolute deviation for the whole population is 135,279. 10. Compare your sample statistics to the statistics for the whole state of Texas. Do you think your sample accurately reflects the characteristics of the whole population? © Carnegie Learning Explain your reasoning. 778 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 778 28/08/13 3:00 PM Problem 2 More Samples 1. Choose two more different samples of ten counties randomly. Then, calculate the mean of each sample and use it to calculate the absolute deviations. Write your answers in the table. Sample 2 County Population Sample 3 Absolute Deviation from the Mean County Population Absolute Deviation from the Mean 2. Calculate each statistic in the table on the following page for Samples 2 and 3. Also, © Carnegie Learning record each statistic for Sample 1 from the previous problem in this same table. 14.5 Sample Populations • 779 462351_CL_C3_CH14_pp729-794.indd 779 28/08/13 3:00 PM Mean Median Mode Quartiles Q1 5 Q3 5 Sample 2 Q1 5 Q3 5 Sample 3 Q1 5 Q3 5 © Carnegie Learning Sample 1 Mean Absolute Deviation Population 101,081 18,395 None Q1 5 6866 Q3 5 49,783 135,279 780 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 780 28/08/13 3:00 PM 3. How do the statistics of your three samples compare to the statistics for the entire data set? 4. Use the additional space in the table to record the statistics for the samples from other groups in your class. 5. Calculate the average of each statistic for all of the samples you and your classmates gathered. a. average mean © Carnegie Learning b. average median c. average mode d. average first quartile 14.5 Sample Populations • 781 462351_CL_C3_CH14_pp729-794.indd 781 28/08/13 3:00 PM e. average third quartile f. average mean absolute deviation 6. How do the averages of the statistics compare to the statistics for the entire data set? 7. When conducting a study, do you think it is better to use the mean of one random sample, or to use the average of the means of multiple samples? Explain your reasoning. 8. When conducting a study, do you think it is better to use the mean absolute deviation of one random sample, or to use the average of the mean absolute deviations of © Carnegie Learning multiple samples? Explain your reasoning. Be prepared to share your solutions and methods. 782 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 782 28/08/13 3:00 PM © Carnegie Learning ID # County Name 2011 Population ID # County Name 2011 Population 1 Anderson County 58,308 36 Chambers County 35,552 2 Andrews County 15,445 37 Cherokee County 51,140 3 Angelina County 87,669 38 Childress County 6,984 4 Aransas County 23,374 39 Clay County 10,721 5 Archer County 8,842 40 Cochran County 3,109 6 Armstrong County 1,928 41 Coke County 3,305 7 Atascosa County 45,579 42 Coleman County 8,763 8 Austin County 28,665 43 Collin County 812,226 9 Bailey County 7,247 44 Collingsworth County 3,092 10 Bandera County 20,538 45 Colorado County 20,816 11 Bastrop County 75,115 46 Comal County 111,963 12 Baylor County 3,741 47 Comanche County 13,891 13 Bee County 32,095 48 Concho County 4,081 14 Bell County 315,196 49 Cooke County 38,396 15 Bexar County 1,756,153 50 Coryell County 76,508 16 Blanco County 10,600 51 Cottle County 1,499 17 Borden County 626 52 Crane County 4,383 18 Bosque County 18,306 53 Crockett County 3,722 19 Bowie County 92,793 54 Crosby County 6,092 20 Brazoria County 319,973 55 Culberson County 2,383 21 Brazos County 197,632 56 Dallam County 6,866 22 Brewster County 9,386 57 Dallas County 2,416,014 23 Briscoe County 1,651 58 Dawson County 13,751 24 Brooks County 7,222 59 Deaf Smith County 19,595 25 Brown County 38,186 60 Delta County 5,212 26 Burleson County 17,251 61 Denton County 686,406 27 Burnet County 43,117 62 DeWitt County 20,255 28 Caldwell County 38,442 63 Dickens County 2,402 29 Calhoun County 21,442 64 Dimmit County 10,118 30 Callahan County 13,515 65 Donley County 3,651 31 Cameron County 414,123 66 Duval County 11,669 32 Camp County 12,407 67 Eastland County 18,633 33 Carson County 6,259 68 Ector County 140,111 34 Cass County 30,256 69 Edwards County 1,966 35 Castro County 8,116 70 Ellis County 152,753 14.5 Sample Populations • 783 462351_CL_C3_CH14_pp729-794.indd 783 28/08/13 3:00 PM County Name 2011 Population ID # County Name 2011 Population 71 El Paso County 820,790 106 Hemphill County 3,970 72 Erath County 38,266 107 Henderson County 78,826 73 Falls County 17,944 108 Hidalgo County 797,810 74 Fannin County 33,958 109 Hill County 35,392 75 Fayette County 24,732 110 Hockley County 22,892 76 Fisher County 3,914 111 Hood County 51,670 77 Floyd County 6,394 112 Hopkins County 35,371 78 Foard County 1,341 113 Houston County 23,484 79 Fort Bend County 606,953 114 Howard County 35,122 80 Franklin County 10,551 115 Hudspeth County 3,423 81 Freestone County 19,684 116 Hunt County 86,531 82 Frio County 17,400 117 Hutchinson County 22,132 83 Gaines County 18,003 118 Irion County 1,620 84 Galveston County 295,747 119 Jack County 9,035 85 Garza County 6,562 120 Jackson County 14,032 86 Gillespie County 25,114 121 Jasper County 36,296 87 Glasscock County 1,251 122 Jeff Davis County 2,288 88 Goliad County 7,243 123 Jefferson County 252,802 89 Gonzales County 19,904 124 Jim Hogg County 5,265 90 Gray County 22,755 125 Jim Wells County 41,339 91 Grayson County 121,419 126 Johnson County 152,734 92 Gregg County 123,081 127 Jones County 20,146 93 Grimes County 26,887 128 Karnes County 14,946 94 Guadalupe County 135,757 129 Kaufman County 105,358 95 Hale County 36,498 130 Kendall County 34,781 96 Hall County 3,377 131 Kenedy County 437 97 Hamilton County 8,472 132 Kent County 825 98 Hansford County 5,577 133 Kerr County 49,783 99 Hardeman County 4,176 134 Kimble County 4,632 100 Hardin County 55,246 135 King County 255 101 Harris County 4,180,894 136 Kinney County 3,630 102 Harrison County 66,296 137 Kleberg County 32,196 103 Hartley County 5,971 138 Knox County 3,758 104 Haskell County 5,989 139 Lamar County 50,074 105 Hays County 164,050 140 Lamb County 14,167 © Carnegie Learning ID # 784 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 784 28/08/13 3:00 PM © Carnegie Learning ID # County Name 2011 Population ID # County Name 2011 Population 141 Lampasas County 19,891 176 Newton County 14,454 142 La Salle County 7,001 177 Nolan County 15,269 143 Lavaca County 19,347 178 Nueces County 343,281 144 Lee County 16,666 179 Ochiltree County 10,530 145 Leon County 16,916 180 Oldham County 2,074 146 Liberty County 76,206 181 Orange County 82,487 147 Limestone County 23,634 182 Palo Pinto County 28,115 148 Lipscomb County 3,327 183 Panola County 24,058 149 Live Oak County 11,447 184 Parker County 118,376 150 Llano County 19,181 185 Parmer County 10,332 151 Loving County 94 186 Pecos County 15,716 152 Lubbock County 283,910 187 Polk County 45,725 153 Lynn County 5,888 188 Potter County 122,285 154 McCulloch County 8,290 189 Presidio County 7,761 155 McLennan County 238,564 190 Rains County 11,059 156 McMullen County 690 191 Randall County 123,351 157 Madison County 13,747 192 Reagan County 3,390 158 Marion County 10,507 193 Real County 3,426 159 Martin County 4,934 194 Red River County 12,703 160 Mason County 3,984 195 Reeves County 13,757 161 Matagorda County 36,809 196 Refugio County 7,291 162 Maverick County 55,405 197 Roberts County 816 163 Medina County 46,367 198 Robertson County 16,740 164 Menard County 2,264 199 Rockwall County 81,290 165 Midland County 140,308 200 Runnels County 10,524 166 Milam County 24,699 201 Rusk County 53,759 167 Mills County 4,848 202 Sabine County 10,740 168 Mitchell County 9,426 203 San Augustine County 8,874 169 Montague County 19,710 204 San Jacinto County 26,801 170 Montgomery County 471,734 205 San Patricio County 64,726 171 Moore County 21,954 206 San Saba County 6,023 172 Morris County 12,848 207 Schleicher County 3,309 173 Motley County 1,230 208 Scurry County 16,919 174 Nacogdoches County 65,466 209 Shackelford County 3,318 175 Navarro County 48,054 210 Shelby County 25,772 14.5 Sample Populations • 785 462351_CL_C3_CH14_pp729-794.indd 785 28/08/13 3:00 PM County Name 2011 Population ID # County Name 2011 Population 211 Sherman County 3,040 246 Williamson County 442,782 212 Smith County 213,381 247 Wilson County 43,789 213 Somervell County 8,451 248 Winkler County 7,178 214 Starr County 61,715 249 Wise County 59,833 215 Stephens County 9,548 250 Wood County 42,164 216 Sterling County 1,158 251 Yoakum County 8,005 217 Stonewall County 1,472 252 Young County 18,484 218 Sutton County 4,007 253 Zapata County 14,282 219 Swisher County 7,801 254 Zavala County 11,849 220 Tarrant County 1,849,815 221 Taylor County 132,755 222 Terrell County 960 223 Terry County 12,675 224 Throckmorton County 1,609 225 Titus County 32,596 226 Tom Green County 111,823 227 Travis County 1,063,130 228 Trinity County 14,659 229 Tyler County 21,666 230 Upshur County 39,826 231 Upton County 3,346 232 Uvalde County 26,535 233 Val Verde County 49,106 234 Van Zandt County 52,776 235 Victoria County 87,545 236 Walker County 68,087 237 Waller County 44,013 238 Ward County 10,716 239 Washington County 33,791 240 Webb County 256,496 241 Wharton County 41,314 242 Wheeler County 5,465 243 Wichita County 130,698 244 Wilbarger County 13,404 245 Willacy County 22,095 © Carnegie Learning ID # 786 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 786 28/08/13 3:00 PM Chapter 14 Summary Key Terms variable (14.1) two-variable data set (14.1) independent variable (explanatory variable) (14.2) dependent variable (response variable) (14.2) association (14.2) linear association (14.2) cluster (14.2) positive association (14.2) negative association (14.2) outlier (14.2) measures of variation or variability (14.4) deviation (14.4) absolute deviation (14.4) mean absolute deviation (14.4) Creating and Interpreting Scatter Plots A scatter plot is a graph of a set of ordered pairs. The ordered pairs are formed by using the values of the first characteristic as the first, or x-coordinate, and the values of the second characteristic as the second, or y-coordinate. By analyzing the data points, it can be determined if there is a relationship between the two variables. Example The graph shown represents players’ height on a baseball team and the cap size they wear. Height vs. Baseball Cap Size 8 Baseball Cap Size © Carnegie Learning 7½ 7 6½ 6 0 0 72 74 76 78 80 Height (in.) It appears that as a player’s height becomes greater, the player’s cap size increases. Chapter 14 Summary • 787 462351_CL_C3_CH14_pp729-794.indd 787 28/08/13 3:00 PM In a linear plot, if one variable increases as the other variable increases, then the two variables are said to have a positive association. If one variable increases as the other variable decreases, then the two variables are said to have a negative association. An outlier is a point that varies greatly from the overall pattern of the scatter plot. Example Fifteen students recently took a math test. The table lists the time each student spent studying for the test and the student’s corresponding test score. Minutes Spent 30 Studying Test Score 0 40 10 60 0 120 100 90 60 20 100 80 30 90 80 50 75 65 90 70 100 95 55 80 70 100 85 70 90 Minutes Spent Studying vs. Test Score 100 90 Test Score 80 70 60 50 40 0 0 20 40 60 80 100 120 140 Minutes Spent Studying The data are linear and have a positive association because the test scores increase, in © Carnegie Learning general, as the minutes spent studying increase. There is an outlier at the point (90, 55). 788 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 788 28/08/13 3:01 PM Conducting an Experiment to Display and Interpret Data Using a Scatter Plot Data from an experiment can be collected in a table, but can be hard to interpret. The dependent and independent variables from the experiment can be listed as ordered pairs and plotted on a scatter plot. Then, patterns or associations can be determined and interpreted from the scatter plot. Example Students set up an experiment to compare leg muscle fatigue to the ability to balance on one leg. Two trials of the balancing (one on each leg) were recorded, starting with 0 lunges being performed. Then the students performed 10 lunges and two more trials were recorded. With each experiment, the students performed 10 more lunges. Then they plotted the results on a scatter plot with the total number of lunges performed as the independent variable and the average balance time as the dependent variable. A negative linear association can be seen on the scatter plot. So, as the number of lunges performed © Carnegie Learning increases, the average time a student can balance on one leg decreases. Chapter 14 Summary • 789 462351_CL_C3_CH14_pp729-794.indd 789 28/08/13 3:01 PM Total Number of Lunges Performed Trial 1 Right Leg (seconds) Trial 2 Left Leg (seconds) Average Time Balanced on One Leg (seconds) 0 120 110 115 10 72 75 73.5 20 58 55 56.5 30 47 45 46 40 35 37 36 50 29 26 27.5 60 20 18 19 y Average Balance Time(s) 120 100 80 60 40 20 0 x 0 10 30 20 40 50 60 © Carnegie Learning Number of Lunges 790 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 790 28/08/13 3:01 PM Calculating the Deviation of Each Data Value from the Mean of a Data Set Measures of variation or variability describe the spread of the data values. Just as there are several measures of central tendency, there are also several measures of variation. The deviation of a data value indicates how far the data value is from the mean. To calculate the deviation, subtract the mean from the data value. Example (78 1 83 1 90 1 85) The mean is ___________________ , or 84. So, the deviation of each data value from the 4 mean is determined by subtracting the mean from the data value, as shown in the table. Deviation from the Mean 78 26 83 21 90 6 85 1 © Carnegie Learning Test Score Chapter 14 Summary • 791 462351_CL_C3_CH14_pp729-794.indd 791 28/08/13 3:01 PM Calculating the Absolute Deviation of Each Data Value from the Mean of a Data Set The absolute value of each deviation is called the absolute deviation. To determine the absolute deviation, calculate the deviation, and then determine the absolute value. Example (5 1 10 1 8 1 9) The mean is ________________ , or 8. So, the deviation of each data value from the mean is 4 determined by subtracting the mean from the data value. The absolute value of each deviation is then calculated. Calculation of Deviation from the Mean Deviation from the Mean Absolute Deviation 5 5—8 23 3 10 10—8 2 2 8 8—8 0 0 9 9—8 1 1 © Carnegie Learning Calls Made Per Day 792 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 792 28/08/13 3:01 PM Calculating and Interpreting the Mean Absolute Deviation for a Data Set In order to get an idea of the spread of the data values, you can take the absolute value of each deviation, and then determine the mean of those absolute values. Example The number of apples harvested each year for the last 5 years has been 235, 201, 193, (235 1 201 1 193 1 228 1 243) , or 220. 228, and 243. The mean is _____________________________ 5 Apples Harvested Per Year Deviation from the Mean Absolute Deviation 235 15 15 201 219 19 193 227 27 228 8 8 243 23 23 © Carnegie Learning (15 1 19 1 27 1 8 1 23) 2 . This means that the The mean absolute deviation is _______________________ , or 18 __ 5 5 annual apple harvest is typically within about 18 apples from a total of 220. Chapter 14 Summary • 793 462351_CL_C3_CH14_pp729-794.indd 793 28/08/13 3:01 PM Using a Random Sample to Represent a Whole Population When dealing with a large amount of data, it is often impractical to analyze all the data points. A sample is a subset of a larger data set which can be used to determine characteristics of a whole group. Example A state in the U.S. has 67 counties, all with varied populations. You can use the randInt( function on your graphing calculator to generate a random sample. Then, you can use the sample to determine average statistics for the entire state. County Population Absolute Deviation from the Mean Adams 91,292 120,152.875 Butler 174,083 37,361.875 Dauphin 251,798 40,353.125 Huntingdon 45,586 165,858.875 Luzerne 319,250 107,805.125 Montgomery 750,097 538,652.125 Potter 18,080 193,364.875 Tioga 41,373 170,071.875 mean of sample population: 211,444.875 median of sample population: 132,687.5 mean of entire population: 181,212.1618 median of entire population: 90,366 mean absolute deviation of entire population: 162,548.439 The mean, median, and mean absolute deviation of the sample population are all slightly larger than the mean, median, and mean absolute deviation of the entire population. © Carnegie Learning mean absolute deviation of sample population: 171,702.5938 However, the mean and mean absolute deviation are quite close. This means the random sample is probably an accurate representation of the entire population. 794 • Chapter 14 Data Displays and Analysis 462351_CL_C3_CH14_pp729-794.indd 794 28/08/13 3:01 PM
© Copyright 2024