Chapter 2 Descriptive Statistics Copyright of the definitions and examples is reserved to Pearson Education, Inc.. In order to use this PowerPoint presentation, the required textbook for the class is the Fundamentals of Statistics, Informed Decisions Using Data, Michael Sullivan, III, fourth edition. Los Angeles Mission College Prepared by DW Ch 2.1 Organizing Qualitative Data Objective A : Interpretation of a Basic Statistical Graph Objective B : Construct a Frequency / Relative Frequency Distribution, Bar Graph, Pareto Chart and Pie Chart Los Angeles Mission College Prepared by DW Ch 2.1 Organizing Qualitative Data Objective A : Interpretation of a Basic Statistical Graph Example 1 : Identity Theft Identity fraud occurs someone else’s personal information is used to open credit card accounts, apply for a job, receive benefits, and so on. The following relative frequency bar graph represents the various types of identity theft based on a study conducted by the Federal Trade Commission. Los Angeles Mission College Prepared by DW (a) Approximate what percentage of identity theft was loan fraud (such as applying for a loan in someone else’s name)? 0.05 100% 5% 1 (b) If there were 10 million cases of identity fraud in 2008, how many were credit card fraud (someone uses someone else’s credit card to make a purchase) ? 0.26 10 million 2.6 million 1 Los Angeles Mission College Prepared by DW Ch 2.1 Organizing Qualitative Data Objective A : Interpretation of a Basic Statistical Graph Objective B : Construct a Frequency / Relative Frequency Distribution, Bar Graph, Pareto Chart and Pie Chart Los Angeles Mission College Prepared by DW Ch 2.1 Organizing Qualitative Data Objective B : Construct a Frequency / Relative Frequency Distribution, Bar Graph, Pareto Chart and Pie Chart B1. Frequency / Relative Frequency Distribution • A frequency distribution lists each category of data and the frequency which is the number of occurrences for each category data. • A relative frequency distribution lists each category of data and the relative frequency which is the proportion of observation within a category. Relative frequency Los Angeles Mission College frequency sum of all frequency Prepared by DW Example 1 : In a national survey conducted by the Centers of Disease Control to determine health-risk behaviors among college students, college students were asked, “How often do you wear a seat beat when driving a car?” The frequencies were as follows: Response I do not drive a car Never Rarely Sometimes Most of the time Always Los Angeles Mission College Frequency 249 118 249 345 716 3093 Prepared by DW (a) Construct a relative frequency distribution. Response I do not drive a car Never Rarely Sometimes Most of the time Always Los Angeles Mission College Frequency (f ) Relative Frequency ( f / f ) 249 118 249 345 716 3093 f 4770 249 / 4770 0.052 118 / 4770 0.025 249 / 4770 0.052 345 / 4770 0.072 716 / 4770 0.150 3093/ 4770 0.648 Prepared by DW (b) What percentage of respondents answered “Always”? 64.8% (c) What percentage of respondents answered “Never” or “Rarely”? 2.5% + 5.2% = 7.7% (d) Suppose that a representative from the Centers for Disease Control says, “2.5% of the college students in this survey responded that they never wear a seat belt.” Is this a descriptive or inferential statement? Descriptive statement. Los Angeles Mission College Prepared by DW Ch 2.1 Organizing Qualitative Data Objective B : Construct a Frequency / Relative Frequency Distribution, Bar Graph, Pareto Chart and Pie Chart B2. Construct a Bar Graph, a Pareto Chart, or a Pie Chart • A bar graph is constructed by labeling each category of data on either the horizontal or vertical axis and the frequency or relative frequency of the category on the other axis. Rectangles of equal width are drawn for each category. The height of each rectangle represents the category’s frequency or relative frequency. • A Pareto chart is a bar graph whose bars are drawn in decreasing order of frequency or relative frequency. • A pie chart is a circle divided into sectors. Each sector represents a category of data. The area of each sector is proportion to the frequency of the category. Los Angeles Mission College Prepared by DW Example 2 : A sample of 40 randomly selected registered voters in Sylmar was asked their Political affiliation: Democrat (D), Republican (R), Independent (I). The results of the survey are as follows: R D R D R R D R D D D D R R D R D D I D D R R D D D I D R D D D I R D R D D D R (a) Construct a frequency distribution of the data. Political Party Frequency ( f ) Republican 14 23 Democrat Independen t Los Angeles Mission College 3 Prepared by DW (b) Construct a relative frequency distribution of the data. Political Affilation Frequency Relative Frequency ( f / f ) Republican 14 Democrat 23 3 f 40 14 / 40 0.350 23 / 40 0.575 3 / 40 0.075 Independen t (c) Construct a frequency bar graph. 25 Frequency 20 15 10 5 0 R Los Angeles Mission College D Party I Prepared by DW (d) Construct a relative frequency bar graph. 0.7 Relative Frequency 0.6 0.5 0.4 0.3 0.2 0.1 0 R D I Party (e) Construct a Pareto chart. 25 Frequency 20 15 10 5 0 D R I Party Los Angeles Mission College Prepared by DW (f) Construct a pie chart. I 7% R 35% D 58% Los Angeles Mission College Prepared by DW (g) Use StatCrunch to construct a pie chart. Click StatCrunch navigation button of the Course Home page Click StatCrunch website Click Open StatCrunch Input the raw data in Var 1 column Click Graph Pie chart with Data Click Var 1 for Select column(s): Los Angeles Mission College Prepared by DW Los Angeles Mission College Prepared by DW The pie chart is obtained from StatCrunch. For more detailed instructions, please download “Q2.1.24 “ by clicking the StatCrunch Handout navigation button of the course homepage. Los Angeles Mission College Prepared by DW Ch 2.2 Organizing Quantitative Data : The Popular Displays Objective A : Histogram Objective B : Constructing a Stem-and-Leaf Plot Objective C : Construct Frequency Distributions and Histogram for Continuous Data Objective D : Time Series Graphs Los Angeles Mission College Prepared by DW Ch 2.2 Organizing Quantitative Data : The Popular Displays Objective A : Histogram A histogram is constructed by drawing rectangles for each class of data. If the discrete data set is small, each number is a class. If the discrete data set is large or the data are continuous, the classes must be created using interval of numbers. The height of each rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same and the rectangles touch each other. Los Angeles Mission College Prepared by DW Construct Frequency Distribution and Histogram for Discrete Data Example 1 : The following data represent the number of customers waiting for a table at 6:00 p.m. for 40 consecutive Saturdays at Bobak’s Restaurant: 11 5 11 3 6 8 6 7 4 5 13 9 6 4 14 11 13 10 9 6 8 10 9 5 10 8 7 3 8 8 7 8 7 9 10 4 8 6 11 8 (a) Are these data discrete or continuous? Explain Discrete because the data are whole numbers. Los Angeles Mission College Prepared by DW (b) Construct a frequency distribution of the data. Number of customers' waiting 3 4 5 6 7 8 9 10 11 12 Los Angeles Mission College Frequency ( f ) 2 3 3 13 5 4 8 4 4 4 0 2 14 1 Prepared by DW (c) Construct a relative frequency distribution of the data. Number of customers' waiting 3 4 5 6 7 8 9 10 11 12 13 14 Los Angeles Mission College Frequency 2 3 3 5 4 8 4 4 4 0 2 1 f 40 Relative Frequency ( f / f ) 2 / 40 0.050 3/ 40 0.075 3 / 40 0.075 5 / 40 0.125 4 / 40 0.100 8 / 40 0.200 4 / 40 0.100 4 / 40 0.100 4 / 40 0.100 0 / 40 0.000 2 / 40 0.050 1/ 40 0.025 Prepared by DW (d) What percentage of the Saturdays had 10 or more customers waiting for a table at 6:00 p.m.? 0.100 + 0.100 + 0.000 + 0.050 + 0.025 = 0.275 0.3 (e) Construct a frequency histogram of the data. 9 8 7 Frequency 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Number of Customers' Waiting Los Angeles Mission College Prepared by DW Identify the shape of each distribution. Uniform distribution Bell-shaped curve Right-skewed distribution Left-skewed distribution Los Angeles Mission College Prepared by DW Ch 2.2 Organizing Quantitative Data : The Popular Displays Objective A : Histogram Objective B : Constructing a Stem-and-Leaf Plot Objective C : Construct Frequency Distributions and Histogram for Continuous Data Objective D : Time Series Graphs Los Angeles Mission College Prepared by DW Objective B : Constructing a Stem-and-Leaf Plot The stem of a data value will consist of the digits to the left of the rightmost digit. The leaf of a data value will be the rightmost digit. Los Angeles Mission College Prepared by DW Example 1 : The following data represent the number of miles per gallon achieved on the highway for small cars for the model year 2008. 27 31 28 30 52 25 33 33 29 23 27 37 30 45 24 32 34 35 31 44 42 26 43 35 36 36 54 33 32 35 34 37 Reorder: 23 24 25 26 27 27 28 29 30 30 31 31 32 32 33 33 33 34 34 35 35 35 36 36 37 37 42 43 44 45 52 54 (a) Construct a stem-and-leaf plot. Stem Leaf 2 3 4 5 6 7 7 8 9 3 0 0 1 1 2 2 3 3 3 4 4 5 5 5 6 6 7 7 4 2 3 4 5 5 2 4 (b) Describe the shape of the distribution. Slightly skewed to the right. Los Angeles Mission College Prepared by DW Ch 2.2 Organizing Quantitative Data : The Popular Displays Objective A : Histogram Objective B : Constructing a Stem-and-Leaf Plot Objective C : Construct Frequency Distributions and Histogram for Continuous Data Objective D : Time Series Graphs Los Angeles Mission College Prepared by DW Objective C : Construct Frequency Distributions and Histogram for Continuous Data • Classes are categories in which data are grouped. • The lowest class limit is the smallest value within a class. • The upper class limit is the largest value within a class. • The class width is the difference between consecutive lower class limits. • The class width is computed by the following formula. largest data value – smallest data value Class width number of class -------> Round this value up to the same decimal place as the raw data. Los Angeles Mission College Prepared by DW Example 1 : The following data represent the fall 2006 student headcount enrollments for all public community colleges in the state of Illinois. (a) Find the number of class. 6 Los Angeles Mission College Prepared by DW (b) Find the class limits. Lowest class limits: 0, 5,000, 10,000, 15,000, 20,000, 25,000 Upper class limits: 4,999, 9,999, 14,999, 19,999, 24,999, 29,999 (c) Find the class width. Class width = 5000 – 0 = 5000 Los Angeles Mission College Prepared by DW Example 2 : Uninsured Rates The following data represent the percentage of people without health insurance for the 50 states and the District of Columbia in 2009. (Ch 2.2 Q36 p. 94) With the first class having a lower class limit of 4 and a class width of 2: Los Angeles Mission College Prepared by DW (a) Construct a frequency distribution. Put the data in ascending order first : ( 4.2) ( 8.6 9.2 9.6 9.6 9.7)(10.2 10.5 10.6 10.6 10.9 10.9 11.3 11.4 11.6) ( 12.3 12.6 13.0 13.3 13.4 13.9)(14.0 14.3 14.7 14.8 15.5 15.9 15.9)(16.1 16.1 16.1 16.2 17.8)(18.1 18.3 18.3 18.4 18.4 18.6 18.7 18.9 19.4 19.6 19.7)(20.6 21.1 21.2 21.3 21.4)(22.2) (25.0) 2 2 2 Los Angeles Mission College Class Limit 4 5.9 6 7.9 8 9.9 10 11.9 12 13.9 14 15.9 16 17.9 18 19.9 20 21.9 22 23.9 24 25.9 Frequency( f ) 1 0 5 9 6 7 5 11 5 1 1 Prepared by DW (b) Construct a relative frequency distribution. Class Limit 4 5.9 6 8 10 12 14 16 18 20 22 24 Los Angeles Mission College 7.9 9.9 11.9 13.9 15.9 17.9 19.9 21.9 23.9 25.9 Frequency ( f ) 1 0 5 9 6 7 5 11 5 1 1 f 51 Relative Frequency ( f ) 1/ 51 0.0196 0 / 51 0 5 / 51 0.0980 9 / 51 0.1765 6 / 51 0.1176 7 / 51 0.1373 5 / 51 0.0980 11/ 51 0.2157 5 / 51 0.0980 1/ 51 0.0196 1/ 51 0.0196 Prepared by DW (c) Construct a frequency histogram of the data. Los Angeles Mission College Prepared by DW (d) Construct a relative frequency histogram of the data. (e) Describe the shape of the distribution. Skewed to the right. Los Angeles Mission College Prepared by DW (f) Use StatCrunch to repeat parts (a) to (e) with the first class having a lower class of 4 and a class width of 4. Note: For qualitative data, we can use StatCrunch to construct a frequency histogram first, then from the histogram we obtain the frequency distribution. Construct a frequency histogram first part (c) Step 1: 1) Log in StatCunch Data Sets from your textbook Click Chapter 2 Click 2.2.36 2) Click Graph → Histogram. Los Angeles Mission College Prepared by DW Step 2: 1) Click Uninsured Rates under Select Column(s): 2) Choose Frequency under Type: 3) Under Bins: --> enter 4 for Start at: --> enter 4 for Width: 4) Under Display option: --> check √ Value above bar. 5) Under Graph Properties: --> enter Uninsured Rates for X-axis label, Frequency for Y-axis label, and Frequency Histogram for Uninsured Rates for Title. 6) Click Compute! Los Angeles Mission College Prepared by DW The frequency histogram is obtained from StatCrunch. Los Angeles Mission College Prepared by DW Construct a frequency distribution. From the frequency histogram, the classes of uninsured rates can be determined. Class Limit Frequency( f ) 4 5.9 1 6 7.9 14 8 9.9 13 10 11.9 16 12 13.9 6 14 15.9 1 For more detailed instructions, please download “Q2.R.6 “ by clicking the StatCrunch Handout navigation button of the course homepage. Los Angeles Mission College Prepared by DW Example 3 : The largest value of a data set is 125 and the smallest value of the data set is 27. If six classes are to be formed, calculate an appropriate class width. Class width = Roundup ( = Roundup ( largest data value – smallest data value number of class 125 – 27 6 ) ) = Roundup( 16.3333) 17 Los Angeles Mission College Prepared by DW Ch 2.2 Organizing Quantitative Data : The Popular Displays Objective A : Histogram Objective B : Constructing a Stem-and-Leaf Plot Objective C : Construct Frequency Distributions and Histogram for Continuous Data Objective D : Time Series Graphs Los Angeles Mission College Prepared by DW Objective D : Time Series Graphs A time series graph represents the values of a variable that have been collected over a specified period of time. The horizontal axis is the time and the vertical axis is the value of the variable. Line segments are drawn by connective consecutive points of time and corresponding value of the variable. Los Angeles Mission College Prepared by DW Example 1: The following time-series graph shows the annual U.S. motor vehicle production from 1990 through 2008. Los Angeles Mission College Prepared by DW (a) Estimate the number of motor vehicles produced in the United States in 1991. 8900 thousands vehicles. (b) Estimate the number of motor vehicles produced in the United States in 1999. 13000 thousands vehicles. (c) Use the results from (a) and (b) to estimate the percent increase in the number of motor vehicles produced from 1991 to 1999. Amount Increase 100 13000 8900 100 46.1% Original 8900 (d) Estimate the percent decrease in the number of motor vehicles produced from 1999 to 2008. Amount Decrease 100 8800 13000 100 32.3% Original 13000 Los Angeles Mission College Prepared by DW Ch 2.3 Graphical Misrepresentations of Data Los Angeles Mission College Prepared by DW Ch 2.3 Graphical Misrepresentations of Data The most common graphical misinterpretation of data is accomplished through manipulation of the scale of the graph. Example 1 : Union Membership The following relative frequency histogram represents the proportion of employed people aged 25 to 64 years old who were members of a union. Los Angeles Mission College Prepared by DW (a) Describe how this graph is misleading. What might a reader conclude from the graph? The vertical axis starts a 0.08 instead of 0. Readers may think the proportion of those employed aged 45 to 54 years who are union members is much higher than for those aged 35 to 44 years. (b) Redraw the histogram with a starting point of zero on the vertical axis so that it is not misleading 0.18 Union Membership Proportion Employed 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 Los Angeles Mission College 25 35 45 Age 55 65 Prepared by DW Example 2 : Inauguration Cost The following is a USA Today-type graph. Explain how it is misleading. The lengths of the bars are not proportional. For example, the bar representing the cost of Clinton’s inauguration should be slightly more than 9 times the one for Carter’s cost and twice as long as the bar representing Reagan’s cost. Los Angeles Mission College Prepared by DW
© Copyright 2024