OPRE504 Chapter Study Guide Chapter 12 Compare Two Groups I Two-Sample t-Test Two-Sample t-Test We assume that two groups are independent from each other and may have different variances. 1. 2. State Hypotheses: H0: Ha: (two-tailed) Ha: (one-tail upper) or Ha: (one-tail lower) Calculate Standard Error of Mean Difference (̅ ̅ )=√ , s1 = Standard Deviation of Sample 1, n1= size of Sample 1, s2 = Standard Deviation of Sample 2, n2= size of sample 2. 3. Determine Adjusted Degree of Freedom ( ) df = ( ) ( ) ) and ( [Note: the smaller of ( ) < df < ] 4. Determine Critical Value (t*) according to Degree of Freedom and significance level 5. Calculate t-statistic t= 6. (̅ ̅ ) ( (̅ ̅ ) ) = Decision Reject H0 when t> Reject H0 when t> Reject H0 when t< - (̅ (̅ ̅ ) ̅ ) or t< -| | for two-tailed test for one-tail upper test for one-tail lower test Q12.1 [Sharpe 2011, Ch.10, E.25] In an investigation of environmental causes of diseases, data were collected on the annual mortality rate (deaths per 100,000) for male in 61 large towns in Chaodong Han OPRE504 Page 1 of 8 England and Wales. In addition, those towns are classified into two groups – North and South of Derby. Is there a significant difference in mortality rates in the two regions at the 5% significance level? Here are summary statistics: Mortality North South 1. Count 34 27 Mean 1631.59 1388.85 Standard Deviation 138.470 151.114 H0: H1: (two-tailed test) (̅ 2. =√ ̅ )=√ ( 3. Median 1631 1369 = 37.546 ) = 53.49, alpha = 5%, two-tailed, df = ( ) ( ) = 2.005 (estimated) [since df 50=2.009, df 60 = 2.000 in the T-Table] TINV(.05, 53.49) = 2.0057 (̅ 4. t= 5. t> ̅ ) ( (̅ ̅ ) ) = = 6.465 , reject H0. DDXL – Hypothesis Tests - 2 Var t Test: More exercises: Chapter 12, Exercises 23, 24, and 26 Chaodong Han OPRE504 Page 2 of 8 II. Confidence Interval for the Difference Between Two Group Means Two-Sample t-Interval We assume that two groups are independent from each other and may have different variances. (̅ Step 1: ̅ )=√ , s1 = Standard Deviation of Sample 1, n1= size of Sample 1, s2 = Standard Deviation of Sample 2, n2= size of sample 2. ( ) Step 2: Calculate adjusted degree of freedom: df = ( ) ( ) Step 3: Find out Critical Value of according to the confidence interval and adjusted degree of freedom (T-Table A-34 in Appendix C) Step 4: CI = ( ̅ ̅ )± (̅ x ̅ ) Q12.2 [Sharpe 2011, Ch.12, Ex.4, p.386] A chain that specializes in healthy and organic food would like to compare the sales performance of two of its primary stores in the state of Maryland. These stores are both in urban, residential areas with similar demographics. A comparison of the weekly sales randomly sampled over a period of nearly two years for these two stores yield the following information: Store # 1 2 a) N 9 9 Mean 242170 235338 Standard Deviation 23937 29690 Min 211225 187475 Median 232901 232070 Max 292381 287838 Create a 95% confidence interval for the difference in the mean store weekly sales (̅ =√ ̅ )`= √ ( = 12712.53 ) ( df = ) = ( ) ( = 15.3 (note, 8 <df < 18-2) ) ( ) ( ) = 2.131 (based on the T-Table for df=15, CI = 95%) [ TINV(.05, 15.3) = 2.1315 ] CI = ( ̅ ̅ )± Chaodong Han x (̅ ̅ ) = (242,170-235,338) ±2.131x 12,712.53 = (-20,264, 33,928) OPRE504 Page 3 of 8 b) How do you interpret CI in the context? We are 95% confident that the mean score for weekly sales at Store 1 are between $20,264 lower and $33,928 higher than the mean weekly sales at Store 2. c) Can you tell that one store sells more on weekly average than the other store? No. Since CI includes zero which indicates no difference, we can’t tell. d) Calculate the Margin of Error ME = e) x (̅ ̅ ) = 2.131x 12,712.53 = $27,096 Calculate a 99% confidence interval for the difference in mean store weekly sales = 2.947 (based on the T-Table for df=15, CI = 99%) CI = ( ̅ ̅ )± x (̅ ̅ ) = (242,170-235,338) ±2.947x 12,712.53 More exercises: Credit Card Spending, Guided Example, p.365. Chapter 12, Exercises 20, 22, 23, 39, 49, 50, 51 III Pooled Samples Pooled t-Test We assume that two groups are independent from each other and have same variances, at least when the null hypothesis is true. 1. 2. State Hypotheses: H0: Ha: (two-tailed) Ha: (one-tail upper) or Ha: (one-tail lower) Calculate Standard Error of Mean Difference (̅ Where Chaodong Han ̅ )= =√ √ ( ) , n1= size of Sample 1, n2= size of Sample 2. ( ) OPRE504 Page 4 of 8 3. Determine Adjusted Degree of Freedom df = n1 + n2 – 2 ( a slightly higher df than two-sample t-tests without equal variances) 4. Determine Critical Value (t*) according to Degree of Freedom and significance level 5. Calculate t-statistic t= 6. (̅ ̅ ) (̅ ̅ ) Decision Reject H0 when t> Reject H0 when t> Reject H0 when t< - or t< -| | for two-tailed test for one-tail upper test for one-tail lower test Q12.3 We want to know whether people are more likely to offer a different amount for a used camera when buying from a friend than when buying from a stranger. The data from an experiment are as follows. Test your hypothesis at 5% significance level. N 8 7 Friends Strangers 1. Mean Prices $281.88 $211.43 Standard Deviation $18.31 $46.43 State Hypotheses: H0: Ha: (two-tailed) ( =√ 2. (̅ 3. ) ̅ )= ( ) =√ √ ( ) = 34.285 x √ ( ) = 34.285 = 17.744 df = 7+8-2 = 13 and 5% alpha level, = 2.16 4. t= (̅ (̅ ̅ ) ̅ ) Chaodong Han = = 3.97 OPRE504 Page 5 of 8 5. t> , reject H0. The average amount paid for a used camera may be different between friends and strangers at 5% significance level. Pooled Confidence Interval Assume two groups are independent and have the same variances when null hypothesis is true 1. Calculate Standard Error of Mean Difference (̅ ̅ )= =√ Where √ ( ) , n1= size of Sample 1, n2= size of Sample 2. ( ) 2. Determine Adjusted Degree of Freedom df = n1 + n2 – 2( a slightly higher df than two-sample t-tests without equal variances) 3. Determine Critical Value (t*) according to Degree of Freedom and Confidence Interval Level: using T-Table A34 4. CI = ( ̅ ̅ )± (̅ ̅ ) Q12.4 We want to know whether people are more likely to offer a different amount for a used camera when buying from a friend than when buying from a stranger. The data from an experiment are as follows. Construct a 95% confidence interval for the difference. Friends Strangers 1. N 8 7 Mean Prices $281.88 $211.43 Standard Deviation $18.31 $46.43 Find Standard Error of Difference Distribution: =√ (̅ ( ) ̅ )= 2. df = 8+7-2 = 13; 4. CI = ( ̅ ̅ )± ( ) =√ ( ) = 34.285 x √ √ Step 3. ( ) = 34.285 = 17.744 = 2.1604 (̅ ̅ ) =(281.88-211.43) ±2.16 x 17.744 = 70.45 ±38.33 = (32.12, 108.78) Chaodong Han OPRE504 Page 6 of 8 Note: CI does not include 0, indicating the mean difference is significant different from 0, providing support for the hypothesis test conducted in Q12.3. VI Paired Data Paired t-test Paired data may be used when two groups are not independent from each other. For example, a firm’s sales in January in 2007 and January in 2008; a subject’s response before a treatment and after a treatment in an experiment. Such a test is essentially a one-sample t-test where the difference of means is treated as a single random variable. 1. State Hypotheses H0: μd = Δ0 Ha: μd ≠ Δ0 (two-tailed test); μd > Δ0 (one-tailed upper test); or μd < Δ0 (one-tailed lower test) 2. Determine Critical Value (t*) according to DF (n-1) and significance level 3. Calculate Standard Error of the Paired Difference SE( ̅ ) = 4. , is standard deviation of the pairwise difference, n = number of pairs Calculate t-statistic t= 5. √ ̅ ( ̅) = ̅ √ Decisions Reject H0 when t> Reject H0 when t> Reject H0 when t< - or t< -| | for two-tailed test for one-tail upper test for one-tail lower test Q12.5 We want to know whether credit card spending to change, on average, from December to January for a market segment. Our data record the credit card expenditure in December 2004 and January 2005 made by 911 cardholders. The average pairwise difference is $788.18 (December 2004 – January 2005) and standard deviation of the difference is $3740.22. a) Since we generally expect spending decreases from December to January, develop a hypothesis test for this belief at the 5% significance level. Chaodong Han OPRE504 Page 7 of 8 1. 2. State Hypotheses: H0: μd = 0; Ha: μd >0 (one-tailed upper test) Critical Value: df = 911-1 = 910; one-tail alpha = 5%; t* = 1.646 (use df=1000 in the T-Table) TINV(0.10, 0.05) = 1.6465 3. SE( ̅ ) = 4. t= ̅ √ = ( ̅) ,= ̅ √ = = 123.919 = 6.36 √ 5. t > t*, reject H0 and believe that credit card spending may have decreased, on average, from December 2004 to January 2005. b) Find a 95% confidence interval for the true mean difference in credit card charges between those two months for all cardholders in this segment. 1. t* = 1.962 given df=910 and CI at 95% [ TINV(0.05, 910) = 1.9626] 2. ME = t* x SE( ̅ ) = 1.962 x 123.919 = 243.13 3. CI = ̅ ±ME = 788.18 ±243.13 = ($545.05, $1031.31) More exercises on paired t-tests: Chapter 12 Exercises 53, 55, 56, 57, 58, 63, 64, 66, 67, 68, Chaodong Han OPRE504 Page 8 of 8