Two sample t-tests and CI, type I and II errors, power of t-tests Chapter 7 Katharina Henneb¨ ohl Two sample t-tests and CI, type I and II errors, power of t-tests Two sample t-tests (and CI) Lecture Introduction to Geostatistics, May 17, 2011 - updated: May 18, 2011 Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Katharina Henneb¨ohl Institute for Geoinformatics University of Muenster Influencing type I and II errors Power of a t-test Outlook References & further readings 7.1 Review chapter 6 Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Confidence intervals population σ unknown Two-sided and one-sided confidence intervals Two-sided CI ⇔ two-sided one sample t-test One-sided CI ⇔ one-sided one sample t-test t0 (=t) value ⇔ p value t0 (=t) value ⇔ sample mean Connection CI/t-test: The confidence interval can be seen as the set of acceptable null hypotheses. Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.2 Some learning goals chapter 7 Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl From confidence intervals for single mean to confidence intervals for difference in means Two sample t-tests (and CI) Understand the difference between independent and paired samples Difference in two means, independent samples From one sample t-test to two sample t-test Two sample t-test, independent samples Test errors Understand the concept of power of t-tests Pooled standard deviation CI for difference in two means, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.3 Decision tree - one sample t-test Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.4 Overview hypotheses - one sample t-test Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl 1 H0 : µ = µ0 , H0 : µ 6= µ0 2 H0 : µ ≥ µ0 , H0 : µ < µ0 We want to test (be sure) that a threshold is not exceeded. Critical region is [t1−α,df , ∞]. 3 H0 : µ ≤ µ0 , H0 : µ > µ0 We want to test (be sure) that a threshold is not undershot. Critical region is [−∞, tα,df ]. Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.5 Difference in two means, independent samples Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Suppose we have two pseudo samples S1 and S2 drawn from two different populations 1 and 2. We are interested in the difference of the population means µ1 − µ2 . A reasonable point estimate would be the difference in sample means m1 − m2 . An appropriate symmetric confidence interval around this estimate would have the form (m1 − m2 ) ± zα/2 · SE (NOTE: z are the quantiles of the standard normal distribution.) How do we obtain the standard error SE for (m1 − m2 )? Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.6 Standard error for CI difference in two means, σ1 and σ2 known Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl We assume that we know the standard deviation of the populations from which sample S1 and S2 were drawn. What is the standard error SE for m1 − m2 ? s σ12 σ22 SE = + n1 n2 where σ1 and n1 represent the standard deviation of population 1 and size of sample S1 and σ2 and n2 the standard deviation of population 2 and size of sample S2 . Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.7 Standard error CI difference in two means, σ1 = σ2 known/unknown Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Suppose the standard deviations of the two populations are equal σ1 = σ2 = σ, then r 1 1 SE = σ · + n1 n2 In practice, we need an estimate for σ: r 1 1 SE = sp · + n1 n2 How do we obtain sp based on the sample standard deviations s1 and s2 ? Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.8 Pooled standard deviation sp Since both populations should have the same unknown standard deviation σ, it seems appropriate to pool the information from both samples to derive an estimate. We will call this estimate sp pooled standard deviation: sP P (X1 − m1 )2 + (X2 − m2 )2 sp = (n1 − 1) + (n2 − 1) where X1 and X2 represent the data values in the respective sample. The same equation reformulated s (n1 − 1) ∗ s12 + (n2 − 1) ∗ s22 sp = (n1 − 1) + (n2 − 1) Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test We need the degrees of freedom df = (n1 − 1) + (n2 − 1) = n1 + n2 − 2 to specify t Outlook References & further readings 7.9 CI for difference in two means, σ1 = σ2 unknown Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl We can now form a (symmetric) confidence interval for the difference of two means, assuming independent samples and equal population standard deviations: Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples (µ1 −µ2 ) ∈ [(m1 −m2 )−t(1−α/2),df ∗SE, (m1 −m2 )+t(1−α/2),df ∗SE] Two sample t-test, independent samples with SE = sp ∗ q 1 n1 + 1 n2 . The usual interest is whether this interval contains zero. CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.10 Example by hand (1) Example Consider the two pseudo samples S1 and S2 drawn from two different populations 1 and 2: S1 = {7, 6.4, 6.9, 5.5, 6.5, 6.0} and S2 = {5.1, 4.9, 4.7, 4.6, 5.0}. We compute the 95% confidence interval for the difference of the two population means. m1 = 6.4 and m2 = 4.9 s12 = 0.2647 and s22 = 0.043 n1 = 6 and n2 = 5 df = 6 + 5 − 2 = 9 s r 5 ∗ s12 + 4 ∗ s22 1.496 sp = = ≈ 0.41 (6 − 1) + (5 − 1) 9 Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.11 Example by hand (2) Example Given the pooled standard deviation sp computed on the previous slide we obtain the 95% confidence interval as follows: p p SE = sp ∗ 1/5 + 1/6 = 0.41 ∗ 1/5 + 1/6 ≈ 0.61 Given α = 0.05 then t0.025,df =9 ≈ −2.26 and t0.975,df =9 ≈ 2.26 Given m1 = 6.4 and m2 = 4.9 then m1 − m2 = 1.5 (µ1 − µ2 ) ∈ [1.5 − 2.26 ∗ 0.61, 1.5 + 2.26 ∗ 0.61] ⇒ (µ1 − µ2 ) ∈ [0.14, 2.9] We may be 95% sure that the true difference between µ1 and µ2 lies within the given interval. Note that the interval does not contain zero. Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.12 Example in R (1) Example We compute the 95% confidence interval for the difference in Length means between female and male students in studNoEx.rda. > > > > > > > > > > > > x1 = studNoEx[studNoEx$Gender == "f", ] x2 = studNoEx[studNoEx$Gender == "m", ] m1 = mean(x1$Length) m2 = mean(x2$Length) delta = m1 - m2 var1 = var(x1$Length) var2 = var(x2$Length) n1 = length(x1$Length) n2 = length(x2$Length) v = ((n1 - 1) * var1 + (n2 - 1) * var2)/(n1 + n2 - 2) sp = sqrt(v) sp [1] 7.048812 sp the pooled standard deviation sp , and v the pooled variance. Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.13 Example in R (2) Example > se = sp * sqrt(1/n1 + 1/n2) > se Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl [1] 1.269341 > df = n1 + n2 - 2 > t0_025 = qt(0.025, df) > t0_025 Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples [1] -1.978820 > > > > t0_975 = qt(0.975, df) lower = delta + t0_025 * se upper = delta + t0_975 * se c(lower, upper) [1] -16.52085 -11.49725 Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test We can be 95% sure that the true difference in Length means lies in [−16.52085, −11.49725]. Note that the interval does not contain zero. Outlook ⇒ Why is the interval negative? References & further readings 7.14 Two sample t-test & CI for the difference in means Example Alternatively, we could obtain the 95% confidence interval for the difference in Length means between female and male students in studNoEx.rda by a two sample t-test in R. > female = x1$Length > male = x2$Length > t.test(female, male, var.equal = TRUE, conf.level = 0.95) Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Two Sample t-test Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, data: female and male independent samples CI for difference in two t = -11.0365, df = 127, p-value < 2.2e-16 means, paired samples alternative hypothesis: true difference in means is not equal toTwo 0sample t-test, paired samples 95 percent confidence interval: How large should a sample be? -16.52085 -11.49725 Type I and type II sample estimates: errors mean of x mean of y Influencing type I and II errors 169.0294 183.0385 Power of a t-test Do you see the connection between a CI for the difference in means and a two sample t-test? Outlook References & further readings 7.15 Explanation R output Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl How do we compute df = 127? df = 51 − 1 + 78 − 1 = 127 How do we compute the t = −11.0365? t = (m1 − m2)/SE = (169.0294 − 183.0385)/1.269341 = −11.03651 Conclusion from the example: we can be 95% sure that the true difference between the means is not zero. Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.16 Two sample t-test - by hand Example Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl We consider the variable Length in the studNoEx.rda dataset and want to test whether the mean Length significantly differs between female and male students, at a significance level α = 0.05. 1 H0 : µ1 − µ2 = 0, HA : µ1 − µ2 6= 0 ( H0 : µ1 = µ2 , HA : µ1 6= µ2 ) 2 n1 = 51, n2 = 78 (number of female and male students) 3 α = 0.05 4 df = 127 ⇒ t0.025,df = −1.979 and t0.975,df = 1.979 5 Critical region, so any t0 outside t0.025,df = −1.979 and t0.975,df = 1.979 leads to rejection of H0 . 6 t0 = (m1 − m2 )/SE = (169.0294 − 183.0385)/1.269341 = −11.0365 7 We can reject H0 and can be 95% sure that µ1 and µ2 are significantly different. Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.17 Difference in means - independent samples Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.18 Difference in means - paired samples Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.19 CI for the difference in two means; paired samples Paired/matched samples: a single object has been measured twice (usually at two moments, or ”before” and ”after” treatment) obj 1 2 3 4 5 t1 13.5 15.3 7.5 10.3 8.7 t2 12.7 15.1 6.6 8.5 8.0 t1 − t2 0.8 0.2 0.9 1.8 0.7 The average of the differences t1 − t2 is md = 0.88, the standard deviation sd = 0.58, and the standard error √ SE = sd / 5 ≈ 0.26. Using t0.025,df =4 = −2.776, the 95% CI for the average population difference ∆ is [0.88 − 2.776 ∗ 0.26, 0.88 + 2.776 ∗ 0.26] = [0.15824, 1.60176]. Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.20 Two sample t-test, paired samples - by hand Example Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl We consider the sample from the previous slide and want to test whether the average difference of t1 and t2 is zero. 1 H0 : ∆ = 0, HA : ∆ 6= 0 2 n1,2 = 5 ⇒ The differences are the sample, so there is basically only one sample! 3 α = 0.05 4 df = 4 ⇒ t0.025,df = −2.776 and t0.975,df = 2.776 5 Critical region, so any t0 outside t0.025,df = −2.776 and t0.975,df = 2.776 leads to rejection of H0 . 6 t0 = md /SE = md = 0.88/0.26 ≈ 3.385 7 We can reject H0 and can be 95% sure that the average difference of paired samples ∆ significantly differs from zero. Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.21 CI for the difference in means; paired or independent samples? On the previous slide we have seen that the 95% CI for the average population difference ∆ is [0.15824, 1.60176]. This interval does not contain zero. The 95% CI for the difference in two means for the same two samples is [-4.111314, 5.871314]. This interval does contain zero and is wider than the 95% CI for ∆. With both methods, we estimate the difference in population means. Using paired samples, it is sometimes possible to obtain a more precise interval estimate for the difference in two population means (or a lower p-value when testing) because pairing may keep extraneous variables constant. Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors But only if the pairs make sense! Power of a t-test Outlook References & further readings 7.22 Paired or independent samples? Example (1) in R using function t.test Example Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl > x1 = c(13.5, 15.3, 7.5, 10.3, 8.7) > x2 = c(12.7, 15.1, 6.6, 8.5, 8) > t.test(x1, x2, var.equal = TRUE) Two sample t-tests (and CI) Difference in two means, Two Sample t-test independent samples Pooled standard deviation data: x1 and x2 CI for difference in two means, independent samples t = 0.4066, df = 8, p-value = 0.695 Two sample t-test, alternative hypothesis: true difference in means is not equal toindependent 0 samples CI for difference in two 95 percent confidence interval: means, paired samples Two sample t-test, paired -4.111314 5.871314 samples How large should a sample sample estimates: be? mean of x mean of y Type I and type II 11.06 10.18 errors Here we compare the difference in two population means using independent samples and obtain a 95% CI (µ1 − µ2 ) ∈ [−4.111314, 5.871314]. We can not reject H0 . Influencing type I and II errors Power of a t-test Outlook References & further readings 7.23 CI for the difference in means; paired or independent samples? Example (2) in R using function t.test Example Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl > x1 = c(13.5, 15.3, 7.5, 10.3, 8.7) > x2 = c(12.7, 15.1, 6.6, 8.5, 8) > t.test(x1, x2, paired = TRUE, var.equal = TRUE) Two sample t-tests (and CI) Difference in two means, independent samples Paired t-test Pooled standard deviation data: x1 and x2 CI for difference in two means, independent samples t = 3.3896, df = 4, p-value = 0.02754 Two sample t-test, alternative hypothesis: true difference in means is not equal toindependent 0 samples CI for difference in two means, paired samples 95 percent confidence interval: Two sample t-test, paired samples 0.1591929 1.6008071 How large should a sample be? sample estimates: mean of the differences Type I and type II errors 0.88 Influencing type I and II errors Here we compare the difference in two population means using paired samples and obtain a 95% CI ∆ ∈ [0.1591929, 1.6008071]. We can reject H0 . Power of a t-test Outlook References & further readings 7.24 CI for the difference in means; paired or independent samples? Example (3) in R using function t.test Example Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl > x1 - x2 [1] 0.8 0.2 0.9 1.8 0.7 > t.test(x1 - x2) One Sample t-test data: x1 - x2 t = 3.3896, df = 4, p-value = 0.02754 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: 0.1591929 1.6008071 sample estimates: mean of x 0.88 Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test A paired t-test somehow corresponds to a one sample t-test: the Outlook differences between the pairs are the sample. References & further readings 7.25 More confidence intervals We have computed confidence intervals for a single mean and the difference in two means, using either independent or paired samples. There are more possibilities to obtain confidence intervals such as: Confidence intervals for proportions (percentages); very similar to confidence intervals for means. Confidence intervals for difference in proportions (percentages) Confidence intervals for difference in proportions (percentages) Confidence intervals for population variance (will be adressed in chapter 8) Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.26 How large should a sample be? Given that a 95% confidence interval, e.g. for µ is obtained by [m − tdf ,(1−α/2) · SE, m + tdf ,(1−α/2) · SE] and given that α is chosen and σ is not under our control, we can only control the width W of the interval by manipulating n: s W = 2tdf ,(1−α/2) · SE = 2tdf ,(1−α/2) · √ n 2tdf ,(1−α/2) · s 2 n=( ) W This equation helps to compute the sample size necessary to obtain a confidence interval with a specific width W . Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.27 Type I and type II errors - basic idea Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl To illustrate the concept of type I and II errors, we consider the situation of a trial (example adapted from [Wonnacott & Wonnacott (1990)]). The judge should decide between H0 , the hypothesis that the accused is innocent, and the alternative HA , that he is guilty. Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation A type I error occurs if the judge considers an innocent person guilty (reject true H0 ). A type II error occurs if the accused is guilty but set free (not reject H0 although HA is true). To prove guilt beyond reasonable doubt means that α should be very small. CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.28 Type I and type II errors - illustration Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.29 Type I and Type II errors - definition Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl The risk to wrongly rejecting a true H0 is the significance level of the test α. The risk that we wrongly not reject a false H0 is called β. Truth Test result H0 true H0 false Reject H0 Type I error, α OK, (1-β), power Do not reject H0 OK (1-α) Type II error, β We call (1 − β) the power of a test. The power of a test is the probability the null hypothesis is rejected when it is false. Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.30 Reducing α and β? Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Decreasing the type I error α from ≈ 0.05 to e.g. 0.01... Outlook References & further readings 7.31 Reducing α and β? Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test ... will increase the type II error β. And vice versa. Outlook References & further readings 7.32 Reducing α and β? Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Alone an increase in sample size allows to reduce β without increasing α ≈ 0.05. The ”trick”: standard error decreases with larger n ⇒ sampling distribution has less variance! Power of a t-test Outlook References & further readings 7.33 Compute β for a given alternative - example Example For Length, let us compute β for H0 : µ = µ0 = 175 the fixed alternative HA : µ = µA = 180 (s = 10.3, m = 178, n = 80, SE = 1.152, α = 0.05). 1 2 3 Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) The critical value is mc = 1.64 ∗ 1.152 + 175 = 176.8893. Difference in two means, independent samples We standardize the critical value with regard to µA = 180 Z = (mc − µA )/SE = (176.8893 − 180)/1.152 ≈ −2.70. Two sample t-test, independent samples Thus, Pr (m < 176.8893) = Pr (Z < −2.70) = 0.0035 = β 1 − β = 0.9965 denotes the power (of a test). Pooled standard deviation CI for difference in two means, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.34 Compute β for a given alternative - illustration Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Usually a larger the difference between µA and µ0 induces a larger power. But further aspects need to be considered... Power of a t-test Outlook References & further readings 7.35 How to compute the power function? Given that H0 is not true, then what is true? Probabilities cannot be computed without assumptions about the population. Given a fixed HA , we can compute power as in the figure in the previous slide. Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation For all possible HA ’s, we obtain the power function. What determines the power? The difference between the H0 and√HA means (delta) The width of the curves (SE = σ/ n) α where is α? – one-sided or two-sided what is n? how is SE computed? – type of test: one-sample, two-sample, paired CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.36 Compute power function vs. delta, n = 20, s = 1 Two sample t-tests and CI, type I and II errors, (0:20)/10, power of t-tests 1.0 > plot((0:20)/10, power.t.test(power = NULL, delta = + n = 20)$power, type = "l", xlab = "delta", ylab = "power") Katharina Henneb¨ ohl Two sample t-tests (and CI) 0.8 Difference in two means, independent samples Pooled standard deviation 0.6 Two sample t-test, independent samples 0.4 CI for difference in two means, paired samples Two sample t-test, paired samples 0.2 How large should a sample be? Type I and type II errors Influencing type I and II errors 0.0 power CI for difference in two means, independent samples Power of a t-test 0.0 0.5 1.0 delta 1.5 2.0 Outlook References & further readings 7.37 Compute power function vs. n; delta = 1 1.0 > plot(1:50, power.t.test(delta = 1, n = 1:50)$power, type = + xlab = "n (sample size)", ylab = "power") Two sample t-tests and CI, type I and II errors, "l",power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) 0.8 Difference in two means, independent samples Pooled standard deviation 0.6 Two sample t-test, independent samples CI for difference in two means, paired samples 0.4 Two sample t-test, paired samples How large should a sample be? Type I and type II errors 0.2 power CI for difference in two means, independent samples Influencing type I and II errors Power of a t-test 0 10 20 30 n (sample size) 40 50 Outlook References & further readings 7.38 The power concept beyond n In a testing framework, increasing n will make every small difference in means significant, as small differences will be noted (with large power). This does not mean that the difference found is relevant. Suppose we’re studying the effect of a medication type on health, or a herbicide type on plant disease. Two large samples (with and without treatment) confirmed (showed significantly) that in the group without treatment there was 45% succes, less than in the group with treatment with 47% success. That’s OK, but should we now collectively apply the treatment? Do the effects compensate for the costs and side effects? Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Significance is something else as relevance. Power of a t-test Outlook References & further readings 7.39 Power computation using power.t.test Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Description: Compute power of test, or determine parameters to obtain target power. Details: Exactly one of the parameters n, delta, power, sd, and sig.level must be passed as NULL, and that parameter is determined from the others. Notice that the last two have non-NULL defaults so NULL must be explicitly passed if you want to compute them. Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.40 Compute sample size Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl > power.t.test(n = NULL, delta = 1, sd = 1, sig.level = 0.05, power = 0.9, + type = "two.sample", alternative = "two.sided") Two-sample t test power calculation n delta sd sig.level power alternative = = = = = = 22.02110 1 1 0.05 0.9 two.sided NOTE: n is number in *each* group Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.41 Compute delta (HA ) Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl > power.t.test(n = 20, delta = NULL, sd = 1, sig.level = 0.05, + power = 0.9, type = "two.sample", alternative = "two.sided") Two-sample t test power calculation n delta sd sig.level power alternative = = = = = = 20 1.051970 1 0.05 0.9 two.sided NOTE: n is number in *each* group Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.42 Two sample t-tests and CI, type I and II errors, power of t-tests Compute power Katharina Henneb¨ ohl > power.t.test(n = 20, delta = 1, sd = 1, sig.level = 0.05, power = NULL, + type = "two.sample", alternative = "two.sided") Two-sample t test power calculation n delta sd sig.level power alternative = = = = = = 20 1 1 0.05 0.8689528 two.sided NOTE: n is number in *each* group Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.43 Compute significance level Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl > power.t.test(n = 20, delta = 1, sd = 1, sig.level = NULL, power = 0.9, + type = "two.sample", alternative = "two.sided") Two-sample t test power calculation Two sample t-tests (and CI) n delta sd sig.level power alternative = = = = = = 20 1 1 0.07004584 0.9 two.sided NOTE: n is number in *each* group (Note that this is of little operational use; computing sd is of even less operational use) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.44 Outlook chapter 8 Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Two sample t-tests (and CI) 1 F distribution 2 ANOVA - Analysis of Variance Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.45 Two sample t-tests and CI, type I and II errors, power of t-tests Katharina Henneb¨ ohl Wonnacott & Wonnacott (1990): Introductory Statistics, 5th edition, Wiley Edzer Pebesma (2010): Introduction to Geostatistics, http://ifgi.uni-muenster.de/~epebe_01/ Geostatistics10/ Two sample t-tests (and CI) Difference in two means, independent samples Pooled standard deviation CI for difference in two means, independent samples Two sample t-test, independent samples CI for difference in two means, paired samples Two sample t-test, paired samples How large should a sample be? Type I and type II errors Influencing type I and II errors Power of a t-test Outlook References & further readings 7.46
© Copyright 2025