One-sample signed rank test • Ranks. For a sample (D1 , . . . , Dn ), the rank of Di is its ranking in (D1 , . . . , Dn ). Ranks are between 1 and n, and small values have small ranks. When there are ties, the average ranks are used. – In the sample (1, 4, 2, 6), the ranks of 1, 4, 2, 6 are 1, 3, 2, 4 respectively. – In the sample (1, 4, 2, 2, 2, 2), the ranks of 1, 4 2 are 1, 6 respectively, and the rank of 2 is (2 + 3 + 4 + 5)/4 = 3.5. • For a sample (D1 , . . . , Dn ), let R1 , . . ., Rn be the ranks of D1 , . . ., Dn respectively, then R1 + · · · + Rn = n(n + 1)/2. • Two-sided one-sample signed rank test. Suppose that (D1 , . . . , Dn ) is a random sample from a distribution whose probability density function (PDF) is symmetric about µ (µ is the median of the distribution of X1 ). The testing problem is H0 : µ = µ0 versus H1 : µ 6= µ0 (1) The sign rank test statistic based on (D1 , . . . , Dn ) is computed as follows. let R1 , . . ., Rn be the ranks of |D1 − µ0 |, . . ., |Dn − µ0 | respectively. Let X R+ = Ri Di −µ0 >0;1≤i≤n and R− = X Ri = Di −µ0 <0;1≤i≤n n(n + 1) − R+ . 2 The signed rank test rejects H0 at level a if R+ ≤ C or R− ≤ C, (2) where C is chosen to be as large as possible while keeping P (R+ ≤ C) ≤ a 2 under H0 . • The probabilities P (R+ = 0),. . ., P (R+ = n(n + 1)/2) do not depend on the distribution of X1 when µ = µ0 . Let DSR (n) be the distribution of R+ when µ = µ0 . The signed rank test rejects H0 in (1) at level a if and only if 2P (DSR (n) ≤ observed min(R+ , R− )) ≤ a. {z } | p-value 1 • (2) can be re-written as min(R+ , R− ) ≤ C. The text uses min(R+ , R− ) as the signed rank test statistic. • When µ = µ0 , R− ∼ R+ ∼ DSR (n). Thus for C < n(n + 1)/4, P (min(R+ , R− ) ≤ C) = P (R+ ≤ C) + P (R− ≤ C) = 2P (R+ ≤ C). • Suppose that the testing problem in (1) is replaced by H0 : µ ≤ µ0 versus H1 : µ > µ0 . (3) The signed rank test rejects H0 at level a if R− ≤ C, where C is chosen to be as large as possible while keeping P (DSR (n) ≤ C) ≤ a. The signed rank test rejects H0 in (3) at level a if and only if P (DSR (n) ≤ observed R− ) ≤ a. {z } | p-value • Suppose that the testing problem in (1) is replaced by H0 : µ ≥ µ0 versus H1 : µ < µ0 . The signed rank test rejects H0 at level a if R+ ≤ C, where C is chosen to be as large as possible while keeping P (DSR (n) ≤ C) ≤ a. The signed rank test rejects H0 in (4) at level a if and only if P (DSR (n) ≤ observed R+ ) ≤ a. | {z } p-value • The distribution DSR (n). 2 (4) Example 1. Suppose that X1 , X2 and X3 are IID, and the PDF for X1 is symmetric about 0. Let Ri be the rank of |Xi | and X R+ = Ri . 1≤i≤3 and Xi >0 List the possible values for R+ and the corresponding probabilities. Let (i1 , i2 , i3 ) be a permutation for (1, 2, 3), then the above table can be summarized as follows. Cases 0 < Xi1 < Xi2 < Xi3 0 < −Xi1 < Xi2 < Xi3 0 < Xi1 < −Xi2 < Xi3 0 < Xi1 < Xi2 < −Xi3 0 < −Xi1 < −Xi2 < Xi3 0 < −Xi1 < Xi2 < −Xi3 0 < Xi1 < −Xi2 < −Xi3 0 < −Xi1 < −Xi2 < −Xi3 Probability 6/48 = 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8 R+ 6 5 4 3 3 2 1 0 The possible values for R+ and the corresponding probabilities are listed below. Possible R+ values Probabilities 0 1/8 1 1/8 2 1/8 3 1/4 4 1/8 5 1/8 6 1/8 • In Example 1, the probabilities can be found using the R command dsignrank(c(0:6), 3) The output is [1] 0.125 0.125 0.125 0.250 0.125 0.125 0.125 and it gives P (DSR (3) = 0), . . ., P (DSR (3) = 6). • To find P (DSR (3) ≤ C), we can use cumsum(dsignrank(c(0:6), 3)) The output is [1] 0.125 0.250 0.375 0.625 0.750 0.875 1.000 and it gives P (DSR (3) ≤ 0), . . ., P (DSR (3) ≤ 6). 3 • Example 2. Suppose that 4 people participated in a weight-losing program, and their weight losses are 0.9, 1.9, 2.9, and 3.9 kilograms respectively. Suppose that the weight loss distribution has a PDF that is symmetric about its median. Based on the signed rank test, can we conclude that the population median for weight loss after participating in the program is (a) greater than 1 kilogram at level 0.13? (b) less than 1 kilogram at level 0.13? (c) different from 1 kilogram at level 0.13? The R output after running the command cumsum(dsignrank(c(0:10), 4)) is [1] 0.0625 0.1250 0.1875 0.3125 0.4375 0.5625 0.6875 0.8125 0.8750 [10] 0.9375 1.0000 Sol. Compute R+ and R− : Xi 0.9 1.9 2.9 3.9 Sign for Xi − 1 − + + + |Xi − 1| 0.1 0.9 1.9 2.9 Rank for |Xi − 1| 1 2 3 4 From the above table, R+ = 2 + 3 + 4 = 9 and R− = 1. (a) Let H1 be the hypothesis that the median for weight loss after participating in the program is greater than 1 and H0 be the hypothesis that the median is equal to 1. The p-value for the sign rank test is P (DSR (4) ≤ 1) = 0.125 < 0.13, so we can conclude that the population median for weight loss after participating in the program is greater than 1 kilogram at level 0.13. (b) Let H1 be the hypothesis that the median for weight loss after participating in the program is less than 1 and H0 be the hypothesis that the median is equal to 1. The p-value for the sign rank test is P (DSR (4) ≤ 9) = 0.9375 > 0.13, so we cannot conclude that the population median for weight loss after participating in the program is greater than 1 kilogram at level 0.13. (c) Let H1 be the hypothesis that the median for weight loss after participating in the program is different from 1 and H0 be the hypothesis that the median is equal to 1. The p-value for the sign rank test is 2P (DSR (4) ≤ min(9, 1)) = 2 × 0.125 > 0.13, so we cannot conclude that the population median for weight loss after participating in the program is different from 1 kilogram at level 0.13. 4 • Suppose that (X1 , . . . , Xn ) and (Y1 , . . . , Yn ) are two random samples from distributions with mean µ1 and µ2 , and the two samples are allowed to be dependent. Consider the testing problem H0 : µ1 − µ2 ≤ 0 versus µ1 > µ2 (5) or the testing problem H0 : µ1 − µ2 = 0 versus µ1 6= µ2 (6) based on (D1 , . . . , Dn ) = (X1 − Y1 , . . . , Xn − Yn ). – We can apply the one sample t test to (D1 , . . . , Dn ) to test (5) or (6) assuming (D1 , . . . , Dn ) is a random sample from a normal distribution with mean µ1 − µ2 . – We can apply the signed rank test to (D1 , . . . , Dn ) to test (5) or (6) assuming (D1 , . . . , Dn ) is a random sample from a distribution whose probability density function (PDF) is symmetric about µ1 − µ2 . Let Ri be the rank of |Di | in (|D1 |, . . . , |Dn |) and X X R+ = Ri and R− = Ri . Di >0 Di <0 ∗ For testing (5), the signed rank test rejects H0 if R− ≤ C, and the p-value is P (DSR (n) ≤ observed R− ). ∗ For testing (6), the signed rank test rejects H0 at level a if min(R+ , R− ) ≤ C, and the p-value is 2P (DSR (n) ≤ observed min(R+ , R− )). Example 3. (Example in the text, Pages 673-676; one observation is removed). 14 participants are asked to rate two different flavors of chicken: the original flavor and the new flavor. The problem of interest is to test H0 : both flavors have the same rating v.s. H1 : the new flavor has higher rating The differences in rating scores (new scores − original scores) are 2, -8, 4, 14, 8, 9, -10, 9, 8, 3, 16, -9, -7, 12 and the level of the test is 0.05. Sol. Sort the differences by absolute values: 2, 3, 4, −7, −8, 8, 8, −9, 9, 9, −10, 12, 14, 16. 5 The ranks of the sorted absolute differences are 1, 2, 3, 4, 6, 6, 6, 9, 9, 9, 11, 12, 13, 14 and R− = 4 + 6 + 9 + 11 = 30. From Table B.7 in Appendix B in the text, the signed rank test rejects H0 at level 0.05 if R− ≤ C, where C is 25. Since 30 > 25, we cannot conclude H1 at level 0.05. We can also solve this problem using p-value. The p-value is P (DSR (14) ≤ 30). From Table B.7 in Appendix B, P (DSR (14) ≤ 25) ≤ 0.05 < P (DSR (14) ≤ 26), so the p-value P (DSR (14) ≤ 30) > 0.05 and we cannot conclude H1 at level 0.05. • R codes for computing R+ and R− in Example 3. x <- c(2, -8, 4, 14, 8, sum(rank(abs(x))[x>0]) sum(rank(abs(x))[x<0]) 9, -10, 9, 8, 3, 16, -9, -7, 12) • R codes for finding the critical value 25 in Example 3. prob <- cumsum(dsignrank(c(0:105), 14)) max(c(0:105)[prob <= 0.05]) 6
© Copyright 2024