Download Report

One-sample signed rank test
• Ranks. For a sample (D1 , . . . , Dn ), the rank of Di is its ranking in
(D1 , . . . , Dn ). Ranks are between 1 and n, and small values have small
ranks. When there are ties, the average ranks are used.
– In the sample (1, 4, 2, 6), the ranks of 1, 4, 2, 6 are 1, 3, 2, 4 respectively.
– In the sample (1, 4, 2, 2, 2, 2), the ranks of 1, 4 2 are 1, 6 respectively,
and the rank of 2 is (2 + 3 + 4 + 5)/4 = 3.5.
• For a sample (D1 , . . . , Dn ), let R1 , . . ., Rn be the ranks of D1 , . . ., Dn
respectively, then R1 + · · · + Rn = n(n + 1)/2.
• Two-sided one-sample signed rank test. Suppose that (D1 , . . . , Dn ) is a
random sample from a distribution whose probability density function
(PDF) is symmetric about µ (µ is the median of the distribution of X1 ).
The testing problem is
H0 : µ = µ0 versus H1 : µ 6= µ0
(1)
The sign rank test statistic based on (D1 , . . . , Dn ) is computed as follows.
let R1 , . . ., Rn be the ranks of |D1 − µ0 |, . . ., |Dn − µ0 | respectively. Let
X
R+ =
Ri
Di −µ0 >0;1≤i≤n
and
R− =
X
Ri =
Di −µ0 <0;1≤i≤n
n(n + 1)
− R+ .
2
The signed rank test rejects H0 at level a if
R+ ≤ C or R− ≤ C,
(2)
where C is chosen to be as large as possible while keeping
P (R+ ≤ C) ≤
a
2
under H0 .
• The probabilities P (R+ = 0),. . ., P (R+ = n(n + 1)/2) do not depend on
the distribution of X1 when µ = µ0 . Let DSR (n) be the distribution of
R+ when µ = µ0 . The signed rank test rejects H0 in (1) at level a if and
only if
2P (DSR (n) ≤ observed min(R+ , R− )) ≤ a.
{z
}
|
p-value
1
• (2) can be re-written as
min(R+ , R− ) ≤ C.
The text uses min(R+ , R− ) as the signed rank test statistic.
• When µ = µ0 , R− ∼ R+ ∼ DSR (n). Thus for C < n(n + 1)/4,
P (min(R+ , R− ) ≤ C)
= P (R+ ≤ C) + P (R− ≤ C)
=
2P (R+ ≤ C).
• Suppose that the testing problem in (1) is replaced by
H0 : µ ≤ µ0 versus H1 : µ > µ0 .
(3)
The signed rank test rejects H0 at level a if
R− ≤ C,
where C is chosen to be as large as possible while keeping
P (DSR (n) ≤ C) ≤ a.
The signed rank test rejects H0 in (3) at level a if and only if
P (DSR (n) ≤ observed R− ) ≤ a.
{z
}
|
p-value
• Suppose that the testing problem in (1) is replaced by
H0 : µ ≥ µ0 versus H1 : µ < µ0 .
The signed rank test rejects H0 at level a if
R+ ≤ C,
where C is chosen to be as large as possible while keeping
P (DSR (n) ≤ C) ≤ a.
The signed rank test rejects H0 in (4) at level a if and only if
P (DSR (n) ≤ observed R+ ) ≤ a.
|
{z
}
p-value
• The distribution DSR (n).
2
(4)
Example 1. Suppose that X1 , X2 and X3 are IID, and the PDF for X1
is symmetric about 0. Let Ri be the rank of |Xi | and
X
R+ =
Ri .
1≤i≤3 and Xi >0
List the possible values for R+ and the corresponding probabilities.
Let (i1 , i2 , i3 ) be a permutation for (1, 2, 3), then the above table can be
summarized as follows.
Cases
0 < Xi1 < Xi2 < Xi3
0 < −Xi1 < Xi2 < Xi3
0 < Xi1 < −Xi2 < Xi3
0 < Xi1 < Xi2 < −Xi3
0 < −Xi1 < −Xi2 < Xi3
0 < −Xi1 < Xi2 < −Xi3
0 < Xi1 < −Xi2 < −Xi3
0 < −Xi1 < −Xi2 < −Xi3
Probability
6/48 = 1/8
1/8
1/8
1/8
1/8
1/8
1/8
1/8
R+
6
5
4
3
3
2
1
0
The possible values for R+ and the corresponding probabilities are listed
below.
Possible R+ values
Probabilities
0
1/8
1
1/8
2
1/8
3
1/4
4
1/8
5
1/8
6
1/8
• In Example 1, the probabilities can be found using the R command
dsignrank(c(0:6), 3)
The output is
[1] 0.125 0.125 0.125 0.250 0.125 0.125 0.125
and it gives P (DSR (3) = 0), . . ., P (DSR (3) = 6).
• To find P (DSR (3) ≤ C), we can use
cumsum(dsignrank(c(0:6), 3))
The output is
[1] 0.125 0.250 0.375 0.625 0.750 0.875 1.000
and it gives P (DSR (3) ≤ 0), . . ., P (DSR (3) ≤ 6).
3
• Example 2. Suppose that 4 people participated in a weight-losing program, and their weight losses are 0.9, 1.9, 2.9, and 3.9 kilograms respectively. Suppose that the weight loss distribution has a PDF that is symmetric about its median. Based on the signed rank test, can we conclude
that the population median for weight loss after participating in the program is
(a) greater than 1 kilogram at level 0.13?
(b) less than 1 kilogram at level 0.13?
(c) different from 1 kilogram at level 0.13?
The R output after running the command cumsum(dsignrank(c(0:10), 4))
is
[1] 0.0625 0.1250 0.1875 0.3125 0.4375 0.5625 0.6875 0.8125 0.8750
[10] 0.9375 1.0000
Sol. Compute R+ and R− :
Xi
0.9
1.9
2.9
3.9
Sign for Xi − 1
−
+
+
+
|Xi − 1|
0.1
0.9
1.9
2.9
Rank for |Xi − 1|
1
2
3
4
From the above table, R+ = 2 + 3 + 4 = 9 and R− = 1.
(a) Let H1 be the hypothesis that the median for weight loss after participating in the program is greater than 1 and H0 be the hypothesis
that the median is equal to 1. The p-value for the sign rank test is
P (DSR (4) ≤ 1) = 0.125 < 0.13, so we can conclude that the population median for weight loss after participating in the program is
greater than 1 kilogram at level 0.13.
(b) Let H1 be the hypothesis that the median for weight loss after participating in the program is less than 1 and H0 be the hypothesis
that the median is equal to 1. The p-value for the sign rank test is
P (DSR (4) ≤ 9) = 0.9375 > 0.13, so we cannot conclude that the
population median for weight loss after participating in the program
is greater than 1 kilogram at level 0.13.
(c) Let H1 be the hypothesis that the median for weight loss after participating in the program is different from 1 and H0 be the hypothesis
that the median is equal to 1. The p-value for the sign rank test is
2P (DSR (4) ≤ min(9, 1)) = 2 × 0.125 > 0.13, so we cannot conclude
that the population median for weight loss after participating in the
program is different from 1 kilogram at level 0.13.
4
• Suppose that (X1 , . . . , Xn ) and (Y1 , . . . , Yn ) are two random samples from
distributions with mean µ1 and µ2 , and the two samples are allowed to be
dependent. Consider the testing problem
H0 : µ1 − µ2 ≤ 0 versus µ1 > µ2
(5)
or the testing problem
H0 : µ1 − µ2 = 0 versus µ1 6= µ2
(6)
based on (D1 , . . . , Dn ) = (X1 − Y1 , . . . , Xn − Yn ).
– We can apply the one sample t test to (D1 , . . . , Dn ) to test (5) or (6)
assuming (D1 , . . . , Dn ) is a random sample from a normal distribution with mean µ1 − µ2 .
– We can apply the signed rank test to (D1 , . . . , Dn ) to test (5) or (6)
assuming (D1 , . . . , Dn ) is a random sample from a distribution whose
probability density function (PDF) is symmetric about µ1 − µ2 .
Let Ri be the rank of |Di | in (|D1 |, . . . , |Dn |) and
X
X
R+ =
Ri and R− =
Ri .
Di >0
Di <0
∗ For testing (5), the signed rank test rejects H0 if R− ≤ C, and
the p-value is
P (DSR (n) ≤ observed R− ).
∗ For testing (6), the signed rank test rejects H0 at level a if
min(R+ , R− ) ≤ C, and the p-value is
2P (DSR (n) ≤ observed min(R+ , R− )).
Example 3.
(Example in the text, Pages 673-676; one observation is
removed). 14 participants are asked to rate two different flavors of chicken:
the original flavor and the new flavor. The problem of interest is to test
H0 : both flavors have the same rating v.s. H1 : the new flavor has higher rating
The differences in rating scores (new scores − original scores) are
2, -8, 4, 14, 8,
9, -10, 9, 8, 3, 16, -9, -7, 12
and the level of the test is 0.05.
Sol. Sort the differences by absolute values:
2, 3, 4, −7, −8, 8, 8, −9, 9, 9, −10, 12, 14, 16.
5
The ranks of the sorted absolute differences are
1, 2, 3, 4, 6, 6, 6, 9, 9, 9, 11, 12, 13, 14
and R− = 4 + 6 + 9 + 11 = 30. From Table B.7 in Appendix B in the
text, the signed rank test rejects H0 at level 0.05 if R− ≤ C, where C is
25. Since 30 > 25, we cannot conclude H1 at level 0.05.
We can also solve this problem using p-value. The p-value is P (DSR (14) ≤
30). From Table B.7 in Appendix B,
P (DSR (14) ≤ 25) ≤ 0.05 < P (DSR (14) ≤ 26),
so the p-value P (DSR (14) ≤ 30) > 0.05 and we cannot conclude H1 at
level 0.05.
• R codes for computing R+ and R− in Example 3.
x <- c(2, -8, 4, 14, 8,
sum(rank(abs(x))[x>0])
sum(rank(abs(x))[x<0])
9, -10, 9, 8, 3, 16, -9, -7, 12)
• R codes for finding the critical value 25 in Example 3.
prob <- cumsum(dsignrank(c(0:105), 14))
max(c(0:105)[prob <= 0.05])
6