i i “book” — 2009/6/8 — 14:41 — page 75 — #95 i 2.4. TWO SAMPLE TESTS FOR CONTINUOUS VARIABLES 2.3.4 i 75 Fisher’s exact test HELP example: see 2.6.3 SAS proc freq data=ds; tables x * y / exact; run; or proc freq data=ds; tables x * y; exact fisher / mc n=bnum; run; Note: The former requests only the exact p-value; the latter generates a Monte Carlo pvalue, an asymptotically equivalent test based on bnum random tables simulated using the observed margins. R fisher.test(y, x) or fisher.test(ymat) Note: The fisher.test() command can accept either two class vectors or a matrix with counts (here denoted by ymat). For tables with many rows and/or columns, p-values can be computed using Monte Carlo simulation using the simulate.p.value option. 2.3.5 McNemar’s test McNemar’s test tests the null hypothesis that the proportions are equal across matched pairs, for example, when two raters assess a population. SAS proc freq data=ds; tables x * y / agree; run; R mcnemar.test(y, x) Note: The mcnemar.test() command can accept either two class vectors or a matrix with counts. 2.4 Two sample tests for continuous variables 2.4.1 Student’s t-test SAS proc ttest data=ds; class x; var y; run; HELP example: see 2.6.4 i i i i i i “book” — 2009/6/8 — 14:41 — page 76 — #96 i 76 i CHAPTER 2. COMMON STATISTICAL PROCEDURES Note: The variable X takes on two values. The output contains both equal and unequalvariance t-tests, as well as a test of the null hypothesis of equal variance. R t.test(y1, y2) or t.test(y ~ x) Note: The first example for the t.test() command displays how it can take two vectors (y1 and y2) as arguments to compare, or in the latter example a single vector corresponding to the outcome (y), with another vector indicating group membership (x) using a formula interface (see sections B.4.6 and 3.1.1). By default, the two-sample t-test uses an unequal variance assumption. The option var.equal=TRUE can be added to specify an equal variance assumption. The command var.test() can be used to formally test equality of variances. 2.4.2 Nonparametric tests SAS proc npar1way data=ds wilcoxon edf median; class y; var x; run; HELP example: see 2.6.4 Note: Many tests can be requested as options to the proc npar1way statement. Here we show a Wilcoxon test, a Kolmogorov–Smirnov test, and a median test, respectively. Exact tests can be generated by using an exact statement with these names, e.g., the exact median statement will generate the exact median test. R wilcox.test(y1, y2) ks.test(y1, y2) library(coin) median_test(y ~ x) Note: By default, the wilcox.test() function uses a continuity correction in the normal approximation for the p-value. The ks.test() function does not calculate an exact pvalue when there are ties. The median test shown will generate an exact p-value with the distribution="exact" option. 2.4.3 Permutation test SAS proc npar1way data=ds; class y; var x; exact scores=data; run; HELP example: see 2.6.4 i i i i i i “book” — 2009/6/8 — 14:41 — page 77 — #97 i 2.5. FURTHER RESOURCES i 77 or proc npar1way data=ds; class y; var x; exact scores=data / mc n=bnum; run; Note: Any test described in 2.4.2 can be named in place of scores=data to get an exact test based on those statistics. The mc option generates an empirical p-value (asymptotically equivalent to the exact p-value) based on bnum Monte Carlo replicates. R library(coin) oneway_test(y ~ as.factor(x), distribution=approximate(B=bnum)) Note: The oneway_test function in the coin library implements a variety of permutation based tests (see also the exactRankTests package). The distribution=approximate syntax generates an empirical p-value (asymptotically equivalent to the exact p-value) based on bnum Monte Carlo replicates. 2.4.4 Logrank test HELP example: see 2.6.5 See also 5.1.19 (Kaplan–Meier plot) and 4.3.1 (Cox proportional hazards model) SAS proc phreg data=ds; model timevar*cens(0) = x; run; or proc lifetest data=ds; time timevar*cens(0); strata x; run; Note: If cens is equal to 0, then proc phreg and proc lifetest treat time as the time of censoring, otherwise it is the time of the event. The default output from proc lifetest includes the logrank and Wilcoxon tests. Other tests, corresponding to different weight functions, can be produced with the test option to the strata statement. These include test=fleming(ρ1 , ρ2 ), a superset of the G-rho family of Fleming and Harrington [23], which simplifies to the G-rho family when ρ2 = 0. R library(survival) survdiff(Surv(timevar, cens) ~ x) Note: Other tests within the G-rho family of Fleming and Harrington [23] are supported by specifying the rho option. 2.5 Further resources Comprehensive introductions to using SAS to fit common statistical models can be found in [9] and [15]. Similar methods in R are accessibly presented in [95]. Efron and Tibshi- i i i i
© Copyright 2024