Download Report

Lecture 21
1. Bootstrap hypothesis tests: 2-sample t-test
2. Types of missing values
3. Making missing-value datasets: MCAR, MAR, and MNAR
1
Bootstrap tests
Our bootstrap example (confidence interval for a correlation):
1. draw B samples with replacement from the data,
2. calculate the statistic r b§ from each bootstrap sample, b = 1, . . . , B ,
© ™
3. use the histogram of the bootstrap statistics r b§ to find confidence interval or
standard error.
Hypothesis test differs from confidence intervals and standard errors:
test has a null hypothesis H0, p-value for the test is calculated assuming H0.
To bootstrap a test, we need to draw the bootstrap samples from the null
hypothesis distribution.
2
The two sample t-test compares two population means µY and µZ
by comparing estimates of these means from independent samples
©
™
y = y 1, y 2, . . . , y n from population Y and z = {z 1, z 2, . . . , z m } from population Z .
The standard assumptions are:
1. The data are simple random samples
2. The two populations are Normal
3. The two populations have the same variance
4. The observations are correctly labeled with their population:
no misclassification.
The null hypothesis is H0 : µY = µZ .
3
Brain glucose example
Magnetic resonance imaging gives researchers a non-invasive way to measure
chemicals in the brain minute by minute. One study measured blood sugar
(glucose) in the brains of 14 people with diabetes and 14 healthy people.
4
The TTEST Procedure
Variable
brain_
glucose
brain_
glucose
brain_
glucose
diabetic
0
1
Diff (1-2)
Variable
brain_glucose
brain_glucose
N
14
Lower CL
Mean
4.6832
Mean
5.3214
Upper CL
Mean
5.9596
Std Dev
1.1054
Std Err
0.2954
14
4.1943
4.6857
5.1771
0.8511
0.2275
-0.131
0.6357
1.4021
0.9865
0.3728
Method
Pooled
Satterthwaite
Variances
Equal
Unequal
DF
26
24.4
t Value
1.71
1.71
Pr > |t|
0.1001
0.1009
Conclusion?
5
These are small samples that looked skewed in opposite directions—perhaps the
t-test is missing something?
6
Bootstrap 2-sample t-test
©
™
We have two independent samples: y = y 1, y 2, . . . , y n from population Y
and z = {z 1, z 2, . . . , z m } from population Z .
1. Create two transformed data sets y0 and z0 with equal means to satisfy the null
hypothesis.
y0 and z0 should have the same standard deviations as originals.
2. Bootstrap y0 and z0 and calculate the test statistic from the samples b = 1, . . . , B :
y¯ ° z¯
t b§ = p
¯ 2 + SE(z)
¯2
SE( y)
This t-statistic does not assume equal population variances.
Use this because we are equalizing means but not SDs.
7
Make the null hypothesis samples
Let y¯ be the mean of sample y, and z¯ be the mean of sample z.
Subtract y¯ from each observation in sample y:
©
™ ©
™
¯ (y 2 ° y),
¯ . . . , (y n ° y)
¯
y0 = y 10 , y 20 , . . . , y n0 = (y 1 ° y),
Subtract z¯ from each observation in sample z:
©
™
¯ (z 2 ° z),
¯ . . . , (z m ° z)}
¯
z0 = z 10 , z 20 , . . . , z n0 = {(z 1 ° z),
Easy to show both y0 and z0 have mean = zero; null hypothesis is equal means.
Shifting each sample by a constant does not change the SD.
8
Original data: samples y and z.
9
Shifted data: samples y0 and z0.
10
Proc Print data = pubh.brain_glucose;
Obs
diabetic
1
0
2
0
3
0
. . . .
15
1
16
1
17
1
. . . .
brain_
glucose
4.5
5.3
8.2
5.2
5.1
5.3
From Proc Ttest, we have y¯ = 5.3214 and z¯ = 4.6857.
data nh_data;
set pubh.brain_glucose;
if diabetic = 0 then
bg = brain_glucose - 5.3214;
if diabetic = 1 then
bg = brain_glucose - 4.6857;
11
Draw simple random samples with replacement from the null hypothesis data
(nh_data).
Here’s what we did before:
Proc Surveyselect
seed = 56672119
data = nh_data out=boot_samples
method = urs
samprate = 1 outhits rep = 2000;
Problems:
Is it possible to get a sample entirely from one group?
What happens to t-statistic if sample sizes for the two groups don’t match
originals?
12
±
Solution: Draw bootstrap samples from each group (diabetic control) separately:
Proc Surveyselect
seed = 56672119
data = nh_data out=boot_samples
method = urs
samprate = 1 outhits rep = 20;
start with 20, make sure code works
strata diabetic / alloc = proportional;
strata statement identifies the grouping variable
alloc = proportional Sample from each group in proportion to its size
13
Calculate t b§ from each bootstrap sample.
proc sort data=boot_samples;
by replicate;
ODS listing close; stop writing output
proc ttest ci=none data=boot_samples;
by replicate;
class diabetic;
var bg;
* use transformed variable;
ODS output ttests = tstars;
run;
ODS listing; resume writing output
proc print data=tstars (obs=10);
14
Obs
Replicate
Variable
1
2
3
4
5
6
7
8
9
10
1
1
2
2
3
3
4
4
5
5
bg
bg
bg
bg
bg
bg
bg
bg
bg
bg
Method
Variances
tValue
DF
Probt
Equal
Unequal
Equal
Unequal
Equal
Unequal
Equal
Unequal
Equal
Unequal
1.42
1.42
1.63
1.63
-1.47
-1.47
-0.23
-0.23
1.65
1.65
26
25.97
26
22.991
26
25.124
26
24.27
26
23.113
0.1666
0.1666
0.1159
0.1175
0.1541
0.1545
0.8189
0.8190
0.1114
0.1129
Pooled
Satterthwaite
Pooled
Satterthwaite
Pooled
Satterthwaite
Pooled
Satterthwaite
Pooled
Satterthwaite
Which t-statistic do we want?
Use subsetting IF.
Rename tValue as t_star
15
data boot_t_stars;
set tstars;
if method= "Satterthwaite";
t_star = tValue;
observed = 1.71;
t_star_larger = (abs(t_star) GE observed);
* calculate indicator to get p-value;
keep replicate t_star t_star_larger;
proc print data=boot_t_stars (obs=10);
16
t_star_
Obs
Replicate
t_star
larger
1
1
1.42297
0
2
2
1.62647
0
3
3
-1.46795
0
4
4
-0.23128
0
5
5
1.64802
0
6
6
0.59854
0
7
7
-0.65278
0
8
8
1.24904
0
9
9
1.80207
1
10
10
0.00004
0
We have 2000 unequal-variance t b§ replicates from H0
17
Draw a histogram of unequal-variance t b§ replicates
ODS graphics on;
Proc Univariate noprint data=boot_t_stars;
var t_star;
histogram t_star / cfill=graye03 ;
inset q1 median mean q3 / position=NE noframe;
run;
ODS graphics off;
What should this histogram look like?
Where should the mean be?
18
19
The TTEST Procedure
Variable
brain_
glucose
brain_
glucose
brain_
glucose
diabetic
0
1
Diff (1-2)
Variable
brain_glucose
brain_glucose
N
14
Lower CL
Mean
4.6832
Mean
5.3214
Upper CL
Mean
5.9596
Std Dev
1.1054
Std Err
0.2954
14
4.1943
4.6857
5.1771
0.8511
0.2275
-0.131
0.6357
1.4021
0.9865
0.3728
Method
Pooled
Satterthwaite
Variances
Equal
Unequal
Observed t = 1.71.
How do we get p-value from histogram?
20
DF
26
24.4
t Value
1.71
1.71
Pr > |t|
0.1001
0.1009
p-value is sum of tail areas where |t b§| ∏ 1.71.
21
To get this area, find proportion of bootstrap t b§ with |t b§| ∏ 1.71.
This is the indicator variable t_star_larger.
Proc Freq data=boot_t_stats;
table t_star_larger;
t_star_larger
Frequency
Percent
Cumulative
Cumulative
Frequency
Percent
-----------------------------------------------------------------0
1779
88.95
1779
88.95
1
221
11.05
2000
100.00
±
bootstrap approximate two-sided p-value = 221 2000 = 0.1105
22
bootstrap approximate two-sided p-value from B replicates:
Ø Ø Ø
Ø
number of Øt b§Ø ∏ Øt obs Ø
B
This is the proportion of test statistics from the H0 distribution that are more
extreme than the one we observed.
±
Bootstrap t-test gives p-value = 221 2000 = 0.1105.
Regular t-test gave p = 0.1009.
In this small sample, with normality in doubt, the bootstrap provides reassurance
that the t-test is not missing a real difference.
23
Missing Data Methods
Multicenter Depression trial (HAMD). This clinical trial randomly assigned 100
patients with major depression to an experimental drug (D) or to placebo (P).
(Source: Dmitrienko et. al. (2005).
Participants completed the Hamilton depression rating scale (HAMD) at baseline
and again at the end of the 9-week treatment. Study outcome was HAMD at end.
Allocation of patients at 5 clinical centers:
drug
center
Frequency|
1|
2|
3|
4|
5|
---------+--------+--------+--------+--------+--------+
Drug
|
11 |
7 |
16 |
9 |
7 |
---------+--------+--------+--------+--------+--------+
Placebo |
13 |
7 |
14 |
10 |
6 |
---------+--------+--------+--------+--------+--------+
Total
24
14
30
19
13
24
Total
50
50
100
The first 5 observations from the Depression Study data
ID
baseline
final
drug
center
1
27
4
D
1
2
27
9
D
1
3
26
8
D
1
4
27
5
D
1
5
36
8
D
1
Model(s) to compare final HAMD between treatments, adjusted for baseline and
center:
We’ll return to these models to analyze this data.
25
Missing Values
Suppose that some final surveys were missing—not completed.
What happens to these participants’ data in the fitting the adjusted model?
What if patients with the worst side-effects to the experimental drug (D) dropped
out and didn’t complete the final survey?
26
Types of Missing Data
Missing completely at random (MCAR): data are missing independently of both
observed and unobserved data.
Example: a participant flips a coin to decide whether to complete the depression
survey.
Missing at random (MAR): given the observed data, data are missing
independently of unobserved data.
Example: male participants are more likely to refuse to fill out the depression
survey, but it does not depend on the level of their depression.
27
MCAR implies MAR, but not the other way round. Most methods assume MAR.
We can ignore missing data ( = omit missing observations) if we have MAR or
MCAR.
Missing Not at Random (MNAR): missing observations related to values of
unobserved data.
Example: participants with severe depression were less likely to complete HAMD
form.
Informative missingness: the fact that data is missing contains information about
the response.
Observed data is biased sample. Missing data cannot be ignored.
28
Cannot distinguish MAR from MNAR without additional information.
SAS default is to omit cases with missing data = ignore missing data.
With MNAR, you get a non-representative sample and biased estimates.
References:
Dmitrienko et. al. (2005) Analysis of Clinical Trials Using SAS, Chapter 5
R Little and D Rubin (2002) Statistical Analysis with Missing Data, Second Edition
29
Plan:
1. Delete observations from HAMD data to make an example of each type of
missing data.
2. Discuss approaches to handling missing data.
3. Compare these approaches on our constructed examples from HAMD.
30
Make missing completely at random (MCAR) example
MCAR: data are missing independently of both observed and unobserved data.
Example: participant flips a coin to decide whether to complete final survey.
Randomly select 30% of the observations in HAMD, set to missing.
data MCAR;
set ph6470.hamd2;
missing = 0;
if (ranuni(457392) < .3) then do; select 30% random sample
final =. ;
missing=1;
label missing values
end;
31
MCAR example, first 10 observations.
Obs
missing
baseline
1
0
27
2
0
3
final
drug
center
4
D
1
27
9
D
1
0
26
8
D
1
4
0
27
5
D
1
5
0
36
8
D
1
6
0
39
18
D
1
7
0
25
14
D
1
8
0
33
8
D
1
9
0
38
9
D
1
10
1
39
.
D
1
32
proc freq data=MCAR;
tables missing;
missing
Frequency
Percent
Frequency
Percent
0
67
67.00
67
67.00
1
33
33.00
100
100.00
What percent are actually missing?
33
Missing at random (MAR) example
Missing at random (MAR): given the observed data, data are missing
independently of unobserved data.
Example: male participants more likely to refuse to fill out final survey,
independent of their level of their depression.
Data does not include gender. Missing values related to observed data: only at
centers 1, 2, and 3.
Need to get º 33 missing cases. Centers 1, 2, 3 together have 64/100 patients in
study. What proportion p should be missing?
p § 64 = 33 gives x = .516
34
data MAR;
set ph6470.hamd2;
missing = 0;
if (ranuni(457392) < .516
and center IN (1, 2, 3))
then do;
final =. ;
missing=1;
end;
proc freq data=MAR;
tables missing;
Cumulative
Cumulative
missing
Frequency
Percent
Frequency
Percent
-----------------------------------------------------------0
63
63.00
63
63.00
1
37
37.00
100
100.00
35
Adjusting the cutoff for the uniform random number gives:
data MAR;
set ph6470.hamd2;
missing = 0;
if (ranuni(457392) < .435
and center IN (1, 2, 3)) then do;
final =. ;
missing=1;
end;
This produces 34 missing values, nearly the same number as the MCAR example.
36
MAR example, first 10 observations.
Obs
missing
baseline
1
1
27
2
0
3
final
change
drug
center
.
23
D
1
27
9
18
D
1
0
26
8
18
D
1
4
0
27
5
22
D
1
5
0
36
8
28
D
1
6
0
39
18
21
D
1
7
1
25
.
11
D
1
8
0
33
8
25
D
1
9
1
38
.
29
D
1
10
1
39
.
18
D
1
37
Missing not at random (MNAR) example
MNAR: missing observations related to values of unobserved data.
Example: participants with most severe depression were less likely to complete
final HAMD survey.
Identify “high” final values.
Randomly select 33 among these to delete—want same among of missing data as
other examples.
How do we identify top 50% of baseline values?
38
Proc univariate data=ph6470.hamd2;
var final;
Quantile
Estimate
100% Max
35.0
99%
34.0
95%
28.0
90%
23.5
75% Q3
19.0
50% Median
14.5
25% Q1
8.0
10%
4.0
5%
2.0
1%
1.0
0% Min
1.0
39
What proportion do we remove? p § 50 = 33 gives p = .66
data MNAR;
set ph6470.hamd2;
missing=0;
if ( final GE 14.5
and ranuni(884739) < .66 ) then do;
final =. ;
missing=1;
end;
proc freq data=MNAR;
tables missing;
missing
0
1
Frequency
70
30
Percent
70.00
30.00
Cumulative
Frequency
70
100
40
Cumulative
Percent
70.00
100.00
Trial and error leads to:
data MNAR;
set ph6470.hamd2;
missing=0;
if (final GE 14.5 and ranuni(884739) < .69 ) then do;
final =. ;
missing=1;
end;
which gives 33 missing values.
41
MNAR example, first 10 observations:
Obs
missing
baseline
1
0
27
2
0
3
final
change
drug
center
4
23
D
1
27
9
18
D
1
0
26
8
18
D
1
4
0
27
5
22
D
1
5
0
36
8
28
D
1
6
1
39
.
21
D
1
7
0
25
14
11
D
1
8
0
33
8
25
D
1
9
0
38
9
29
D
1
10
1
39
.
18
D
1
42
Plan:
1. Delete observations from HAMD data to make an example of each type of
missing data: MCAR, MAR, MNAR.
All data sets have 33% missing data.
2. Discuss approaches to handling missing data.
3. Compare these approaches on our constructed examples from HAMD.
Results will depend on type of missingness, not amount of missing data.
43