Modeling Non-Independence in Meta

Meta-Analysis of Correlated
Data
Meta-Analysis of Correlated
Data
Common Forms of Dependence
• Multiple effects per study
– Or per research group!
• Multiple effect sizes using same control
• Phylogenetic non-independence
• Measurements of multiple responses to a
common treatment
• Unknown correlations…
Multiple Sample Points per Study!
Study
Experiment in Study
Hedges D
V Hedges D
Ramos & Pinto 2010
1
4.32
7.23
Ramos & Pinto 2010
2
2.34
6.24
Ramos & Pinto 2010
3
3.89
5.54
Ellner & Vadas 2003
1
-0.54
2.66
Ellner & Vadas 2003
2
-4.54
8.34
Moria & Melian 2008
1
3.44
9.23
Hierarchical Models
• Study-level random effect
• Study-level variation in coefficients
• Covariates at experiment and study level
Hierarchical Models
• Random variation within study (j) and
between studies (i)
Tij ~ N(qij, sij2)
qij ~ N(yj,wj2)
yj ~ N(m,t2)
Study Level Clustering
Hierarchical Partitioning of One
Study
Study Mean
Variation due to w
Variation due to t
Grand Mean
Example: Data Set 1
1
2
3
4
5
6
7
8
...
Group Effect Variance
A
0.2
0.10
A
0.6
0.15
A
0.5
0.05
A
0.1
0.06
B
0.8
0.08
B
0.4
0.05
B
0.9
0.04
C
0.2
0.09
A Two-Step Solution
Tij ~ N(qij, sij2)
qij ~ N(yj,wj2)
yj ~ N(m,t2)
library(plyr)
data1_study <- ddply(data1, .(Group),
function(adf){
mod <- rma(Effect, Variance, data=adf)
cbind(theta_j = coef(mod),
se_theta_j = coef(summary(mod))[1,2],
omega2 = mod$tau2)
})
A Two-Step Solution
Tij ~ N(qij, sij2)
qij ~ N(yj,wj2)
yj ~ N(m,t2)
> data1_study
Group
theta_j se_theta_j
omega2
1
A 0.3312500 0.1369306 0.00000000
2
B 0.7005364 0.1654476 0.02854676
3
C 0.6788453 0.1987595 0.17151248
4
D 0.7836646 0.2677693 0.26470540
5
E 0.8552760 0.1556476 0.14561528
yj
wj
A Two-Step Solution
Tij ~ N(qij, sij2)
qij ~ N(yj,wj2)
yj ~ N(m,t2)
> rma(theta_j, I(se_theta_j^2), data=data1_study)
Random-Effects Model (k = 5; tau^2 estimator: REML)
t2
tau^2 (estimated amount of total heterogeneity): 0.0272 (SE = 0.0414)
...
estimate
se
zval
pval
ci.lb
ci.ub
0.6472
0.1087
5.9545
<.0001
0.4342
0.8603
***
m
Multiple Effects per Research
Group
Solutions to Multiple Hierarchies
• Multiple-Step Meta-analyses
• Multi-level hierarchical model fits
– Better estimator
– Accommodates more complex data structures
– May need to go Bayesian (don't be scared!)
• Model correlation…
Common Forms of Dependence
• Multiple effects per study
– Or per research group!
• Multiple effect sizes using same control
• Phylogenetic non-independence
• Measurements of multiple responses to a
common treatment
• Unknown correlations…
Multiple Effect Sizes with
Common Control
Effect of each treatment calculated using same control!
The Control Keeps Showing Up!
Treatment - Control
D=
JM
SDPooled
1
1
d
s (D) =
+
=
nt
n c 2n t +c
2
• nc and sdc are going to be the same for all
treatments
• Effect sizes will covary
Calculating Covariance
Treatment - Control
JM
D=
SDPooled
d
1
1
=
+
s (D) =
n c 2n t +c
nt
2
Dt1Dt 2
1
cov(DT 1, DT 2 ) = +
n c 2n c +t1+t 2+t 3+...
Formulae available or derivable for all effect sizes
A Mixed Effect Group Model
• Group means, random study effect,
and then everything else is error
Ti ~ N(qim, si2)
where
qim ~ N(mm,t2)
A Mixed Effect Group Model
• Group means, random study effect,
and then everything else is error
Ti ~ MVN(qi, Si)
where
qi ~ MVN (Xim, G2)
What are qi and Si?
qi =
Si=
éq i1 ù
ê ú
êq i2 ú
êë ... úû
Ti ~ MVN(qi, Si)
és 2 i1 0
0ù
ê
ú
2
ê 0 s i2 0 ú
êë 0
0 ... úû
What about the treatment effects?
é m1 ù
ê ú
ê m2 ú
êë ... úû
Xi =
é1 0 0 ù
ê
ú
0
1
0
ê
ú
êë0 0 ... úû
Gi=
ét 2 0 0 ù
ê
ú
2
2)
q
~
MVN
(X
m,
G
0
t
0
i
ê
ú i
êë 0 0 ... úû
m=
What if treatments are correlated?
Ti ~ MVN(qi, Si)
Si =
és 2 i1 si12 ... ù
ê
ú
2
êsi12 s i2 ... ú
êë ...
... ... úû
Why does covariance matter?
s2x-y = s2x + s2y + 2s x,y
• In asking if two treatments differ, cov helps
tighten confidence intervals
• High cov = more weight for a study as
treatments share information
Multiple Treatments
1
2
3
4
5
6
study trt m1i
m2i
sdpi n1i n2i
1
1 7.87 -1.36 4.2593 25 25
1
2 4.35 -1.36 4.2593 22 25
2
1 9.32 0.98 2.8831 38 40
3
1 8.08 1.17 3.1764 50 50
4
1 7.44 0.45 2.9344 30 30
4
2 5.34 0.45 2.9344 30 30
Common
Control!
http://www.metafor-project.org/doku.php/analyses:gleser2009
Calculating the
Variance/Covariance Matrix
[1,]
[2,]
[3,]
[4,]
[5,]
[6,]
[,1]
0.113
0.060
0.000
0.000
0.000
0.000
[,2]
0.060
0.098
0.000
0.000
0.000
0.000
[,3]
0.000
0.000
0.105
0.000
0.000
0.000
http://www.metafor-project.org/doku.php/analyses:gleser2009
[,4]
0.000
0.000
0.000
0.064
0.000
0.000
[,5]
0.000
0.000
0.000
0.000
0.098
0.055
[,6]
0.000
0.000
0.000
0.000
0.055
0.082
Fitting a Model with a VCOV Matrix
> rma.mv(yi ~ factor(trt)-1,
V,
random =~ 1|study,
data=dat)
Comparison to No Correlation
Model
With correlation
estimate
factor(trt)1
factor(trt)2
2.3796
1.5784
se
0.1641
0.2007
zval
pval
ci.lb
ci.ub
14.4984
7.8662
<.0001
<.0001
2.0579
1.1851
2.7013
1.9716
zval
15.7196
7.1405
pval
<.0001
<.0001
ci.lb
2.0797
1.1011
ci.ub
2.6722
1.9343
Without correlation
factor(trt)1
factor(trt)2
estimate
se
2.3759 0.1511
1.5177 0.2125
Common Forms of Dependence
• Multiple effects per study
– Or per research group!
• Multiple effect sizes using same control
• Phylogenetic non-independence
• Measurements of multiple responses to a
common treatment
• Unknown correlations…
Effect Size on Related
Organisms Not Independent
Warming on
Litterfall
Pine Trees
Redwoods
Fir Trees
Oak Trees
Phylogenetic Distances Determines
Covariances for Weights
What about Multiple Studies of
Some Species?
Common Forms of Dependence
• Multiple effects per study
– Or per research group!
• Multiple effect sizes using same control
• Phylogenetic non-independence
• Measurements of multiple responses to a
common treatment
• Unknown correlations…
Common Treatments
Treatment
Response 1
Response 2
Response 3
Common Treatments
CO2
CO2
Assimilation
GS
Stomatal
Conductance
PN
Correlation Between Responses
What does Correlation between
effects mean?
Xi =
Gi=
é1 0 0 ù
ê
ú
0
1
0
ê
ú
êë0 0 ... úû
é t2
1
ê
êrt1t 2
êë ...
m=
é m1 ù
ê ú
ê m2 ú
êë ... úû
rt1t 2 ... ù
ú q ~ MVN (X m, G2)
2
i
t2
... ú i
...
... úû
What Do We Do?
1. Create a 'composite' measure
– Average
– Weighted Average
2. Estimate different coefficients directly
3. Robust Variance Estimation (RVE)
The CO2 Effect Data
1
2
3
4
5
6
7
8
9
10
experiment Paper Measurement Hedges Var
1
121
GS -0.4862 0.3432
1
121
PN 0.9817 0.3735
2
121
GS 0.1535 0.3343
2
121
PN 2.0668 0.5113
3
121
GS 0.0965 0.3337
3
121
PN 2.6101 0.6172
4
121
GS 0.0000 0.2857
4
121
PN 3.6586 0.7638
5
168
GS -1.5271 0.4305
5
168
PN 1.8355 0.4737
Direct Estimation
rma.mv(Hedges ~ Measurement,
Var,
random =~ Measurement|Paper,
data=co2data,
struct="HCS")
r and Different Correlation
Structures
• Different structures for different data
• We do not always know which one is correct!
Estimates of Variance, Covariance
Multivariate Meta-Analysis Model (k = 68; method:
REML)
Variance Components:
outer factor: Paper
(nlvls = 18)
inner factor: Measurement (nlvls = 2)
tau^2.1
tau^2.2
rho
estim
4.5098
3.5799
0.4751
sqrt
2.1236
1.8921
k.lvl
34
34
fixed
no
no
no
level
GS
PN
Disadvantages to Multivariate
Meta-Analysis
1. Difficult to estimate with few studies
2. Additional assumptions of covariance
structure
3. Often little improvement over univariate
meta-analysis
4. Publication bias exacerbated if data not
missing at random
Jackson et al. 2011 Satist. Med.
Robust Variance Estimation
• Essentially, bound weights within a group j
to 1/mean varj and assume a value of r
– Test sensitivity to choice of r
– Correct DF for small sample sizes
• Methods developed by Hedges, Tipton,
and others
• robumeta package in R
robumeta & RVE
library(robumeta)
robu(Hedges ~ Measurement,
data=co2data,
studynum=Paper,
var.eff.size=Var)
RVE: Correlated Effects Model with SmallSample Corrections
Model: Hedges ~ Measurement
Number of studies = 18
Number of outcomes = 68 (min = 2 , mean =
3.78 , median = 4 , max = 10 )
Rho = 0.8
I2 = 85.59992
Tau.Sq = 2.561661
Struct="CS" only so far
Often, Choice of r Matters Little
> sensitivity(co2modRVE)
Type
Variable
1 Estimate intercept
2
- MeasurementPN
3 Std. Err. intercept
4
- MeasurementPN
5
Tau.Sq
-
rho=0
0.00454
1.03149
0.51173
0.61984
2.55334
rho=0.2
0.00457
1.03139
0.51179
0.61990
2.55542
rho=0.4
0.00459
1.03128
0.51185
0.61996
2.55750
rho=0.6
0.00462
1.03118
0.51192
0.62003
2.55958
rho=0.8
0.00464
1.03107
0.51198
0.62009
2.56166
rho=1
0.00467
1.03097
0.51204
0.62015
2.56374
Results May Differ…
Multivariate Meta-Analysis
Model Results:
intrcpt
MeasurementPN
estimate
-0.0503
1.0579
se
0.5221
0.5359
zval
-0.0963
1.9742
pval
0.9233
0.0484
ci.lb
-1.0735
0.0076
ci.ub
0.9730
2.1082
*
Robust Variance Estimation
Model Results:
Estimate StdErr t-value
Sig
1 intercept
2 MeasurementPN
0.00464
1.03107
df P(|t|>) 95% CI.L 95% CI.U
0.512 0.00907 16.7
0.620 1.66278 16.7
0.993
0.115
-1.077
-0.279
1.09
2.34
Other Sources of Unknown
Correlations
• Shared system types
• Shared environmental events
• Labs or investigators
• Re-sampling experiments
• Experiments repeated in a region
• More…
Why Model Correlation instead
of Hierarchy?
• Depends on question
• Analytical difficulty
• Leveraging correlation to aid with missing
data