Download Report

Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
Structural Equation Modeling, 19:351–371, 2012
Copyright © Taylor & Francis Group, LLC
ISSN: 1070-5511 print/1532-8007 online
DOI: 10.1080/10705511.2012.687661
An Examination of Statistical Power in
Multigroup Dynamic Structural Equation Models
John J. Prindle and John J. McArdle
University of Southern California
This study used statistical simulation to calculate differential statistical power in dynamic structural equation models with groups (as in McArdle & Prindle, 2008). Patterns of between-group
differences were simulated to provide insight into how model parameters influence power approximations. Chi-square and root mean square error of approximation (RMSEA) power approximation
procedures were used to compare the effects of parameter manipulations and how researchers
should interpret findings. The chi-square power of perfect fit calls for at least 270 individuals
to detect moderate differences, whereas the RMSEA procedure of close fit seems to require as
many as 1,450 participants. It is shown that parameters that provide input into the change score
that the transfer leads to affect power versus indirect pathways. A discussion of differences in
approximation values and future research directions follows.
Keywords: DSEM, multigroup, power
The classic definition of statistical power is the ability of a researcher to detect the direction of
an effect using a null hypothesis test of significance (see Cohen, 1988). In the typical equation
power D 1 “, in which “ is the percentage of Type II error we are willing to accept on an
a priori basis. Traditionally acceptable Type II error rates have fluctuated from area to area
within psychology with a minimum proposed level being 20% and a more desirable level
of 80% (Cohen, 1988). Others still have indicated that higher levels of power are necessary
for good psychological studies—in the 95% “equipotent” range; Muller & Benignus, 1992;
O’Brien & Muller, 1993). This prior research outlines considerations when planning studies
and calculating effective sample sizes given the phenomenon under investigation.
The purpose of this article is to provide a framework for analyzing the effects of dynamic
structural equation models (McArdle, 2009) in relation to statistical power. We will use a
dynamic structural equation modeling (SEM) model with dual latent change scores (LCSs),
recently used to test the nature of transfer from the trained domain to a related domain (McArdle
& Prindle, 2008). The use of longitudinal and multivariate data allows researchers to take
Correspondence should be addressed to John J. Prindle, Department of Psychology, 3620 McClintock Ave., SGM
501, Los Angeles, CA 90089, USA. E-mail: [email protected]
351
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
352
PRINDLE AND M C ARDLE
advantage of cross-trait and within-person variation, but the question of how many actual data
points are needed to retain traditional accuracy of testing is still unknown. All prior SEM
research indicates that a larger sample size will be needed when the analysis is based on
latent variables (Tanaka, 1987). In addition, when a larger number of parameters are being
estimated there is a need for more data to accurately measure those parameters. The question
we address here is practical: How many people are needed for adequate statistical tests using
these seemingly advanced models?
STATISTICAL POWER ANALYSIS
The issue of power in multivariate studies presents us with an opportunity to use data more
effectively (e.g., Schmitz, Cherny, & Fulker, 1998). One overarching theme that seems relevant
here is that good research requires an analysis of power in advance of model analyses (Baguley,
2004; Hoenig & Heisey, 2001). Of course, the history of psychology is littered with ideas and
theories that have been less than fruitful because of null results (Maxwell, 2004). By focusing
on data that are available, our effect sizes, and thus our initial hypotheses, might be biased
because of the amount of underpowered studies within psychology. In an attempt to reconcile
this apparent dearth in the literature, Thomas (1997) proposed the use of a proactive approach
to studies of power. It is important to note that the implementation of power is just as crucial
as following proper ethical standards (i.e., McArdle, 2010). It now seems common to calculate
power using a reactive approach, but this actually does little to make a case for adequate
sample sizes because power for such tests is based largely on the fitted model p value (Hoenig
& Heisey, 2001). Some have proposed research designs based on priors and updated posteriors
once results are obtained, instead of recalculating and repackaging a result. Of course, if we
have them, we can use priors to determine measures and models, but if the sample size is not
explicitly calculated beforehand, then any power analysis after the fact is not fully informative.
An advantage of premeditated planning of research is one can work more efficiently. The
cost-effectiveness of a study is important to its viability. Given that funding is limited in
research, it would benefit researchers to know whether they would receive more bang for
their buck if they got more participants versus more replicate measures (Allison et al., 1997).
A more recent aspect of study design has been the optimality of study size and duration
through minimizing structural equation models to their most basic forms (von Oertzen, 2010).
In addition, an emphasis on reliability of measures and within-person variation of analysis of
covariance designs of SEM produces strong, testable models based on theory. In experimental
design we can see the benefit of streamlining and preparing for studies so optimal outcomes
are more likely on an a priori basis (i.e., significant p values).
To determine SEM power we need the sample size (n), the size of the effect (d), and the Type
I error rate (’). These steps are outlined in Cohen (1988) with “rules of thumb” for interpretation
of the calculated effect sizes for many basic data analysis models (i.e., contingency tables,
analyses of variance [ANOVA], regression, etc.). A simple reorganization of these variables
will allow us to determine effect size or sample size if power is held constant. The concept of
power goes beyond the standard practices of minimum group sample sizes and research rules
of thumb; power is a way of determining how many people are needed to find an effect of
a given treatment. In terms of this point of view, we put more importance on what the size
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
STATISTICAL POWER IN DYNAMIC STRUCTURAL EQUATION MODELS
353
of the effect we are looking for is than we had before (Cohen, 1990). Studies are based on
prior research findings and we use these results to determine how to effectively measure our
research question. In this way, proper statistical analysis of research carried out will provide
definitive results on the theory under investigation.
There are many prior research articles attempting to understand how power is calculated in
multivariate and longitudinal designs (Duncan, Duncan, & Li, 2003; MacCallum, Browne,
& Sugawara, 1996; McQuitty, 2004; Muthén & Curran, 1997; Muthén & Muthén, 2002;
Raudenbush & Lui, 2000; Satorra & Saris, 1985; Schmitz et al., 1998; Tanaka, 1987; Yuan &
Hayashi, 2003). There are a lot of benefits to using an SEM analysis over traditional statistical
methods. The ability to test specific hypotheses within the model as well as do traditional
tests (i.e., ANOVA) within the same structure makes it a versatile tool. The use of factor
models and longitudinal data requires an advancement of methods for dealing with power
calculation beyond the initial method outlined by Cohen (1988). We can build on this literature
by implementing techniques for SEM with power calculation given the fit indexes and issues
we encounter in SEM. Using these powerful tools there is ample opportunity to accurately test
any hypotheses researchers might have.
FITTING STRUCTURAL EQUATION MODELS
To determine model fit in SEM, a few methods and procedures used should be introduced. A
basic method of determining model fit is the use of the ¦2 difference test (Anderson & Gerbing,
1988; Saris & Satorra, 1993; Satorra & Saris, 1985; Tucker & Lewis, 1973). In this method
we use nested models to test how certain pathways can be constrained within each sequence of
models. Nested models are those that have the same exact structure but constraints are added
or removed, which changes the number of parameters estimated. This changes the degrees of
freedom (df) for the model depending on how many constraints are added or removed. The
change in ¦2 model fits is compared to the change in df between those same models. The
difference in ¦2 scores of these nested models can then be compared to determine which
model has the best fit. Even though the ¦2 values of the nested models are correlated, the ¦2
values of the difference between models are independent allowing for a ¦2 test of significance
based on the change in df (Steiger, Shapiro, & Browne, 1985).
The next method of model fit examines the closeness of fit, which is known as the root
mean square error of approximation (RMSEA). This measure of fit was introduced by Steiger
and Lind (1980) to measure the degree to which the model deviates from close fit, taking
into account the number of parameters in the model. Traditionally the RMSEA is presented
with a 90% confidence interval as a way of representing the range of close fit. The idea of
such an index of fit is that there is an increase in model fit as a model becomes increasingly
complex; that is, as more parameters are added (Steiger, 1990). To properly and fairly compare
nested models we need to account for difference in model complexity based on the number of
parameters each contains.
One advantage of SEM methods is the use of multigroup designs to test training effects.
When a multigroup design is used, the model can be constrained in full or in parts throughout
the model. The use of selected constraints is commonly used to allow for between-group
variations in parameters (Raykov, 1997). With this method of model fitting we can allow
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
354
PRINDLE AND M C ARDLE
groups to differ in key areas that we know to be true (e.g., change score means), while still
testing other parameters in our power analyses. The effectiveness of using multigroup designs to
detect group differences in key parameters has been used successfully in past studies (McArdle
& Prindle, 2008; Molenaar, Dolan, & Wicherts, 2009; Qureshi & Compeau, 2009). Provided
the measurements are equal between groups and the constructs are identical in meaning, we
can compare the scores across the groups for equality.
There are other forms of comparing model fit for SEM programs. In practical use the findings
of how these indexes compare have been less than optimistic. One index might provide different
conclusions as other indexes on the same model and data fit to it (Fan, Thompson, & Wang,
1999). Others have also shown that discrepancies might arise among standard indexes of model
fit (Marsh, Balla, & McDonald, 1988).
POWER APPROXIMATIONS IN STRUCTURAL EQUATION MODELING
Some examples of SEM power analyses have been extensive in analyzing the effects of
covariance structure on resultant power. MacCallum, Browne, and Cai (2006) showed how
nested models could be compared in SEM programs to yield reasonable power approximations
for prospective studies. Several studies present standard methods for examining power in a
latent growth model framework (Fan, 2003; Hertzog, Lindenberger, Ghisletta, & von Oertzen,
2004; von Oertzen, Ghisletta, & Lindenberger, 2010). These studies provide a set of guidelines
to which we can adapt our problem. In creating a set of nested structural equation models with
latent variables explicitly defined, we can test what sample sizes are approximated with a given
effect size.
The most tractable method is that of Satorra and Saris (1985). This method is a twostep method based on the likelihood ratio test ¦2 value noncentrality parameter. An implied
model is fit with parameters fixed to obtain a covariance structure outputted based on the
alternative hypothesis being true. The null hypothesis is then fit and the noncentrality parameter
is estimated. The df for the model is calculated as the difference between the two hypotheses
and the critical value is the ¦2 value associated with the degrees of freedom. This is used in the
noncentrality parameter function to determine power for detecting the effect in the alternative
hypothesis. Alternatives to the likelihood ratio tests are the likelihood modification and Wald
test. The likelihood modification index is an estimate of the change in fit by freeing one
parameter in the model. The Wald test uses the squared value of the t values for the parameter.
Another way to compute statistical power in an SEM setting is to use Monte Carlo methods.
The most influential of this type of method is that proposed first by Steiger (1990) and further
by Muthén and Muthén (2002). First, a population model is established with population values
used to create a sample set. Second, a set of samples are fit to the model and model fits are
obtained. Then the alternative model is fit and the difference in model fits can be calculated and
tested, just as in the two-step method by Satorra and Saris (1985). Another simple technique is
to record the percentage of successful rejections of the null model by Monte Carlo simulation
methods (Duncan, Duncan, & Li, 2006). These methods average the percentage of successes
over many repetitions to minimize random anomalies in simulation of the data.
Earlier methods are described by MacCallum et al. (1996) for models including latent
variables. This group proposes that the measure of close fit, the RMSEA, be used to calculate
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
STATISTICAL POWER IN DYNAMIC STRUCTURAL EQUATION MODELS
355
a model’s power for a given effect size. The concept of close fit as given by the RMSEA
measures how the model closely approximates the data, as opposed to the exact misfit given
by the ¦2 measure. The noncentral distribution of RMSEA values for the null and alternative
hypotheses are compared and the power is the probability that we can detect a difference
between these two distributions. The probability of power is a function of the noncentral
distribution for the degrees of freedom, d, and the noncentrality parameter œ for the alternative
hypothesis. This is compared to the critical value given by a specified level of ’. The calculated
power is given as the area under the curve beyond the critical value. The null hypothesis is the
calculated RMSEA values for our models from the simulation runs. This is then compared to
a small difference in model fits that is specified beforehand. For example, If we want to know
the probability that a null and alternative model are significantly different, we can test these
values compared to a null model RMSEA fit D .05 with an alternative model RMSEA fit D
.03. Although these specific values are arbitrary, the concept they represent is not.
STUDY FOCUS
We start with a dynamic model of the kind defined in McArdle and Prindle (2008). This is a
model of the expected means and covariances across at least two groups—randomly assigned
control and experimental groups based on two or more occasions. The purpose of this Monte
Carlo study is to determine sample size recommendations for a basic set of dynamic model
issues from other researchers.
METHODS
Latent Change Score Model
The use of LCS in SEM is a treatment for obtaining an unobserved change in a measurement
(McArdle, 2009; McArdle & Hamagami, 2003; McArdle et al., 1998). These longitudinal
analyses allow us to use the observed scores and estimate the change as a latent variable in
the model. The use of LCS in psychological measurement is explicitly outlined in Ferrer and
McArdle (2010). The utility of such models to provide insight into unobserved changes between
observations gives researchers a strong tool in multivariate analyses. The first equation defines
the relationship of the two observed scores to the latent change score, y2 D y1 C y where
the components are defined as y1 D y1 C ey1 and y D y C “y y1 C ey . The second score
is a one-to-one function of both the first score and the unobserved change. From this fixed
relationship we can break the two components down further. The first observed score has a
mean and variation, both unique to it. The LCS is given a mean and error as well, but is related
to the first score. The relationship between the first score and the latent change is indicated
by the weight “. The strength of the relationship depends on the estimate of this weight. The
multigroup representation of the LCS model is shown in Figure 1.
The LCS model is used in this study as a group model with constraints held equal across
groups. This finding makes it possible for us to test for group differences by relaxing constraints
between groups to test for groups’ differences on key hypotheses. These nested models, models
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
356
PRINDLE AND M C ARDLE
FIGURE 1 A two-group latent change score structural equation model. The parameters are shown as equal
across both groups, indicating that both groups have the same weights for all of the pathways. Nested models
will be models that have the same structure but vary which parameters are equal across groups. In this way
we can test means, regression weights, and variances individually for group differences.
that share the same structure with differences in parameter constraints, can then be examined
for significant changes in model fit (Bagozzi & Yi, 1988). Further work by McArdle and
Prindle (2008) presented a method of testing factor dynamics in clinical trials based on nested
LCS models, building on work by McArdle (2007).
Dual Latent Change Score Model
We define the dual LCS model as modeling the relationship of two constructs over time with
crossed and lagged effects. The model proposed is the final model in the analyses of a randomized clinical trial testing interventions on aging effects, which tested between-group differences
in model constraints shown in Figure 2 (McArdle & Prindle, 2008). In this article, the dual LCS
model tests the effect of how the previous scores are related to estimated change. The model
is fully crossed and lagged and each pathway is tested for significance of temporal effects.
Because it is a dual change score (DCS) model, we can couple the two measured constructs
together and determine the interaction of the measures within time and over time. If y was the
first measure we can call x our second variable and the relationship of x with itself over time
is given as x2 D x1 C x. It then follows that the individual pieces of x2 are x1 D x1 C ex1
and x D x C “x x1 C ex . To couple these models we say that the change scores are now a
function of both of the previous scores. The coupled model is given as the system of equations,
y D y C “y y1 C ”xy x1 C ey and x D x C “x x1 C ”yx y1 C ex . The addition to the
LCS model couples the constructs and creates parameters to estimate effects of x1 to y and
y1 to x, as indicated by the ” symbol.
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
STATISTICAL POWER IN DYNAMIC STRUCTURAL EQUATION MODELS
357
FIGURE 2 The dual change score model is represented as a multigroup model, similar to the latent change
score model in Figure 1. The parameters are written as identical across groups, which would indicate that there
is no difference between the groups in the model being tested.
We use the factor scores to create a score that is the combination of the shared variation in
the construct they represent. So for each construct we have three measures that are related to
the construct and our factor structure is rotated to maximize the variance explained by these
three measures. With a given factor structure, a score is provided for each person and the dual
LCS model is used with factors as the Time 1 and Time 2 scores in the model.
Monte Carlo Simulation
The Mplus program (Muthén & Muthén, 2010) was used to simulate the data for the power
analyses carried out. The program uses a fairly standard Monte Carlo mixture process to
generate data and covariance structures for a specified model. The same model specifications
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
358
PRINDLE AND M C ARDLE
are then used to create a predetermined number of simulated data sets. These data sets are then
run on the analysis model, which can be the same or might differ from the creation model. In
this way the Mplus program allows users to test the misfit of the data to a specific hypothesis
when the correct model for analysis is assumed to be known.
The Mplus program allows for specific tests of model constraints and group analyses. The
group design of our analysis indicates that two groups are measured on two constructs indicated
by three measures over two times. Parallel measures over two periods of time and across
multiple groups are fit as dual LCS models as indicated by previous studies with similar data
on which this study is based (McArdle & Prindle, 2008). In our model, one group is an
experimental trained group and the other a control group. The first measure is a near measure
.x/, which closely matches the construct in which training occurred. For this measure, the
training created a higher change score in the experimental group when compared to the control
group. In the untrained far measure we simulate small gain in both groups, with no detectable
differences in mean level between groups. The far .y/ construct is an ability that we hope
to transfer training effects to from the near measure through mean differences or stronger
association between predictors of the far change in the trained group versus the controls. We
are examining the effects of training on the far change score through the transfer pathway .£/
as shown in Figure 2. The effect of differential factor reliability is also tested across x and y
factors.
Manipulations of our analyses center on the pathways that lead to the latent change in the
far construct. First we assume that where you are initially in ability on the far construct will
have some effect on how much you change over time. The same holds true for the near factor;
we call this the lagged effect. The next effect is the crossed effect: Where one starts on the
near measure affects how much the far measure changes and vice versa for far on near. The
last element of directional effects is that of transfer from the near change to the far .£/. This
pathway indicates how change achieved in the near factor affects change in the far factor. The
crossed and lagged effects on the near change can reach the far change through the transfer
pathway. Figure 2 is one example of the DCS model of transfer as described in the previous
sections.
Model Parameters
We have five specific hypotheses about this specific bivariate LCS model.
1. The effect of the transfer weight in determining power will be more pronounced with
larger discrepancies between the groups.
2. The variation of supporting weights .“n ; “f ; ”nf ; ”f n / will affect the extent to which
the transfer weight will be detectable.
3. The direct pathways will affect the power calculation more than the indirect pathways.
4. The effectiveness of competing methods of statistical power analysis is tested.
5. The reliability of the factors will be correlated with power of the model to detect
differential transfer between the groups.
Here we hypothesize that the pathways leading to the transfer path will affect power more
than ones that support change but not transfer. Of course, we want to determine how the dy-
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
STATISTICAL POWER IN DYNAMIC STRUCTURAL EQUATION MODELS
359
namics of the model affect the power calculation for the test of transfer under a strict sequence
of model parameter specifications.
The models proposed have some elements that are held constant throughout our analyses.
For simplicity, Occasion 1 variables are held constant with a mean of 0.0 and a variance of
1.0. The mean and variance of the Occasion 2 measures are fixed at 0.0 so that the mean and
the variance are functions of the Occasion 1 and change score values. The regression values
and the covariances are manipulated in the models with the control group’s values provided at
the top of the tables if they differ from those of the experimental group. All other values not
specified in the body of the table can be found in the notes for full parameter specification.
The RMSEA and ¦2 statistical power values are based on the difference in degrees of freedom
between the invariant model and the relaxed model.
RESULTS
The statistical power in dynamic structural equation models presented here starts with onevariable .V D 1/ examples over occasions, building up to models testing interactions of groups
and variables over occasions. The first LCS models examine the effects of group differences
in a simple training experiment. The control group is held to no change in parameters and the
experimental group is incremented on the mean, variance, and regression weight to the change
score. Statistical power is calculated in each step to see how these increments in between and
within variation affect a researcher’s ability to detect group differences with a fixed sample
size. The move to DCS models utilizes the same restrictions on two sets of LCS models. The
added strength of these models is the use of crossed and lagged regressions which allow for
tests of whether variables affect one another over time. With the DCS there is first a model
created with means and covariances at Time 1 and among the LCSs. Then the transfer model
is presented, in which we assume the training has an explicit transfer from the near ability
to the far ability. Not only does the previous time affect the change in the untrained ability,
but so does the change in the trained ability. Power estimates for incremental changes in the
pathways between groups are given.
Latent Change Score Group Mean Difference Testing
The model of change on a measure is testing the ability to detect a difference in the latent
change between groups. The model parameters and power estimates for those parameters are
given in Table 1. The first thing to note is small changes in the mean, as the control group is
held at 0, but moving the experimental group’s mean further from 0 creates large amounts of
power. The variance of the first five models is held at a negligible level so the mean of the
groups is the only model difference.
Moving to incrementing the variance of the change, we can see that increasing this parameter
decreases the power of detecting mean differences. This is largely because we are creating
noise around the mean and spreading the data so there is overlap between the groups. The
distributions increase in their overlap as the variance increases and it becomes harder to find a
significant difference in the means.
360
PRINDLE AND M C ARDLE
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
TABLE 1
Latent Change Score Fixed Control Group
Control
Model 1
Model 2
Model 3
Model 4
Model 5
Model 6
Model 7
Model 8
Model 9
Model 10
Model 11
Model 12
Model 13
Model 14
Model 15
Model 16
Model 17
Model 18
Model 19
Model 20
Model 21
Model 22
Model 23
Model 24
Model 25
’n
2
¢
Power
(RMSEA)
Power
(¦2 )
0.30
0.01
0.10
0.20
0.50
0.70
0.01
0.10
0.20
0.50
0.70
0.01
0.10
0.20
0.50
0.70
0.01
0.10
0.20
0.50
0.70
0.01
0.10
0.20
0.50
0.70
0.50
0.10
0.10
0.10
0.10
0.10
0.20
0.20
0.20
0.20
0.20
0.50
0.50
0.50
0.50
0.50
0.70
0.70
0.70
0.70
0.70
1.00
1.00
1.00
1.00
1.00
0.725
0.351
0.062
0.427
0.972
0.647
0.287
0.051
0.406
0.955
0.258
0.091
0.027
0.140
0.756
0.234
0.087
0.025
0.154
0.705
0.261
0.103
0.028
0.171
0.686
0.781
0.515
0.243
0.571
0.966
0.715
0.456
0.222
0.528
0.945
0.570
0.356
0.195
0.443
0.869
0.505
0.320
0.188
0.408
0.818
0.437
0.285
0.182
0.371
0.751
Note. N D 100 per group (100 replications) with Time 1 variance and
mean held equal across groups (’n1 D 0; ¢n2 D 1). The model values alpha and
sigma change for the experimental group changes in each iteration. The beta
is equal for both groups throughout all models. Values shown in bold indicate
manipulated parameters. RMSEA D root mean square error of approximation.
Finally, testing the effect of Time 1 score on the latent change, we find that the stronger the
relationship in the experimental group, the higher the power. The control group is held to no
effect of Time 1 score on the latent change. Allowing the groups to have equal effects for the
regression weight does not affect power levels and they are the same as if the regression was
fixed at 0 for both groups.
With sample sizes of n D 100 per group, and the mean change and latent variance about
equal, we have at least 88% power across the different calculations of the power estimates
(Model 9). This level only increases if the regression from Time 1 is significantly different
from zero for the experimental group (Models 10–15). The sample size per group was set as
n D 100, yielding high levels of power for relatively small effect sizes. Table 1 tests the effect
of different change score variances between groups and ability to detect mean differences. The
control group is held at a mean change of 0.3 and variance of 0.5 and the experimental group
is incremented in both parameters. The larger the difference between the two means, the higher
the power is, indicated by both close and exact fit measures. The RMSEA measure notes more
STATISTICAL POWER IN DYNAMIC STRUCTURAL EQUATION MODELS
361
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
loss in power when variances of the latent change are closer, whereas the ¦2 estimate of power
shows drops in power for detecting mean differences when variances are further apart. As
noted before, the RMSEA measure of close fit allows us to set a range of acceptable misfit,
essentially saying the model is good enough when choosing between close alternatives. This
means we would need to see larger differences of close fit between nested models to have the
same level of power for the ¦2 measure of exact fit.
Dual Change Score Model Group Mean Difference Testing
Here we couple two LCS models together .V D 2/ to allow for crossing of measures from
Time 1 to the opposing LCS. The two change scores are coupled with crossed pathways from
Time 1 to the LCSs and by the covariances at Time 1 and between the LCSs.
When the group parameters are constrained to be fully invariant we find there is more misfit
with the model in terms of ¦2 value. The RMSEA estimate is also affected by this misfit. The
first element to be tested in this framework is the covariance at Time 1. The control group was
held constant with ¢nf D 0:1 and the value for the experimental group was incremented. With
increasing differences in covariation we find that there is a drop in power if we constrain the
groups to have equal parameters and test only for differences in the mean of one latent change.
There is a drop in power as the covariance grows in terms of using the RMSEA for power
estimates. The estimates using the relaxed constraints RMSEA and ¦2 estimates both provide
high estimates of power for determining group differences in the LCS.
Of course, the ability of the RMSEA index to act as a substitute for model power is dependent
on where in the range the null and alternative values fall. For ¦2 calculations, the difference
between the nested models and the df between the models determines whether there is power
to detect significant differences. Adding in crossed effects for the model does not change the
patterns of power estimates. The patterns hold when equal crossed effects are imposed on the
DCS model.
Moving to the covariance of the latent change scores, there is a small effect when incrementing the change covariance. For the measure of close fit, as the covariance between the
changes increases, the power to detect a mean change in the measure with differential group
mean changes. The model fits for this type of nested model analysis become important here.
For the ¦2 misfit estimate there is little deviation in ability to detect exact fit between the
models. When the mean is relaxed between the groups, the gain in fit is always substantial
compared to the equal means constraint. The RMSEA close fit power results again fall below
the ¦2 statistical power values, indicating that for the gain in close fit, the degree of freedom
is not worth forgoing. In other words, the added complexity of saying the groups have two
identifiable means will take more people when using the RMSEA measure of fit.
Testing Transfer in the Dual Change Score Model
The DCS model allows us to create a dynamic effect of one measure affecting the other through
the change scores. The major pathways relating measures to one another are incremented in
the DCS model with transfer, lagged effects, and then the crossed effects. Then, the covariance
362
PRINDLE AND M C ARDLE
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
TABLE 2
Power Approximations for a Dual Change Score Group Transfer Model
Control
Model 1
Model 2
Model 3
Model 4
Model 5
Model 6
Model 7
Model 8
Model 9
Model 10
Model 11
Model 12
Model 13
Model 14
Model 15
Model 16
Model 17
Model 18
£nf
“n
“f
”f n
”nf
¢n1f1
Power (RMSEA)
Invariant Parameters
Power (¦2 )
Relax Mean
0.05
0.1
0.3
0.5
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.1
0.1
0.1
0.1
0.2
0.3
0.4
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.2
0.3
0.4
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.2
0.3
0.4
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.2
0.3
0.4
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.2
0.3
0.4
0.001
0.224
0.900
0.248
0.258
0.269
0.355
0.486
0.579
0.238
0.249
0.256
0.348
0.459
0.555
0.236
0.236
0.239
0.187
0.678
0.980
0.687
0.697
0.707
0.771
0.840
0.886
0.686
0.694
0.703
0.767
0.833
0.878
0.681
0.684
0.686
Note. N D 100 per group with 100 replications. Time 1 near and far mean D 0.0 and variance D 1.0. Time 1
2
2
covariance fixed at 0.1. ’en D 0:5 (all other change means D 0.0) ¢n
D 0:5, ¢f
D 0:2. Invariant parameters root
mean square error of approximation (RMSEA) power estimate is listed first (df D 13=12), followed by ¦2 measure
of power. Values shown in bold indicate manipulated parameters.
at the initial time is examined. The initial set of models in Table 2 shows a typical gain in
power for an increasing difference between group transfer. The results that show no substantial
increase in power through incrementing leads to the near change, “n and ”f n . However,
incrementing regression weights to the far change, “f and ”nf , yields great improvements in
power. The graphs in Figure 3 show the changes in power by varying the individual parameters
within the DCS model. As can be seen in this representation, transfer is found more reliably
when the difference between the control and experimental group is greater and when parameters
affect the change in the far variable (where the transfer is directed). Figure 4 shows how the
power estimates between RMSEA and ¦2 compare when plotted together for each manipulated
parameter. Examining the effect of covariance on statistical power yielded little gain in power
as this value was increased, less than incrementing the regressions to the near change. Again,
we note that there are major differences in the power levels between using the ¦2 measure of
fit and the RMSEA.
In Table 3 the results of the variation in the reliability or communality of the common
factors are presented by ¦2 and RMSEA approximations. When the reliability of either factor
is held constant, the power to detect differences in transfer decreases for decrements in the
opposing factor reliability. If the reliability of x is held at .5 then the power to detect the same
differential transfer between groups decreases as the reliability in y decreases. The difference
between the ¦2 power and RMSEA power is very well distinguished here. The difference in
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
STATISTICAL POWER IN DYNAMIC STRUCTURAL EQUATION MODELS
363
(a)
(b)
FIGURE 3 (a) Root mean square error of approximation (RMSEA) power estimates varying model parameters; (b) chi-square power estimates with varying model parameters. Each set of points indicates one parameter
manipulation of the dual change score model outlined in Table 2. The first series is increasing values of transfer,
the second increasing values of “n , series 3 is increasing values of “f , series 4 is increasing values of ”n ,
series 5 is increasing values of ”f , and series 6 is increasing values of ¢n1f 1 .
transfer across groups is masked markedly by the increasing unreliability of the factors. It is
interesting to note that estimated power stays relatively high when the x (near) factor is varied.
The y (far) factor reliability creates more drastic changes in power approximations when its
values are manipulated in simulations.
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
364
PRINDLE AND M C ARDLE
FIGURE 4 (a) Power estimates of varying the transfer (£) parameter (gray line is ¦2 estimate, black is the
root mean square error of approximation (RMSEA) estimate); (b) power estimates of varying the “n parameter;
(c) power estimates of varying the “f parameter; (d) power estimates of varying the ”n parameter; (e) power
estimates of varying the ”f parameter; (f) power estimates of varying the ¢n1f 1 parameter. Parameters identified
in Figure 2 are varied in these graphs systematically.
DISCUSSION
We have identified key differences between the means and covariances over two different
groups. Naturally, some parameters are more important than others. Thus, the findings provide
insight into key modeling issues. Obviously, there is an increase in statistical power as the
between-groups parameters differ, but the within-groups models are key as well. In the LCS
models, the ability to detect mean differences in the LCS was increased as the experimental
group parameters were incremented away from those of the control group. In the DCS model,
STATISTICAL POWER IN DYNAMIC STRUCTURAL EQUATION MODELS
365
TABLE 3
Power Approximations for Variations in Factor Reliability
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
Y Factor Reliability
1
0.66
0.5
0.33
0.2
0.873
0.825
0.762
0.685
0.492
0.537
0.526
0.532
0.355
0.231
0.383
0.398
0.417
0.354
0.329
0.281
0.311
0.358
.230
0.314
0.193
0.267
0.314
0.251
0.315
0.061
0.045
0.038
0.032
0.023
0.028
0.22
0.021
0.015
0.012
0.021
0.016
0.016
0.014
0.014
0.017
0.014
0.014
0.011
0.013
0.013
0.012
0.013
0.011
0.014
Chi-square power approximation
X factor reliability
1
0.66
0.5
0.33
0.2
RMSEA power approximation
X factor reliability
1
0.66
0.5
0.33
0.2
Note. N D 100 per group with 100 replications. Power approximations are based on
model specifications from Model 3 in the dual change score transfer model of Table 2. The
factor reliabilities are adjusted by adjusting the uniquenesses so the ratio of factor variance over
total variance is as indicated. RMSEA D root mean square error of approximation.
the mean of the LCS on the near measure and the transfer pathway were both examined for
the ability to detect group differences. Statistical power was affected by both the mean and
variance of the change. Larger variation in the change scores minimizes the power; conversely,
larger differences in the mean change score lead to increases in power. Larger differences in
the regression weights shows small drops in power levels.
The pathways that lead directly to the latent change in the far measure show greater effects
on statistical power (“f and ”nf ). The indirect pathways have an almost negligible effect on
levels of power (“f , ”nf , and ¢n1f 1 ). The parameters that more directly affect the expectations
of the far latent change variation have a higher impact on power calculations. The important
¢2
factor in the LCS models was ’n
, the ratio of the variation of the change to the mean of the
n
change. The higher this ratio was, the lower the power when compared to a model with a lower
ratio. When the move to the testing of transfer in the DCS model is made, the expected variances
of the change scores become important in determining power. The transfer weight is one of the
parameters in the expectation of the variance of the change. By increasing transfer weight or
the other direct regression weights, the proportion of the residual variance compared to total
expected variance decreases, increasing the proportion of predicted variance. The variation in
reliability shows that the factor of interest needs to have fairly high reliability for reasonable
power, even with known group differences. Using measures that do not reliably give consistent
results will lead to losses in the ability to detect true group differences.
The size of the effect is important in this set of calculations. Traditionally a standardized
effect is used, the mean difference divided by the standard deviation. The size of the effect
we observe has to do with how many standard deviations away from the population mean
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
366
PRINDLE AND M C ARDLE
the hypothesized treatment effect changes the score. For our given paradigm, this is given as
x
x
d D obs s pop . The d value is the observed difference between the treated mean from the known
population value divided by the standard error. The standard error has been proposed to be a
pooled standard error in case the variances of the experimental and control groups were not
equal as Cohen (1988) assumed (Hartung, Knapp, & Sinha, 2008). The sample size is related
to the standard error: The larger the sample the size, the lower the standard error. It might
be useful to note that Cohen had proposed that effects be broken down into small, medium,
and large (i.e., numerically as 0.2, 0.5, and 0.8 effect sizes). The particulars of whether one
should consider an effect small or large can only be based on the specific field of study and
its relevant history. Rules of thumb are only applicable in this framework when specific values
of the impact of treatment and variability of effectiveness are known. The literature within a
given field of study will inform the researcher as to what should be considered large and small
effects. Thus, researchers should not take values as rules of thumb, but should take an in-depth
look at how their specific hypotheses fit with current literature. Doing so can also create a
cohesive narrative in the progression of the field of study (Cohen, 1990).
The importance of power in research from a practical point of view is determining the
“minimum standards” a researcher must maintain to reliably find a “statistically significant”
effect. Guessing and using rules of thumb for research is not scientifically sound, and this type
of uninformed research leads to underwhelming results (Cohen, 1990; Maxwell, 2004). There
is such an emphasis to minimize Type I error in our research that issues with Type II errors
are often considered less important. By focusing on the ability to reliably find an effect, with
accuracy, we can hope to become more effective researchers.
We used two different methods of determining how good the model fits the data. The ¦2
measures the exact fit of the data to the model, whereas the RMSEA is a measure of close
fit and thus it takes more drastic changes in fit to reject a less parsimonious model. The idea
of model parsimony in SEM has been shown to be subjective based on the theory of the
problem being addressed, but should be aided by some measures of fit as indicated earlier in
fitting SEM models (McDonald & Marsh, 1990; Raykov & Marcoulides, 1999). The purpose
of model parsimony is to provide theory with the most simplistic model possible because it
is the one built with the most restrictions (Popper, 1959). Using the more relaxed fit of the
RMSEA allows researchers to have more leeway in determining if a more parsimonious model
is suitable (MacCallum et al., 2006). Because the RMSEA measure of fit gets at close fit, if
a model fits well enough given the constrained model, we can pick the constrained model.
A relaxed model might be close enough in fit that the extra parameters do not add much to
warrant the added complexity.
With the results of the DCS models, we can make some recommendations of sample size
goals for this class of experimental training research. Based on previous research, a model with
minimal crossed and lagged weights with moderate covariation at Time 1 will closely resemble
Models 16 to 18 in Table 2. As indicated in Table 2, there are 100 participants simulated per
group, for a total sample size of 200. Increasing the sample size to 270 will bring the power to
80% based on the ¦2 measure of fit and a sample size of 1,450 is necessary for the RMSEA
calculation of model fit. The RMSEA measure indicates that minor differences in model
constraints across groups require a larger sample size. Because RMSEA takes into account
the model complexity, minor changes in fit might indicate that relaxing a group constraint is
not needed. The change in RMSEA is not significantly different in this case.
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
STATISTICAL POWER IN DYNAMIC STRUCTURAL EQUATION MODELS
367
The use of multiple methods of determining power shows how many interpretations of the
best fitting model can be made. Whereas the exact fit method shows us that we can make more
distinct divisions of good versus better fitting models, the RMSEA method provides one with
more stringent requirements for proposing a better fitting alternative model. This translates
into research designs with more participants needed when a more stringent standard is set.
Alternatively, one can look at minimizing the problem and using power equivalence to optimize
studies with a minimal set of parameters compared to the problem proposed (von Oertzen,
2010). This study is of note due to the potential importance in future research minimizing large
growth models into simple power equivalent models that provide insight into management of
measurement reliability and project costs. The power equivalence aspect of SEM provides good
stimulus for future research in power analysis in SEM.
To further analyze the merits of the DCS model in clinical trial data analysis, future analyses
should include effects of missingness. As reported by Hamagami and McArdle (2001), the
stability of using the change score approach is not without bias. Selecting cases to study under
varying circumstances of missingness will provide insight into the projections of approximate
power that this study reports.
REFERENCES
Allison, D. B., Allison, R., Faith, M. S., Paultre, F., & Pi-Sunyer, F. X. (1997). Power and money: Designing statistically
powerful studies while minimizing financial costs. Psychological Methods, 2, 20–33.
Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended
two-step approach. Psychological Bulletin, 103, 411–423.
Bagozzi, R. P., & Yi, Y. (1988). On the evaluation of structural equation models. Journal of the Academy of Marketing
Science, 16, 74–94.
Baguley, T. (2004). Understanding statistical power in the context of applied research. Applied Ergonomics, 35, 73–80.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45, 1304–1312.
Duncan, T. E., Duncan, S. C., & Li, F. (2003). Power analysis models and methods: A latent variable framework
for power estimation and analyses. In Z. Sloboda & W. Bukowski (Eds.), Handbook of drug abuse prevention (pp.
609–626). New York, NY: Kluwer Academic/Plenum Publishing Co.
Fan, X. (2003). Power of latent growth modeling for detecting group differences in linear growth trajectory parameters.
Structural Equation Modeling, 10, 380–400.
Fan, X., Thompson, B., & Wang, L. (1999). The effects of sample size, estimation methods, and model specification
on SEM fit indices. Structural Equation Modeling, 6, 56–83.
Ferrer, E., & McArdle, J. J. (2010). Longitudinal modeling of developmental changes in psychological research.
Current Directions in Psychological Science, 19, 149–154.
Hamagami, F., & McArdle, J. J. (2001). Advanced studies of individual differences linear dynamic models for
longitudinal data analysis. In G. A. Marcoulides & R. E. Schumacker (Eds.), New developments and techniques in
structural equation modeling (pp. 203–246). Mahwah, NJ: Erlbaum.
Hartung, J., Knapp, G., & Sinha, B. K. (2008). Statistical meta-analysis with applications. New York, NY: Wiley.
Hertzog, C., Lindenberger, U., Ghisletta, P., & von Oertzen, T. (2006). On the power of multivariate latent growth
curve models to detect correlated change. Psychological Methods, 11, 244–252.
Hoenig, J. M., & Heisey, D. M. (2001). The abuse of power. The American Statistician, 55, 1–6.
MacCallum, R. C., Browne, M. W., & Cai, L. (2006). Testing differences between nested covariance structure models:
Power analysis and null hypotheses. Psychological Methods, 11, 19–35.
MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis and determination of sample size for
covariance structure modeling. Psychological Methods, 1, 130–149.
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
368
PRINDLE AND M C ARDLE
Marsh, H. W., Balla, J. R., & McDonald, R. P. (1988). Goodness of fit indexes in confirmatory factor analysis: The
effect of sample size. Psychological Bulletin, 103, 391–410.
Maxwell, S. E. (2004). The persistence of underpowered studies in psychological research: Causes, consequences, and
remedies. Psychological Methods, 9, 147–163.
McArdle, J. J. (2007). Five steps in the structural factor analysis of longitudinal data. In R. Cudeck & R. C. MacCallum
(Eds.), Factor analysis at 100: Historical developments and future directions (pp. 99–130), Mahwah, NJ: Lawrence
Erlbaum Associates.
McArdle, J. J. (2009). Latent variable modeling of differences and changes with longitudinal data. Annual Review of
Psychology, 60, 577–605.
McArdle, J. J. (2010). Some ethical issues in factor analysis. In A. T. Panter & S. K. Sterba (Eds.), Handbook of
ethics in quantitative methdology (pp. 313–339). New York, NY: Taylor and Francis Group.
McArdle, J. J., & Hamagami, F. (2003). Structural equation models for evaluating dynamic concepts within longitudinal
twin analyses. Behavior Genetics, 33, 137–159.
McArdle, J. J., Prescott, C. A., Hamagami, F., & Horn, J. L. (1998). A contemporary method for developmental-genetic
analyses of age changes in intellectual abilities. Developmental Neuropsychology, 14, 69–114.
McArdle, J. J., & Prindle, J. J. (2008). A latent change score analysis of a randomized clinical trial in reasoning
training. Psychology and Aging, 23, 702–719.
McDonald, R. P., & Marsh, H. W. (1990). Choosing a multivariate model: Noncentrality and goodness of fit.
Psychological Bulletin, 107, 247–255.
McQuitty, S. (2004). Statistical power and structural equation models in business research. Journal of Business
Research, 57, 175–183.
Molenaar, D., Dolan, C. V., & Wicherts, J. M. (2009). The power to detect sex differences in IQ test scores using
multi-group covariance and means structure analyses. Intelligence, 37, 396–404.
Muller, K. E., & Benignus, V. A. (1992). Increasing scientific power with statistical power. Neurotoxicology and
Teratology, 14, 211–219.
Muthén, B., & Curran, P. (1997). General longitudinal modeling of individual differences in experimental designs: A
latent variable framework for analysis and power estimation. Psychological Methods, 2, 371–402.
Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo study to decide on sample size and determine
power. Structural Equation Modeling, 9, 599–620.
Muthén, L. K., & Muthén, B. O. (2010). Mplus 6 [Computer software]. New York, NY: Worth.
O’Brien, R. G., & Muller, K. E. (1993). Unified power analysis for t-tests through multivariate hypotheses. In L. K.
Edwards (Ed.), Applied analysis of variance in behavioral science (pp. 297–344). New York, NY: Marcel Dekker.
Popper, K. R. (1959). The logic of scientific discovery. Hutchinson, London.
Qureshi, I., & Compeau, D. R. (2009). Assessing between-group differences in information systems research: A
comparison of covariance- and component-based SEM. MIS Quarterly, 33, 197–214.
Raudenbush, S. W., & Liu, X. (2000). Statistical power and optimal design for multisite randomized trials. Psychological
Methods, 5, 199–213.
Raykov, T. (1997). Equivalent structural equation models and group equality constraints. Multivariate Behavioral
Research, 32, 95–104.
Raykov, T., & Marcoulides, G. A. (1999). On desirability of parsimony in structural equation model selection. Structural
Equation Modeling, 6, 292–300.
Saris, W. E., & Satorra, A. (1993). Power evaluations in structural equation models. In K. A. Bollen & J. S. Long
(Eds.), Testing structural equation models (pp. 181–204). Newbury Park, CA: Sage.
Satorra, A., & Saris, W. E. (1985). Power of the likelihood ratio test in covariance structure analysis. Psychometrika,
50, 83–90.
Schmitz, S., Cherny, S. S., & Fulker, D. W. (1998). Increase in power through multivariate analyses. Behavior Genetics,
28, 357–363.
Steiger, J. H. (1990). Structural model evaluation and modification: An interval estimation approach. Multivariate
Behavioral Research, 25, 173–180.
Steiger, J. H., & Lind, J. C. (1980). Statistically-based tests for the number of common factors. Paper presented at the
May annual meeting of the Psychometric Society, Iowa City, IA.
Steiger, J. H., Shapiro, A., & Browne, M. W. (1985). On the multivariate asymptotic distribution of sequential chi
square statistics. Psychometrika, 50, 253–264.
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
STATISTICAL POWER IN DYNAMIC STRUCTURAL EQUATION MODELS
369
Tanaka, J. S. (1987). “How big is big enough?”: Sample size and goodness of fit in structural equation models with
latent variables. Child Development, 58, 134–146.
Thomas, L. (1997). Retrospective power analysis. Conservation Biology, 11, 276–280.
Tucker, L. R., and Lewis, C. (1973). The reliability coefficient for maximum likelihood factor analysis. Psychometrika,
38, 1–10.
Von Oertzen, T. (2010). Power equivalence in structural equation modeling. British Journal of Mathematical &
Statistical Psychology, 63, 257–272.
Von Oertzen, T., Ghisletta, P., & Lindenberger, U. (2010). Simulating statistical power in latent growth curve modeling:
A strategy for evaluating age-based changes in cognitive resources. In M. Crocker & J. Siekmann (Eds.), Resource
adaptive cognitive processes (pp. 95–117). Heidelberg, Germany: Springer-Verlag.
Yuan, K.-H., & Hayashi, K. (2003). Bootstrap approach to inference and power analysis based on three statistics for
covariance structure models. British Journal of Mathematical and Statistical Psychology, 56, 93–110.
APPENDIX
COMPUTER PROGRAM SCRIPTS
Latent Change Score Simulation in Mplus 6.0.
TITLE: MONTE CARLO POWER ANALYSIS;
MONTECARLO:
NAMES ARE n1 n2;
NGROUPS = 2;
NOBS = 50 50; !Sample sizes for each group;
NREPS = 100;
SEED = 04072010;
RESULTS = output1.dat;
ANALYSIS: ITERATIONS = 20000;
MODEL POPULATION:
n2 ON n1 @ 1;
dn BY n2 @ 1;
dn ON n1 @ 0;
n1 @ 1; n2 @ 0; dn @ .001;
[n1 @ 0]; [n2 @ 0]; [dn @ 0];
MODEL POPULATION-G1:
n2 ON n1 @ 1;
dn BY n2 @ 1;
dn ON n1 @ 0;
n1 @ 1; n2 @ 0; dn @ .001;
[n1 @ 0]; [n2 @ 0]; [dn @ 0];
MODEL POPULATION-G2:
n2 ON n1 @ 1;
dn BY n2 @ 1;
dn ON n1 @ 0;
n1 @ 1; n2 @ 0; dn @ .01;
[n1 @ 0]; [n2 @ 0]; [dn @ .1];
MODEL:
n2 ON n1 @ 1;
dn BY n2 @ 1;
dn ON n1 * 0 (bn);
n1 * 1 (vn1); n2 @ 0; dn * .001 (rdn);
[n1 * 0] (mn); [n2 * 0] (mn); [dn * 0] (mdn);
MODEL G1:
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
370
PRINDLE AND M C ARDLE
n2 ON n1 @ 1;
dn BY n2 @ 1;
dn ON n1 * 0 (bn);
n1 * 1 (vn1); n2 @ 0; dn * .001 (rdn);
[n1 * 0] (mn); [n2 * 0] (mn); [dn * 0] (mdn);
MODEL G2:
n2 ON n1 @ 1;
dn BY n2 @ 1;
dn ON n1 * 0 (Ebn);
n1 * 1 (vn1); n2 @ 0; dn * .01 (Erdn);
[n1 * 0] (mn); [n2 * 0] (mn); [dn * .1] (mdn);
Dual Change Score Simulation in Mplus 6.0.
TITLE: MONTE CARLO POWER RCT;
MONTECARLO:
NAMES ARE n1 n2 f1 f2;
NGROUPS = 2;
NOBS = 100 100; !Sample sizes for each group;
NREPS = 100;
SEED = 04072010;
RESULTS = output1.dat;
ANALYSIS: ITERATIONS = 20000;
MODEL POPULATION:
n2 ON n1 @ 1; !NEAR measure
dn BY n2 @ 1;
dn ON n1 @ .1;
f2 ON f1 @ 1; !FAR measure
df BY f2 @ 1;
df ON f1 @ .1;
n1 @ 1; n2 @ 0; dn @ .2;
[n1 @ 0]; [n2 @ 0]; [dn @ 0];
f1 @ 1; f2 @ 0; df @ .2;
[f1 @ 0]; [f2 @ 0]; [df @ 0];
n1 WITH f1 @ .1; dn WITH df @ 0;
dn ON f1 @ .1; df ON n1 @ .1;
MODEL POPULATION-G1:
n2 ON n1 @ 1; !NEAR measure
dn BY n2 @ 1;
dn ON n1 @ .1;
f2 ON f1 @ 1; !FAR measure
df BY f2 @ 1;
df ON f1 @ .1;
n1 @ 1; n2 @ 0; dn @ .2;
[n1 @ 0]; [n2 @ 0]; [dn @ 0];
f1 @ 1; f2 @ 0; df @ .2;
[f1 @ 0]; [f2 @ 0]; [df @ 0];
n1 WITH f1 @ .1; dn WITH df @ 0;
dn ON f1 @ .1; df ON n1 @ .1;
MODEL POPULATION-G2:
n2 ON n1 @ 1; !NEAR measure
dn BY n2 @ 1;
dn ON n1 @ .1;
f2 ON f1 @ 1; !FAR measure
Downloaded by [MPI Max-Planck-Institute Fur Bildungsforschung] at 06:15 27 September 2013
STATISTICAL POWER IN DYNAMIC STRUCTURAL EQUATION MODELS
df BY f2 @ 1;
df ON f1 @ .1;
n1 @ 1; n2 @ 0; dn @ .5;
[n1 @ 0]; [n2 @ 0]; [dn @ .5];
f1 @ 1; f2 @ 0; df @ .2;
[f1 @ 0]; [f2 @ 0]; [df @ 0];
n1 WITH f1 @ .1; dn WITH df @ 0;
dn ON f1 @ .1; df ON n1 @ .1;
MODEL:
n2 ON n1 @ 1;
dn BY n2 @ 1;
dn ON n1 * .1 (bn);
f2 ON f1 @ 1;
df BY f2 @ 1;
df ON f1 * .1 (bf);
n1 * 1 (vn1); n2 @ 0; dn * .2 (rdn);
[n1 * 0] (mn); [n2 * 0] (mn); [dn * 0] (mdn);
f1 * 1 (vf1); f2 @ 0; df * .2 (rdf);
[f1 * 0] (mf); [f2 * 0] (mf); [df * 0] (mdf);
n1 WITH f1 * .1 (c1); dn WITH df @ 0 (cd);
dn ON f1 * .1 (gf); df ON n1 * .1 (gn);
MODEL G1:
n2 ON n1 @ 1;
dn BY n2 @ 1;
dn ON n1 * .1 (bn);
f2 ON f1 @ 1;
df BY f2 @ 1;
df ON f1 * .1 (bf);
n1 * 1 (vn1); n2 @ 0; dn * .2 (rdn);
[n1 * 0] (mn); [n2 * 0] (mn); [dn * 0] (mdn);
f1 * 1 (vf1); f2 @ 0; df * .2 (rdf);
[f1 * 0] (mf); [f2 * 0] (mf); [df * 0] (mdf);
n1 WITH f1 * .1 (c1); dn WITH df @ 0 (cd);
dn ON f1 * .1 (gf); df ON n1 * .1 (gn);
MODEL G2:
n2 ON n1 @ 1;
dn BY n2 @ 1;
dn ON n1 * .1 (bn);
f2 ON f1 @ 1;
df BY f2 @ 1;
df ON f1 * .1 (bf);
n1 * 1 (vn1); n2 @ 0; dn * .5 (Erdn);
[n1 * 0] (mn); [n2 * 0] (mn); [dn * .5] (mdn);
f1 * 1 (vf1); f2 @ 0; df * .2 (Erdf);
[f1 * 0] (mf); [f2 * 0] (mf); [df * 0] (mdf);
n1 WITH f1 * .1 (Ec1); dn WITH df @ 0 (cd);
dn ON f1 * .1 (gf); df ON n1 * .1 (gn);
371