Document 264377

Finite-Sample
Properties
GMM
of
Some
Alternative
Estimators
Lars Peter HANSEN
Departmentof Economics,Universityof Chicago,Chicago, IL 60637
John HEATON
KelloggGraduateSchool of Management,NorthwesternUniversity,Evanston,IL 60208
Amir YARON
G.S.I.A.,Carnegie MellonUniversity,Pittsburgh,PA 15213
We investigatethe small-samplepropertiesof three alternativegeneralizedmethodof moments
(GMM)estimatorsof asset-pricingmodels.The estimatorsthatwe considerincludeones in which
the weightingmatrixis iteratedto convergenceandones in whichthe weightingmatrixis changed
with eachchoiceof the parameters.
Particularattentionis devotedto assessingthe performanceof
the asymptotictheoryfor makinginferencesbaseddirectlyon the deterioration
of GMMcriterion
functions.
KEY WORDS: Asset pricing;Generalizedmethodof moments;MonteCarlo.
The purposeof this article is to investigatethe small- Kocherlakota(1990a), all of our experimentscome from
sample properties of generalized method of moments single-consumereconomies with power utility functions.
(GMM)estimatorsappliedto asset-pricingmodels.Oural- Withinthe confinesof these economies,there is still conternativeasymptoticallyefficientestimatorsincludeones in siderableflexibility in the experimentaldesign. Some of
which the weighting matrix is estimatedusing an initial our experimentaleconomies are calibratedto annualtime
(consistent)estimatorof theparametervector,ones in which series data presuminga centuryof data.Othereconomies
the weightingmatrixis iteratedto convergence,andones in are calibratedto monthlypostwardata.In the experiments
whichthe weightingmatrixis changedfor everyhypotheti- calibratedto annualdataandseveralof the experimentscalcal parametervalue.The last of these threeapproacheshas ibratedto
monthlydata,the momentconditionsare nonlinnot been used very muchin the empiricalasset-pricinglit- ear in at least one of the
parametersof interest.Moreover,
erature,but it has the attractionof being insensitiveto how some of the specificationsintroducetime nonseparabilities
the momentconditionsare scaled.In addition,we studythe in the consumer
preferencesthat are motivatedby either
advantagesto basing statisticalinferencesdirectly on the local durabilityor habit persistence.In the other expericriterionfunctionratherthanon quadraticapproximations ments
calibratedto monthlydata, the momentconditions
to it.
are,
by
design,linearin the parameterof interest.Some of
We addressthe followingissues:
these setupsare specialcases of the classicalsimultaneous1. How does the procedurefor constructingthe weight- equationsmodel. For other setups, the observeddata are
ing matrixaffect the small-samplebehaviorof the GMM modeledas being time averaged,introducinga movingavcriterionfunction?
erage structurein the disturbanceterms.
2. How do the confidenceregions of parameterestimaThe articleis organizedas follows. Section 1 describes
tors constructedusingthe GMMcriterionfunctionperform the alternativeestimatorswe study and the relatedeconorelative to confidenceregions based on the usually con- metric literature.Section 2 specifies the Monte Carlo enstructedstandarderrors?
vironmentswe use. Section 3 gives an overview of the
3. Is the small-sampleoverrejectionoften foundin stud- calculationsincludinga descriptionof how inferencesare
ies of GMMestimatorsreducedwhenusing an estimatorin madebaseddirectlyon the shapeof the criterionfunctions.
which the weightingmatrixis continuouslyaltered?
Section 4 then presentsthe resultsof the MonteCarloex4. How are the small-samplebiases of the GMM esti- perimentsusing a lognormalmodel calibratedto monthly
matorsaffectedby the choice of procedurefor constructing data.Section5 presentsthe resultscalibratedto annualand
the weightingmatrix?
monthlydatausing a Markov-chainapproximation.
Finally,
in
our
Section
6.
remarks
are
concluding
Because there has been an extensive body of empirical workinvestigatingthe consumption-based
intertemporal
model
GMM
estimation
(CAPM)
using
capitalasset-pricing
@ 1996 American Statistical Association
methods,we use suchmodelsas laboratoriesfor ourMonte
Journal of Business & Economic Statistics
Carloexperiments.As in the work of Tauchen(1986) and
July 1996, Vol. 14, No. 3
262
Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators
1. ALTERNATIVE
ESTIMATORS
AND
RELATED
LITERATURE
sion.] An advantageof this estimatorrelativeto the previous two is thatit is invariantto how the momentconditions
are scaledeven whenparameter-dependent
scale factorsare
introduced.A simpleexampleof a continuous-updating
estimatoris a minimumchi-squaredestimatorused for restrictedmultinomialmodels in which the efficientdistance
matrixis constructedfrom the probabilitiesimpliedby the
underlyingparametersand hence is parameterdependent.
The threeGMMestimatorshave antecedentsin the classical simultaneous-equations
literature.Considerestimating
One of the goals of our study is to comparethe finitesample propertiesof three alternativeGMM estimators,
each of which uses a given collection of moment conditions in an asymptoticallyefficientmanner.Write the moment conditionsas
E[(p(Xt,
j)] = 0,
263
(1)
where / is the k-dimensionalparametervector of interest.
In (1) the function (o has n _>k coordinates.We assume
that {1/'7
~t1 (p(Xt, 3)} convergesin distributionto
a normallydistributedrandomvector with mean 0 and covariancematrixV(3).
Let VT(0) denote (an infeasible)consistentestimatorof
this covariancematrix. This latter estimatoris typically
made operationalby substitutinga consistentestimatorfor
0/, denoted{b1}. An efficientGMMestimatorof the parameter vectorp is thenconstructedby choosingthe parameter
vector b that minimizes
a single equation, say yt =
3'xt
+ ut, where
/ is the pa-
rameterof interest.Let zt denote the vector of predetermined variablesat time t that by definitionare orthogonal to ut. One way to estimate/3 is to use two-stageleast
squares,which is our two-step estimatorunder the additionalrestrictionsthatthe disturbancetermis conditionally
homoscedasticand serially uncorrelated.In this case the
iterativeestimatorconvergesafter two steps and hence is
the two-stepestimator.It is well knownthat the two-stage
least squaresestimatoris not invariantto normalization.In
fact Hillier(1990) criticizedthe two-stageleast squaresestimatorby arguingthat the object that is identifiedis the
I
T1 T
-[1
T
direction[1,-0'] butnot its magnitude.Hillierthenshowed
(Xt,b)
"[
P-(Xt,
that the conventionaltwo-stageleast squaresestimatorof
[VT(bl)]L
L t=l
t=l1
directionis distortedby its dependenceon normalization.
The first two GMM estimatorsthat we considerdiffer in
As an alternative,Sargan(1958)suggestedaninstrumentalthe way in which this is accomplished.
variables-typeestimatorthatminimizes
Two-StepEstimator The first estimator,called the twostep estimator,uses anidentitymatrixto weightthe moment
Tt=
t1
S t=1(Yt- b'xt)z
=l ztz1
x
conditionsso that bI is chosen to minimize
(5)
T b'xt)zt
(Yt-
b)1 (2)
Z1
:
1
1
t=(5)
1E
# t=1(Yt- b'xt)2
T
t=l
t=l1
Let bl denotethe estimatorobtainedby minimizing(2).
Iterative Estimator. The second estimator continues
from the two-step estimatorby reestimatingthe matrix
V(/) using V(bT-1) and constructinga new estimatorbiT.
This is repeateduntil b- convergesor until j attainssome
large value. Let bO denotethis estimator.
ContinuousUpdatingEstimator Insteadof taking the
weightingmatrixas given in each step of the GMM estimation,we also consideran estimatorin which the covariance matrixis continuouslyalteredas b is changedin the
minimization.Formallylet b, be the minimizerof
(
t=ll
,
"1pXi,?[
1
(
t=l
(Xt,
b)
.
(4)
Allowing the weighting matrixto vary with b clearly alters the shape of the criterionfunctionthat is minimized.
Although the first-orderconditionsfor this minimization
problem have an extra term relative to problems with
fixed weightingmatrices,this termdoes not distortthe limiting distributionfor the estimator.[See Pakes and Pollard
(1989, pp. 1044-1046) for a more formal discussion and
provision of sufficientconditionsthat justify this conclu-
by choice of b. Underthe additionalrestrictionsimposedon
the disturbanceterm discussedin the previousparagraph,
this is our continuous-updating
estimator.Notice thatif we
ignorethe denominatortermin (5) andminimize,the solution is the two-stageleast squaresestimator.By including
the denominatorterm, Sarganshowed that, for an appropriate choice of zt, the solution is the (limited information) quasi-maximumlikelihood estimator(using a Gaussian likelihood),which as an estimatorof directionis invariantto normalization.
The estimationenvironmentsthat we study are more
complicatedthan the one just described.Sometimes the
moment conditionsare not linear in the parameters,and
the disturbancetermsare often conditionallyheteroscedastic and/or seriallycorrelated.As a consequence,the twostep and iterativeestimatorsno longer coincide and the
estimatorcan no longerbe interpreted
continuous-updating
as a quasi-likelihoodestimator.The estimationmethodsremainlimitedinformation,however,in thatthe momentconditionsused aretypicallynot sufficientto fully characterize
the time series evolutionof the endogenousvariables.
The second goal of our analysisis to comparethe reliabilityof confidenceregionscomputedusing quadraticapproximationsto criterionfunctionsto ones based directly
on the deteriorationof the originalcriterionfunctions.The
formerapproachis more commonlyused in the empirical
264
Journalof Business & EconomicStatistics,July 1996
asset-pricingliteraturepartiallybecauseit is easier to im- the finite-samplepropertiesof the estimatorsdescribedin
mechanismsareconstructed
plement.The latterapproachexploits the chi-squaredfea- Section 1. The data-generating
ture of the appropriatelyscaled criterionfunctions.From to be consistentwith a representativeagent consumptionthe vantagepoint of hypothesistesting, the plausibilityof basedcapitalasset-pricingmodel (CCAPM).Estimatorsof
an observeddeteriorationof the criterionfunctioncaused the parametersof the representativeagent'sutilityfunction
by imposingparameterrestrictionscanbe assessedby using are consideredalong with tests of the overidentifyingconthe appropriatechi-squareddistribution.Using this same ditions impliedby the model. We use the CCAPMas the
insight, confidenceregions can be computedby using the basis of our experimentsbecauseGMMhas been used exappropriatechi-squareddistributionto prespecifysome in- tensivelyin studyingthis model(e.g., see DunnandSinglecrementin the criterionfunction and inferringthe set of ton 1986;Eichenbaumand Hansen 1990; Epsteinand Zin
parametervalues that imply no more than that increment. 1991;FersonandConstantinides1991;HansenandSingleSuch confidenceregions can have unusualshapes and, in ton 1982). Furthermore,the CCAPMforms the basis for
fact, may not even be connected.One of the key questions two studies of the finite-samplepropertiesof GMM conof this investigationis whetheror not they lead to more ductedby Tauchen(1986) and Kocherlakota(1990a).
reliablestatisticalinferences.
Preferencesand EulerEquations. In the model the repOne of our reasons for studying the performanceof resentative
consumeris assumedto have preferencesover
criterion-function-based
inferencecomes from the workof
consumptiongiven by
Magdalinos(1994). Within the confines of the classical
simultaneous-equations
paradigm,Magdalinosstudiedthe
> 0, (6)
of
alternative
tests of instrumentadmissibil1
performance
U- =1E E6t
As
a
result
of
his
recommended
ity.
analysis,Magdalinos
alteringthe weighting matrix to embody the restrictions where ct is consumptionat date t. The parameter0 capas is done in the continuous-updating
method.In addition, tures some time nonseparabilityin preferences. Examhe found that test statistics are better behavedusing the ples of models with time nonseparabilityin preferences
limited-information
maximumlikelihoodestimatorthanthe can be found in the work of Abel (1990), Constantinides
least
two-stage
squaresestimator.Recall that the former (1990), Detempleand Zapatero(1991), Dunn and Singleestimatorcoincides with our continuous-updating
estima- ton (1986), Eichenbaumand Hansen (1990), Gallantand
tor and the latter to our two-step and iteratedestimators Tauchen(1989), Heaton(1993, 1995),Novales (1990), Ryin the classicalsimultaneous-equations
estimationenviron- der andHeal (1973), and Sundaresan(1989).If 0 > 0, conment consideredby Magdalinos.
sumptionis durableor substitutableover time. If 0 = 0,
the
Anotherreason is Nelson and Startz's(1990) criticism
preferencesof the consumeraretime-additive.If 0 < 0,
of the use of instrumental-variables
methods for study- consumptionis complementaryover time and the prefering consumption-basedasset-pricingmodels. These au- ences of the representativeconsumerexhibit habitpersisthors were concernedaboutthe behaviorof instrumental- tence.
We considerestimatorsof the parameters6,-y,and 0, as
variablesestimatorswhen the instrumentsare poorly corwell
as tests of the modelbasedon implicationsof the Eurelated with the endogenous variables.Their arguments
ler
equations.Let must = (ct + Oct-l)--, which can be
were based on analogiesto results derivedformallyfor t
statisticsandoveridentifyingrestrictionstests in the classi- interpretedas the indirectmarginalutilityfor consumption
cal simultaneous-equations
setting. The questionof inter- "services"as measuredby st = ct + Oct-,. Similarly,let
est to us is the extent to which criterion-function-basedmuct (ct + Oct-1)-' +-06E[(ct+l + Oct)F-YTt], where Ft
inferenceand continuousupdatingcan help overcomethe gives the informationset at time t. The Eulerequationfor a
concerns of Nelson and Startz. Furthermore,Stock and representativeagent'sportfolioallocationdecisionis given
Wright(1995) provideda theoreticalrationalefor consid- by
criterionfunctioninsteadof
ering the continuous-updating
muct = E(6muct+lRt+1_ Ft),
(7)
other GMM implementationsin situationsin which the
where Rt+i is a gross returnon an asset from t to t + 1.
model is poorlyidentified.
The finite-samplepropertiesof the two-stepanditerative Because aggregateconsumptionis growingover time, we
The (normalized)
GMMestimatorsin an asset-pricingsettinghavebeen stud- divided(7) by must to inducestationarity.
ied previouslyby Tauchen(1986), Kocherlakota(1990a), Eulerequationthat we consideris then given by
and Ferson and Foerster (1994). These investigatorsdid
muct+i
muct = F(
(8)
not studythe propertiesof the continuous-updating
estimaL1I'
must
must
tor, nor did they study the behaviorof criterion-functionRemovingconditionalexpectationsfrom(8) resultsin the
basedconfidenceregions.Furthermore,
Tauchen(1986)and
Kocherlakota(1990a) consideredonly the case of time- Euler-equationerror
separablepreferencesfor the representativeconsumer.
t+2(,O) = muc
muc+l Rt+l,
(9)
must
2.
MONTE CARLO ENVIRONMENT
must
wheremuc _=(ct + Oct_1)-r + 69(ct+1 + Oct)-Y.Note that
We considerseveralMonteCarloenvironmentsto assess. E(qt+2[It) = 0. Furthermore,notice that, when 9 is not
Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators
Table1. MaximumLikelihoodEstimatorof MonthlyLawof Motion
No timeavg.
Order Unrestricted
of C(L) log-likelihood Log-likelihood -y
1
2
3
3,214.9
3,218.1
3,220.6
3,237.3
3,238.8
4.55
4.21
Timeaveraged
Log-likelihood y
3,230.3
4.99
0, 4t+2 has a first-ordermovingaverage[MA(1)]structure.
265
and covariancematrix I and where p is the mean of Yt.
Furthermore,B(L) is a matrix of polynomialsin the lag
operator.To use this assumptionaboutthe dynamicsof consumptionandreturnsalongwith the Eulerequation(7), we
assumethatthe preferencesof the representativeagent are
time additive.In this case the Euler-equation
errorfor each
returncan be writtenas
-- log(ct+1/ct)
+ log Rt+1 -
= ?7t+l,
(12)
By choosing instruments, zt, in Yt, unconditional moment
whereE(rqt+l ht) = 0 and n is a constant(e.g., see Hansen
and Singleton1983).
The relation(12) implies a set of restrictionson the law
E[pt+2 (6, , 0)] = E[ztqt+2(61Y,7)] = 0.
(10) of motion (11). To
impose these restrictions,we consider
several
finite-order
of B(L) and use the
parameterizations
Finally,notice that bt+2 can be expressedin termsof con- methods described Hansen and
(1991). These
by
Sargent
sumptionratiosandreturns,whichwe taketo be stationary
are of the form
parameterizations
processes.
One unpleasantfeature of these moment conditionsis
C(L)
(L)(13)
B(L) that they can always be made to be satisfiedin a degener'
a(L)
ate fashion.Supposethat y = 0 and 60 = -1; then clearly
where C(L) is a 3 x 3 matrix of polynomialsin the lag
muc0 = 0 and the momentconditionsare trivially satisfied. Without imposing additionalconstraintson the pa- operatoranda(L) is a 1 x 1 polynomialin the lag operator.
rametervectors,this degeneracyin the momentconditions We restrict the polynomiala(L) to be second order and
createsproblemsfor the two-stepand iterativeestimators. consideredseveraldifferentordersof C(L).
To estimate the constrainedlaw of motion, we used
Of course,whentime separabilityis imposed(0 = 0), these
datafrom 1959.2to 1992.12.Aggregateconsumpmonthly
problematicparametervaluescannotbe reached.When0 is tion is
seasonally
adjustedreal aggregateconsumptionof
permittedto be differentfrom 0, Eichenbaumand Hansen nondurables services
for the UnitedStatestakenfrom
plus
(1990) were led to dividethe momentconditionsby 1 + 60
CITIBASE.
These
data
were
convertedto a per capitameaso that the two-step and iterativeestimatorsnot be driven
sure
total
U.S.
by
dividing
by
populationfor each month,
to the degeneratevalues.We will do likewise.An attractive
obtained
from
CITIBASE.
The
equity returnis the valueestimatoris its insensipropertyof the continuous-updating
return
from
the
Center
for Researchin Security
weighted
scale factorsandhenceto the
tivity to parameter-dependent
Prices
and
the
bond
return
is the Fama-Blissrisk(CRSP),
momenttransformationused by Eichenbaumand Hansen
free
return
from
CRSP.
Each
of
these
returnseries was
(1990). Moreover,the criterionof the continuous-updating
converted
into
a
real
return
the
using
implicit
pricedeflator
estimatordoes not necessarilytendto 0 in the vicinity y = 0
for
nondurables
and
services
from
CITIBASE.
and 60 = -1. Althoughthe estimatedmean for {(Pt+2} beFor simplicitywe removed the sample mean from the
comes smallin the neighborhoodof theseparametervalues,
vector
Yt so thatthe constantsin (11) and(12) did not have
so does the estimatedasymptoticcovariancematrix,and
to
be
estimated.
As a result, the only preferenceparamethe criterionfunctionfor the continuous-updated
estimator
ter
to
be
estimated
is 7. The results of estimatingthe law
plays off this tension.
of
motion
exact maximumlikelihoodfor differ(11)
using
We build severalMonteCarloenvironmentsto simulate
ent
orders
of
the
polynomial
C(L) are given in Table 1.
returnsand consumptiongrowth that are consistentwith
The
column
labeled
"Unrestricted
log-likelihood"reports
(8). These are used to assess the finite-sampleproperties
the
in
the
case
of
unrestricted
estimationof
log-likelihood
of the estimatorsof Section 1 basedon the momentcondithe
in
the
The
columns
labeled
polynomials
lag
operator.
tions (10).
"no time avg."reportthe log-likelihoodand the estimated
conditionsare given by
2.1 Lognormal Model
Time-AdditiveModel, No Time Averaging. In our first
Monte Carlo environment we model consumption growth
and returns directly by assuming that they are jointly lognormally distributed as in the work of Hansen and Singleton
(1983). Let Y(t) = [log(ct/ct-1)log Rflog Rf]', where ct is
aggregate consumption at time t, R• is the gross return on a
stock index at time t, and R[ is the gross return on a bond
at time t. We assumethat
Yt = l + B(L)Et,
(11)
where Et is a normally distributed three-dimensional random vector that is independent over time and has zero mean
value of 7 under the restrictions implied by (12). In searching for the maximized log-likelihood, larger values of the
log-likelihood were found for values of 7 larger than 50.
The results reported in Table 1 for the constrained models
correspond to local maxima. We used the local maximizers
for our simulations because they result in plausible values
for 7y.Notice that there is substantial improvement in the
log-likelihood in moving from a first- to a second-order
polynomial for C(L) in the unrestricted case. This indicates that more than a first-order polynomial is needed for
C(L). There is little improvement in going to a third-order
polynomial in the unrestricted case.
For both the second- and third-order polynomial cases,
there is great deterioration in the log-likelihood when the
266
Journalof Business & EconomicStatistics,July 1996
model restrictionsare imposed.This is consistentwith the
resultsreportedby HansenandSingleton(1983).Moreover,
thereis little improvementin the log-likelihoodin moving
from a second-orderto a third-orderC(L). For this reason
we usedthe pointestimatesfromtherestrictedmodelwitha
second-orderC(L) to conductourMonteCarloexperiments
for the lognormalmodel with no time averaging.
In assessing the finite-samplepropertiesof GMM estimatorsin this case, we constructed500 MonteCarlosamples, each with a sample size of 400. A sample size of
400 approximatesthe size of availablemonthlyconsumption data.We constructedmomentconditionsbasedon the
Euler-equationerrorsfor both the bond and the stock reerrorwe used
ForeachEuler-equation
turnssimultaneously.
one periodlagged(log)bondand(log) stockreturnsandone
periodlagged(log) consumptiongrowthas instruments.We
did not include constantsas instrumentsbecause the data
are simulatedunderthe assumptionthat they have a zero
mean.
Notice thatin this case the Euler-equationerror
t+
n
7=1
h=l
(t-1+(T/n)+(h/n)
(19)
is predictableat time t. However,E(r t• lt_-1) = 0 so
that instrumentscan be chosen from the informationset
at time t - 1. An instrumental-variables
estimatorof this
modelmustaccountfor the MA(1)structureof the moment
condition.
We estimatedthe log-linearlaw of motion (13) for consumptionandasset returnsunderthe restrictionimpliedby
(18). Becausethe consumptiondataare an arithmeticaverage of consumptionexpendituresover a period,in applying
(18) to actual consumptiondata we assume that geometric averagesand arithmeticaveragesare approximatelythe
same. This assumptionwas made by Grossman,Melino,
ModelWithTimeAveraging. As a further and Shiller(1987), Hall (1988), andHansenand
Time-Additive
Singleton
data-generatingmechanism,we also consideran example (1996).Note also thatthe returnsin (18) aretime averaged.
in which the decision intervalof the representativeagent For
simplicity,in estimatingthe law of motion(13), we use
is much smallerthanthe intervalof the data.Supposethat the
the model
monthlyCRSP series directly.Furthermore,
therearen decisionperiodswithineachobservationperiod. with time
averagingimposes a weaker set of restrictions
For example,if the representativeagent'sdecisioninterval thandoes the modelthattakesno accountof time
averaging
is a week and dataare observedmonthly,then n wouldbe becausewe do not
imposethe restrictionon the first-order
agent'sutilityfunction autocorrelation
approximately4. The representative
of the Euler-equationerrorof the stock reat time t is given by
turn.In the limit case of continuousdecision making,the
of the errorshouldbe .25, as disfirst-orderautocorrelation
.
U
cussed
et
al.
Grossman
(14)
(1987) and Hall (1988).
by
-1
=O(6n)h (Ct+h/n)lWe consider the case of a third-orderpolynomialfor
If we maintainthe assumptionthatconsumptionandreturns C(L) and a second-orderpolynomialfor a(L). The results
error of this estimationarereportedin Table1 in the columnslaarejointlylognormallydistributed,the Euler-equation
this
error
can
be beled "Timeaveraged."Notice thatthe log-likelihoodfuncis
again given by (12). Moreover,
rt+l
tion improvessomewhatcomparedto the case in whichtime
as
decomposed
averagingis ignored.The model,however,is still substann
at odds with the data. The estimatedvalue of 7 is
tially
(15)
7lt+l = E (t+h/n,
largeras well.
slightly
h=1
We used this model to create 500 Monte Carlo draws,
where
each with a samplesize of 400. As in the case of no time
averaging,we studiedestimatorsbased on the Eulerequa(16)
(t+h/n
E(77t+lITFt+h/n) E(7rt+l• Ft+(h-1)/n).
tions for both returns.In this case the instrumentswere
In this environment,we presumethatobservedconsump(log) stock returns,(log) bondreturns,and (log) consumption does not correspondto the actual point-in-timecon- tion
growth,all laggedtwo periods.
sumptionof the representativeagent but insteadis an averageof actualconsumptionover one unit of time. Specifi- 2.2 Discrete-StateModels
cally,supposethatobservedconsumption,c?, is a geometric
In our second set of MonteCarloenvironmentswe folaverageof actualconsumption,
low Tauchen(1986) andKocherlakota(1990a)andconsider
a Markov-chainmodel for aggregateconsumptionanddiv1
(17) idend growth.Aggregateconsumptionis assumedto repc=
Ct-1+h/n
resentthe endowmentof the representativeconsumer,and
and similarlyfor the observedreturn,R?. Averaging(12) dividendsrepresentthe cash flow from holding stock. We
form one-periodstock returnsandthe returnsto holdinga
over time impliesthat
one-period(real)discountbond.Each of these returnscan
-1Ylog(cal/c ) + log Rl1
be representedas functionsof the stateof the Markovchain.
Constructionof these returnsin the case of time-additive
was describedby Kocherlakota(1990a)andTauchen
=
(18) utility
I4 + - t
+(/t-n)+(h/n)+(h/n).
(1986).
7=1 h=l
j
(
Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators
and Yaron(1994). GMMestimatesand test statisticswere
computedfor 500 replicationsof a sample size of 100.
This sample size correspondsapproximatelyto the length
of most annualdatasets.
Table2. PreferenceSettings forDiscreteState-Space Model
6to(ct+ect-)1'1
t=-o
1 --
-
Preferencecase
6
TS1
TS2
TNS1
TNS2
TNS3
.97
1.139
.97
.97
.97
7y
1.3
13.7
1.3
1.3
1.3
U=E
8
0
0
1/3
-1/3
-2/3
AnnualModel. To calibratethe first Markovchain,we
used the methoddescribedby Tauchenand Hussey (1991)
to approximatea first-ordervectorautoregression(VAR)for
consumptionand dividendgrowth.The parametersof the
VARaretakenfromKocherlakota(1990a)andare givenby
In 1t
.0 4
InAt
.021 +
]
.1 7
.414
.017 .161
In'ti
+
In/At-1
(20)
where?t is the gross growthrate of real annualdividends
on the Standard& Poor 500, where At is the gross growth
rate of U.S. per capitareal annualconsumption,andwhere
E(t
.01400 .00177
.00177 .00120
267
(21)
MonthlyModel. We repeatedsome of the experiments
using a law of motion calibratedto postwarmonthlydata.
As in the constructionof the Markov chain for the annual model, we startedwith a first-orderVAR for consumptionanddividendgrowth.The consumptiondataused
in estimatingthe VAR were aggregateU.S. expenditures
on nondurablesand services describedin Subsection2.1.
We constructeddividendsimplied by the monthlyCRSP
value-weightedportfolioreturn.Thesedividendswere convertedto real dividendsusingthe implicitpricedeflatorfor
monthlynondurablesand services taken from CITIBASE.
This dividendseries is highly seasonalbecauseof the regular dividendpayoutpolicies of most companies.To avoid
modeling this seasonality,we let ?t = log(dt/dt-12)/12,
and we assumedthat?t representsthe one-perioddividend
growthof the model.The series {log(dr/dt-12)} appearsto
be stationary.
The parametersof the VAR estimatedusing these data
are given by
In
t
AtJ
LIn
_[
.0012
[ .0019J
Furthermore,Et is assumedto be normallydistributedand
-.1768
.1941
uncorrelatedover time. The Markovchain for [?t;At]' is
+
(22)
IlnAt-1
t-1 ]
In
-.2150
.0267
chosento have 16 states.
In simulatingdata from this model we chose several
values of the preferenceparametersof the representative where
consumer.These are presentedin Table2. Preferenceset.0001
=
x 10-3.
(23)
ting TS1 was used by Tauchen(1986) and preferencesetE(ete') [.1438
.0001
.0001
.0.0145
145
ting TS2 was used by Kocherlakota(1990a).As shownby
Kocherlakota(1990a), these latter parameters,along with
the Markov-chainmodel of endowments,imply first and As in the case of the annual Markov-chainmodel, we
second momentsfor asset returnsthat mimic their sample approximatedthe VAR of (22) and (23) with a 16-state
The largevalueof 6 in TS2 is not inconsistent Markovchain using the methodsof Tauchenand Hussey
counterparts.
with the existence of an equilibriumin the model because (1991). The MonteCarlodataconsistedof 500 replications
of a samplesize of 400. We focusedexclusivelyon the more
of the large value of y (Kocherlakota1990b).
preferenceconfiguration(TS1), adjusting6 for
We restrictour attentionto "moderate"
valuesof 6 and'y "moderate"
the
shorter
interval.(Moreprecisely,we used the
sampling
in our examinationof time-nonseparable
preferences,and twelfth root of .97 in
place of .97 for 6.) In addition,we
we considera rangeof valuesof 0. ParametersettingTNSI
Monte
Carlo
data using nonseparablespecificagenerated
introducesa modest degreeof durabilityby letting 0 = 1
tion
TNS
and
with 6 adjustedappropriately.
TNS2,
again
1
and TNS2 introduces habit persistence with 0 = -?. TNS3
results in a more extremeamountof habit persistenceby
3. OVERVIEW
OF MONTECARLORESULTS
setting 0 = - . The asymmetric (in magnitude) across the
3.1 DescriptiveStatistics
specificationsof 8 is guidedin partby the a priorinotion
that there should only be a limited amountof durability
In describingthe resultsof the variousMonteCarloexin the goods classifiedas "nondurable"
in NationalIncome perimentsin Sections 3 and 4, we focus most of our disandProductAccountsandby empiricalevidencefor a sub- cussionon the following calculations:
stantialdegreeof habitpersistencereportedby Fersonand
Constantinides(1991).
Table3. MomentConditions
In implementingthe estimators,the Euler equationsfor
Numberof moment
the stock and bond returnsare multipliedby instrumental
conditions
Momentset
Instruments,zt
variablesto constructmomentconditions.The two instruM1
RR, At, 1
8
ment sets are listed in Table 3. Monte Carlo results for
M2
4
At,1
other momentconditionswere given by Hansen,Heaton,
268
Journalof Business & EconomicStatistics,July 1996
1. We foundthe minimumvalueof the criterionfunction.
Call this JT. The limitingdistributionof TJTis chi-squared
with degrees of freedomequal to the numberof moment
conditionsminusthe numberof parametersestimated.We
used this limiting distributionto test the overidentifying
momentconditions.
2. We evaluatedthe criterionfunctionat the trueparametervector.Callthis J'. The limitingdistributionof TJT' is
chi-squaredwith degreesof freedomequalto the numberof
momentconditions.Withthis limitingdistribution,we characterizethe familyof parametervectorsthatlook plausible
from the standpointof the momentconditions.Stock and
Wright(1995) advocatedthe use of this type of inference
when it is suspectedthat the parametersare poorlyidentified. Here we examinewhetherthe limitingdistributionof
for
JT%providesa reasonablesmall-sampleapproximation
inference.
this
conducting
3. We found the minimumvalue of the criterionfunction when7 is constrainedto be its truevalue.Call this Ji.
Because y is the only parameterestimatedin the lognormal model,in this case J/ coincideswith J'. The limiting
distribution of T(J
- JT) is chi-squared with 1 df. This
limitingdistributionallowsus to constructa confidenceregion for y based on the incrementsof the criterionfunction from its unconstrainedminimum.By evaluatingT(J?/
-
JT)
we determined whether the true value of y is in the
correspondingchi-squaredcriticalvalueandthe fractionof
the actualcomputedstatisticsthatare abovethatvalue (depicted on the y axis). Thus the 45-degreeline (depictedas
referencefor assessingthe qualityof
...) is the appropriate
the limitingdistribution.FollowingDavidsonand McKinnon (1994),these plots are referredto as p-valueplots, and
we presentthe results for the interval[0, .5] becausethis
boundsthe regionof probabilityvaluesused in most applications.Althoughwe use probabilityvaluesas our basis of
comparison,confidenceintervalsat alternativesignificance
levels can be assessedby simplysubtractingthe probability
valuesfrom 1.
The figures are organizedas follows. For each Monte
Carlo setup we first consider a single figure with four
graphstitled as follows: "(a)Minimized","(b)True","(c)
and "(d)Wald,"correspondingto
Constrained-Minimized"
the statistics TJT,TJ•, T(JT? - JT), and T(yT - 7)2
(UT)2, respectively.To providea formalstatisticalmeasure
of the distancebetweenthe empiricaldistributionsandtheir
theoreticalcounterparts,on each figurea bandaroundthe
45-degreeline is plottedusing dottedlines. This bandis a
90%confidenceregionbasedon the Kolmogorov-Smirnov
Test. This states that the probabilitythat the maximaldifferencebetweenthe empiricaldistributionandthe theoretical one will lie within those lines is 90%. Maximaldifferenceswithinthese bandsare not statisticallysignificant
at the 10%significancelevel. Althoughwe presentresults
confidence
for the interval[0, .5], the Kolmogorov-Smirnov
between
the
on
the
is
based
calculating supremum
region
line
over
the
and
the
distribution
region
45-degree
empirical
resultingintervalfor alternativeconfidencelevels.
4. We constructedthe morestandardconfidenceintervals
for 7 based on a quadraticapproximationto the criterion
function.In particular,let 7YTbe an estimatorof 7yand
a
be the estimatedasymptoticstandarderrorof the estimator [0, 1].
In each graph,the dashedline gives the MonteCarlorewhich has an asymptotic
/(T2, 7)2
W7TWe study T(QT
for the two-step estimator,the dot-dashline for the
sults
chi-squareddistributionwith 1 df. Notice thatthis objectis iterated
estimator,and the solid line for the continuousjust the Waldstatisticfor the hypothesisthatthe truevalue
of the parameteris y. In constructingthis statisticfor the updatingestimator.Forthe minimizedcriterionfunctionrethereis a necessaryorderingbetweenthe continuousestimator,the standarderrorsinclude sults,
continuous-updating
Whenthe iterativeestia term that reflectsthe derivativeof the GMM weighting updatingandthe iterativeestimator.
value
of
the
criterion
functioncan also
the
mator
converges,
matrixwith respectto the parameters.
estimator.Because
be obtainedby the continuous-updating
Our Monte Carlo calculationsare greatly simplifiedby
estimatorminimizesits criterion,
the continuous-updating
our knowledgeof the true parametervector.In empirical
must
be smallerthanthe criterionfor
minimized
value
this
work,the corresponding
computationswouldbe morecom- the iterativeestimator.As a result, the
plot for the miniplicated. For instance,to constructa confidenceinterval mized criterionof the
estimatormust
continuous-updating
for y based on the originalcriterionfunction,a researcher
lie below the plot for the iterativeestimatorunless the itwould have to characterizenumericallythe hypothetical
There is no natural orerative estimator fails to
values of this parameterthat are consistent with a prespecifled deterioration in the criterion while concentrating out all
of the other parameters. When there are very few remaining components in the parametervector (in our examples 0,
1, or 2), this concentration is tractable. This approach may
become very difficult, however, when the parameter vector
is large.
In reporting our Monte Carlo results we use one of the
graphical methods advocated by Davidson and McKinnon
(1994). For each Monte Carlo setup we computed the empirical distributions of the statistics and compared them to
the corresponding chi-squared distributions. The results are
plotted on a set of figures constructed as follows. For each
probability value (depicted on the x axis), we computed the
converge.
dering between the results for the two-step estimator and
the continuous-updating estimator or between the two-step
estimator and the iterative estimator.
To complement our p-value plots, we also provide some
results summarizing the performance of the implied parameter estimators. The finite-sample properties of the point estimates are of interest in their own right and in some cases
provide additional insights into the behavior of the p-value
plots.
3.2 Numerical Search Routines
The two-step and iterative estimators are given by
the minimizers of the objective function (2), and the
Hansen, Heaton, and Yaron: Finite-Sample Properties of Some Alternative GMM Estimators
0.5
0.4
0.4
0.3
0.2
ering an estimator of the form
(b) True
(a) Minimized
0.5
..
} {
VT(b)
T(Xt,b)PT(Xt,b)'
t=1
/
0.3
/
269
T
+1
0.2
,
t=2
0.1
0.1 /
00
0.1
0.2
0.3
0.4
0.5
00
0.1
0.5
0.4
0.4
0.3
0.3 /
0.2
0.2
0.1
0.1
o.
o?
0.1
Figure 1.
0.2
0.3
0.4
0.2
0.3
0.4
0.5
0
.. .
0.1
0.2
0.3
0.4
0.5
Criterion Functions, Monthly Lognormal Model, No
Time Averaging: ---.,
Iterative;-- -, Two-Step;
-
, Continuous-
Updating.
continuous-updating estimator is given by the minimizer
of the objective function (4). For some of our experiments,
the estimators are given by solutions to linear equations. In
the other cases we used the numerical optimization routines
fminu.m and fmins.m, which are part of the "Optimization
Toolbox" for use with MATLAB. The routine fminu.m implements a quasi-Newton method, which is dependent on
an initial setting for the parameters. To check whether the
results were sensitive to initialization, we considered several different starting values that included the true parameter vector. When this gradient method failed to converge
or resulted in unusual estimates, we also used the routine
fmins.m, which is a simplex search method. Details on these
MATLAB programs can be found in the MATLAB Optimization Toolbox manual. In Section 4 we show that the
continuous-updating criterion can make numerical search
for the minimizer difficult. As a further check on our numerical results, when we obtained extreme parameter estimates we also examined the continuous-updating criterion
over a grid of the parameters. This gave us additional assurance that the estimated parameters were indeed minimizers
of the criterion.
When
(Xt, b) - 1/T ZT,
•T(Xt,b).
this estimator was not positive definite, we used an estimator proposed by Durbin (1960) (see also Eichenbaum,
Hansen, and Singleton 1988). Durbin's estimator is obtained
by first approximating the MA(1) model with a finite-order
autoregression. The residuals from this autoregression are
used to approximate the innovations. Then the parameters
of the MA(1) model are estimated by running a regression
of the original time series onto a one-period lag of the
"approximate" innovations. Finally, an estimate of VT(b)
is formed using the estimated MA coefficients and sample
covariance matrix for the residuals. This procedure has the
advantage that the finite-order MA structure of {ot } is imposed and the estimator is positive semidefinite by construction. It does, however, rely on the choice of a finite-order
autoregression to use in the approximation. In implementing the estimator we ran a 12th-order autoregression in the
initial stage. Although this covariance matrix estimator was
used in searching for the parameter estimates, it was not
needed at any of the converged parameter values.
(a) Minimized
/
0.5
0.4
(b) True
0.5
,
0.41
0.3
'
/
/
0.2
0.3
/ , -."
0.2
/
0.1
0.1
O
0
0.1
0.2
0.3
0.4
0.5
C0
0
0.1
-
0.3
0.4
0.5
0.5
'
0.4
0.2
(d) Wald
(c) Constrained-Minimized
0.5
0.4
/
0.3
"
/
0.3'
0.2
.
0.2
0.1
4.
(24)
where pT(Xt, b) =
/
0
+ POT
(Xt,b)OPT(Xt-l,b)']},
0.5
(d) Wald
(c) Constrained-Minimized
0.5
00
[(pT(Xt-1, b)(pT(Xt,b)'
0.1
MONTE CARLO RESULTS, LOGNORMALMODEL
For the case of time-averaged data, {pt } has an MA(1)
structure, as we discussed in Section 2.1. To account for
this, the estimator of VT(b) was computed by first consid-
0
0.1
0.2
0.3
0.4
0.5
0
0.1
0.2
0.3
0.4
0.5
Figure2. CriterionFunctions,MonthlyLognormalModel,TimeAveraging:---.., Iterative;
- - -, Two-Step;--• , Continuous-Updating.
Journalof Business & EconomicStatistics,July 1996
270
Table4. Propertiesof Estimatorsof y, LognormalModel
confidence intervals built from the Wald criteria, but the
Continuous-updating Iterative
Two-step distortion is substantially smaller than with the other two
Property
estimators. Finally, the coverage rates of the confidence reA. No timeaveraging,true7 = 4.55
gions implied by the true and constrained-minimized crite3.72
1.64
1.73
Median
ria for the continuous-updating estimator accord well with
Mean
1.81
2.06
7,171.17
the asymptotic distribution and are clearly better than the
4.47
1.81
2.06
Truncatedmean*
-.23
10%quantile
-.20
-3.67
coverage rates for the other estimators.
18.75
4.00
4.33
90% quantile
These Monte Carlo results for the lognormal model support the following remedy for the concerns raised by NelB. Timeaveraging,true7 = 4.99
son and Startz (1990). From the standpoint of hypothe3.48
1.74
1.75
Median
sis testing and confidence-interval construction, use of the
-518.14
1.88
2.04
Mean
criterion is much more reliable than
continuous-updating
3.18
1.88
2.04
Truncatedmean*
the
other
methods
we
study. The tests of the overidentifying
-10.48
-.29
10%quantile
-.66
restrictions based on the continuous-updating estimator do
90% quantile
11.52
4.22
4.85
* Estimateswithabsolutevaluesgreaterthan100 were excludedfromthe computation
of the not reject too often and in fact are quite well approximated
truncatedmeans.
by the limiting distribution. Although confidence intervals
based on the Wald criteria can be badly distorted, particCriterion Functions. Figures 1 and 2 report the prop- ularly for the two-step and iterative estimators, confidence
erties of the criterion functions for the two Monte Carlo
regions constructed from the continuous-updating criteria
experiments. Figure 1 is for the case of no time averaging have coverage probabilities that are close to the ones imof the data, and Figure 2 is for the case of time averaging.
plied by the asymptotic theory. This occurs for confidence
Figures 1(d) and 2(d) report the results using the Wald (ap- sets based on both the constrained-minimized criterion in
proximate quadratic) criteria. The results for the Wald and panel (c) and the true criterion in panel (b). The latter result
constrained-minimized criteria are identical for the iterative
supports the recommendation of Stock and Wright (1995)
and two-step estimators. This occurs because the model is to base confidence intervals on the level of the continuouslinear in the parameters and the weighting matrix is fixed
updating criterion.
in constructing (J'r - JT). For the continuous-updating
Parameter Estimates. Table 4 reports summaries of meaestimator, the results for the Wald and the constrainedminimized criteria are different due to the dependence of sures of central tendency for the three estimators of - along
the weighting matrix on the hypothetical parameter values. with 10% and 90% quantiles. The medians for the two-step
Notice that the small-sample distributions of the mini- and iterative estimators are considerably lower than the true
mized criterion functions for the iterative and two-step es- value of -, whereas the median bias for the continuoustimators are greatly distorted. The small-sample tests of the updating estimator is much smaller. The distribution for the
overidentifying restrictions based on the minimized crite- continuous-updating estimator, however, is also more disrion values are too large, leading to overrejections of the persed, as evidenced by the larger increment between the
model when using these estimators. The minimized crite- 10% and 90% quantiles. Moreover, the Monte Carlo samrion function for the continuous-updatingestimator is much ple means for the continuous-updating estimator are much
better behaved, and the small-sample distribution is very more severely distorted than they are for the other two esclose to being X2 for both Monte Carlo experiments. Tests timators. The enormous sample means for the continuousof the overidentifying restrictions of the models using the updating estimator occur because in the case of no time
minimized value of the criterion function of the continuous- averaging and of time averaging there were 23 and 31 samupdating estimator have the correct size for the model with- ples, respectively, in which the estimates are, in absolute
out time averaging and similarly for the model with time value, larger than 100. When these are removed from the
averaging for probability values less than about .1. Even Monte Carlo samples, the sample means of the continuousfor probability values greater than .1, the distribution of the updating estimator are closer to the true values than are the
minimized criterion for the continuous-updating estimator means for the other two estimators.
Recall that the analog to the two-step and iterative estiis not greatly distorted.
The finite-sample coverage probabilities of the three mator in the classical simultaneous-equations model is twoways of constructing confidence regions for 7 are depicted stage least squares and that the analog to the continuousin subplots (b), (c), and (d) in Figures 1 and 2. Recall that updating estimator is limited-information (quasi) maximum
the constrained-minimized and Wald criteria coincide when likelihood. It is known from the literature that there are
they are based on the two-step and iterative estimators but settings in which the two-stage least squares estimator has
differ when the continuous-updating estimator is used. The finite moments, but the limited-information maximum likesmall-sample coverage probabilities are greatly distorted lihood estimator does not (e.g., see Mariano and Sawa 1972;
for the intervals constructed with the iterative and two-step Sawa 1969). In light of these theoretical results and our
estimators in all cases. In particular,they do not contain the Monte Carlo findings, the continuous-updating estimator
true parametervalues as often as is to be expected from the is not an attractive alternative to the other estimators we
limiting distribution. In the case of the continuous-updating consider if our bases of comparison are the (untruncated)
estimator, for low p values coverage rates are too small for moment properties [or even relative squared errors as in
Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators
(b) Iterative
(a) Continuous-Updating
1.5
1.5
0.5
0.5
-1.5
-1
0
Angle
1
.5
-1.5
(c) Two Step
2
1.5
-1
0
Angle
1
271
be imposed for identification,we restrictattentionto the
interval[-ir/2, ir/2]. In smoothingthe histogram,we used
Gaussiankernel with a bandwidthof .1. The value of the
densityestimateis plottedat each of the samplepoints using a circle. The shapeof the smootheddistributionalong
with the massof the plottedcirclesprovidesevidenceabout
the small-sampledistributionof the parameterestimators.
SC Notice that the primarymodes of the continuous-updating
angle estimatorare very close to the trueparametervalues,
but the modes of the other two angle estimatorsare distorted.Moreover,the densityestimatesfor the modalangle
are largerfor the continuous-updating
method.The Monte
Carlo distributionsfor the continuous-updating
angle estimatoralso, however,have secondarymodes near -r/2,
estimatesof 7ywiththe
correspondingto large-in-magnitude
wrong sign.
0.5
CriterionFunction. The criterion
Continuous-Updating
function
for
the
estimatorcan somecontinuous-updating
0
1
-1.5 -1
1.5
timesleadto extremeoutliersfor the minimizingvalueof y.
Angle
This occursin the two Monte Carloexperimentsfor some
of EstimatedAngle Impliedby (1,
Figure3. Smoothed Distribution
of the trials, as we discussedpreviously.To see why this
MonthlyLognormalModel,No TimeAveraging.
"YT),
can occur, supposefor simplicitythat there is a single return
under consideration,no time averaging,and several
the work of Zellner(1978)].On the otherhand,Anderson,
instruments.
The momentconditionsare constructedusing
Kunitomo,and Sawa (1982) advocateduse of the limited
informationestimatorover the two-stageleast squaresesti(26)
b(Xt, g) - [log Rt+l - g log(ct+1/ct)]zt,
matorbecause,amongotherthings,the medianbias of the
formerestimatoris smaller.We also find less distortionin
the mediansfor the continuous-updating
estimatorin our where zt is a vector of instrumentsand E[I(Xt,7y)] = 0
[see Eq. (12)].Becausethe momentconditionsare linearin
experiments.
the criteriafor the iteratedand two-stepestimatorsare
g,
As we noted previously,one attractiveattributeof the
estimatoris its invarianceto ad hoc quadraticin g. In contrast,the criterionfor the continuouscontinuous-updating
of the momentcon- updatingestimatorconvergesas g gets large. To see this,
(parameterdependent)transformations
ditions. For example supposethat we reparameterize(12) observe that for a large value of g the sample average
of b(Xt,g) is approximatelyg times the sample mean of
as
log(ct+l/ct)zt and the samplecovarianceis approximately
+
(25)
=
yolog(ct+l/ct) y11log
Rt+l rt+1,
(b) Iterative
(a) Continuous-Updating
2
assumingthat = 0. In (25) the parametersyo and yi are 2
not uniquelyidentifiedby the momentconditions,whereas
1.5
in (12) allows identifi- 1.5
their ratio is. The parameterization
cation of yo by setting -y = 1. The continuous-updating
estimatorof y in (12) is invariantto how this identification
is achievedvia restrictionson (25). Notice howeverthatthe
0.5
B
0.51
two-step and iterativeestimatorsare sensitive to the chosen normalization.Hillier (1990) achievedidentificationin
1
1
1.5
-1.5 -1
0
0
1.5
a differentmannerby makingthe directionfrom the origin -1.5 -1
Angle
Angle
definedby (Y70,
71) in two-dimensionalspace the object of
(c) Two Step
interest.Althoughthis angle is identified,the lengthof the
ray from the origin along which the true value of (Y70,
71)
1.5
is not. Hillier's defense of limited informationmaximum
likelihoodover conventionaltwo-stageleast squaresis that
the formeris a betterestimatorof direction.
To see whether such a conclusion might well extend
0.5
to comparisonsbetween the continuous-updating
estimator and the other two estimatorswe consider,we report
smootheddistributionsof the estimated"direction"in FigAngle
ures 3 and 4. We measuredirectionby the angle (as measuredin radians)betweenthe horizontalaxis and the point
of EstimatedAngle Impliedby (1,
Figure4. Smoothed Distribution
(1, 7). Becausethereis still a sign normalizationthatmust "/r), Monthly Lognormal Model, Time Averaging.
272
Journalof Business & EconomicStatistics,July 1996
Criterion
with"Large"
Estimateof Gamma
120
100
80
0 60
S40 0
20
-20
-15
-10
-5
0
5
10
15
20
10
15
20
g
AverageCriterion
80
S60
" 40020
0
-20
-15
-10
-5
0
5
g
Figure5. CriterionFunctionforContinuous-Updating
Estimator,MonthlyLognormalModel,No TimeAveraging.
times the sample covarianceof log(ct+l/ct)zt. There- the parametervectoris of largedimension,however,implefore, for large g, the criterionfunction is approximately mentingthe continuous-updating
estimatormay sometimes
a quadraticform that is (1/T) times the chi-squaredtest be difficult.
statisticfor the null hypothesisthat
Estimationof y and 6. The Monte Carloenvironments
(27) of the lognormaland Markov-chainmodels differ in the
E[log(ct+1/ct)zt] = 0.
As a result it is possible for the minimizedcriterionfor assumedlaw of motion for consumptionand returnsand
the continuous-updating
criterionto be minimizedat a very in the numberof estimatedparameters.To allow a more
direct comparisonwith the Markov-chainresults, we now
large value of g.,
To furtherillustratethis potentialproblem,the upperplot considera MonteCarloexperimentwith the log-linearlaw
in Figure5 is of the criterionfunctionfor the continuous- of motion,whereboth y and 6 are estimated.This second
updating estimator for a Monte Carlo draw in which parameterallows us to check whetherthe favorableperestimatorreportedin
the value of g that minimizes the criterion function is formanceof the continuous-updating
2
and
is
to
the
1
The
Figures
lower
in
unique
5
the
single-parameter
setup.
673,750.4.
plot Figure gives
average
To implement this case, we assume that 6 = .971/12 and
value of the criterionfunction over the Monte Carlo experiments.These results are for the case of no time aver- that the meanof the (log of) consumptiongrowthis equal
aging. Notice that in the upperplot the criterionfunction to its samplemean.These parametervalues,alongwith the
approachesits lowest value as g becomeslargein absolute model for B(L) in (13), imply a set of restrictionson the
value.Even in the lower plot the criterionfunctionasymp- unconditionalmeanof Yt given in (11). MonteCarlosamtotes to a local minimumfor largenegativevaluesof g. The ples of Yt were drawnunderthese restrictions.To estimate
numericalsearchused to implementthe estimatorcouldbe 6 and y we used momentconditionsMl of Table3.
The resultsof this MonteCarloexperimentaredisplayed
complicatedby the flatsectionsof the criterionfunctionand
the searchroutinecould end up spuriouslysearchingin the in Figure 6. As in Figures 1 and 2, rejectionrates of the
directionof very largevaluesof g. This is particularlytrue overidentifyingrestrictionsin the case of the continuousfor a gradient-based
method(suchas fminu.min MATLAB) updatingestimatorare similar to those predictedby the
in which the routineattemptsto set the gradientof the cri- asymptoticdistributions,at least for probabilityvalues of
terion functionto 0. When the parametervector is of low less than.2. Moreover,as before,the small-sampledistribudimension,this problemcan easily be assessedby gridding tions of the minimizedcriteriaof the two-stepanditerative
the parametervectorandevaluatingthe criterionfunctionat estimatorsaregreatlydistorted.In this case,however,a test
the gridpoints.This is whatwe did to makesurethatlarge basedon the minimizedcriterionof the two-stepestimator
estimatesof g were not due to numericalproblems.When would underrejectthe restrictionsof the model. Coverage
92
Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators
(a) Minimized
-
0.5
(b) True
0.5
.
0.4
0.4
0.3
/
.0.3
0.2
0.1
0.2
-
0.1
:
00
0.1
0.2
0.3
0.4
00
0.5
0.1
(c) Constrained-Minimized
0.5
.
0.4
0.5
0.3
0.3
0.2
.
0.1
0.1
0
0.1
0.2
0.3
0.4
o00?
0.5
.
?"
0
0.1
0.2
0.3
0.4
0.5
Figure 6. CriterionFunctions,TimeAdditiveModel,6 and -7 Estimated:I---. , Iterative;- - -, Two-Step;
, Continuous-Updating.
probabilitiesof the truevaluesof both6 and-ybasedon the
"true"criterionaccordwell withthe asymptoticdistribution
estimator.The coverwhen using the continuous-updating
age probabilitiesfor the othertwo estimators,however,are
substantiallyat odds with the theory.
0.5
0.5
0.4
Results for y = 1.3 and 6 = .97. We start by discussing the results obtainedusing the more "moderate"
values of the preferenceparametersTS1 that were used
by Tauchen(1986). Figure 7 displays the results for the
first set of eight momentconditionsMl. Note from panel
(b) that the GMMcriterionfunctionsevaluatedat the true
parametervalues are largerthan predictedby the asymp(b) True
0.5
0.5
0.4
0.4
0.3
0.3
0.4
-
0.3
First we considerthe results for time-separablepreferences (0 = 0) using the Markov-chainmodel calibratedto
annualdata. For these runs VT(b)is computedas a simple covariancematrixestimator.The resultingp-valueplots
are depictedin Figures 7-9 for differentcombinationsof
the preferencesettingsand momentsets given in Tables2
and 3. Featuresof the Monte Carlo distributionsfor the
point estimatesof -yare reportedin Table5.
(a) Minimized
(b) True
(a) Minimized
Referring now to panel (c), the coverage rates for
the true-constrained
criterionare somewhattoo small for
the continuous-updatingestimator.The coverage probabilities are much more seriously distorted,however, for
the two-step and iterativeestimatorsjust as in Figures 1
and 2. Moreover,confidenceintervalsbased on the trueestimator
constrainedcriterionfor the continuous-updating
are a little better behavedthan those based on the Wald
statisticfor all threeestimators.To summarize,the results
displayedin Figure6 are largelyconsistentwith the results
in which-yis the only parameterestimated.
5. MONTECARLORESULTS,
MARKOV-CHAIN
MODELS
5.1 Time-SeparablePreferences,AnnualData
0.5
0.4
0
0.3
(d) Wald
0.4!
0.2
0.2
273
0.3
.-
0.2
0.2
0.2
0'
,
0.1
00
0.1
0.2
0.3
0.4
0.5
00
0
0.1
.
0.4
-
0.3
..'
0.2
0.1
0.1
0
0.1
0.2
0.3
0.4
0.5
-
0.2
0.3
0.4
0.5
0
0
0.1
0.2
0.3
0.4
0.5
0.4
0.5
(d) Wald
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
/
'
-
0.1
00 " 0.1
1.3, 8 = O, Moment Conditions MI:--.-.,
, Continuous-Updating.
0.1
'
(c) Constrained- Minimized
0.2
0.3
0.4
Iterative;- - -, Two-Step;
.
'.
/0.1-/..:.
0
0.5
Model,6 = .97,
Figure7 CriterionFunctions,AnnualMarkov-Chain
7=
0
0.5
0.5
0.2
0
0.4
0.4
.'
0.3
0.3
(d) Wald
(c) Constrained- Minimized
0.5
0.2
.
.,I
0.1
0.1
0.1
0.2
,
0.1
0.2
0.3
0.4
0.5
0
0.1
0.2
0.3
Model,6 = .97
Figure8 CriterionFunctionsAnnualMarkov-Chain
7=
-
1.3,
= O, Moment Conditions M2: ----.,
, Continuous-Updating.
Iterative;- - -, Two-Step;
Journalof Business & EconomicStatistics,July 1996
274
(a) Minimized
0.5
(b) True
0.5
0.4
0.4
.
0.3
0.3
0.2
0.2
/
0.1
0.1
0'
0
0.1 0.2 0.3 0.4 0.5
00
,
/
0.1 0.2 0.3
- Minimized
(c) Constrained
0.5
0.3
/ A'
0.2
,
0.4
/.
0.3
0.5
(d) Wald
0.5
0.4
0.4
/
0.2
.
0.1
0.1
00
0.1
0.2 0.3 0.4 0.5
0"
0
0.1 0.2 0.3
0.4
0.5
Model, S =
Figure 9. CriterionFunctions,Annual Markov-Chain
1.139, -y = 13.7, = 0, Moment ConditionsM2: I---., iterative;
- - -, Two-Step;
, Continuous-Updating.
totic theory.Althoughthe asymptoticapproximationsare
betterwith the continuous-updating
estimator,at least for
smallerprobabilityvalues,the criterionevaluatedat the true
parametersis still too large. On the other hand,the minimizedcriteriafunctionsaremuchbetterbehaved,especially
for the continuous-updating
estimatorwith probabilityvalues less than.15 [see Fig. 7(a)].Confidenceintervalsbased
Table5. Propertiesof Estimatorsof 9y,
Markov-Chain
Model,AnnualData
Property
Continuous-updating
Iterative
Two-step
A. TS1-M1, true -y = 1.3, 6 = .97
Mean
Truncatedmean*
Median
10%quantile
90% quantile
1.90
1.90
1.15
0.61
5.76
.85
.85
.70
.13
1.60
1.08
1.08
.78
-1.25
4.09
1.35
1.35
1.10
-1.35
3.51
1.35
1.35
1.25
-2.71
5.59
B. TS1-M2,true y = 1.3, 6 = .97
Mean
Truncatedmean*
Median
10%quantile
90%quantile
1.36
1.53
1.17
.61
4.10
Results for y -- 13.7 and 6 = 1.139.
C. TS2-M2,truey = 13.7,6 = 1.139
Mean
Truncatedmean*
Median
10%quantile
90% quantile
6.29
14.80
14.14
8.96
22.68
13.68
13.68
13.10
9.01
19.40
on criteria-function
behaviorperformedpoorly in this setting, althoughthey performedbetter for the continuousupdatingestimatorthanfor the othertwo estimators.Moreconfidenceintervalsfor
over, the criterion-function-based
the continuous-updating
estimatorprovedto be more reliable than the confidenceintervalsbased on the Wald statistic.
Figure 8 reportsplots for the set of four momentconditions M2. The reductionin momentconditions(relative
to M ) leads to an improvementin the underlyingcentral
limit approximationsdepictedin panel (b). This is especriterionfunction
cially true for the continuous-updating
exceptat largeprobabilityvalues.The minimizedcriterionfunctionvaluesused to test the overidentifyingrestrictions
behaveas predictedby the asymptotictheoryfor all three
estimators.The criterion-function-based
confidenceintervals work quite well for the continuous-updating
estimator
but not for confidenceintervalsbasedon the Waldstatistic
[comparepanels(c) and (d)].
In summary,asymptotictheory gives a poor guide for
statisticalinferencein the case of momentconditionsM1.
Presumablythis occursbecauseof the manymomentconditionsrelativeto the samplesize. For the smallermoment
conditionset M2, the asymptoticapproximations
workconsiderablybetter.Forbothmomentconditionsthe asymptotic
distributionsfor the overidentifyingrestrictionstests and
criterion-function-based
inferenceare more reliablewhen
basedon the continuous-updating
estimator.
Recall that the minimized value of the continuousupdatingcriterionis alwaysless thanor equalto thatof the
limitingcriterionof the iterativeestimator.Tauchen(1986)
reportedcasesin whichthe minimizedcriterionfunctionfor
the iterativeestimatorled to underrejection
of the overidenIn
restrictions.
these
the
cases
of the
tifying
underrejection
restrictions
more
was
when
usoveridentifying
pronounced
the
criterion
function.
See
Hansen
ing
continuous-updating
et al. (1994) for an analysisof one of these cases.
Of course,assessingthe reliabilityof the asymptotictheory as appliedto the differentparameterestimatorsis a
differentquestionthanassessingthe performanceof the parameterestimatorsthemselves.In regardto this latterquestion, the resultsin the firstportionof Table5 showthatthe
estimatorhas considerablyless bias in
continuous-updating
the medianthanthe othertwo estimatorsin the case of M1.
The iterativeestimator,however,has much less dispersion
in this case as measuredby the widthbetweenthe .10 and
.90 quantiles.In the case of M2, thereis little medianbias
for all three estimators.On the otherhand,the dispersion
in the continuous-updating
estimatoris smallerthanthatof
the othertwo estimators.
14.10
14.10
13.43
9.85
19.54
* Estimateswithabsolutevaluesgreaterthan100 wereexcludedfromthe computation
of the
truncatedmeans.
Next we con-
sider resultsusing the preferencespecificationconsidered
by Kocherlakota(1990a) (TS2 in Table 1). We only consider the performanceof GMM estimatorsobtainedusing
momentconditionsM2. Ourresultsare displayedin Figure
9 and Table 5. With this change in parameterconfiguration, the resultsfor the M2 momentconditionsare similar
to those in Figure 8. In regardto the parameterestimates,
Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators
(a) Minimized
0.5
275
(a) Minimized
(b) True
0.5
0.4
0.4
0.3
0.3
0.2
0
0
0.1
0.2
0.3
0.4 0.5
- Minimized
(c) Constrained
0.5
0.4
0.5
0.1
0.2 0.3
0.4 0.5
7
0.1 0.2
= 1.3, 0 = 0, Moment Conditions Ml: ---.
Two-Step;-
0
0
0.1
0.2 0.3 0.4 0.5
0.3
0
0
0.1 0.2 0.3
0.4
0.5
, Iterative;- --,
, Continuous-Updating.
again the median bias is small relative to the dispersion
for all three estimators.The continuous-updating
estimator displayssomewhatmore dispersionthanthe other two
estimators.
5.2 Time-SeparablePreferences, MonthlyData
To explore the extent to which the limiting distribution
providesa betterguidefor inferencefor largersamplesizes
(with less extremedatapoints),we redid some of our calculationsusing simulationscalibratedto monthlydata as
described in Subsection2.2. We focused exclusively on
the more "moderate"preferenceconfigurationadjusting6
accordingly.In this case we only looked at estimatorsconstructedusing momentconditionsMl andM2. We areparticularlyinterestedin momentset M1 becauseof its common use in practicewhen analyzingpostwardata.Ourresults are reportedin Figures10 and 11 andTable6. Notice
r
0.4
0.3
0.2
0.2
0.1
0.1
0
0.1
0.2 0.3 0.4 0.5
0
0.1
0.2 0.3
0.4 0.5
Figure 11. CriterionFunctions,MonthlyMarkov-Chain
Model,6 =
y = 1.3,
- - -, Two-Step;
.971/12,
= 0, Moment Conditions M2:.--I, Continuous-Updating.
, Iterative;
have distributionsthatare muchmore concentratedaround
the trueparametervalue thanthe distributionsfor the twostep estimator(againsee Table6). The performanceof the
two-stepestimatorcould potentiallybe improvedby using
a differentweightingmatrixin the first step. For example,
the residualsfromnonlineartwo-stageleast squaresapplied
to eachEulerequationcouldbe usedto estimatethe asymptotic covariancematrixof the momentconditions.Another
possibilityis to use the covariancematrixof the prices of
the "synthetic"securitiesimplicitin the use of instrumental
variables.See HansenandJagannathan
(1993)for a discussion of this weightingmatrix.
Table6. Propertiesof Estimatorsofy, Markov-Chain
Model,
Monthly Data: True-y = 1.3, 6 = .971/12
Property
Continuous-updating
Iterative
Two-step
1.25
1.25
1.18
.96
1.61
1.57
1.57
1.17
-2.12
4.63
1.38
1.38
1.27
1.00
1.89
1.84
1.84
1.27
-2.95
6.50
A. TS1-M1
Mean
Truncatedmean*
Median
10%quantile
90% quantile
updating estimates are very close to one another when M2
1.37
1.37
1.28
1.02
1.85
B. TS1-M2
is used. This is reflectedin quantilesreportedin Table6 as
well as in the "Minimized" and "Constrained-Minimized" Mean
mean*
graphs.Presumably,the reasonfor this is that the weight- Truncated
Median
ing matrix tends to be a relatively "flat" function of the 10%quantile
90% quantile
parameters.
In regard to the parameter estimates of -y, both the
continuous-updating estimator and the iterative estimators
(d) Wald
0.3
that all of the asymptotic approximations are consistently
reliablefor the continuous-updating
estimationmethod.In
sharpcontrast,large-sampleinferencesfor the two-stepestimatorare of particularlypoor qualitywith the exception
of the overidentifyingrestrictionstest usingM2. Moreover,
it is of note thatthe iterativeestimatesandthe continuous-
0.4 0.5
0.5
0.4
Model,6 =
Figure 10. CriterionFunctions,MonthlyMarkov-Chain
.971/12,
0.1
0.5
0.5
0
0
0.2
.
- Minimized
(c) Constrained
(d) Wald
0.1
0.1
0
0
0.4
0.2
.
/
0.3
0.3
,
0.2
0.1 0.2
0.4
/
0.3
0
0
0.3
0.1
0.1 /
0.1
.
0.2
0.2
/
0.4
."
.0.4
0.3
(b) True
0.5
0.5
1.38
1.38
1.27
1.00
1.89
* Estimateswithabsolutevaluesgreaterthan100 wereexcludedfromthe computation
of the
truncatedmeans.
276
Journalof Business & EconomicStatistics,July 1996
separablepreferenceswhenthe restriction0 = 0 is imposed
in the estimation,the estimatorVT(b) of the asymptoticcovariancematrixaccommodatesan MA(1) structurein the
0.4
0.4
//
Euler-equationerrors.As in Section 3, we used the VT(b)
estimatorgiven in (24) exceptwhenit was not positivedef0.3
0.3
inite, in whichcase we shiftedto Durbin's(1990)estimator
0.2
0.2
/.
with a fourth-orderautoregression.
Figure12 presentsthep-valueplotsfor the criterionfunc0.1
0.1
tions for the differentsettingsfor 0. The p-valueplots of
Figures 1, 2, 6, and 7-11 consideredresultsfor fixedpref00 '0 0.1 0.2 0.3 0.4 0.5
0 0.1 0.2 0.3 0.4 0.5
erence parametersand the three differentestimators.The
same MonteCarlodatawere used for the experimentsfor
Constrained
- Minimized
Wald
each
estimatorin these plots.
0.5
0.5
.
In contrastto the time-separablecase (the solid line in
0.4
0.4
Fig. 8), the distributionof the minimizedcriteria imply
.4
small-sampleoverrejectionof the moment conditionsfor
0.3
0.3
each of the settingsfor 0. Furthermore,even when evaluated at the true parameters,the criterionfunctionsare not
0.2
0.2
distributedas a chi-square.This occursfor all four settings
0.1
0.1
of 0 including0 = 0 (time-separable
preferences).Evidently
0
0
...............
the estimatorof the asymptoticcovariancematrixof the
0
0
0 0.1 0.2 0.3 0.4 0.5
0 0.1 0.2 0.3 0.4 0.5 momentconditions,which assumesan MA(1) structurefor
the errors,causes small-sampledistortionof the GMMcriModel,6 =
Figure 12. CriterionFunctions,AnnualMarkov-Chain
terion function.Notice that the distributionsof the Wald
y = 1.3, Time Nonseparable, Moment Conditions M2: ---.,
.971/2,
statistics are very far from being chi-squared.Consistent
,
;---,
,=
=
=
O
0;-with the resultsreportedin Section 4 and Subsections5.1
These particularMonte Carlo experimentsare ones in
which the unconditionalmomentrestrictionsprovidemuch
Table7. Propertiesof Estimatorsof y and 6, Markov-Chain
moreidentifyinginformationaboutthe powerparameter-y
Model,AnnualData:Continuous-Updating
Estimator,
thananyof the otherexperimentswe reporton. Not only are
Truey = 1.3, 6 = .97
estimatesmore accuratethan the estimatesobtainedfrom
Property
7
0
the MonteCarloexperimentscalibratedto annualdata,but
A. TNS1-M2,true6 =
also the estimatesfrom the lognormalMonteCarloexperimentsreportedin Section3 thathavethe samesamplesize.
Mean
4.56
5.99 x 105
.83
Truncatedmean*
4.56
Presumablythe mainreasonfor this latterdisparityis that
1.71
Median
.52
the Markov-chainmodelsare calibratedto dividendbehav10%quantile
.64
.10
ior ratherthanreturnbehavior.As is well knownfrom the
90% quantile
10.83
1.25
empiricalasset-pricingliterature,the dividendcalibrations
implyreturnsthatareless volatilethanhistoricaltime series
B. TS1-M2,true0 = 0
becauseof some fundamentalmodel misspecification.
Mean
-.34
5.88
Minimized
0.5
True
0.5
5.3 Time-NonseparablePreferences,AnnualData
As we discussed in Subsection 5.1, the continuousupdatingestimatorgenerallyprovidesmore reliableinference in the case of time-separablepreferenceswhen data
are generatedfrom the annualMarkov-chainmodel. Even
for that estimator,however,it is only when momentconditions M2 are used that the distributionsof the criteria
TJT
(minimized), TJ(
(true) and T(J? -
JT)
(con-
strained-minimized)accord well with the corresponding
chi-squareddistributions.For thesereasonswe considerresults with time-nonseparable
preferencesusing only moment conditionsM2 and only for the continuous-updating
estimator.To constructour Monte Carlodatasetswe used
the threetime-nonseparable
settingsof theparameterslisted
in Table 2 as TNS1, TNS2, and TNS3 (0 =
0 = - ,1
0 = - ). We also used data generated with 0 = 0 but still
estimatedwith the parameter0. Unlike the case of time-
Truncatedmean*
Median
10%quantile
90% quantile
3.67
.73
.00
10.92
-.34
-.12
-.92
.18
C. TNS2-M2, true 6 = --
Mean
Truncatedmean*
Median
10%quantile
90% quantile
3.05
2.69
.97
.01
8.28
D. TNS3-M2,true0 = .41
Mean
2.16
Truncatedmean*
1.22
Median
.04
10%quantile
5.79
90% quantile
-.46
-.46
-.42
-.93
-.01
.07
-.64
-.69
-.94
-.40
* Estimateswithabsolutevaluesgreaterthan100 wereexcludedfromthe computation
of the
truncatedmeans.
Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators
Minimized
0.5
.
0.4
True
0.5
0.4.
.2
0.3
0.3
0.2
0.2
r
0.1
0.1
o
0
0.1
0.2
0.3
0.4
0.5
0
0
0.1
Constrained- Minimized
0.3
0.4
0.5
Wald
0.5
0.5
0.4
0.4
0.3
0.2
0.3
,
277
equationerrorsis close to being singularat this value of 0.
Once again we used the estimatorof V(b) given by (24).
When Durbin's(1960) estimatorwas necessarywe used a
twelfth-orderautoregression.
Figure 13 reportsthe resultsfor criterionfunctionsusing
moment conditionsMl. Under Ml, the minimizedcriterion performsreasonablywell for all three settings of 0.
Tests of the overidentifyingrestrictionsof the model have
the correct small-samplesize in this case. Notice further
that the criterionT(Jd - JT) (constrained-minimized)
is
close to being chi-squareddistributedbut that the distribution of the Wald statistic is very far from chi-squared.
Hence, we continueto find that inferencesbased directly
on criterionfunctions are much more reliablethan those
based on quadraticapproximationsto the criterionfuncTable 8. Propertiesof Estimatorsof 7yand 6, Markov-Chain
Model,
0.2
0.2
0.1..
0.1
C
Monthly Data: Continuous-Updating Estimator, True 7 = 1.3, 6 = .971/12
Property
0
0.1
0.2
0.3
0.4
0.5
0
0=
;-,
3'
0
A. TNS1-M1, true 0 = 1
0.1
0.2
0.3
0.4
0.5
Model,6=
Figure 13. CriterionFunctions,MonthlyMarkov-Chain
.971/12,
7
= 1.3, Time Nonseparable, Moment Conditions Ml: ---.
= 3.
= 0;-,
and 5.2, confidenceintervalsfor 7 constructedusing the
criterionfunctionperformmuchbetterthanthose basedon
the Waldstatistic.
Table 7 reportsstatisticssummarizingthe propertiesof
the estimatorsof -y and 0. The estimatorof -y does not in
generalperformas well as in the time-separablepreference
case (see Table5 for the comparison).The medianfor 7T is
substantiallybelow the truevalue of -yin the case of 0 = 0
1 and above for 0 = 1.
and 0 Furthermore, the disof
is
of
the
estimators
7 considerablylargerthan
persion
is
when time separability correctlyimposed,at least in part
due to having to estimate an additionalparameterand to
accommodateMA(1) termsin the estimatorof the asymptotic covariancematrixV(b). Regardingthe estimatorsof
0, thereis substantialdispersionfor each case as evidenced
by the 10% and 90% quantiles.Notice furtherthat there
is some medianbias in the estimatorsof 0 for the cases
of 0 = g,0, and - (panelsA, B, and C of Table 7). In
summary,the annualdatado not permitsimultaneousestimation of 0 and 7 with any reasonableprecision,at least
for momentconditionM2.
5.4 Time-Nonseparable Preferences, Monthly Data
Using the monthlyMarkov-chainmodel, we also exam-
ined the time-nonseparable model for 0 = g, 0, and
Mean
Truncatedmean*
Median
10%quantile
90% quantile
13.56
2.90
1.27
.79
7.35
.35
.35
.34
.22
.48
B. TS1-M1,true0 = 0
16.68
4.18
1.29
.38
13.46
-.01
-.01
.01
-.23
.19
C. TNS2-M1,true6 = -1
6.03
5.82
1.24
.00
17.95
-.42
-.42
-.35
-.98
.09
Mean
Truncatedmean*
Median
10%quantile
90%quantile
D. TNS1-M2,true = 32
153.96
11.09
1.45
.48
410.29
.66
.66
.37
.16
1.16
E. TNS1-M2,true6 = 0
Mean
62.51
17.89
Truncatedmean*
Median
1.40
10%quantile
.03
142.79
90%quantile
-.14
-.14
.04
-.66
.26
Mean
Truncatedmean*
Median
10%quantile
90% quantile
Mean
Truncatedmean*
Median
10%quantile
90% quantile
1
andfor momentconditionsM2. We furtherconsideredmomentconditionsMl1becausetheseconditionsareoftenused
in practice and because the continuous-updating
estimareasonablesmall-samplepropertiesunder
torsdemonstrated
time-separablepreferences.We did not considerthe case
of 0 = - with the monthly model because the Markovchain model implies that the covariance matrix of the Euler-
F TNS2-M2,true = -•
Mean
Truncatedmean*
Median
10%quantile
90% quantile
12.98
12.78
.35
.01
40.99
-.40
-.40
-.58
-.91
.17
* Estimateswithabsolutevaluesgreaterthan100 wereexcludedfromthe computation
of the
truncatedmeans.
Journalof Business & EconomicStatistics,July 1996
278
True
Minimized
0.5
0.5
0.4
0.4-
0.3
0.3
0.2
0.2
r
0.1
0.1
00
0.2
0.1
0.3
0.4
0.5
0
0
0.1
Constrained- Minimized
0.5
0.4
0.4
'
.
o0
0.4
0.5
0.4
0.5
0.3
0.2
.2
0.2
0.1
0.3
Wald
0.5 1
0.3
0.2
...
0.1
0.1
.
0.2
0.3
0.4
0.5
o0
0.1
0.2
0.3
Model,6 =
Figure 14. CriterionFunctions,MonthlyMarkov-Chain
.971/12, y = 1.3, Time Nonseparable, Moment Conditions M2: ---.
1
0= = "3'.
0O=
',
"
,
' 0;-,
tions. Finally,we reportsummarystatisticsof the central
tendencyof the estimatorsof -yand0 in Table8, panelsA,
B, andC. Lackof priorknowledgeof the parameter6 again
causes the estimatorof -yto be much less precise as measuredby the distancebetweenthe 90%and 10%quantiles.
(Forinstance,comparethe firstcolumnof Table6 to panel
B of Table8.)
Figure 14 presentsthe p-valueplots for momentconditions M2. In this case the distributionof the minimizedcriterionfunctionsandthe constrained-minimized
criteriaare
not chi-squared.The model'soveridentifyingconditionsare
rejectedtoo often for all of the parametersettings,andconfidenceintervalsfor the parameter-yhave the wrong coverage probabilities.With momentconditionsM2, allowing
for time nonseparabilityresultsin a substantialnumberof
very largeestimatesof -yas reflectedby the size of the 90%
quantiles(see Table8, panelsC, D, and E). It appearsthat
"theadditionof returnsas instruments(the differencebetween momentconditionsMl and M2) improvesthe performanceof the estimatorand the quality of the central
limit approximations.
6.
CONCLUDINGREMARKS
In this article we examinedthe finite-sampleproperties of three alternativeGMM estimatorsthat differin the
way in which the momentconditionsare weighted.Particular attentionwas paid to both the performanceof tests
of overidentifyingrestrictionsandto comparingalternative
ways of constructingconfidencesets. In documentingfinitesampleproperties,we used severaldifferentspecifications
of the CCAPM.The experimentsdifferedsubstantiallyin
the amountof sample informationthat there is about the
parametersof interest.Although the experimentsdo not
uniformlysupportthe conclusionthatone estimatordominates the others,some interestingpatternsemerged.
1. Continuousupdatingin conjunctionwith criterionfunction-basedinferenceoften performedbetterthanother
methodsfor annualdata;however,the large-sampleapproximationsare still not very reliable.
2. In monthlydata the centrallimit approximationsfor
estimationmethodappliedin conthe continuous-updating
methodof inferjunctionwith the criterion-function-based
ence performedwell in most of our experiments,including
ones in whichthe parametersareestimatedvery accurately
and ones in which thereis a substantialamountof dispersion in the estimates.
3. Confidenceintervalsconstructedusing quadraticapproximationsto the criterion function performed very
poorly in manyof our experiments.
4. The continuous-updating
estimatortypicallyhad less
medianbias thanthe otherestimators,but the MonteCarlo
sampledistributionsfor this estimatorsometimeshadmuch
fattertails.
5. The tests for overidentifyingrestrictionsare, by construction,more conservativewhen the weightingmatrixis
continuouslyupdated,andin manycases this led to a more
reliabletest statistic.
inferOur reasonfor exploringcriterion-function-based
ences and continuousupdatingis to assess some simple
ways of making GMM inferences more reliable. Moreover, when continuousupdatingis used in conjunction
with criterion-function-based
inferences,the large-sample
approximationsbecome invariantto parameter-dependent
transformationsof the momentconditions.In this article
we have made no attemptto explorethe ramificationsfor
powerof the resultingstatisticaltests. Moreover,from the
standpointof obtainingpoint estimates,we see no particular advantageto using continuousupdatingwhen minimizing GMMcriterionfunctions.For example,continuous
updatingcan indeedfatten the tails of the distributionsof
the estimators.In this sensecontinuousupdatingsometimes
inheritsthe defects of maximumlikelihoodestimatorsrelative to two-stageleast squaresestimatorsin the classical
environment.
simultaneous-equations
OurMonteCarloexperimentsfor monthlydatawere sufficiently successful to convince us to reexaminesome of
the empiricalevidence for the CCAPM.In most tests of
the CCAPM,the model'soveridentifyingconditionsarerejected (e.g., see Hansenand Singleton 1982). Because the
two-stepor iterativeestimatoris typicallyused in practice,
one potentialexplanationfor these rejectionscould be the
tendencyof these estimatorsto result in overrejectionof
the model in small samples.To assess this possibilitywe
estimatedthe time-separableand time-nonseparable
modestimator.We used the
els using the continuous-updating
consumptionand returndata describedin Subsection2.1
along with momentconditionsMl given in Table3.
Estimationof the time-separablemodel resultedin point
estimatesof 6 and 7 of .25 and 720.65, respectively.This
Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators
279
Time SeparablePreferences,Delta Estimated
65
60
.2
55
S50
245
(.
400
2
4
6
8
12
10
Gamma
14
16
18
20
18
20
Time NonseparablePreferences,Delta andThetaEstimated
40
35
C:
230
1
0 25
M 20
1510
0
2
4
6
8
10
12
14
16
Gamma
Figure 15. GMMCriterionFunctions,MonthlyData, -yConstrained,MomentConditionsM1.
is an example in which the tail behaviorof the criterion
resultsin largeestimatedvalueof 7. The minimizedGMM
criterionwas 5.94 with an impliedp value of .43; hence,
estimatorimplies
it appearsthat the continuous-updating
that the model is not at odds with the data. The estimate
parameters,however,are very far frombeing economically
plausible.As we found in severalof our Monte Carlo exestimator,extreme
periments,with the continuous-updating
point estimates of the parametersare possible. In those
cases, however,there typically was little deteriorationin
the criterionfunctionwhen evaluatednearthe trueparameter values, so in practiceit is importantto evaluatethe
criterionfunctionat plausiblevalues of the parameters.In
this case we restricted-7to the range[0, 20] andestimated6
for each hypotheticalvalue of 7. The resultingminimized
criterionas a function of 'yis plotted in the top panel of
Figure 15. Notice that for this range of y the minimized
criterionfunction is well above 30, where the implied p
value is essentially0. As a result, once a plausibleset of
parametersis considered,the model is still rejectedwhen
estimator.
using the continuous-updating
In estimationof the time-nonseparable
model, the point
estimatesof the parameterswerealso quiteimplausiblewith
estimatesof 6, y-, and 0 of 1.20, 267.96, and .32, respectively.The bottompanelof Figure 15 presentsthe criterion
estimatorwith -yrefunctionfor the continuous-updating
=
strictedto the range[0, 201.At -y 20 the criterionreaches
a minimumof 13.55 with an impliedp value of .035. As
a result,even at this extremevalue for -ythe model is still
substantiallyat odds with the data.
estimator
In summary,althoughthe continuous-updating
does not save the CCAPM,the experimentsthat we have
presentedprovideevidencethatit shouldbe of use in many
GMMestimationenvironments.
ACKNOWLEDGMENTS
"Wethank Renee Adams, Wen Fang Liu, Alexander
Monge,MarcRoston,GeorgeTauchen,and an anonymous
refereefor helpfulcommentsandMarcRostonfor assisting
with the computations.Financialsupportfromthe National
Science Foundation(Hansenand Heaton) and the Sloan
Foundation(HeatonandYaron)is gratefullyacknowledged.
Additionalcomputerresourceswere providedby the Social
Sciences and PublicPolicy ComputingCenter.Partof this
researchwas conductedwhile Heatonwas a NationalFellow at the HooverInstitution.
[ReceivedJanuary1995. RevisedJuly1995.]
REFERENCES
Abel, A. (1990), "AssetPricesUnderHabitFormationand CatchingUp
Withthe Joneses,"A.E.R.Papersand Proceedings,80, 38-42.
Anderson,T. W., Kunitomo,N., and Sawa,T. (1982), "Evaluationof the
DistributionFunctionof the LimitedInformationMaximumLikelihood
Econometrica,50, 1009-1028.
Estimator,"
G. (1990), "HabitFormation:A Resolutionof the Equity
Constantinides,
PremiumPuzzle,"Journalof PoliticalEconomy,98, 519-543.
Davidson,R., and McKinnon,J. G. (1994), "GraphicalMethodsfor Investigatingthe Size andPowerof HypothesisTests,"DiscussionPaper
903, QueensUniversity,Dept.of Economics.
Detemple,J., andZapatero,F. (1991),"AssetPricesin anExchangeEconEconometrica,59, 1633-1657.
omy WithHabitFormation,"
280
Journalof Business & EconomicStatistics,July 1996
Dunn,K., andSingleton,K. (1986),"ModelingtheTermStructureof Interest RatesUnderNonseparable
UtilityandDurabilityof Goods,"Journal
of Financial Economics, 17, 27-55.
Durbin,J. (1960), "TheFitting of Time Series Models,"Reviewof the
International Statistical Institute, 28, 233-244.
Eichenbaum,M., and Hansen,L. (1990), "EstimatingModels With IntertemporalSubstitutionUsing AggregateTime Series Data,"Journal
of Business & Economic Statistics, 8, 53-69.
ences andTimeAggregation,"
Econometrica,
61, 353-385.
(1995), "AnEmpiricalInvestigationof Asset PricingWith Time
Preferences,"
Econometrica,63, 681-717.
Nonseparable
Hillier,G. (1990), "Onthe Normalizationof StructuralEquations:Propertiesof DirectionEstimators,"
Econometrica,58, 1181-1194.
N. R. (1990a),"OnTestsof Representative
ConsumerAsset
Kocherlakota,
S-
Pricing Models," Journal of Monetary Economics, 26, 285-304.
- (1990b),"ANote on the 'Discount'Factorin GrowthEconomies,"
Eichenbaum,M., Hansen,L., and Singleton,K. (1988), "A Time Series
Journal of Monetary Economics, 25, 43-48.
andLeisure
Analysisof a Representative
AgentModelof Consumption
Magdalinos,A. M. (1994), "TestingInstrumentAdmissibility:Some ReChoice Under Uncertainty" Quarterly Journal of Economics, 103, 51finedAsymptoticResults,"Econometrica,62, 373-405.
78.
Mariano,R. S., andSawa,T. (1972),"TheExactFinite-SampleDistribuRiskAversion,andTemporal
Epstein,L., andZin,S. (1991),"Substitution,
tion of the Limited-Information
MaximumLikelihoodEstimatorin the
Behaviorof ConsumptionandAsset Returns:An EmpiricalAnalysis,"
Caseof TwoIncludedEndogenousVariables,"
Journalof theAmerican
Journal Political
263-286.
of
Economy, 199,
Statistical Association, 67, 159-163.
G. (1991),"HabitFormationandDurabil- Nelson,C. R., andStartz,R.
Ferson,W.,andConstantinides,
of theInstrumental
(1990),"TheDistribution
ity in AggregateConsumption:
EmpiricalTests,"Journalof Financial
VariablesEstimatorandIts t-RatioWhentheInstrument
Is a PoorOne,"
Economics, 29, 199-240.
Ferson,W.,andFoerster,S. (1994),"FiniteSamplePropertiesof the GeneralizedMethodof Momentsin Testsof ConditionalAssetPricingMod-
Journal of Business, 63, 125-164.
Novales,A. (1990),"SolvingNonlinearRationalExpectationsModels:A
StochasticEquilibrium
Modelof InterestRates,"Econometrica,
58, 93Journal
els,"
of Financial Economics, 36, 29-55.
111.
Estimationof Pakes, A., and
Gallant,A., and Tauchen,G. (1989), "Seminonparametric
Pollard,D. (1989), "Simulationand the Asymptoticsof
ConditionallyConstrainedHeterogeneousProcesses,"Econometrica, OptimizationEstimators,"
Econometrica,57, 1027-1057.
57, 1091-1120.
H. E., Jr.,andHeal,G. M. (1973),"OptimalGrowthWithIntertemRyder,
S.
Grossman, J., Melino, A., and Shiller, R. (1987), "Estimatingthe
porally Dependent Preferences," Review of Economic Studies, 40, 1-31.
Continuous-Time
Consumption-Based
Asset-PricingModel,"Journalof
Sargan,J. (1958), "TheEstimationof EconomicRelationshipsUsing InBusiness & Economic Statistics, 5, 315-327.
strumentalVariables,"
Econometrica,26, 393-415.
Substitutionin Consumption,"
Hall, R. (1988),"Intertemporal
Journalof
Sawa, T. (1969), "The Exact SamplingDistributionof OrdinaryLeast
Political Economy, 96, 339-357.
Journalof theAmerSquaresandTwo-StageLeastSquaresEstimator,"
Hansen,L., Heaton,J., and Yaron,A. (1994), "FiniteSampleProperties
ican Statistical Association, 64, 923-937.
of Some AlternativeGMMEstimators,"
WorkingPaper3728-94-EFA,
MassachusettsInstituteof Technology,SloanSchoolof Management. Stock, J., and Wright, J. (1995), "Asymptoticsfor GMM Estimators
WithWeakInstruments,"
unpublishedmanuscript,HarvardUniversity,
R. (1993), "AssessingSpecificationErrors
Hansen,L., and Jagannathan,
School.
Kennedy
in StochasticDiscountFactorModels,"unpublishedmanuscript.
Sundaresan,S. (1989), "Intertemporally
DependentPreferencesand the
Hansen,L., and Sargent,T. (1991), "ExactLinearRationalExpectations
of
and
Wealth,"
Review of Financial Studies, 2,
Volatility
Consumption
Models," in Rational Expectations Econometrics, eds. L. Hansen and T.
73-89.
Boulder:
Westview
45-75.
Press,
Sargent,
pp.
Tauchen,G. (1986), "StatisticalPropertiesof GeneralizedMethod-ofInstrumental
Hansen,L., andSingleton,K. (1982), "Generalized
Variable
MomentsEstimatorsof Structural
Parameters
ObtainedFromFinancial
Estimationof NonlinearRationalExpectationsModels,"Econometrica,
Market Data," Journal of Business & Economic Statistics, 4, 397-425.
50, 1269-1286.
Methodsfor ObRiskAversion,andtheTemporal Tauchen,G., andHussey,R. (1991),"Quadrature-Based
(1983),"StochasticConsumption,
taining.ApproximateSolutionsto NonlinearAsset Pricing Models,"
Behavior of Asset Returns,"Journal of Political Economy, 91, 249-265.
Econometrica, 59, 371-396.
(1996),"EfficientEstimationof LinearAssetPricingModelsWith
A. (1978), "Estimationof Functionsof PopulationMeans and
Zellner,
Moving-Average Errors,"Journal of Business & Economic Statistics, 14,
CoefficientsIncludingStructuralCoefficients;A Minimum
Regression
53-68.
Journalof Econometrics,8, 127ExpectedLoss (MELO)Approach,"
PreferHeaton,J. (1993), "TheInteractionBetweenTime-Nonseparable
158.