Download Report

Following the Trend: Tracking GDP when Long-Run
Growth is Uncertain∗
Juan Antolin-Diaz
Thomas Drechsel
Ivan Petrella
Fulcrum Asset Management
LSE, CFM
Birkbeck & CEPR
This draft: November 20, 2014
First draft: October 24, 2014
Abstract
Using a Bayesian dynamic factor model that allows for changes in both
the long-run growth rate of output and the volatility of business cycles, we
document a significant decline in long-run growth in the United States and in
other advanced economies. Our evidence supports the view that this slowdown
started prior to the Great Recession.
When applied to real-time data, the
proposed model is capable of detecting shifts in long-run growth in a timely
and reliable manner. Furthermore, taking into account the variation in long-run
growth improves the short-run forecasts and “nowcasts” of US GDP typically
produced using this class of models.
Keywords: Dynamic Factor Models; Bayesian Methods; Mixed Frequencies; Realtime Forecasting; Long-run growth;
JEL Classification Numbers: C53, C38, C11, E37.
∗
Antolin-Diaz: Macroeconomic Research Department, Fulcrum Asset Management, Marble Arch
House, 66 Seymour Street, London W1H 5BT, UK; E-Mail: [email protected].
Drechsel: Department of Economics and Centre for Macroeconomics, London School of Economics,
Houghton Street, London, WC2A 2AE, UK; E-Mail: [email protected]. Petrella: Birkbeck, University of London, Malet Street, London WC1E 7HX, UK; E-Mail: [email protected]. The views
expressed in this paper are solely the responsibility of the authors and should not be interpreted as
reflecting the views of Fulcrum Asset Management. We thank Gavyn Davies, Wouter den Haan, Ed
Hoyle, Ron Smith and the seminar participants at ECARES for useful comments and suggestions.
Alberto D’Onofrio provided excellent research assistance.
1
1
Introduction
“The global recovery has been disappointing ... Year after year we have had to
explain from mid-year on why the global growth rate has been lower than predicted as
little as two quarters back”. Stanley Fischer, August 2014.
The slow pace of the recovery from the 2007-2009 recession has recently prompted
questions about whether the long-run growth rate of the US economy (and other industrialised economies) is lower now than it has been on average over the past few decades.
Indeed, for the last five years, forecasts of US and global real GDP growth have been
persistently biased upwards. As means of illustration, Figure 1 shows the GDP growth
projections made by the Federal Open Market Committee (FOMC) at the end of each
year for the current year and the subsequent three. As the figure shows, actual GDP
growth has averaged close to 2% over that period. At the end of each year, however,
the FOMC expected the economy to accelerate substantially, only to downgrade the
forecast back to 2% throughout the course of the year.1
What accounts for the systematic upward bias in growth forecasts? The present
paper starts from the observation that there is substantial evidence of a decline in the
mean of US real GDP growth. For instance, the structural break test of Bai and Perron
(1998) suggests that a significant break in the mean GDP growth probably occurred
in the early part of the 2000’s, and that this break is larger than the one generally
accepted to have occurred in the early 1970’s (see also Luo and Startz, 2014). This
evidence complements the recent work of Fernald (2014), who describes a slowdown
in US labor and total factor productivity prior to the Great Recession, and concludes
that most of the shortfall of actual output from the pre-recession trend is the result
1
The FOMC was not alone. An analysis of consensus forecasts of private sector economists, using
data from Consensus Economics Inc., reveals the same pattern.
2
Figure 1: Year-end FOMC projections for US Real GDP Growth and Actual Outturn, 2009-2014
Actual Outturn
Nov-09
Nov-10
Nov-11
Dec-12
Dec-13
6.0
4.0
2.0
0.0
-2.0
-4.0
-6.0
2009
2010
2011
2012
2013
2014
2015
2016
Note: Calculated as the midpoint of the central tendency of the FOMC participants’ individual
projections. Both the projections and the actual outturn are expressed as percentage growth over the
same quarter of the previous year. Source: Federal Reserve Board of Governors.
of overly optimistic projections of potential growth. Interestingly, a sequential application of the Bai and Perron (1998) test using real-time data reveals that while the
break is estimated to have occurred in the early 2000’s, it would not have been detected
at conventional significance levels until the summer of 2014, which possibly explains
the delay in incorporating the slowdown into growth forecasts. Orphanides (2003) emphasized that real-time misperceptions about the long-run growth of the economy can
have a large role in monetary policy mistakes. We therefore argue that the possibility of significant instabilities in the mean (and the variance) of real output calls for
a robust framework capable of producing timely assessments of short- and long-run
GDP growth. To this end, we introduce two novel features into an otherwise standard
3
dynamic factor model (DFM) of real activity data: a time-varying long-run growth
component for GDP, and stochastic volatility (SV) in the innovations to both factors
and idiosyncratic components.
While many econometric models have assumed long-run growth to be constant,
and well approximated by the sample average of about 3%, the literature on economic
growth reveals considerable uncertainty about the issue (see, e.g. Fernald and Jones,
2014). While it is by now widely accepted that the post-1973 sample is associated with
a slowdown in productivity and therefore potential growth (see Nordhaus, 2004 for a
retrospective), there is a range of views on the current prospects for the US economy
after the booming years often associated with the IT revolution in the late 1990’s
(Oliner and Sichel, 2000). In recent years a rather pessimistic view of long-run growth
has emerged (Gordon, 2014, Summers, 2014) contrasting with more benign view of
the health of the economy prevailing before the Financial Crisis (see Jorgenson et al.,
2006). The introduction of a time-varying mean for GDP growth allows us to take
the uncertainty about long-run growth seriously. As emphasised by Cogley (2005), if
one thinks that the long-run growth rate is not constant, it is optimal to give more
weight to recent data than to data from the distant past when estimating current longrun growth. By taking a Bayesian approach, we can combine our prior beliefs about
the rate at which the past information should be discounted with the information
contained in the data. The inclusion of SV is instead motivated by the desire to
capture the changes in the uncertainty surrounding GDP forecasts in the sample under
investigation. Specifically, this is important to capture the substantial changes in the
volatility of output that have taken place during the postwar period, most famously
the “Great Moderation” first reported by Kim and Nelson (1999a) and McConnell and
Perez-Quiros (2000). Moreover, the recent literature on economic uncertainty finds
substantial time variation and counter-cyclicality in aggregate uncertainty (see e.g.,
4
Bloom, 2014, and Jurado et al., 2014).
Since the seminal contributions of Evans (2005) and Giannone et al. (2006), the
DFM has become one of the most important tools to produce real-time tracking estimates (also known as “nowcasts”) of GDP, for which data is available only quarterly
and with considerable delay, by exploiting the ability of the Kalman Filter to incorporate the information content provided by more timely indicators.2 The DFM literature
has generally operated under the assumption of no structural instabilities, either by
assuming that the data is stationary to begin with, or that it has been differenced
appropriately to reach stationarity.3 In the case of GDP, it is usually assumed that
its quarterly growth rate has constant mean and variance. Important exceptions in
the literature that allow for structural instabilities within the DFM include Del Negro
and Otrok (2008), who model time variation in factor loadings and volatilities,4 and
Marcellino et al. (2013), who show that the addition of SV improves the performance of
the model for short-term forecasting of euro area GDP. Acknowledging the importance
of allowing for time variation in the means of the variables, Stock and Watson (2012)
pre-filter their dataset in order to remove any low-frequency trends from the resulting
growth rates using a biweight local mean. In his comment to their paper, Chris Sims
(2012) suggested to explicitly model, rather than filter out, these long-run trends, and
emphasized the importance of evolving volatilities for describing and understanding
macroeconomic data. We see the present paper as extending the DFM literature, and
in particular its application to nowcasting GDP, in the direction suggested by Sims.
2
An extensive survey of the nowcasting literature is provided by Banbura et al. (2012), who also
demonstrate, in a real-time context, the good out-of-sample performance of DFM nowcasts.
3
If evidence of structural breaks is present, the usual strategy is to select an estimation window over
which the series are believed to be stationary. For example, Banbura et al. (2012) start their estimation
sample in 1983 to account for the reduction in volatility associated with the “Great Moderation” in
the mid-1980’s.
4
The model of Del Negro and Otrok (2008) also includes time-varying factor loadings, but the
means of the observable variables are still treated as constant, i.e. they are stationary in the first
moment but not in the second.
5
Because in-sample results obtained with revised data often underestimate the uncertainty faced by policymakers in real time, we conduct a careful out-of-sample evaluation of the performance of our model using vintages of US data from January 2000
to September 2014. We document how, by the summer of 2011 (well before the break
test became conclusive) the model would have concluded that a significant decline in
long-run growth was behind the slow recovery. Furthermore, we evaluate the forecasts
of our model against a benchmark DFM without time-varying long-run growth or SV,
and we show that the addition of these features improves both density and point forecasts of US GDP growth. Failing to account for the lower level of the long-run growth
component of GDP, a standard DFM assuming a stable mean would have persistently
predicted a recovery stronger than the one witnessed in the recent years.
Finally, we apply our model to the other G7 economies and review the drivers
of secular fluctuations of GDP growth. Our estimates conform to the well accepted
narrative on the evolution of demographic and productivity trends in the advanced
economies. We show that while in the US, long-run variations in productivity have
been, until recently, masked by offsetting movements in labor force growth, in other
countries productivity and labor force trends historically moved in the same direction,
resulting in large changes in long-run output growth. We also show that productivity
growth in the recent decade has been substantially below historical norms. Therefore,
having a model that appropriately accounts for the uncertainty in the time variation
of the long-run component of GDP is perhaps even more important at the current
juncture.
The remainder of this paper is organized as follows. Section 2 presents preliminary
evidence of a slowdown in the long-run US GDP growth. Section 3 discussed the
implications of time-varying long-run growth and volatility for DFMs and presents our
model. Section 4 applies the model to US data and documents the decline in long-run
6
growth. The implications for tracking GDP are discussed in the context of our realtime out-of-sample evaluation exercise. 5 applies our model to the other G7 countries
and decomposes the changes in long-run growth into its underlying drivers. Section 6
concludes.
2
Preliminary Evidence of a Decline in Long-Run
GDP Growth
The possibility that there are changes in long-run GDP growth is the source of a
long-standing controversy, and has important implications for econometric modelling.
In essence, the question is whether GDP growth is best described as fluctuating around
a constant mean, or whether the mean growth rate is itself subject to stochastic variation. Nelson and Plosser (1982) first modelled the (log) level of real GDP as containing
a random walk with drift. This implies that after first-differencing, the resulting growth
rate would be stationary, and therefore exhibit a constant mean. Since their seminal
contribution, this assumption has been embedded in many econometric models.
After the slowdown in productivity and GDP growth of the 1970’s became apparent,
many papers such as Clark (1987) modeled the drift term in the stochastic trend as
an additional random walk. This second approach implies that the level of GDP is
integrated of order two. This assumption would also be consistent with the local
linear trend model of Harvey (1985), in the popular Hodrick-Prescott filter (Hodrick
and Prescott, 1997), and in Stock and Watson (2012)’s practice of removing a local
biweight mean from the growth rates before applying the standard DFM framework.
The I(2) assumption is, however, not without controversy since it implies that the
growth rate of output can drift without bound. Consequently, many applied papers,
(see, e.g. Perron and Wada, 2009) have argued that the changes in trend growth
7
occurring in the 1970s reflect one large break, rather than a continuous sequence of
very small breaks and modeled the growth rate of GDP as a stationary process with
at most one deterministic break in the mean around 1973. As means of preliminary
evidence of a slowdown in GDP growth, we test for multiple breaks in the mean of GDP
growth over the postwar sample period using the Bai and Perron (1998) methodology.5
We find that there is evidence in favor of at least one break at the 5% level. The most
likely break is in the second quarter of 2000, while the second most likely break, which
is not significant, is estimated to have occurred in the second quarter of 1973.
Figure 2: Real-Time Result of the Bai-Perron Test: 2000-2014
9
8
7
6
5
4
3
2000
2002
2004
2006
2008
2010
2012
2014
Note: The solid blue line is the test statistic obtained from recursively re-applying the Bai and Perron
(1998) test in real time as new National Accounts vintages are being published. The dotted line plots
5% critical value of the test, while the dashed line plots the 10% critical value.
These results are similar to those of Luo and Startz (2014), who use Bayesian
model averaging to calculate the posterior probability of a single break and find the
most likely break date in 2006:Q1. They note that if the sample was restricted to
5
See Appendix A for the full results of the test and some further discussions.
8
exclude the decade of the 2000’s, a break date around 1973:Q1 would be the most
likely. This is also in line with the analysis of the labour productivity series by Fernald
(2014), who finds evidence for three breaks in the mean growth rate of productivity:
first, a productivity slowdown in 1973:Q2, second, a speedup around 1995:Q3, and
finally a second slowdown in the early 2000’s.6
Given the somewhat different results obtained with different samples and methods,
the precise number and timing of breaks remains unclear to us. However, it is a fair
conclusion from the results presented above, that there is substantial evidence for at
least one break in the mean of GDP in the postwar sample, most likely in the first half
of the decade of the 2000’s. Most importantly, the early 2000’s break detected by the
Bai and Perron (1998) test only became significant at standard levels with the recent
vintages of National Accounts data, as displayed in Figure 2. If the test is correct
and the break happened at the beginning of the decade, this means that the break
was not detected until almost fifteen years later. This highlights the importance of an
econometric framework capable of detecting changes in long-run growth in a timely
and accurate manner.
3
Econometric Framework
DFMs in the spirit of Geweke (1977), Stock and Watson (2002) and Forni et al.
(2009) have become a workhorse of empirical macroeconomics. This popularity owes
to their theoretical appeal (see e.g., Sargent and Sims, 1977 or Giannone et al., 2006),
as well as their ability to parsimoniously model very large datasets. DFMs capture the
idea that a small number of unobserved factors drives the comovement of a possibly
6
The results of Fernald (2014) vary slightly depending on whether the income or expenditure
measures of GDP are used. In the first case, the breaks are detected in 1973:Q2, 1997:Q2 and
2003:Q1; in the second, 1973:Q2, 1995:Q3 and 2006:Q1.
9
large number of macroeconomic time series, each of which may be contaminated by
measurement error or other sources of idiosyncratic variation. Giannone et al. (2008)
and Banbura et al. (2012) have applied the DFM framework to the problem of nowcasting GDP; that is, obtaining early estimates of quarterly GDP growth by exploiting
more timely monthly indicators and the factor structure of the data.
The implications of the presence of instabilities in the mean of GDP growth for the
estimation of DFMs are not straightforward. Intuitively, if the instabilities are shared
by many of the series in the panel, they will be captured by the factors, and therefore
the factor-based forecasts will be accurate. If, however, they are not common, they will
be absorbed by the idiosyncratic component of GDP, while the factors themselves will
still be estimated consistently (see Doz et al., 2012). Given the maintained assumption
of stationarity of factors and idiosyncratic components, and given that the common
factors drive most of the persistence in the series, the forecasts of the latter will revert
to zero relatively quickly, likely producing biased forecasts and nowcasts of GDP. In
addition, it is likely that the presence of instabilities in the variance will lead to poor
density forecasting. In this section we propose to make the DFM framework robust to
both of these instabilities.
While, as noted by Luo and Startz (2014), the available data cannot conclusively
settle the question of whether these instabilities take the form of discrete, large breaks,
or a continuous sequence of small shocks, specifications with discrete breaks have very
important practical drawbacks. First, until recently the presence of a single significant
break in 1973 was usually taken for granted. This highlights the uncertainty about
the number of true breaks and their exact timing. However, once a test is completed
and a break introduced into a model, this is essentially asserting that the break(s)
occurred with unit probability at the particular point determined by the test, therefore
ignoring the uncertainty surrounding the test itself. Second, the fact that breaks have
10
occurred in the past should make the researcher wary that they might happen in the
future. It follows that models with deterministic breaks underestimate the uncertainty
around growth forecasts, which should increase with the forecast horizon. Finally, the
real-time test result of Figure 2 made evident, structural break tests may only detect
a break when it is too late.
Therefore, while we remain agnostic about the ultimate form of structural change in
the GDP process, we specify the long-run growth rate of GDP as a random walk. Our
motivation is similar to Primiceri (2005). While in principle it is unrealistic to conceive
that GDP growth could wander in an unbounded way, as long as the variance of the
process is small and the drift is considered to be operating for a finite period of time,
the assumption is innocuous. Moreover, modeling the trend as a random walk is more
robust to misspecification when the actual process is indeed characterized by discrete
breaks, whereas structural break tests might not be robust to the true process being
a random walk.7 Finally, the random walk assumption also has the desirable feature
that, unlike stationary in models, confidence bands around GDP forecasts increase
with the forecast horizon, reflecting uncertainty about the possibility of future breaks
or drifts in trend growth.
3.1
The Model
Let yt be an (n × 1) vector of observable macroeconomic time series, and let ft
denote a (k × 1) vector of latent common factors. It is assumed that n >> k, so that
the number of observables is much larger than the number of factors. Setting k = 1
7
We demonstrate this point with the use of Monte Carlo simulations, showing that a random walk
trend ‘learns’ quickly about a large break once it has occurred. On the other hand, the random walk
does not detect a drift when there is not one, despite the presence of a large cyclical component. See
Appendix B for the full results of these simulations.
11
and ordering GDP growth first (therefore GDP growth is referred to as y1,t ) we have8
y1,t = α1,t + ft + u1,t ,
yi,t = αi + λi ft + ui,t ,
(1)
i = 2, . . . , n
(2)
where ui,t is an idiosyncratic component specific to the ith series and λi is its loading
to the common factor.9 Since the intercept α1,t is time-dependent in equation (1), we
allow the mean growth rate of GDP to vary. We choose to do so only for GDP, the
variable of primary interest, to keep the model as parsimonious as possible.10 If some
other variable in the panel was at the center of the analysis or there was suspicion of
changes in its mean, an extension to include additional time-varying intercepts would
be straightforward. In fact, for theoretical reasons it might be desirable to impose that
the drift in long-run GDP growth is shared by other series such as consumption, a
possibility that we consider in the robustness section.
The laws of motion for the factor and idiosyncratic components are, respectively,
Φ(L)ft = εt ,
(3)
ρi (L)ui,t = ηi,t
(4)
8
In our application to US real activity data, the use of one factor is appropriate and thus focus
on the case of k = 1 for expositional clarity in this section.
9
The loading for GDP is normalised to unity. This serves as an identifiying restriction in our
estimation algorithm. Bai and Wang (2012) discuss minimal identifying assumptions for DFMs.
10
The alternative approach of including a time-varying intercept for all indicators (see, e.g. Creal
et al., 2010 or Fleischman and Roberts, 2011) implies that the number of state variables increases
with the number of observables. This not only imposes an increasing computational burden, but in
our view compromises the parsimonious structure of the DFM framework, in which the number of
degrees of freedom does not decrease as more variables are added. It is also possible that allowing
for time-variation in a large number of coefficients would improve in-sample fit at the cost of a loss
of efficiency in out-of-sample forecasting. For the same reason we do not allow for time-variation in
the autoregressive dynamics of factors and idiosyncratic components, given the limited evidence on
changes in the duration of business cycles (see e.g. Ahmed et al., 2004).
12
Φ(L) and ρi (L) denote polynomials in the lag operator of order p and q, respectively.
Both (3) and (4) are covariance stationary processes. The disturbances are distributed
iid
iid
2
) and ηi,t ∼ N (0, ση2i,t ), where the SV is captured by the time-variation
as εt ∼ N (0, σε,t
2
in σε,t
and ση2i,t .11 The idiosyncratic components, ηi,t , are cross-sectionally orthogonal
and are assumed to be uncorrelated with the common factor at all leads and lags.
Finally, the dynamics of the model’s time-varying parameters are specified to follow
driftless random walks:
iid
α1,t = α1,t−1 + vα,t ,
vα,t ∼ N (0, ςα,1 )
log σεt = log σεt−1 + vε,t ,
vε,t ∼ N (0, ςε )
log σηi,t = log σηi,t−1 + vηi,t ,
vηi,t ∼ N (0, ςη,i )
iid
iid
(5)
(6)
(7)
where ςα,1 , ςε and ςη,i are scalars.12
Note that in the standard DFM, it is assumed that εt and ηi,t are iid. Moreover,
both the factor VAR in equation (3) and the idiosyncratic components (4) are usually assumed to be stationary, so by implication the elements of yt are assumed to be
stationary (i.e. the original data have been differenced appropriately to achieve stationarity). In equations (1)-(7) we have relaxed these assumptions to allow for SV and
a stochastic trend in the mean of GDP. Our model nests the specifications that have
been proposed previously in the nowcasting literature: we obtain the DFM with SV of
Marcellino et al. (2013) if we shut down time variation on the mean of GDP, i.e. set
11
Once SV is included in the factors, it must be included in all idiosyncratic components as well. In
fact, the Kalman filter estimates of the state vector will depend on the signal-to-noise ratios, σεt /σηi,t .
If the numerator is allowed to drift over time but the denominator is kept constant, we might be
introducing into the model spurious time-variation in the signal-to-noise ratios, implying changes in
the precision with which the idiosyncratic components can be distinguished from the common factors.
12
For the case of more than one factor, following Primiceri (2005), the covariance matrix of ft ,
denoted by Σε,t , can be factorised without loss of generality as At Σε,t A0t = Ωt Ω0t , where At is a lower
triangular matrix with ones in the diagonal and covariances aij,t in the lower off-diagonal elements,
and Ωt is a diagonal matrix of standard deviations σεi,t . Furthermore, for k > 1, Qε would be an
unrestricted (k × k) matrix.
13
ςα,1 = 0, and if we further shut down the SV, i.e. set ςα,1 = ς = ςη,i = 0, we obtain the
specification of Banbura and Modugno (2014) and Banbura et al. (2012).
3.2
Dealing with Mixed Frequencies and Missing Data
Tracking activity in real time requires a model that can efficiently incorporate information from series measured at different frequencies. In particular, it must include
both the growth rate of GDP, which is measured at quarterly frequency, and more
timely monthly indicators of real activity. Therefore, the model is specified at the
monthly frequency, and following Mariano and Murasawa (2003), the (observed) quarterly growth rate can be related to the (unobserved) monthly growth rate and its lags
using a weighted mean:
1 m
2 m
1 m 2 m
q
m
+ y1,t−1 + y1,t−2
+ y1,t−4
+ y1,t−3
y1,t
= y1,t
3
3
3
3
(8)
q
where only every third observation of y1,t
is actually observed. Substituting (1) into
(8) yields a representation in which the quarterly variable depends on the factor and
its lags. The presence of mixed frequencies is thus reduced to a problem of missing
data in a monthly model.
Besides mixed frequencies, additional sources of missing data in the panel include:
the “ragged edge” at the end of the sample, which stems from the non-synchronicity
of data releases; missing data at the beginning of the sample, since some data series
have been created or collected more recently than others; and missing observations
due to outliers and data collection errors. Below we will present a Bayesian estimation
method that exploits the state space representation of the DFM, so the latent factors,
the parameters, and the missing data points will be jointly estimated using the Kalman
filter (see Durbin and Koopman (2012) for a textbook treatment).
14
3.3
State Space Representation and Estimation
The model features autocorrelated idiosyncratic components (see equation (4)). In
order to cast it in state-space form, Banbura and Modugno (2014) suggest including
these components as additional elements of the state vector. This solution has the
undesirable feature that the number of state variables will increase with the number of
observables, leading to a loss of computational efficiency. We avoid this shortcoming by
redefining the system for the monthly indicators in terms of quasi-differences (see e.g.
Kim and Nelson 1999b, pp. 198-199 and Bai and Wang 2012). Specifically, defining
y¯i,t ≡ (1 − ρi (L))yi,t for i = 2, . . . , n and y˜t = [y1,t , y¯2,t , . . . , y¯n,t ]0 , the model can be
compactly written in the following state-space representation:
y˜t = HXt + η˜t ,
(9)
Xt = F Xt−1 + et ,
(10)
where the state vector stacks together the time-varying intercept, the factors, and the
idiosyncratic component of GDP, as well as their lags required by equation (8). To be
precise, Xt = [α1,t , . . . , α1,t−4 , ft , . . . , ft−mp , u1,t , . . . , u1,t−mq ]0 , where mp = max(p, 4)
and mq = max(q, 4). Therefore the measurement errors, η˜t 0 = [0, η¯t 0 ] with η¯t =
[η2,t , . . . , ηn,t ]0 ∼N (0, Rt ), and the transition errors, et ∼N (0, Qt ), are not serially correlated. The system matrices H, F , Rt and Qt depend on the hyperparameters of the
DFM, λ, Φ, ρ, σε,t , σηi,t , ςα1 , ςε , ςη . Appendix C provides an detailed description of the
state space system.
The model is estimated with Bayesian methods simulating the posterior distribution
of parameters and factors using a Markov Chain Monte Carlo (MCMC) algorithm.
Specifically, we extend the Gibbs-sampler algorithm for DFMs proposed by Bai and
15
Wang (2012) to include mixed frequencies, a the time-varying intercept, and SV.13 The
stochastic volatilities are sampled using the approximation of Kim et al. (1998), which
is considerably faster than the alternative Metropolis-Hasting algorithm of Jacquier
et al. (2002). The sampling algorithm consists of the following six steps:14
Step 0: Initialize
the
model
parameters
at
arbitrary
starting
values
0
, ση0i,t , ςα01 , ςε0 , ςη0 ). Set j = 1.
(λ0 , Φ0 , ρ0 , σε,t
Step 1: Draw the GDP trend growth as well as the latent factors conditional on model parameter values, i.e.
j
obtain a draw of (α1t
, ftj ) from
j
j−1
p(ftj , α1t
, ςαj 1 , ςεj−1 , ςηj−1 ; y).
|λj−1 , Φj−1 , ρj−1 , σε,t
, σηj−1
i,t
Step 2: Draw the variance of the trend component, i.e. obtain a draw of ςαj 1 from
j
j−1
, ςεj−1 , ςηj−1 , α1t
, ftj ; y).
p(ςαj 1 |λj−1 , Φj−1 , ρj−1 , σε,t
, σηj−1
i,t
Step 3: Draw the coefficients and volatilities of the factor VAR, i.e. obtain a draw
j
j
j
, ςαj 1 , ςηj−1 , α1t
, ftj ; y).
of (Φj , σε,t
, ςεj ) from p(Φj , σε,t
, ςεj |λj−1 , ρj−1 , σηj−1
i,t
Step 4: Draw
the
factor
loadings,15
i.e.
obtain
a
draw
of
λj
from
j
j
, ftj ; y).
, ςαj 1 , ςεj , ςηj−1 , α1t
p(λj |Φj , ρj−1 , σε,t
, σηj−1
i,t
Step 5: Draw the coefficients and volatilities for the process of the idiosyncratic
components,
i.e.
obtain
a
draw
of
(ρj , σηj i,t , ςηj )
from
j
j
, ςαj 1 , ςεj , α1t
, ftj ; y).
p(ρj , σηj i,t , ςηj |λj , Φj , σε,t
Step 6: Increment j by 1 and iterate steps 1 to 5 until convergence.
13
Simulation algorithms in which the Kalman Filter is used over thousands of replications frequently
produce a singular covariance matrix due to the accumulation of rounding errors. Bai and Wang (2012)
propose a modification of the well-known Carter and Kohn (1994) algorithm to prevent this problem
which improves computational efficiency and numerical robustness.
14
The conditional posteriors p(·|·) are stated for a given prior and our choice of priors is discussed
in Section 4.1. More details on the Gibbs sampler can be found in Appendix C.
15
Note that in this step we discarded αi for i 6= 1 from equation (2), since the data y will be
demeaned. This is equivalent to explicitly drawing αi in Step 4 using a flat prior.
16
4
Evidence for US Data
4.1
Priors and Model Settings
In order to facilitate comparison with the existing literature, which has generally
approached the estimation of DFMs from a classical perspective, we wish to impose
as little prior information as possible. For that reason, in our baseline results we use
uninformative priors for the factor loadings and the autoregressive coefficients of factors
and idiosyncratic components.
The variances of the innovations to the time-varying parameters, namely ςα,1 , ςε and
ςη,i in equations (5)-(7) are however difficult to identify from the information contained
in the likelihood function alone. As the literature on Bayesian VARs documents, in
many cases attempts to use non-informative priors for these parameters will produce
relatively high posterior estimates, i.e. a relatively large amount of time variation.
While this will tend to improve the in-sample fit of the model it is also likely to worsen
out-of-sample forecast performance. We therefore use priors to shrink these variances
towards zero (i.e. towards the benchmark model, which excludes time-varying long-run
GDP growth and SV), by setting an inverse gamma prior with one degree of freedom
and scale equal to 0.001 for ςα,1 and one degree of freedom and scale equal to 0.0001
for ς and ςη .
Our choice is consistent with our desire to maximize out-of-sample forecast performance and avoid overfitting, also evident in our decision to allow for time variation
only in those coefficients in which we consider it strictly necessary. In our empirical
application the number of lags in the polynomials Φ(L) and ρ(L), which we set to
p = 2 and q = 2 respectively, in the spirit of Stock and Watson (1989). The model
can be easily extended to include more lags in both transition and measurement equations (i.e. to allow the factors to load some variables with a lag). In the latter case,
17
it is again sensible to avoid overfitting by choosing priors that shrink the additional
lag coefficients towards zero (see e.g. D’Agostino et al., 2012, and Luciani and Ricci,
2014).
4.2
Data
A number of studies on DFMs, including Giannone et al. (2005), Banbura et al.
(2012), Alvarez et al. (2012) and Banbura and Modugno (2014) highlight that the
inclusion of nominal or financial variables, the consideration of disaggregated series
beyond the main headline indicators, or the use of more than one factor do not meaningfully improve the precision of real GDP forecasts. We follow them in focusing on a
medium-sized panel of real activity data including only series for each economic category at the highest level of aggregation, and set the number of factors k = 1.16 Our
criteria for data selection is similar to the one proposed by Banbura et al. (2012),
who suggest including the headline series that are followed closely by financial market
participants.17
Our panel of 26 data series, shown in Table 1, covers all the main “hard” indicators
in production, income, employment, sales, construction and international trade, as well
as the most important “soft” or survey indicators of consumer and business confidence.
Survey indicators prove to be very valuable for nowcasting for three reasons. First, they
are timely, as they provide the earliest signals at a stage in the quarter where little
information about current activity is available and therefore uncertainty is highest.
Second, they are relatively accurate, and display a high degree of persistence, helping
16
The single factor can in this case be interpreted as a coincident indicator of economic activity
(see e.g. Stock and Watson, 1989, and Mariano and Murasawa, 2003). Relative to the latter studies,
which include just four and five indicators respectively, the conclusion of the literaure is that adding
additional indicators, in particular surveys, does improve the precision of GDP forecasts (Banbura
et al., 2010).
17
In practice, we consider that a variable is widely followed by markets when survey forecasts of
economists are available on Bloomberg prior to the release.
18
Table 1:
Data series used in empirical analysis
Freq.
Start Date
Transformation
Publ. Lag
Hard Indicators
Real GDP
Industrial Production
New Orders of Capital Goods
Light Weight Vehicle Sales
Real Personal Consumption Exp.
Real Personal Income less Trans. Paym.
Real Retail Sales Food Services
Real Exports of Goods
Real Imports of Goods
Building Permits
Housing Starts
New Home Sales
Payroll Empl. (Establishment Survey)
Civilian Empl. (Household Survey)
Unemployed
Initial Claims for Unempl. Insurance
Q
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
Q1:1960
Jan 60
Mar 68
Feb 67
Jan 60
Jan 60
Jan 60
Feb 68
Feb 69
Feb 60
Jan 60
Feb 63
Jan 60
Jan 60
Jan 60
Jan 60
% QoQ Ann.
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
26
15
25
1
27
27
15
35
35
19
26
26
5
5
5
4
Soft Indicators
Markit Manufacturing PMI
ISM Manufacturing PMI
ISM Non-manufacturing PMI
Conf. Board: Consumer Confidence
U. of Michigan: Consumer Sentiment
Richmond Fed Mfg Survey
Philadelphia Fed Business Outlook
Chicago PMI
NFIB: Small Business Optimism Index
Empire State Manufacturing Survey
M
M
M
M
M
M
M
M
M
M
May 07
Jan 60
Jul 97
Feb 68
May 60
Nov 93
May 68
Feb 67
Oct 75
Jul 01
Diff 12 M.
Diff 12 M.
Diff 12 M.
-
-7
1
3
-5
-15
-5
0
0
15
-15
Notes: The second column refers to the sampling frequency of the data, which can be quarterly (Q) or
monthly (M). % QoQ Ann. refers to the quarter on quarter annualized growth rate, % MoM refers to
(yt − yt−1 )/yt−1 while Diff 12 M. refers to yt − yt−12 . The last column shows the average publication
lag, i.e. the number of days elapsed from the end of the period that the data point refers to until its
publication by the statistical agency. All series were obtained from the Haver Analytics database.
19
to identify the persistence of the common factor.18 And third, they are stationary by
construction, which is informative for the extraction of the trend component in GDP.
Since we are interested in covering a long sample in order to study the fluctuations
in long-run growth, we start our panel in January 1960. Here we take full advantage of
the Kalman filter’s ability to deal with missing observations at any point in the sample,
and we are able to incorporate series such as the Markit Manufacturing PMI, which is
closely watched by the market but starts as late as 2007.
4.3
In-Sample Results
We estimate the model with 7000 replications of the Gibbs sampler algorithm, of
which the first 2000 are discarded as burn-in draws and the remaining ones are kept
for inference.19
Panel (a) of Figure 3 plots the median, the 68% and 90% posterior credible intervals
of the long-run growth rate, and, for comparison, the well-known estimate of potential
growth produced by the Congressional Budget Office (CBO). Several features of our
estimate of long-run growth are worth noting. An initial slowdown is visible around
the late 1960’s, close to the 1973 “productivity slowdown” (Nordhaus, 2004). The
acceleration of the late 1990’s and early 2000’s associated with the productivity boom
in the IT sector (Oliner and Sichel, 2000) is also clearly visible. Thus, until the middle of
the decade of the 2000’s, our estimate conforms well to the generally accepted narrative
about fluctuations in potential growth. It must be noted, however, that according to
18
A few of these surveys, in particular consumer confidence, appear to be better aligned with the
rest of the variables after taking a 12 month difference transformation, a feature that is consistent
with these indicators sometimes being regarded as leading rather than coincident.
19
Thanks to the efficient state space representation discussed above, the improvements in the
simulation smoother proposed by Bai and Wang (2012), and other computational improvements we
implemented, the estimation is very fast. Convergence is achieved after only 1500 iterations, which
take less than 20 minutes in MATLAB using a standard Intel 3.6 GHz computer with 16GB of DDR3
Ram.
20
our estimates until the most recent part of the sample, the historical average of 3.15% is
always contained within the 90% credible interval. This is consistent with the fact that
the break tests do not find significant breaks during 1973 or the mid-1990’s. Finally,
from its peak of about 3.25% in late 1998 to its level as of June 2014, 2.25%, the
median estimate of the trend has declined by one percentage point, a more substantial
decline than the one observed after the original “productivity slowdown” of the 1970’s.
Moreover, the decline appears to have happened gruadually since the start of the 2000’s,
with the reductions in trend growth clustered around two episodes: one around the
middle of the decade, but before the Financial Crisis, and a second after the recession,
at the beginning of the subsequent recovery.
Our estimate of long-run growth and the CBO’s capture similar but not identical
concepts. The CBO measures the growth rate of potential output (i.e. the level of
output that could be obtained if all resources were used fully), whereas our estimate,
similar to Beverdige and Nelson (1981), measures the component of the growth rate
that is expected to be permanent. Moreover, the CBO estimate is constructed using the so-called “production function approach”,20 which is radically different from
the DFM methodology. It is nevertheless interesting that despite employing different
statistical methods they produce qualitatively similar results, with the CBO estimate
displaying a more marked cyclical pattern but remaining for most of the sample within
the 90% credible posterior interval of our estimate. As in our estimate, two slowdowns
are appreciated in the CBO’s, although the timing of the second of the CBO’s slowdowns coincides with the recession. The CBO’s estimate was significantly below ours
immediately after the recession, reaching an unprecedented low level of about 1.25%
20
Essentially, the production function approach calculates, using statistical filters, the trend components of the supply inputs to a neoclassical production function (the capital stock, total factor
productivity, and the total amount of hours) and then aggregates them to obtain an estimate of the
trend level of output. See CBO (2001).
21
Figure 3: US long-run growth estimate: 1960-2014 (% Annualised Growth Rate)
(a) Estimated long-run growth vs CBO estimate of potential growth
5
4
3
2
1
1960
1970
1980
1990
2000
2010
(b) Filtered estimates of long-run growth vs SPF survey
Filtered long-run growth estimate
Livingston Survey
4
3.5
3
2.5
2
1.5
1990
1995
2000
2005
2010
2015
Note: Panel (a) plots the posterior median (solid red), together with the 68% and 90% (dashed
blue) posterior credible intervals of the long-run GDP growth. The black line is the CBO’s estimate
of potential growth. Shaded areas represent NBER recessions. In Panel (b), the solid gray line is
the filtered estimate of the long-run GDP growth rate, α
ˆ 1,t|t , using the vintage of National Accounts
available as of mid-2014. The blue diamonds represent the real-time mean forecast from the Livingston
Survey of Professional Forecasters of the average GDP growth rate for the subsequent 10 years.
22
in 2010, and remains in the lower bound of our posterior estimate since then.21
Panel (b) Figure 3 seems to indicate that the decline in trend growth happened
gradually during the 2000’s. It should be noted that the posterior estimates, α
ˆ t|T are
outputs of a Kalman smoother recursion, i.e. they are conditioned on the entire sample,
so it is possible that our choice of modeling long-run GDP growth as a random walk
is hard-wiring into our results the conclusion that the decline happened in a gradual
way. In experiments with simulated data, presented in Appendix B, we show that
if the random walk assumption is wrong, and changes in trend growth occur in the
form of discrete breaks, additional insights can be obtained from looking at the filtered
estimates, α
ˆ 1,t|t . Panel (b) displays the filtered estimate of long-run growth. The results
broadly agree with the smoothed estimates, being consistent with a relatively gradual
decline. The model’s estimate declines from about 3.5% in the early 2000’s to about
2.25% as of the middle of 2014. Larger declines again appear to cluster around two
dates, first around the mid-2000’s and second in the summer of 2011.
As an additional external benchmark, Panel (b) also includes the real-time median
forecast of average real GDP growth over the next ten years from the Livingston Survey
of Professional Forecasters’ (SPF). It is noticeable that the SPF was substantially more
pessimistic during the 1990’s, and did not incorporate the substantial acceleration in
trend growth due to the ‘New Economy’ until the end of the decade. From 2005 to
about 2010, the two estimates are remarkably similar, showing a deceleration to about
2.75% as the productivity gains of the IT boom faded. This matches the narrative of
Fernald (2014). Since then, the SPF forecast has remained relatively stable whereas
21
This is due to the fact that the CBO’s method embeds an Okun Law relationship, by which growth
is above potential when the unemployment rate is declining. The strong decline in unemployment
of the last few years has therefore led to their estimate of the output gap to close partially. On the
contrary, an output gap-type measure which can be obtained as a by-product of our model remains
large five years after the end of the recession, since the economy has not grown above trend for a
sustained period. This is consistent with the narrative that attributes the decline in unemployment to
declining labor force participation rather than cyclical improvement (see, e.g. Erceg and Levin, 2013.
23
our model’s estimate has declined by a further half percentage point. In essence, as of
mid-2014, the SPF forecast had not incorporated the second decline in long-run growth
identified by our model.
Panel (a) in Figure 4 presents the estimates of the SV of the common factor.22
The “Great Moderation” is clearly visible, with the average volatility pre-1985 being
about twice the average of the post-1985 sample. Notwithstanding the large increase in
volatility during the Great Recession, our estimate of the common factor volatility since
then remains consistent with the “Great Moderation” still being in place, confirming
the early evidence reported by Gadea-Rivas et al. (2014). It is clear from the figure that
volatility seems to spike during recessions, a finding that brings our estimates close to
the recent findings of Jurado et al. (2014) and Bloom (2014) relating to business-cycle
uncertainty. It is interesting to note that while in our model the innovations to the
level of the common factor and its volatility are uncorrelated, the fact that increases in
volatility are observed during recessions indicate the presence of negative correlation
between the first and second moments, implying negative skewness in the distribution
of the common factor.23
Panels (b)-(j) of Figure 4 plot the posterior distribution of the standard deviation
of the idiosyncratic component of selected variables. Several features are worth noting.
On the one hand, there appear to be common trends in the idiosyncratic volatilities
that are not captured by the volatility of the common factor. For instance, a reduction
in volatility from the mid-1980s is clearly visible for many series, consistent with the
“Great Moderation”. On the other hand, the volatility of some series, such as personal
income and those related to the housing market, has increased in recent decades. These
developments can be linked to specific events, such as the frequent changes in the tax
22
2
To be precise, this is the square root of var(ft ) = σε,t
(1 − φ2 )/[(1 + φ2 )((1 − φ2 )2 − φ21 )]. The
standard deviations of the idiosyncratic components are calculated in an analogous way.
23
We believe a more explicit model of this feature is an important priority for future research.
24
code or the housing boom and bust in the 2000’s. Moreover, as with the volatility of
the common factor, many of the idiosyncratic volatilities present sharp increases during
recessions.24
To sum up, we have presented evidence that a significant decline in long-run US
GDP growth occurred over the last decade. Our results are consistent with a relatively
gradual decline between the early 2000’s and the beginning of the recovery from the
Great Recession. Both smoothed and filtered estimates appear to show that the largest
declines cluster around two episodes. First, a slowdown from the elevated levels of
growth at the turn of the century, before the recession and consistent with the narrative
of Fernald (2014) on the fading of the IT productivity boom. Second, a continuation
of the slowdown after the recession, with further declines detected around 2010-2011.
The recent decline is more significant in magnitude than fluctuations in trend growth
of the early 1970’s and the late 1990’s.
4.4
Real-Time Evidence of Changes in Long-Run Growth
The smoothed and filtered estimates of Figure 3 indicate that a slowdown occurred
during the 2000’s, but as is well known, macroeconomic time series are revised (sometimes heavily) over time, and in many cases these revisions contain valuable information
that was not available at initial release. Therefore, it is possible that our results are
only apparent using the current vintage of data, and our model would not have been
able to detect the slowdown as it happened.
To address this concern, we reconstruct our dataset at each point in time, using vintages of data available from the Federal Reserve Bank of St. Louis ALFRED database.
Our aim is to replicate as closely as possible the situation of a forecaster which would
24
Our estimates suggest to us that there appears to be a factor structure within the volatilities.
This issue has been investigated in the context of VARs with SV by de Wind and Gambetti (2014),
but to the best of our knowledge it has not been applied to DFMs.
25
Figure 4: Stochastic Volatility of Common Factor and Selected Idiosyncratic Components
(a) Common Factor
6
5
4
3
2
1
0
1960
1965
1970
1975
1980
1985
1990
1995
2000
(c) Industrial Production
(b) GDP
.95
12
2015
(d) Housing Starts
12
.75
8
2010
14
.85
10
2005
10
.65
8
.55
6
6
.45
4
2
1960
4
.35
1970
1980
1990
2000
2010
.25
1960
1970
1980
1990
2000
2010
2
1960
(f) Income
(e) Consumption
.9
.7
1970
1980
1990
2000
2010
(g) Retail Sales
1.0
2.5
0.8
2
0.6
1.5
0.4
1
0.2
0.5
.5
.3
.1
1960
1970
1980
1990
2000
2010
0.0
1960
(h) Employment
1970
1980
1990
2000
2010
(i) ISM Manufacturing
.4
0
1960
1970
1980
1990
2000
2010
(j) Consumer Confidence
12
5
4.5
10
.3
4
.2
8
3.5
3
6
.1
2.5
.0
1960
1970
1980
1990
2000
2010
2
1960
1970
1980
1990
2000
2010
4
1960
1970
1980
1990
2000
2010
Note: Each panel presents the median (red), the 68th (solid blue) and the 90th (dashed blue) percentiles of the posterior credible intervals of the idiosyncratic component of (a) the common factor and
(b)-(j) the idiosyncratic component of selected variables (See footnote 22). Shaded areas represent
NBER recessions. Similar charts for other variables are available upon request.
26
have applied our model in real time.
Our evaluation sample starts on 11 January 2000 and ends in 22 September 2014.
This is the longest sample possible for which we are able to include the entire panel in
Table 1 using fully real-time data.25 Full details about the construction of the database
are available in Appendix E. From the first date of the evaluation sample, we proceed
by re-estimating the model each day in which new data are released.
Figure 5: Long-Run GDP Growth Estimates in Real Time
Time-varyingsmeanswiths68thsands90thsptl.
Constantsmeansestimatedsinsrealstime
5
4.5
4
3.5
3
2.5
2
1.5
1
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
Note: The shaded areas represent the 68th and 90th percentile, together with the median of the
posterior credible interval of the current value of long-run GDP growth, re-estimated daily from
January 2000 to September 2014 using the vintage of data available at each point in time. The
recursive mean is the contemporaneous estimate of the historical average of GDP growth rate with
the start of the sample fixed at Q1:1960.
Figure 5 presents the model’s real-time assessment of the posterior distribution of
long-run growth at each point in time. The real-time recursive mean of GDP growth is
25
In a few cases new indicators were developed after January 2000. For example, the Markit
Manufacturing PMI survey is currently one of the most timely and widely followed indicators, but
it started being conducted in 2007. In those cases, we sequentially apply the selection criterion of
Banbura et al. (2012) and append to the panel, in real time, the vintages of the new indicators as
soon as Bloomberg surveys of forecasters are available. In the example of the PMI, surveys appear in
Bloomberg since mid-2012. By implication, the number of indicators in our data panel grows when
new indicators appear.
27
also plotted. This is equivalent to the estimate of long-run growth from a model with
no time-varying intercept for GDP.
From the start of the sample in 2000 to around mid-2005, the median estimate of
long-run growth is somewhat above the historical mean, fluctuating between 3.5% and
3.75%. As discussed in the previous section, this period of high trend growth is usually
associated with the effects of the productivity boom in the IT and related sectors of the
late 1990’s. Between mid-2005 and end-2007 the growth rate starts declining, reaching
3% by the time the Great Recession starts, although a 90% credible interval for longrun growth still contains the historical average. The estimate then stays quite stable,
not being impacted meaningfully by the events of 2008-2009. Since the summer of
2010, however, the estimate declines further, and after a large drop in July 2011 the
median estimate fluctuates between 2.5% and 2.25%. It is clear from Figure 5 that by
July 2011 there was sufficient evidence to conclude that a significant decline in long-run
growth had occurred.
Figure 6 looks in more detail at the specific information that, in real time, led
the model to reassess its estimate of long-run growth. While data since the middle of
2010 had already started shifting down the estimate, the publication by the Bureau
of Economic Analysis of the annual revisions to the National Accounts in July 29th,
2011 were critical for the model’s assessment. The vintages after that date show not
just that the contraction in GDP during the recession was much larger than previously
estimated, but even more importantly, it reduced the average growth rate of the subsequent recovery from 2.8% to about 2.4%. In short, by the end of July 2011 (well
before structural break tests arrived at the same conclusion), enough evidence had accumulated about the weakness of the recovery to shift the model’s estimate of long-run
growth down by almost half a percentage point.
28
Figure 6: Impact of July 2011 National Account Revisions
(a) Evolution of the Annual 2011 GDP Forecast (b) National Accounts Revisions around July 2011
AnnualgForecast
4.5
ExcludinggChangesgingTrend
13.6
4.0
13.4
3.5
13.2
3.0
13.0
2.5
12.8
2.0
12.6
1.5
Jan
Mar
May
Jul
Sep
Nov
Jan
12.4
2006
June 2011 Vintage
July 2011 Vintage
2007
2010
2008
2009
2011
2012
Note: Panel (a) shows the evolution of the annual GDP growth forecast for 2011 produced by the
model. The light dashed line represents the counterfactual forecast that would result from shutting
down fluctuations in long-run growth. The vertical line marks the release of the National Accounts
revisions on 29th July 2011. Panel (b) shows the level of real GDP (in bn USD) before and after the
revisions.
4.5
Implications for Tracking GDP in Real Time
The standard DFM with constant long-run growth and constant volatility has been
very successfully applied to produce current quarter nowcasts of GDP (see Banbura
et al., 2010 for a survey). As we discussed in Section 3.1, it is likely that declines in
trend growth lead to biased forecasts even in the short term, but on the other hand it is
possible that the addition of time-varying components increases estimation uncertainty,
leading to a worse overall forecast performance. In this section we evaluate whether
the proposed model improves current and next quarter forecasts of US GDP growth in
real time.
Using our real-time database of US vintages, we proceed by re-estimating the following three models each day in which new data is released: a benchmark with constant
long-run GDP growth and constant volatility (Model 0, similar to Banbura and Modugno, 2014), a version with constant long-run growth but with stochastic volatility
29
(Model 1, similar to Marcellino et al., 2013), and the baseline model put forward in
this paper with both time variation in the long-run growth of GDP and SV (Model 2).
Allowing for an intermediate benchmark with only SV allows us to evaluate how much
of the improvement in the model can be attributed to the addition of the long-run
variation in GDP as opposed to the SV. This is especially relevant for the evaluation
of density nowcasts. In particular, we evaluate the point and density forecast accuracy
relative to the initial (“Advance”) release of GDP, which is released between 25 and
30 days after the end of the reference quarter.26
When comparing the three different models, we test the significance of any improvement of Models 1 and 2 relative to Model 0. This raises some important econometric
complications given that (i) the three models are nested, (ii) the forecasts are produced
using an expanding window, and (iii) the data used is subject to revision. These three
issues imply that commonly used test statistics for forecasting accuracy, such as the
one proposed by Diebold and Mariano (1995) and Giacomini and White (2006) will
have a non-standard limiting distribution. However, rather than not reporting any
test, we follow the “pragmatic approach” of Faust and Wright (2013) and Groen et al.
(2013), who build on Monte Carlo results in Clark and McCracken (2012). Their results indicate that the Harvey et al. (1997) small sample correction of the Diebold and
Mariano (1995) statistic results in a good sized test of the null hypothesis of equal
finite sample forecast precision for both nested and non-nested models, including cases
with expanded window-based model updating. Overall, the results of the tests should
be interpreted more as a rough gauge of the significance of the improvement than a
definitive answer to the question. We compute various point and density forecast ac26
We have explored the alternative of evaluating the forecasts against subsequent releases, or the
latest available vintages. The relative performance of the three models is broadly unchanged, but all
models do better at forecasting the initial release. If the objective is to improve the performance of
the model relative to the first official release, then ideally an explicit model of the revision process
would be desirable. The results are available upon request.
30
curacy measures at different moments in the release calendar, to assess how the arrival
of information improves the performance of the model. In particular, starting 180 days
before the end of the reference quarter, and every subsequent day up to 25 days after
its end, when the GDP figure for the quarter is usually released. This means that
we will evaluate the forecasts of the next quarter, current quarter (nowcast), and the
previous quarter (backcast). We consider two different samples for the evaluation: the
full sample (2000:Q1-2014:Q2) and the sample covering the recovery since the Great
Recession (2009:Q2-2014:Q2).
4.5.1
Point Forecast Evaluation
Figure 7 shows the results of evaluating the posterior mean as point forecast. We use
two criteria, the root mean squared error (RMSE) and the mean absolute error (MAE).
As expected, both of these decline as the quarters advance and more information on
monthly indicators becomes available (see e.g. Banbura et al., 2012).
Both the RMSE and the MAE of Model 2 are lower than that of Model 0 starting
30 days before the end of the reference quarter, while Model 1 is somewhat worse
overall. Although our gauge of significance indicates that these differences are not
significant at the 10% level for the overall sample, the improvement in performance
is much clearer in the recovery sample. In fact, the inclusion of the time varying
long run component of GDP helps anchor GDP predictions at a level consistent with
the weak recovery experienced in the past few years and produces nowcasts that are
‘significantly’ superior to those of the reference model from around 30 days before the
end of the reference quarter. In essence, ignoring the variation in long-run GDP growth
would have resulted in being on average around 1 percentage point too optimistic from
2009 to 2014.
31
Figure 7: Point Forecast Accuracy Evaluation
(a) Root Mean Square Error
Full Sample: 2000:Q1-2014:Q2
2.75
Forecast
Nowcast
Recovery Sample: 2009:Q2-2014:Q2
3
Backcast
2.5
2.75
2.25
2.5
2
2.25
1.75
Model 0
Model 1
2
Forecast
Nowcast
Backcast
Model 0
Model 1
Model 2
Model 2
1.5
1.75
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
0
15
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
0
15
(b) Mean Absolute Error
Full Sample: 2000:Q1-2014:Q2
2.25
Forecast
Nowcast
Recovery Sample: 2009:Q2-2014:Q2
Backcast
2.25
2
2
1.75
1.75
1.5
1.5
Model 0
Model 1
Model 2
Forecast
Nowcast
Backcast
Model 0
Model 1
Model 2
1.25
1.25
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
0
15
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
0
15
Note: The horizontal axis indicates the forecast horizon, expressed as the number of days to the end
of the reference quarter. Thus, from the point of view of the forecaster, forecasts produced 180 to 90
days before the end of a given quarter are a forecast of next quarter; forecasts 90-0 days are nowcasts
of current quarter, and the forecasts produced 0-25 days after the end of the quarter are backcasts
of last quarter. The boxes below each panel display, with a vertical tick mark, a gauge of statistical
significance at the 10% level of any difference with Model 0, for each forecast horizon, as explained in
the main text.
32
4.5.2
Density Forecast Evaluation
Density forecasts can be used to assess the ability of a model to predict unusual
developments, such as the likelihood of a recession or a strong recovery given current
information. The adoption of a Bayesian framework allows us to produce density
forecasts from the DFM that consistently incorporate both filtering and estimation
uncertainty. Figure 8 reports the probability integral transform (PITs) for the 3 models
calculated with the nowcast of the last day of the quarter. Diebold et al. (1998)
highlight that well calibrated densities are associated with uniformly distributed PITs.
Figure 8 suggests that the inclusion of SV is paramount to get well calibrated densities,
whereas the inclusion of the trend helps get a more appropriate representation of the
right hand of the distribution.
Figure 8: Probability Integral Transform (PITs)
(b) Model 1
(a) Model 0
(c) Model 2
2
2
2
1.5
1.5
1.5
1
1
1
0.5
0.5
0.5
0
0
0
0.2
0.4
0.6
0.8
1
0
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
Note: The figure displays the cdf of the Probability Integral Transforms (PITs) evaluated on the last
day of the reference quarter.
There are several measures available for density forecast evaluation. The (average)
log score, i.e. the logarithm of the predictive density evaluated at the realization, is
one of the most popular, rewarding the model that assigns the highest probability to
the realized events. Gneiting and Raftery (2007), however, caution against using the
log score, emphasizing that it does not appropriately reward values from the predictive
density that are close but not equal to the realization, and that it is very sensitive to
33
Figure 9: Density Forecast Accuracy Evaluation
(a) Log Probability Score
Full Sample: 2000:Q1-2014:Q2
-1.9
-2
-2.1
Forecast
Nowcast
Recovery Sample: 2009:Q2-2014:Q2
Backcast
-1.9
Forecast
Nowcast
Backcast
-2
Model 0
Model 1
Model 2
-2.1
-2.2
-2.2
-2.3
-2.3
-2.4
-2.4
Model 0
Model 1
Model 2
-2.5
-2.5
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
0
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
15
0
15
(b) Continuous Rank Probability Score
Recovery Sample: 2009:Q2-2014:Q2
Full Sample: 2000:Q1-2014:Q2
1.5
Forecast
Nowcast
Backcast
1.7
1.4
1.6
1.3
1.5
1.2
1.4
1.1
1.3
1
0.9
Forecast
Nowcast
Backcast
1.2
Model 0
Model 1
1.1
Model 2
Model 0
Model 1
Model 2
0.8
1
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
0
15
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
0
15
Note: The horizontal axis indicates the forecast horizon, expressed as the number of days to the end
of the reference quarter. Thus, from the point of view of the forecaster, forecasts produced 180 to 90
days before the end of a given quarter are a forecast of next quarter; forecasts 90-0 days are nowcasts
of current quarter, and the forecasts produced 0-25 days after the end of the quarter are backcasts
of last quarter. The boxes below each panel display, with a vertical tick mark, a gauge of statistical
significance at the 10% level of any difference with Model 0, for each forecast horizon, as explained in
the main text.
34
outliers. They therefore propose the use of the (average) continuous rank probability
score (CRPS) in order to address these drawbacks of the log-score. Figure 9 shows
that by both measures our model outperforms its counterparts. Interestingly, the
comparison of Model 1 and Model 2 suggests that failing to properly account for
the long-run growth component might give a misrepresentation of the GDP densities,
resulting in poorer density forecasts.27
In sum, the results indicate that a model that allows for time-varying long-run GDP
growth (together with SV) produces short-run forecasts that are on average (over the
full evaluation sample) either similar to or improve upon the benchmark model. The
performance tends to improve substantially in the sub-sample including the recovery
from the Great Recession, coinciding with the significant downward revision of the
model’s assessment of long-run growth. In other words, the addition of the timevarying long-run growth does not worsen the forecasting performance when the latter
is relatively stable, but provides a substantial improvement when the trend is shifting.
It therefore provides a robust and timely methodology to track GDP when long-run
growth is uncertain.
4.6
4.6.1
Robustness
Alternative Priors and Model Settings
Recall that in our main results we use non-informative priors for loadings and serial
correlation coefficients, and a conservative prior for the time variation in the long-run
growth of GDP and the stochastic volatilities. The non-informative priors were moti27
We also assess how the three models fare when different areas of their predictive densities are
emphasized in the forecast evaluation. We follow Groen et al. (2013) and compute weighted averages of
Gneiting and Raftery (2007) quantile scores (QS) that are based on quantile forecasts that correspond
to the predictive densities from the different models. Our results, available in the Appendix D,
indicate that while there is an improvement in density nowcasting for the entire distribution, the
largest improvement comes from the right tail.
35
vated by our desire to make our estimates more comparable to the existing literature,
but here we also consider a “Minnesota”-style prior on the autoregressive coefficients
of the factor,28 as well as shrinking the coefficients of the serial correlation towards
zero. The motivation for these priors is to express a preference for a more parsimonious model where the factors capture the bulk of the persistence of the series and the
idiosyncratic components are close to iid, that is, closer to true measurement error.
These alternative priors do not meaningfully affect the posterior estimates of our main
objects of interest.29 As for the prior on the amount variation of long-run growth,
our choice of a conservative prior was motivated by our desire to minimize overfitting,
especially given that the GDP series, being observed only at the quarterly frequency,
has one third of the observations of the monthly variables. Other researchers might
however believe that the permanent component of GDP growth is more variable, so
we re-estimated the model using a larger prior for the variance of its time variation
(i.e. ςα,1 = 0.01). As a consequence, the long-run component displays a more pronounced cyclical variation but it is estimated less accurately. Interestingly the increase
in the variance and cyclicality of the long-run growth component brings it closer to the
CBO estimate.
4.6.2
Common Long-Run Components
In Section 3.1 we argue that to ensure unbiased forecasts of GDP, and retain a
parsimonous model, it is sufficient to include time variation in the long-run component
for GDP growth only. Furthermore, keeping the number of time-varying coefficients
to a minimum ensures that in-sample fit is not improved at the cost of out-of sample
28
To be precise, we center the prior on the first lag around 0.9 and the subsequent lags at zero.
There is some evidence that the use of these priors might at times improve the convergence of
the algorithm. Moreover, in Section 5 we apply the model to the other G7 economies. We find that
for some countries where fewer monthly indicators are available, shrinking the serial correlations of
the idiosyncratic components towards zero helps obtain a common factor that is persistent.
29
36
forecasting performance. Nevertheless, for theoretical reasons, it might be desirable to
impose that the drift in long-run GDP growth is shared by other series, such as consumption and real income.30 Given that a large strand of the literature has considered
that consumption and income might share a common trend, we re-estimate the model
allowing for the possibility that the GDP long-run growth also loads the consumption
and income series.31 The resulting posterior estimate of the long-run component shares
the broad movements of the baseline specification.32 In fact, the uncertainty around it
is somewhat reduced, a natural consequence of incorporating information from many
series as opposed to only one. Interestingly, the movements in the trend appear to
be more pronounced in this specification, and the entire decline in trend growth is
estimated to have happened before the Great Recession.
We see the above robustness check as an encouragement to use our methodology
more broadly as a tool to track macroeconomic data in ways that are informed by
theoretical considerations. We further explore this idea in the next section.
5
Long-Run Growth Fluctuations in the G7
So far we have focused our discussion on the US economy, where long-run growth
had been remarkably stable until the turn of century, when the decline documented in
this paper started occuring. The analysis of data from other industrialized economies
30
To confirm the theoretical argument that both consumption and income are part of GDP and
should therefore have similar long-run properties, we apply the Bai and Perron methodology to these
series. Perhaps not surprisingly, the likely dates of the breaks in these series coincide with the ones
we found in the GDP series, 1973 and 2000.
31
For instance, Cochrane (1994), Harvey and Stock (1988) and Cogley (2005) have argued that
incorporating information about consumption is informative about the permanent component in GDP.
Importantly, consumption of durable goods should be excluded given that the ratio of their price index
to the GDP deflator exhibits a downward trend, and therefore the chained quantity index grows faster
than overall GDP. Following Whelan (2003), for this section we construct a Fisher index of nondurables and services and use its growth rate as an observable variable in the panel.
32
See Appendix D for the results.
37
indicates that the postwar stability of the US long-run growth rate is an exception
rather than the rule.
Figure 10 plots the results of re-estimating our model with data for each of the other
G7 economies.33 We plot the posterior estimate of long-run growth, and for comparison,
the estimate of potential growth by the OECD.34 Large fluctuations in trend growth
are apparent for most of the countries: during the 1960s, Germany, France and Italy
were growing by 4% on a sustained basis, while Japan was growing by about 7%. These
high growth rates were a consequence of the need to rebuild the capital stock from the
destruction of World War II, and were bound to end as the continental European and
Japanese economies converged towards US levels of output per capita.
The UK and Canada display a more stable profile, similar to that of the US, although the slowdown in the 1970’s is more clearly visible in the Canadian data. Again,
similar to the US, an acceleration in the late 1990’s and a subsequent slowdown in
the mid-2000’s is observed. It is interesting to note that applying the Bai and Perron
(1998) test to the other G7 economies breaks are detected for most countries, and
once again breaks are clustered around two episodes, the early 1970’s and the early
2000’s. An exception is the UK, where no break is detected but our model estimates
a substantial decline in trend growth, an issue that has been extensively discussed in
UK policy circles. Our real-time application to the US suggests that it is possible that
our model is detecting the decline in long-run growth earlier than the break test.
By applying a simple accounting identity, we can take a first careful step at giving
a more structural interpretation of the estimated fluctuations in long-run growth. By
this identity, the growth rate of output is equal to the sum of the growth rates of
labor productivity, hours per capita and population growth. This exercise reveals, as
33
Details on the specific data series used are available in Appendix E.
Like the CBOs estimate for the US, the OECD measure is calculated using the production
function approach.
34
38
Figure 10: Posterior Estimate of the Long-Run GDP growth rate for the other G7 Economies
(b) Canada
(a) United Kingdom
(c) Japan
4.5
5
4
4.5
3.5
4
6
3
3.5
5
2.5
3
2
2.5
1.5
2
1
1960
1970
1980
1990
2000
2010
1.5
1960
9
8
7
4
3
2
1
0
1970
(d) Germany
1980
1990
2000
2010
-1
1960
1970
1980
1990
2000
2010
2000
2010
(f) Italy
(e) France
6
6
6
5
5
5
4
4
3
3
2
2
1
1
4
3
0
1960
1970
1980
1990
2000
2010
0
1960
2
1
0
1970
1980
1990
2000
2010
-1
1960
1970
1980
1990
Notes: Each panel presents the median (red), the 68th (solid blue) and the 90th (dashed blue)
percentiles of the posterior credible intervals of the long-run GDP growth rate. The green diamonds
represent the OECD’s estimate of potential output growth.
expected, that secular movements in the growth rate of productivity are behind the
bulk of changes in long-run growth. Sometimes, however, other factors can play an
important role. In Japan and continental Europe, population growth slowed down
during the sample 1960-2014, making a negative contribution to overall growth in the
last decade in Germany and Japan, and close to zero in France and Italy.
In the US, the productivity slowdown of the early 1970s was partially masked by
an increase in hours per capita (mainly a result of increases in female labor force participation) and population growth during the period 1973-2005, resulting in a broadly
stable long-run growth rate of about 3% that has come to be regarded as a stylized
39
fact of the US economy. However, as pointed out by Gordon (2014), in recent years,
the growth rates of productivity, hours per capita and population have all slowed down
persistently, leading to a clear decline in long-run GDP growth significant enough to
be detected by structural break tests.
It is worth noting that the weakness of productivity of recent years is not confined
to the US economy. Figure 11 illustrates this point by plotting the five-year centered
moving average of the growth in real output per hour worked. A marked slowdown is
visible in all countries, and all of the areas are experiencing productivity growth below
1%, an unprecedented phenomenon in the postwar period. The case of the UK, where
measured productivity has been slightly negative since the Financial Crisis, is particularly striking. The coincidence in the timing of the current productivity slowdown
across countries suggests a driving common factor, and a more ‘structural’ interpretation of the decline in productivity growth remains an interesting open question which
we leave for further research. Nevertheless, the possibility that trends in productivity
growth are substantially different currently than they were historically highlights the
need to take uncertainty about long-run growth seriously at the present juncture.
40
Figure 11: Labor Productivity Growth, % 5-Year Centered Moving Average
10
CA
JP
Europe
UK
US
9
8
7
6
5
4
3
2
1
0
-1
1960
1970
1980
1990
2000
2010
Note: Labor productivity calculated as the ratio of total real output to aggregate hours worked.
‘Europe’ refers to the GDP weighted average of Germany, France and Italy. Sources: Conference
Board Total Economy Database, IMF World Economic Outlook.
6
Concluding Remarks
The sluggish recovery of the US economy in the aftermath of the Great Recession
raises the question whether the long-run growth rate of GDP is now lower than it
has been on average over the postwar period. Given that taking into account changes
in long-run growth is of crucial importance for tracking GDP, we extend a standard
dynamic factor model used for nowcasting GDP to allow for both changes in long-run
GDP growth and stochastic volatility. Estimating the model with Bayesian methods,
we provide compelling evidence that long-run growth of US GDP displays a gradual
decline after the turn of the century, moving from its peak of 3.5% to 2.25% in 2014.
In a real-time out-of-sample evaluation exercise using vintages of US data, we show
that our specification of the model outperforms the standard version, both in point
and density forecasts/nowcasts. The improvements mainly derive from the inclusion
41
of a time-varying long-run GDP growth rate and are most substantial for the recovery
period. In essence, the addition of time-varying long-run growth does not worsen the
forecasting performance when the long-run growth rate is relatively stable, but provides
a substantial improvement when it changes over time.
Finally, applying the model to the other G7 economies enables us to take a careful
step at investigating the drivers behind long-run changes in GDP growth. We conjecture that the decline in long-run growth is consistent with the narrative of Fernald
(2014), who attributes the phenomenon to a slowdown in productivity. More work on
‘structural’ interpretations of our result is a promising avenue for further research.
42
References
Ahmed, S., Levin, A., and Wilson, B. A. (2004). Recent U.S. Macroeconomic Stability:
Good Policies, Good Practices, or Good Luck?
The Review of Economics and
Statistics, 86(3):824–832.
Alvarez, R., Camacho, M., and Prez-Quirs, G. (2012). Finite sample performance
of small versus large scale dynamic factor models. CEPR Discussion Papers 8867,
C.E.P.R. Discussion Papers.
Bai, J. and Perron, P. (1998). Estimating and testing linear models with multiple
structural changes. Econometrica, 66(1):47–68.
Bai, J. and Wang, P. (2012). Identification and estimation of dynamic factor models.
Working Paper.
Banbura, M., Giannone, D., Modugno, M., and Reichlin, L. (2012). Now-casting and
the real-time data flow. Working Papers ECARES 2012-026, ULB – Universite Libre
de Bruxelles.
Banbura, M., Giannone, D., and Reichlin, L. (2010). Nowcasting. CEPR Discussion
Papers 7883, C.E.P.R. Discussion Papers.
Banbura, M. and Modugno, M. (2014). Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data. Journal of Applied Econometrics, 29(1):133–160.
Beverdige, S. and Nelson, C. R. (1981). A new approach to decomposition of economic
time series into permanent and transitory components with particular attention to
measurement of the business cycle. Journal of Monetary Economics, 7(2):151–174.
43
Bloom, N. (2014). Fluctuations in Uncertainty. Journal of Economic Perspectives,
28(2):153–76.
Carter, C. K. and Kohn, R. (1994). On Gibbs Sampling for State Space Models.
Biometrika, 81(3):541–553.
CBO (2001). CBOs Method for Estimating Potential Output: An Update. Working
papers, The Congress of the United States, Congressional Budget Office.
Clark, P. K. (1987). The Cyclical Component of U.S. Economic Activity. The Quarterly
Journal of Economics, 102(4):797–814.
Clark, T. E. and McCracken, M. W. (2012). In-sample tests of predictive ability: A
new approach. Journal of Econometrics, 170(1):1–14.
Cochrane, J. H. (1994). Permanent and Transitory Components of GNP and Stock
Prices. The Quarterly Journal of Economics, 109(1):241–65.
Cogley, T. (2005). How fast can the new economy grow? A Bayesian analysis of the
evolution of trend growth. Journal of Macroeconomics, 27(2):179–207.
Cogley, T. and Sargent, T. J. (2005). Drift and Volatilities: Monetary Policies and
Outcomes in the Post WWII U.S. Review of Economic Dynamics, 8(2):262–302.
Creal, D., Koopman, S. J., and Zivot, E. (2010). Extracting a robust US business cycle
using a time-varying multivariate model-based bandpass filter. Technical Report 4.
D’Agostino, A., Giannone, D., and Lenza, M. (2012). The bayesian dynamic factor
model. Technical report, ULB – Universite Libre de Bruxelles.
de Wind, J. and Gambetti, L. (2014). Reduced-rank time-varying vector autoregressions. CPB Discussion Paper 270, CPB Netherlands Bureau for Economic Policy
Analysis.
44
Del Negro, M. and Otrok, C. (2008). Dynamic factor models with time-varying parameters: measuring changes in international business cycles.
Diebold, F. X., Gunther, T. A., and Tay, A. S. (1998). Evaluating Density Forecasts
with Applications to Financial Risk Management. International Economic Review,
39(4):863–83.
Diebold, F. X. and Mariano, R. S. (1995). Comparing Predictive Accuracy. Journal
of Business & Economic Statistics, 13(3):253–63.
Doz, C., Giannone, D., and Reichlin, L. (2012). A QuasiMaximum Likelihood Approach
for Large, Approximate Dynamic Factor Models. The Review of Economics and
Statistics, 94(4):1014–1024.
Durbin, J. and Koopman, S. J. (2012). Time Series Analysis by State Space Methods:
Second Edition. Oxford University Press.
Erceg, C. and Levin, A. (2013). Labor Force Participation and Monetary Policy in the
Wake of the Great Recession. CEPR Discussion Papers 9668, C.E.P.R. Discussion
Papers.
Evans, M. D. D. (2005). Where Are We Now? Real-Time Estimates of the Macroeconomy. International Journal of Central Banking, 1(2).
Faust, J. and Wright, J. H. (2013). Forecasting inflation. In Elliott, G. and Timmermann, A., editors, Handbook of Economic Forecasting, volume 2.
Fernald, J. (2014). Productivity and potential output before, during, and after the
great recession. NBER Macroeconomics Annual 2014, 29.
Fernald, J. G. and Jones, C. I. (2014). The Future of US Economic Growth. American
Economic Review, 104(5):44–49.
45
Fleischman, C. A. and Roberts, J. M. (2011). From many series, one cycle: improved
estimates of the business cycle from a multivariate unobserved components model.
Finance and Economics Discussion Series 2011-46, Board of Governors of the Federal
Reserve System (U.S.).
Forni, M., Giannone, D., Lippi, M., and Reichlin, L. (2009). Opening The Black
Box: Structural Factor Models With Large Cross Sections. Econometric Theory,
25(05):1319–1347.
Gadea-Rivas, M. D., Gomez-Loscos, A., and Perez-Quiros, G. (2014). The Two Greatest. Great Recession vs. Great Moderation. Working Paper Series 1423, Bank of
Spain.
Geweke, J. (1977). The dynamic factor analysis of economic time series. In Latent
Variables in Socio-Economic Models. North-Holland.
Giacomini, R. and White, H. (2006). Tests of Conditional Predictive Ability. Econometrica, 74(6):1545–1578.
Giannone, D., Reichlin, L., and Sala, L. (2006). Vars, common factors and the empirical validation of equilibrium business cycle models. Journal of Econometrics,
132(1):257–279.
Giannone, D., Reichlin, L., and Small, D. (2005). Nowcasting GDP and inflation: the
real-time informational content of macroeconomic data releases. Technical report.
Giannone, D., Reichlin, L., and Small, D. (2008). Nowcasting: The real-time informational content of macroeconomic data. Journal of Monetary Economics, 55(4):665–
676.
46
Gneiting, T. and Raftery, A. E. (2007). Strictly Proper Scoring Rules, Prediction, and
Estimation. Journal of the American Statistical Association, 102:359–378.
Gneiting, T. and Ranjan, R. (2011). Comparing Density Forecasts Using Thresholdand Quantile-Weighted Scoring Rules. Journal of Business & Economic Statistics,
29(3):411–422.
Gordon, R. J. (2014). The Demise of U.S. Economic Growth: Restatement, Rebuttal, and Reflections. NBER Working Papers 19895, National Bureau of Economic
Research, Inc.
Groen, J. J. J., Paap, R., and Ravazzolo, F. (2013). Real-Time Inflation Forecasting
in a Changing World. Journal of Business & Economic Statistics, 31(1):29–44.
Harvey, A. C. (1985). Trends and Cycles in Macroeconomic Time Series. Journal of
Business & Economic Statistics, 3(3):216–27.
Harvey, A. C. and Stock, J. H. (1988). Continuous time autoregressive models with
common stochastic trends. Journal of Economic Dynamics and Control, 12(2-3):365–
384.
Harvey, D., Leybourne, S., and Newbold, P. (1997). Testing the equality of prediction
mean squared errors. International Journal of Forecasting, 13(2):281–291.
Hodrick, R. J. and Prescott, E. C. (1997). Postwar U.S. Business Cycles: An Empirical
Investigation. Journal of Money, Credit and Banking, 29(1):1–16.
Jacquier, E., Polson, N. G., and Rossi, P. E. (2002). Bayesian Analysis of Stochastic
Volatility Models. Journal of Business & Economic Statistics, 20(1):69–87.
47
Jorgenson, D. W., Ho, M. S., and Stiroh, K. J. (2006). Potential Growth of the
U.S. Economy: Will the Productivity Resurgence Continue? Business Economics,
41(1):7–16.
Jurado, K., Ludvigson, S. C., and Ng, S. (2014). Measuring uncertainty. American
Economic Review, Forthcoming.
Kim, C.-J. and Nelson, C. R. (1999a). Has The U.S. Economy Become More Stable?
A Bayesian Approach Based On A Markov-Switching Model Of The Business Cycle.
The Review of Economics and Statistics, 81(4):608–616.
Kim, C.-J. and Nelson, C. R. (1999b). State-Space Models with Regime Switching:
Classical and Gibbs-Sampling Approaches with Applications, volume 1 of MIT Press
Books. The MIT Press.
Kim, S., Shephard, N., and Chib, S. (1998). Stochastic Volatility: Likelihood Inference
and Comparison with ARCH Models. Review of Economic Studies, 65(3):361–93.
Luciani, M. and Ricci, L. (2014). Nowcasting Norway. International Journal of Central
Banking, Forthcoming.
Luo, S. and Startz, R. (2014). Is it one break or on going permanent shocks that
explains u.s. real gdp? Journal of Monetary Economics, 66:155–163.
Marcellino, M., Porqueddu, M., and Venditti, F. (2013). Short-term GDP forecasting
with a mixed frequency dynamic factor model with stochastic volatility. CEPR
Discussion Papers 9334.
Mariano, R. S. and Murasawa, Y. (2003). A new coincident index of business cycles
based on monthly and quarterly series. Journal of Applied Econometrics, 18(4):427–
443.
48
McConnell, M. M. and Perez-Quiros, G. (2000). Output fluctuations in the united
states: What has changed since the early 1980’s?
American Economic Review,
90(5):1464–1476.
Nelson, C. R. and Plosser, C. I. (1982). Trends and random walks in macroeconomic time series: Some evidence and implications. Journal of Monetary Economics,
10(2):139–162.
Nordhaus, W. (2004). Retrospective on the 1970s Productivity Slowdown. Nber working papers, National Bureau of Economic Research, Inc.
Oliner, S. D. and Sichel, D. E. (2000). The Resurgence of Growth in the Late 1990s: Is
Information Technology the Story? Journal of Economic Perspectives, 14(4):3–22.
Orphanides, A. (2003). The quest for prosperity without inflation. Journal of Monetary
Economics, 50(3):633–663.
Perron, P. and Wada, T. (2009). Let’s take a break: trends and cycles in us real gdp.
Journal of Monetary Economics, 56:749–765.
Primiceri, G. E. (2005). Time varying structural vector autoregressions and monetary
policy. Review of Economic Studies, 72(3):821–852.
Sargent, T. J. and Sims, C. A. (1977). Business cycle modeling without pretending to
have too much a priori economic theory. Technical report.
Sims, C. A. (2012). Comment to stock and watson (2012).
Stock, J. and Watson, M. (2012). Disentangling the channels of the 2007-09 recession.
Brookings Papers on Economic Activity, Spring:81–156.
49
Stock, J. H. and Watson, M. W. (1989). New Indexes of Coincident and Leading
Economic Indicators. In NBER Macroeconomics Annual 1989, Volume 4, NBER
Chapters, pages 351–409. National Bureau of Economic Research, Inc.
Stock, J. H. and Watson, M. W. (2002). Forecasting Using Principal Components
From a Large Number of Predictors. Journal of the American Statistical Association,
97:1167–1179.
Summers, L. (2014). Secular stagnation. IMF Economic Forum: Policy Responses to
Crises, speech delivered at the IMF Annual Research Conference, November 8th.
Whelan, K. (2003). A Two-Sector Approach to Modeling U.S. NIPA Data. Journal
of Money, Credit and Banking, 35(4):627–56.
50
A
Full Results of Bai and Perron Tests
In this section of the Appendix we report the full results of the Bai and Perron
(1998) tests for multiple breaks in the mean. Table A.1 reports the result for US real
GDP growth. We sequentially apply the SupFT (k) test for the null hypothesis of no
break against the alternatives of k = 1, 2, or 3 breaks. Secondly, the test SupFT (k+1|k)
tests the null of k breaks against the alternative of k + 1 brakes. Finally, the Ud max
statistic tests the null of absence of break against the alternative of an unknown number
of breaks. The null of no break is rejected against the alternative in each of the three
tests, at the 5% level for the case of one break and at the 10% level for two and three
breaks. However, we cannot reject the null of only one break against two breaks, or the
null of only two against three breaks. The final test confirms the conclusion that there
is evidence in favour of at least one break. The conclusions are almost identical when we
use our baseline sample starting in 1960:Q1, or a longer one starting in 1947:Q1. The
break dates are clustered around the beginning of 2000 for the most likely break, and
around 1973 for the second (though not significant) break date.35 The mean growth
estimate would be roughly 3.6 prior to the 2000 break and 1.75 after that. The latter
number is probably distorted by the extreme events of the 2008-2009 Great Recession,
so we note that the median growth rate is 2.25%.
35
If we allow for a maximum of 5 breaks (with a minimum size of roughly 40 quarters) the results
would be qualitatively very similar for the longer sample, whereas for the baseline 1960-2014 sample
a significant single break would be identified in 2006.
51
Table A.1:
Tests for structural breaks in the mean of GDP growth
1960-2014
1947-2014
SupFT (k)
k=1
k=2
8.626**
8.582**
[2000:Q2]
[2000:Q1]
5.925*
4.294
[1973:Q2; 2000:Q2]
[1968:Q1; 2000:Q1]
k=3
4.513*
4.407*
[1973:Q1; 1984:Q1; 2000:Q2]
[1968:Q4; 1982:Q3; 2000:Q1]
SupFT (k|k − 1)
k=2
k=3
Ud max
2.565
0.430
1.109
2.398
8.626**
8.582**
Note: Bai and Perron (1998) methodology. Dates in square brackets are the most likely break date(s)
for each of the specifications.
Table A.2:
Tests for structural breaks in mean of GDP growth of Advanced Economies
SupFT (1)
SupFT (2)
SupFT (3)
USA
8.626**
5.925*
4.513*
UK
1.305
3.304
3.229
Canada
Germany France
Italy
Japan
13.002*** 13.314*** 59.289*** 22.215*** 78.478***
7.845**
8.408**
32.984*** 16.417*** 50.761***
4.157
6.917*** 21.970*** 12.217*** 40.895***
SupFT (2|1)
SupFT (3|2)
2.565
0.430
–
–
1.508
0.303
Breaks
Ud max
4.235
0.070
5.686
0.555
6.577
1.895
24.422***
0.105
[1973:2] [1982:4] [1974:1]
[2000:2] [2003:3] [2003:3]
[1973:1]
[1992:1]
[1974:1]
[2001:1]
[1973:4]
[2001:1]
[1973:1]
[1990:3]
8.626**
13.314***
59.289***
22.215***
78.478***
3.304
13.002***
Notes: The table reports the two most likely break dates, also if these are not statistically significant.
52
B
Simulation Results
Figure B.1: Simulation Results I
Data-generating process (DGP) with one discrete break in the trend
(a) True vs. Estimated Trend (Filtered)
(b) True vs. Estimated Trend (Smoothed)
1
1.25
0.75
0.5
0.25
0
-0.25
-0.5
-0.75
-1
-1.25
0
100
200
300
400
0
500
(c) True vs. Estimated Factor
100
200
300
400
500
(d) True vs. Estimated Volatilities of ut
6
30
20
3
10
0
0
-10
-20
-3
-30
0
100
200
300
400
-3
500
0
3
6
Note: The DGP features a discrete break in the trend of GDP growth occurring in the middle
of the sample, as well as stochastic volatility. The sample size is n = 26 and T = 600, which
mimics our US data set. The estimation procedure is the fully specified model as defined by
equations (1)-(7) in the text. We carry out a Monte Carlo simulation with 100 draws from
the DGP. Panel (a) presents the trend component as estimated by the Kalman filter, plotted
against the actual trend. The corresponding figure for the smoothed estimate is given in
panel (b). In both panels, the posterior median (black) as well 68% (solid) and 90% (dashed)
posterior bands are shown in blue/purple. Panel (c) displays the factor generated by the the
DGP (red) and its smoothed estimate (blue) for one draw. Panel (d) provides evidence on
the accuracy of the estimation of the SV of the idiosyncratic terms, by plotting the volatilities
from the DGP against the estimates for all 26 variables. Both are normalised by subtracting
the average volatility.
53
Figure B.2: Simulation Results II
Data-generating process (DGP) with two discrete breaks in the trend
(b) True vs. Estimated Trend (Smoothed)
(a) True vs. Estimated Trend (Filtered)
1.25
1
0.75
0.5
0.25
0
-0.25
-0.5
-0.75
-1
-1.25
0
100
200
300
400
0
500
100
200
300
400
500
Note: The simulation setup is equivalent to the one in Figure B.1 but features two discrete
breaks in the trend at 1/3 and 2/3 of the sample. Again, we show the filtered as well as
the smoothed trend median estimates and the corresponding 68% and 90% posterior bands.
Panels (c) and (d) are omitted as they are very similar to Figure B.1.
54
Figure B.3: Simulation Results III
Data-generating process (DGP) without trend and without SV
(a) True vs. Estimated Trend (Filtered)
(b) True vs. Estimated Trend (Smoothed)
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
-0.2
-0.2
-0.4
-0.4
-0.6
-0.6
-0.8
-0.8
-1
-1
0
100
200
300
400
500
1
(c) True vs. Estimated Factor
101
201
301
401
501
(d) True vs. Estimated Volatility of Factor
12
1.4
9
1.3
6
1.2
3
1.1
0
1
-3
0.9
-6
0.8
-9
0.7
0.6
-12
0
100
200
300
400
0
500
100
200
300
400
500
Note: The DGP is the baseline model without trend in GDP growth and without stochastic
volatility (“Model 0”). The estimation procedure is the fully specified model as explained in
the description of Figure B.1. Again, we plot the filtered and smoothed median estimates
of the trend with 68% and 90% posterior bands in panels (a) and (b). Panel (c) presents
a comparison of the estimated factor and its DGP counterpart for one Monte Carlo draw.
Panel (d) in similar to (b), but for the volatility of the common factor.
55
Figure B.4: Simulation Results IV
Data-generating process (DGP) without trend and discrete break in factor volatility
(a) True vs. Estimated Trend (Smoothed)
(b) True vs. Estimated Volatility of Factor
1
2.75
0.5
2.25
0
1.75
-0.5
1.25
-1
0.75
0
100
200
300
400
500
0
100
200
300
400
500
Note: The DGP does not feature any changes in the trend of GDP growth, but one discrete
break in the volatility of the common factor. As in Figures B.1-B.3, the estimation procedure
is based on the fully specified mode. Panel (a) displays the smoothed posterior median
estimate of the trend component of GDP growth, with 68% and 90% posterior bands shown
as solid and dashed blue lines, respectively. Panel (b) displays the posterior median estimate
of the volatility of the common factor (black), with the corresponding bands.
56
C
Details on Estimation Procedure
C.1
Construction of the State Space System
Recall that in our main specification we choose the order of the polynomials in
equations (3) and (4) to be p = 2 and q = 2, respectively. Let the vector y˜t be defined
as


q
y1,t




 y2,t − ρ2,1 y2,t−1 − ρ2,2 y2t−2 − α¯2 


y˜t = 
,
..


.




yn,t − ρn,1 yn,t−1 − ρn,2 yn,t−2 − α¯n
where α¯i = αi (1−ρi,1 −ρi,2 ), so that the system is written out in terms of the quasidifferences of the monthly indicators. Given this re-defined vector of observables, we
cast our model into the following state space form:
iid
η˜t ∼ N (0, R˜t )
yt = HXt + η˜t ,
Xt = F Xt−1 + et ,
iid
et ∼ N (0, Qt )
0
where the state vector is defined as Xt = α1t , . . . , α1t−4 , ft , . . . , ft−4 , u1t , . . . , u1t−4 .
57
In the above state space system, after setting λ1 = 1 for identification, the matrices
of parameters H and F , are then constructed as follows:


Hq
Hq 
 Hq
H=

0(n−1)×5 Hm 0(n−1)×5

Hq =
 λ2 − λ2 ρ2,1 − λ2 ρ2,2

 λ3 − λ3 ρ3,1 − λ3 ρ3,2

Hm = 



λn − λn ρn,1 − λn ρn,2
1
3
2
3
1

2
3
1
3

0 0

0 0

.. 
. 


0 0

 F1 05×5 05×5 



F =
0
F
0
5×5
2
5×5




05×5 05×5 F3


 1 01×4 
F1 = 

I4 04×1


φ1 φ2 01×3 
F2 = 

I4
04×1
58


ρ1,1 ρ1,2 01×3 
F3 = 

I4
04×1
Furthermore, the error terms are defined as
0
η˜t =
0 η2,t . . . ηn,t
0
et =
vα,t 04×1 t 04×1 η1,t 04×1
with covariance matrices


01×(n−1) 
 0
R˜t = 
,
0(n−1)×1
Rt
where Rt = diag(ση2,t , ση3,t , . . . , σηn,t ),
and
Qt = diag(ςα,1 , 01×4 , σ,t , 01×4 , ση1,t , 01×4 ).
59
C.2
Details of the Gibbs Sampler
Let θ ≡ {λ, Φ, ρ, ςα1 , ςε , ςη } be a vector that collects the underlying parameters. The
model is estimated using a Markov Chain Monte Carlo (MCMC) Gibbs sampling algorithm in which conditional draws of the latent variables, {α1t , ft }Tt=1 , the parameters, θ,
and the stochastic volatilities, {σε,t , σηi,t }Tt=1 are obtained sequentially. The algorithm
has a block structure composed of the following steps.
C.2.0
Initialization
The model parameters are initialized at arbitrary starting values θ0 , and so are the
0
sequences for the stochastic volatilities, {σε,t
, ση0i,t }Tt=1 . Set j = 1.
C.2.1
Draw latent variables conditional on model parameters and SVs
j−1
j
}Tt=1 , y).
, σηj−1
Obtain a draw {α1t
, ftj }Tt=1 from p({α1t , ft }Tt=1 |θj−1 , {σε,t
i,t
This step of the algorithm uses the state space representation described above
(Appendix C.1), and produces a draw from the entire state vector Xt by means of a
forward-filtering backward-smoothing algorithm (see Carter and Kohn 1994 or Kim
and Nelson 1999b). In particular, we adapt the algorithm proposed by Bai and Wang
(2012), which is robust to numerical inaccuracies, and extend it to the case with mixed
frequencies and missing data following Mariano and Murasawa (2003), as explained
in section 3.2. Like Bai and Wang (2012), we initialise the Kalman Filter step from
a normal distribution whose moments are independent of the model parameters, in
particular X0 ∼ N (0, 104 ).
60
C.2.2
Draw the variance of the time-varying GDP growth component
j T
}t=1 ).
Obtain a draw ςαj 1 from p(ςα1 |{α1t
j
Taking the sample {α1,t
}Tt=1 drawn in the previous step as given, and posing an
inverse-gamma prior p(ςα1 ) ∼ IG(Sα1 , vα1 ) the conditional posterior of ςα1 is also drawn
inverse-gamma distribution. As discussed in Section 4.1, we choose the scale Sα1 = 10−3
and degrees of freedom vα1 = 1 for our baseline specification.
C.2.3
Draw the autoregressive parameters of the factor VAR
j−1 T
Obtain a draw Φj from p(Φ|{ftj−1 , σε,t
}t=1 ).
Taking the sequences of the common factor {ftj−1 }Tt=1 and its stochastic volatility
j−1 T
{σε,t
}t=1 from previous steps as given, and posing a non-informative prior, the corre-
sponding conditional posterior is drawn from the Normal distribution (see, e.g. Kim
and Nelson 1999b). In the more general case of more than one factor, this step would
be equivalent to drawing from the coefficients of a Bayesian VAR. Like Kim and Nelson (1999b), or Cogley and Sargent (2005), we reject draws which imply autoregressive
coefficients in the explosive region.
C.2.4
Draw the factor loadings
Obtain a draw of λj from p(λ|ρj−1 , {ftj−1 , σηj−1
}Tt=1 , y).
i,t
Conditional on the draw of the common factor {ftj−1 }Tt=1 , the measurement equations reduce to n independent linear regressions with heteroskedastic and serially correlated residuals. By conditioning on ρj−1 and σηj−1
, the loadings can be estimated
i,t
using GLS and non-informative priors. When necessary, we apply restrictions on the
61
loadings using the formulas provided by Bai and Wang (2012).
C.2.5
Draw the serial correlation coefficients of the idiosyncratic components
Obtain a draw of ρj from p(ρ|λj−1 , {ftj−1 , σηj−1
}Tt=1 , y).
i,t
Taking the sequence of the common factor {ftj−1 }Tt=1 and the loadings drawn in
previous steps as given, the idiosyncratic components can be obtained as ui,t = yi,t −
}Tt=1 ,
λj−1 ftj−1 . Given a sequence for the stochastic volatility of the ith component, {σηj−1
i,t
the residual is standardized to obtain an autoregression with homoskedastic residuals
whose conditional posterior can be drawn from the Normal distribution as in step 2.3.
C.2.6
Draw the stochastic volatilities
j T
Obtain a draw of {σε,t
}t=1 and {σηj i,t }Tt=1 from p({σε,t }Tt=1 |Φj−1 , {ftj−1 }Tt=1 ), and
from p({σηi,t }Tt=1 |λj−1 , ρj−1 , {ftj−1 }Tt=1 , y) respectively.
Finally, we draw the stochastic volatilities of the innovations to the factor and
the idiosyncratic components independently, using the algorithm proposed by Kim
et al. (1998), which uses a mixture of normal random variables to approximate
the elements of the log-variance. This is a more efficient alternative to the exact
Metropolis-Hastings algorithm previously proposed by Jacquier et al. (2002). For the
general case in which there is more than one factor, the volatilities of the factor VAR
can be drawn jointly, see Primiceri (2005).
Increase j by 1, go to Step 2.1 and iterate until convergence is achieved.
62
D
Additional Figures
We assess how the three models fare when different areas of their predictive densities
are emphasized in the forecast evaluation. To do that we follow Groen et al. (2013)
and compute weighted averages of Gneiting and Raftery (2007) quantile scores (QS)
that are based on quantile forecasts that correspond to the predictive densities from the
different models (Figure D.1).36 Our results indicate that while there is an improvement
in density nowcasting for the entire distribution, the largest improvement comes from
the right tail. For the full sample, Model 1 is very close to Model 0, suggesting that
being able to identify the location of the distribution is key to the improvement in
performance. In order to appreciate the importance of the improvement in the density
forecasts, and in particular in the right side of the distribution, we calculated a recursive
estimate of the likelihood of a ‘strong recovery’, where this is defined as the probability
of an average growth rate of GDP (over the present and next three quarters) above
the historical average. Model 0 and Model 2 produce very similar probabilities up
until 2011 when, thanks to the downward revision of long-run GDP growth, Model 2
starts to deliver lower probability estimates consistent with the observed weak recovery.
The Brier score for Model 2 is 0.186 whereas the score for Model 0 is 0.2236 with the
difference significantly different at 1% (Model 1 is essentially identical to Model 0).37
36
37
As Gneiting and Ranjan (2011) show, integrating QS over the quantile spectrum gives the CRPS.
The results are available upon request.
63
64
Nowcast
Nowcast
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
Model 2
Model 0
Model 1
Forecast
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
Model 2
Model 1
Model 0
Forecast
15
0
15
Backcast
0
Backcast
0.1
0.11
0.12
0.13
0.14
0.15
0.16
0.17
0.09
0.1
0.11
0.12
0.13
0.14
0.15
Nowcast
0
15
Backcast
Nowcast
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
Model 2
Model 0
Model 1
Forecast
0
15
Backcast
Recovery Sample: 2009:Q2-2014:Q2
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
Model 2
Model 1
Model 0
Forecast
(b) Center
0.17
0.18
0.19
0.2
0.21
0.22
0.23
0.24
0.25
0.26
0.27
0.28
0.15
0.16
0.17
0.18
0.19
0.2
0.21
0.22
0.23
0.24
0.25
Nowcast
Nowcast
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
Model 2
Model 0
Model 1
Forecast
-180 -165 -150 -135 -120 -105 -90 -75 -60 -45 -30 -15
Model 2
Model 0
Model 1
Forecast
(c) Right
15
0
15
Backcast
0
Backcast
Note: The horizontal axis indicates the forecast horizon, expressed as the number of days to the end of the reference quarter. Thus, from the
point of view of the forecaster, forecasts produced 180 to 90 days before the end of a given quarter are a forecast of next quarter; forecasts
90-0 days are nowcasts of current quarter, and the forecasts produced 0-25 days after the end of the quarter are backcasts of last quarter. The
boxes below each panel display, with a vertical tick mark, a gauge of statistical significance at the 10% level of any difference with Model 0, for
each forecast horizon, as explained in the main text.
0.14
0.15
0.16
0.17
0.18
0.19
0.2
0.21
0.22
0.23
0.24
0.25
0.26
0.13
0.14
0.15
0.16
0.17
0.18
0.19
0.2
0.21
(a) Left
Full Sample: 2000:Q1-2014:Q2
Figure D.1: Density Forecast Accuracy Evaluation: Quantile Score Statistics
Figure D.2: Impact of Alternative Assumptions on Posterior Estimate of Long-Run Growth
(a) Less Conservative Prior Variance
(b) Common Trend with Consumption and Income
5
5
4
4
3
3
2
2
1
1960
1970
1980
1990
2000
2010
1
1960
1970
1980
1990
2000
2010
Note: Panel (a) shows the evolution of the annual GDP growth forecast for 2011 produced by the
model. The light dashed line represents the counterfactual forecast that would result from shutting
down fluctuations in long-run growth. Panel (b) shows the impact of the national account revisions
on the assessment of the depth of the 2008-09 recession and the speed of the subsequent recovery.
65
66
1970
1980
1990
2000
2010
1970
1980
1990
2000
2010
1
1960
2
3
4
5
6
1.5
1960
1970
1970
1970
1980
1980
1980
1990
1990
1990
2000
2000
2000
(b) Canada
2010
2010
2010
1980
1990
2000
2010
0
1960
2
4
6
8
10
12
14
-1
1960
0
1
2
3
4
5
6
7
8
9
-15
1960
1970
1970
1980
1980
1990
1990
2000
2000
2010
2010
-5
0.5
1960
1
1.5
2
2.5
3
3.5
0
1960
1
2
3
4
5
6
-15
1960
-10
-5
-10
0
0
10
10
5
15
15
5
20
1970
(c) Japan
20
1970
1970
1970
1980
1980
1980
1990
1990
1990
2000
2000
2000
(d) Germany
2010
2010
2010
0.5
1960
1
1.5
2
2.5
3
3.5
0
1960
1
2
3
4
5
6
-8
1960
-4
0
4
8
12
1970
1970
1970
1980
1980
1980
1990
1990
1990
2000
2000
2000
(e) France
2010
2010
2010
1
1960
2
3
4
5
-1
1960
0
1
2
3
4
5
6
-8
1960
-4
0
4
8
12
1970
1970
1970
1980
1980
1980
1990
1990
1990
2000
2000
2000
(f) Italy
2010
2010
2010
Note: The top chart in each panel presents the mean estimate of the components of the monthly GDP equation. The middle and bottom chart
in each panel present the posterior medians (red), 68% (solid blue) and 90% (dashed blue) posterior bands for the trend GDP growth and the
stochastic volatility of the common factor, respectively.
1
1960
2
3
4
5
6
7
8
1
1960
2
3
2.5
2.5
3.5
3
2
4
1.5
4.5
4
3.5
5
-10
1960
2010
-10
1960
2000
-5
-5
1990
0
0
1980
5
5
1970
10
10
4.5
15
(a) United Kingdom
15
Figure D.3: Posterior Estimates for the G7 Economies: 1960-2014 (% Annualised Growth Rate)
E
Details on the Construction of the Data Base
E.1
US (Vintage) Data Base
For our US real-time forecasting evaluation, we consider data vintages since 11
January 2000 capturing the real activity variables listed in the text. For each vintage,
the start of the sample is set to January 1960, appending missing observations to any
series which starts after that date. All times series are obtained from one of these
sources: (1) Archival Federal Reserve Economic Data (ALFRED), (2) Bloomberg, (3)
Haver Analytics. Table E.1 provides details on each series, including the variable code
corresponding to the different sources.
For several series, in particular Retail Sales, New Orders, Imports and Exports,
only vintages in nominal terms are available, but series for appropriate deflators are
available from Haver, and these are not subject to revisions.occasions We therefore
deflate them using, respectively, CPI, PPI for Capital Equipment, and Imports and
Exports price indices. Additionally, in several occassions the series for New Orders,
Personal Consumption, Vehicle Sales and Retail Sales get are subject to methodological
changes and part of their history gets discontinued. In this case, given our interest in
using long samples for all series, we use older vintages to splice the growth rates back
to the earliest possible date.
For soft variables real-time data is not as readily available. The literature on realtime forecasting has generally assumed that these series are unrevised, and therefore
used the latest available vintage. However while the underlying survey responses are
indeed not revised, the seasonal adjustment procedures applied to them do lead to
important differences between the series as was available at the time and the latest
vintage. For this reason we use seasonally un-adjusted data and re-apply the CensusX12 procedure in real time to obtain a real-time seasonally adjusted version of the
67
surveys. We follow the same procedure for the initial unemployment claims series. We
then use Bloomberg to obtain the exact date in which each monthly datapoint was
first published.
68
Table E.1:
Detailed Description of Data Series
Hard Indicators
Real Gross Domestic
Product
Real Industrial
Production
Real Manufacturers’
New Orders Nondefense
Capital Goods
Excluding Aircraft
Real Light Weight
Vehicle Sales
Real Personal
Consumption
Expenditures
Real Personal Income
less Transfer Payments
Real Retail Sales Food
Services
Frequ.
Start
Date
Vintage
Start
Transformation
Publ.
Lag
Data
Code
Q
Q1:1960
Dec 91
30
GDPC1(F)
M
Jan 60
Jan 27
%QoQ
Ann
% MoM
15
INDPRO(F)
M
Mar 68
Mar 97
% MoM
25
NEWORDER(F)1
PPICPE(F)
M
Feb 67
Mar 97
% MoM
1
ALTSALES(F)2
TLVAR(H)
M
Jan 60
Nov 79
% MoM
30
PCEC96(F)
M
Jan 60
Dec 79
% MoM
30
DSPIC96(F)
M
Jan 60
Jun 01
% MoM
15
RETAIL(F)
CPIAUCSL(F)
RRSFS(F)3
Real Exports of Goods
M
Feb 68
Jan 97
% MoM
35
BOPGEXP(F)4
C111CPX(H)
TMXA(H)
Real Imports of Goods
M
Feb 69
Jan 97
% MoM
35
BOPGIMP(F)4
C111CP(H)
TMMCA(H)
Building Permits
Housing Starts
New Home Sales
Total Nonfarm Payroll
Employment
(Establishment Survey)
Civilian Employment
(Household Survey)
Unemployed
Initial Claims for
Unempl. Insurance
M
M
M
M
Feb
Jan
Feb
Jan
M
M
M
60
60
63
60
Aug 99
Jul 70
Jul 99
May 55
%
%
%
%
MoM
MoM
MoM
MoM
0
0
0
0
Jan 60
Feb 61
% MoM
0
Jan 60
Jan 60
Feb 61
Jan 00*
% MoM
% MoM
0
0
(Continues on next page)
69
PERMIT(F)
HOUST(F)
HSN1F(F)
PAYEMS(F)
CE16OV(F)
UNEMPLOY(F)
LICM(H)
Detailed Description of Data Series (Continued)
Soft Indicators
Markit Manufacturing
PMI
ISM Manufacturing PMI
M
May 07
Jan 00*
-
0
S111VPMM(H)5
H111VPMM(H)
M
Jan 60
Jan 00*
-
0
NMFBAI(H)
NMFNI(H)
NMFEI(H)
NMFVDI(H)6
ISM Non-manufacturing
PMI
Conference Board:
Consumer Confidence
University of Michigan:
Consumer Sentiment
M
Jul 97
Jan 00*
-
0
NAPMCN(H)
M
Feb 68
Jan 00*
Diff 12 M.
0
CCIN(H)
M
May 60
Jan 00*
Diff 12 M.
0
CSENT(H)5
CONSSENT(F)
Index(B)
Richmond Fed
Manufacturing Survey
M
Nov 93
Jan 00*
-
0
RIMSXN(H)
RIMNXN(H)
RIMLXN(H)6
Philadelphia Fed
Business Outlook
M
May 68
Jan 00*
-
0
BOCNOIN(H)
BOCNONN(H)
BOCSHNN(H)
BOCDTIN(H)
BOCNENN(H)6
Chicago PMI
M
Feb 67
Jan 00*
-
0
PMCXPD(H)
PMCXNO(H)
PMCXI(H)
PMCXVD(H)6
NFIB: Small Business
Optimism Index
Empire State
Manufacturing Survey
M
Oct 75
Jan 00*
Diff 12 M.
0
NFIBBN (H)
M
Jul 01
Jan 00*
-
0
EMNHN(H)
EMSHN(H)
EMDHN(H)
EMDSN(H)
EMESN(H)6
Notes: (B) = Bloomberg; (F) = FRED; (H) = Haver; 1) deflated using PPI for capital equipment; 2) for
historical data not available in ALFRED we used data coming from HAVER; 3) using deflated nominal series
up to May 2001 and real series afterwards; 4) nominal series from ALFRED and price indices from HAVER. For
historical data not available in ALFRED we used data coming from HAVER; 5) preliminary series considered;
6) NSA subcomponents needed to compute the SA headline index. * Denotes seasonally un-adjusted series
which have been seasonally adjusted in real time.
70
E.2
Data Base for Other G7 Economies
Table E.2:
Canada
Real Gross Domestic Product
Industrial Production: Manuf., Mining, Util.
Manufacturing New Orders
Manufacturing Turnover
New Passenger Car Sales
Real Retail Sales
Construction: Dwellings Started
Residential Building Permits Auth.
Real Exports
Real Imports
Unemployment Ins.: Initial and Renewal Claims
Employment: Industrial Aggr. excl. Unclassified
Employment: Both Sexes, 15 Years and Over
Unemployment: Both Sexes, 15 Years and Over
Consumer Confidence Indicator
Ivey Purchasing Managers Index
ISM Manufacturing PMI
University of Michigan: Consumer Sentiment
Freq.
Start Date
Transformation
Q
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
Jun-1960
Jan-1960
Feb-1960
Feb-1960
Jan-1960
Feb-1970
Feb-1960
Jan-1960
Jan-1960
Jan-1960
Jan-1960
Feb-1991
Feb-1960
Feb-1960
Jan-1981
Jan-2001
Jan-1960
May-1960
% QoQ Ann.
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
Diff 12 M.
Level
Level
Diff 12 M.
Notes: The second column refers to the sampling frequency of the data, which can be quarterly (Q)
or monthly (M). % QoQ Ann. refers to the quarter on quarter annualized growth rate, % MoM refers
to (yt − yt−1 )/yt−1 while Diff 12 M. refers to yt − yt−12 . All series were obtained from the Haver
Analytics database.
71
Table E.3:
Germany
Real Gross Domestic Product
Mfg Survey: Production: Future Tendency
Ifo Demand vs. Prev. Month: Manufact.
Ifo Business Expectations: All Sectors
Markit Manufacturing PMI
Markit Services PMI
Industrial Production
Manufacturing Turnover
Manufacturing Orders
New Truck Registrations
Total Unemployed
Total Domestic Employment
Job Vacancies
Retail Sales Volume excluding Motor Vehicles
Wholesale Vol. excl. Motor Veh. and Motorcycles
Real Exports of Goods
Real Imports of Goods
Freq.
Start Date
Transformation
Q
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
Jun-1960
Jan-1960
Jan-1961
Jan-1991
Apr-1996
Jun-1997
Jan-1960
Feb-1960
Jan-1960
Feb-1963
Feb-1962
Feb-1981
Feb-1960
Jan-1960
Feb-1994
Feb-1970
Feb-1970
% QoQ Ann.
Level
Level
Level
Level
Level
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
Notes: The second column refers to the sampling frequency of the data, which can be quarterly (Q)
or monthly (M). % QoQ Ann. refers to the quarter on quarter annualized growth rate, % MoM refers
to (yt − yt−1 )/yt−1 while Diff 12 M. refers to yt − yt−12 . All series were obtained from the Haver
Analytics database.
72
Table E.4:
Japan
Real Gross Domestic Product
TANKAN All Industries: Actual Business Cond.
Markit Manufacturing PMI
Small Business Sales Forecast
Small/Medium Business Survey
Consumer Confidence Index
Inventory to Sales Ratio
Industrial Production: Mining and Manufact.
Electric Power Consumed by Large Users
New Motor Vehicle Registration: Trucks, Total
New Motor Vehicle Reg: Passenger Cars
Real Retail Sales
Real Department Store Sales
Real Wholesale Sales: Total
Tertiary Industry Activity Index
Labor Force Survey: Total Unemployed
Overtime Hours / Total Hours (manufact.)
New Job Offers excl. New Graduates
Ratio of New Job Openings to Applications
Ratio of Active Job Openings and Active Job Appl.
Building Starts, Floor Area: Total
Housing Starts: New Construction
Real Exports
Real Imports
Freq.
Start Date
Transformation
Q
Q
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
Jun-1960
Sep-1974
Oct-2001
Dec-1974
Apr-1976
Mar-1973
Jan-1978
Jan-1960
Feb-1960
Feb-1965
May-1968
Feb-1960
Feb-1970
Aug-1978
Feb-1988
Jan-1960
Feb-1990
Feb-1963
Feb-1963
Feb-1963
Feb-1965
Feb-1960
Feb-1960
Feb-1960
% QoQ Ann.
Diff 1 M.
Level
Level
Level
Level
Level
% MoM
% MoM
Diff 1 M.
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
Notes: The second column refers to the sampling frequency of the data, which can be quarterly (Q)
or monthly (M). % QoQ Ann. refers to the quarter on quarter annualized growth rate, % MoM refers
to (yt − yt−1 )/yt−1 while Diff 12 M. refers to yt − yt−12 . All series were obtained from the Haver
Analytics database.
73
Table E.5:
United Kingdom
Real Gross Domestic Product
Dist. Trades: Total Vol. of Sales
Dist. Trades: Retail Vol. of Sales
CBI Industrial Trends: Vol. of Output Next 3 M.
BoE Agents’ Survey: Cons. Services Turnover
Markit Manufacturing PMI
Markit Services PMI
Markit Construction PMI
GfK Consumer Confidence Barometer
Industrial Production: Manufacturing
Passenger Car Registrations
Retail Sales Volume: All Retail incl. Autom. Fuel
Index of Services: Total Service Industries
Registered Unemployment
Job Vacancies
LFS: Unemployed: Aged 16 and Over
LFS: Employment: Aged 16 and Over
Mortgage Loans Approved: All Lenders
Real Exports
Real Imports
Freq.
Start Date
Transformation
Q
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
Mar-1960
Jul-1983
Jul-1983
Feb-1975
Jul-1997
Jan-1992
Jul-1996
Apr-1997
Jan-1975
Jan-1960
Jan-1960
Jan-1960
Feb-1997
Feb-1960
Feb-1960
Mar-1971
Mar-1971
May-1993
Feb-1961
Feb-1961
% QoQ Ann.
Level
Leve
Level
Level
Level
Level
Level
Diff 12 M.
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
Notes: The second column refers to the sampling frequency of the data, which can be quarterly (Q)
or monthly (M). % QoQ Ann. refers to the quarter on quarter annualized growth rate, % MoM refers
to (yt − yt−1 )/yt−1 while Diff 12 M. refers to yt − yt−12 . All series were obtained from the Haver
Analytics database.
74
Table E.6:
France
Real Gross Domestic Product
Industrial Production
Total Commercial Vehicle Registrations
Household Consumption Exp.: Durable Goods
Real Retail Sales
Passenger Cars
Job Vacancies
Registered Unemployment
Housing Permits
Housing Starts
Volume of Imports
Volume of Exports
Business Survey: Personal Prod. Expect.
Business Survey: Recent Output Changes
Household Survey: Household Conf. Indicator
BdF Bus. Survey: Production vs. Last M., Ind.
BdF Bus. Survey: Production Forecast, Ind.
BdF Bus. Survey: Total Orders vs. Last M., Ind.
BdF Bus. Survey: Activity vs. Last M., Services
BdF Bus. Survey: Activity Forecast, Services
Markit Manufacturing PMI
Markit Services PMI
Freq.
Start Date
Transformation
Q
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
Jun-1960
Feb-1960
Feb-1975
Feb-1980
Feb-1975
Feb-1960
Feb-1989
Feb-1960
Feb-1960
Feb-1974
Jan-1960
Jan-1960
Jun-1962
Jan-1966
Oct-1973
Jan-1976
Jan-1976
Jan-1981
Oct-2002
Oct-2002
Apr-1998
May-1998
% QoQ Ann.
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
Level
Level
Diff 12 M.
Level
Level
Level
Level
Level
Level
Level
Notes: The second column refers to the sampling frequency of the data, which can be quarterly (Q)
or monthly (M). % QoQ Ann. refers to the quarter on quarter annualized growth rate, % MoM refers
to (yt − yt−1 )/yt−1 while Diff 12 M. refers to yt − yt−12 . All series were obtained from the Haver
Analytics database.
75
Table E.7:
Italy
Real Gross Domestic Product
Markit Manufacturing PMI
Markit Services PMI: Business Activity
Production Future Tendency
ISTAT Services Survey: Orders, Next 3 MISTAT Retail Trade Confidence Indicator
Industrial Production
Real Exports
Real Imports
Real Retail Sales
Passenger Car Registrations
Employed
Unemployed
Freq.
Start Date
Transformation
Q
M
M
M
M
M
M
M
M
M
M
M
M
Jun-1960
Jun-1997
Jan-1998
Jan-1962
Jan-2003
Jan-1990
Jan-1960
Jan-1960
Jan-1960
Feb-1990
Jan-1960
Feb-2004
Feb-1983
% QoQ Ann.
Level
Level
Level
Level
Level
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
% MoM
Notes: The second column refers to the sampling frequency of the data, which can be quarterly (Q)
or monthly (M). % QoQ Ann. refers to the quarter on quarter annualized growth rate, % MoM refers
to (yt − yt−1 )/yt−1 while Diff 12 M. refers to yt − yt−12 . All series were obtained from the Haver
Analytics database.
76