dynamic models in empirical io

ECO 2901
EMPIRICAL INDUSTRIAL ORGANIZATION
Lecture 3: Intro to Dynamic Models In Empirical IO (II)
Victor Aguirregabiria (University of Toronto)
Toronto. Winter 2015
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
1 / 25
Likelihood
Under CI-1 to CI-3, we have that:
h
i
n
T
J
log (Pr(Data)) = ∑i =1 ∑t =1 ∑j =0 1fait = j g log P (j j xit )
h
h
∑ i =1 ∑ t =1
n
∑
T
n
i =1
1
log fx (xi ,t +1 jxit , ait )
log p (xi 1 )
i
i
The probabilities P (j j xit ) for j = 0, 1, ..., J are denoted Conditional
Choice Probabilities:
P (j j x )
Victor Aguirregabiria ()
Pr (ait = j j xit = x ) =
Empirical IO
Z
1 fa (x, ε) = j g fε (d ε)
Toronto. Winter 2015
2 / 25
Components of the Likelihood function
Each of the three components of the full likelihood function can be
considered as likelihood functions for di¤erent components of the
data:
l (θ ) = lChoice (θ ) + lTrans (θ f ) + lInitial (θ )
=
∑i =1 ∑t =1 ln P (ait jxit ; θ )
n
T 1
+ ∑i =1 ∑t =1 ln fx (xi ,t +1 jxit , ait ; θ f )
n
+ ∑i =1 log p (xi 1 jθ )
n
T
In stationary models, obtaining p (xi 1 jθ ) requires the computation of
the ergodic distribution of the state variables (including endogenous).
Most applications do not do this, and consider the conditional
likelihood function:
l (θ jx1 ) = lChoice (θ ) + lTrans (θ f )
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
3 / 25
Conditional ML Estimation
Typically, the parameters θ f can be identifed/estimated from the
transition data, i.e., from the likelihood lTrans (θ f ).
Then, a common estimation approach is the following.
In a …rst step, estimate θ f as:
b
θ f = arg max lTrans (θ f )
θf
Given b
θ f , in a secind step, estimate (θ U , θ ε , β) as:
(bθ U , bθ ε , b
β) = arg max
θ U ,θ ε ,β
lChoice (θ U , θ ε , β; b
θf )
Unless stated otherwise, this is the approach that we will always
consider.
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
4 / 25
Estimation and Solution of the DP Problem
We have that:
lChoice (θ ) =
where:
P (j j x, θ )
h
∑i =1 ∑t =1 ∑j =0 1fait = j g
n
T
J
Pr (ait = j j xit = x, θ ) =
Z
log P (j j xit ; θ )
i
1 fa (x, ε, θ ) = j g fε (d ε, ; θ ε )
In contrast to static (and single-agent) decision models where the
optimal decision rule has a known closed-form expression, in DP
decision models, the functional form of a (x, ε, θ ) is unknown, unless
we solve the DP problem.
The DP problem cannor be solved generically for the in…nite possible
values of θ. Solution methods provide a (x, ε, θ ) for a single value of
θ.
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
5 / 25
Estimation and Solution of the DP Problem
(2)
Solving the DP problem for each trial value of θ in our search for the
MLE is computationally costly.
The literature on estimation of Dynamic Discrete Choice structural
models has been partly motivated by reducing this computational
cost.
This typically requires additional assumptions / structure.
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
6 / 25
Additive separability of unobservables (AS)
The payo¤ function is:
U (at , st ; θ U ) = u (at , xt ; θ U ) + εt (at )
Where εt = fεt (0), εt (1), ..., εt (J )g is a vector of continuous rv’s,
with support the real line, and continuously di¤erentiable density.
εt has zero mean and it is independent of xt .
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
7 / 25
Discrete observables (DO)
The vector of observables state variables has a discrete and …nite
support:
xit 2 X = fx (1 ) , x (2 ) , ..., x (M ) g
CI + AS + DO assumptions imply a substantial reduction in the
dimension of the state space of the DP problem, and therefore in
computation time.
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
8 / 25
Integrated Value function and Bellman equation
Under the AS + CI assumptions:
2
V (xt , εt ) = max 4
a 2A
+ β ∑x
t +1
Z
u (a, xt ) + εt (a)
V (xt +1 , εt +1 ) fε (d εt +1 ) fx (xt +1 jxt ,
De…ne the integrated value function:
V σ (xt )
Z
V (xt , εt ) fε (d εt )
Given this de…nition, we have the integrated Bellman equation:
#)
(
"
Z
u
(
a,
x
)
+
ε
(
a
)
t
t
fε (d ε t )
V σ (xt ) =
max
+ β ∑x V σ (xt +1 ) fx (xt +1 jxt , a)
a 2A
t +1
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
9 / 25
Integrated Value function and Bellman equation
The integrated value function and Bellman equation have some
interesting properties.
[1]
The integrated Bellman equation is a contraction mapping,
and therefore V σ is unique and it can be computed using succesive
iterations in the Int. Bellman.
[2]
Given V σ (xt ), we can obtain the optimal decision rule (we do
not need V (xt , εt ))
"
#
u (xt , a) + εt (a)
a (xt , εt ) = arg max
+ β ∑x V σ (xt +1 ) f (xt +1 jxt , a)
a 2A
t +1
[3]
Function V σ can be described as a vector in the Euclidean
space of dimention M (instead of an in…nite dimension space of
real-valued functions).
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
10 / 25
Integrated Value function and Bellman equation
De…ne the conditional choice value functions:
v (a, xt )
u (a, xt ) + β ∑x
t +1
V σ (xt +1 ) fx (xt +1 jxt , a)
Using this de…nition, the optimal decision rule is:
a (xt , εt ) = arg max [v (xt , a) + εt (a)]
a 2A
And the integrated Bellman eq:
σ
V (xt ) =
Z
max [v (a, xt ) + εt (a)]
a 2A
fε (d ε t )
The RHS "Integral of a max" is what McFadden (1978) de…ned as
"Social Surplus" in the context of Random Utility Models (RUM).
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
11 / 25
Integrated Value function and Bellman equation
[4]
For di¤erent well-known discrete choice models, such as Binary
Probit and Logit, Multinomial Logit, or Nested Logit, the Social
Surplus function has a closed form expression in terms of v (a, xt )’s.
For instance, for the MNL:
V σ (xt ) = ln (exp fv (0, xit )g + ... + exp fv (J, xit )g)
In vector form:
V = ln (exp fu(0) + β Fx (0) Vg + ... + exp fu(J ) + β Fx (J ) Vg)
u(0) ... u(J ) are M 1 vectors of pro…ts; Fx (0) ... Fx (J ) are
M M transition probability matrices of x.
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
12 / 25
Value function iteration algorithm
Given this equation, the vector V can be obtained by successive
approximations.(iterations) in the Bellman equation.
Let V0 be an arbitrary initial value for the vector V. For instance, V0
could be a M 1 vector of zeroes.
Then, at iteration k
1 we obtain:
Vk = Γ(Vk
1)
where Γ(.) is the function in the RHS of the Bellman equation, i.e.,
Vk = ln (exp fu(0) + β Fx (0) Vk
1g
+ ... + exp fu(J ) + β Fx (J ) Vk
1 g)
Since the Bellman equation is a contraction mapping, this algorithm
always converges (regardless the initial V0 ) and it converges to the
unique …xed point.
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
13 / 25
McFadden’s Conditional Logit
Consider the static Random Utility Model (RUM):
ai = arg
max
j 2f0,1,...,J g
[u (j, xi ) + εi (j )]
where we have data on fai , xi g.
McFadden’s Conditional Logit is a particular type of RUM with:
ai = arg
max
j 2f0,1,...,J g
[z (j, xi ) θ + εi (j )]
where z (j, xi ) are known functions to the researcher; and
fεi (j ) : j = 0, 1, ..., J g are extreme value distributed.
Then,
P (j j xi , θ ) =
Victor Aguirregabiria ()
expfz (j, xi ) θ g
expfz (k, xi ) θ g
∑Jk =0
Empirical IO
Toronto. Winter 2015
14 / 25
McFadden’s Conditional Logit
(2)
The log-likelihood function of this CLogit is:
"
#
n
l (θ ) = ∑
i =1
J
∑ 1fai = j g z (j, xi ) θ
ln
j =0
And the likelihood equations:
∂l (θ )
∂θ
n
= ∑
i =1
∂li (θ )
∂θ
J
∂li (θ )
= ∑ z (j, xi ) [1fai = j g
∂θ
j =0
J
∑ expfz (k, xi ) θ g
k =0
with:
P (j j xi , θ )]
A nice feature of the Clogit is that this likelihood function is globally
concave, and therefore standard gradient search methods easily
provide the MLE. For instance, BHHH iterations:
! 1
!
n
n
^
^
^
∂l
(
θ
)
∂l
(
θ
)
∂l
(
θ
)
i
i
i
k
k
k
^
θk +1 = ^
θk + ∑
∑ ∂θ
∂θ
∂θ 0
i =1
i =1
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
15 / 25
DP - McFadden’s Conditional Logit
Consider the DP CLogit:
ait = arg
max
j 2f0,1,...,J g
[v (j, xit , θ ) + εit (j )]
where
v (j, xit , θ )
z (j, xit ) θ + β fx (j, xit ) V(θ )
with the vector V(θ ):
V(θ ) = ln (exp fz(0) θ + β Fx (0) V(θ )g + ... + exp fz(J ) θ + β Fx (J )
The CCPs are:
P (j j xit , θ ) =
Victor Aguirregabiria ()
expfz (j, xit ) θ + β fx (j, xit ) V(θ )g
expfz (k, xit ) θ + β fx (k, xit ) V(θ )g
∑Jk =0
Empirical IO
Toronto. Winter 2015
16 / 25
DP - McFadden’s Conditional Logit (2)
The log-likelihood function of the DP-CLogit is:
"
#
l (θ ) = ∑
i ,t
J
J
∑ 1fait = j g v (j, xit , θ )
j =0
And the likelihood equations:
∂l (θ )
∂θ
∑ expfv (k, xit , θ )g
ln
k =0
= ∑ ∂lit∂θ(θ ) with:
i ,t
J
∂lit (θ )
∂V(θ )
= ∑ z (j, xit ) + β fx (j, xit )
[1fait = j g
0
∂θ
∂θ 0
j =0
P (j jxit , θ )]
with
"
#
J
∂V(θ )
= I β ∑ P(j, θ ) Fx (j )
∂θ 0
j =0
Victor Aguirregabiria ()
Empirical IO
1
"
J
∑ P(j, θ ) z(j )
j =0
#
Toronto. Winter 2015
17 / 25
Nested Fixed Point Algorithm (NFXP)
The NFXP algorithm (Rust, 1987) is a gradient iterative search
method to obtain the MLE of the structural parameters.
This algorithm nests:
(1) a BHHH method (outer algorithm), that searches for a root
of the likelihood equations;
(2) with a value function method (inner algorithm), that solves
the DP problem for each trial value of the structural parameters.
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
18 / 25
Nested Fixed Point Algorithm
(2)
The algorithm is initialized with an arbitrary vector ^
θ0 .
Outer Algorithm: BHHH iteration is de…ned as:
^
θk +1 = ^
θk +
∑
i ,t
θk )
∂lit (^
θk ) ∂lit (^
∂θ
∂θ 0
!
1
∑
i ,t
∂lit (^
θk )
∂θ
!
Inner Algorithm: Value function iterations to solve the DP given ^
θk ,
∂lit (^
θk )
^
to calculate V(θk ), and the corresponding ∂θ .
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
19 / 25
NFXP with CLOGIT
(1)
We start with an arbitrary initial guess ^
θ0 .
Then, we obtain the vector V(^
θ0 ) by using value function iterations:
n
o
n
Vk = ln exp z(0)^
θ0 + β Fx (0) Vk 1 + ... + exp z(J )^
θ0 + β Fx (J )
until convergence.
Given V(^
θ0 ), we calculate the CCPs:
P j j x, ^
θ0 =
Victor Aguirregabiria ()
expfz (j, x ) ^
θ0 + β fx (j, x ) V(^
θ0 )g
expfz (k, x ) θ + β fx (k, x ) V(^
θ0 )g
∑Jk =0
Empirical IO
Toronto. Winter 2015
20 / 25
NFXP with CLOGIT
(2)
And given the CCPs P j j x, ^
θ0 , we can make a BHHH iteration to
obtain a new value of θ:
! 1
!
^
^
^
∂l
(
θ
)
∂l
(
θ
)
∂l
(
θ
)
0
0
0
it
it
it
^
θ1 = ^
θ0 + ∑
∑ ∂θ
∂θ
∂θ 0
i ,t
i ,t
where
#
"
J
∂lit (^
θ0 )
∂V(^
θ0 ) h
= ∑ z (j, xit ) + β fx (j, xit )
1fait = j g
∂θ
∂θ 0
j =0
P j jxit , ^
θ0
and
"
#
J
∂V(^
θ0 )
= I β ∑ P(j, ^
θ0 ) Fx (j )
∂θ 0
j =0
Victor Aguirregabiria ()
Empirical IO
1
"
J
θ0 ) z(j )
∑ P(j, ^
j =0
Toronto. Winter 2015
#
21 / 25
NFXP with CLOGIT
If ^
θ1
(3)
^
θ0 satis…es a convergence criterion, then ^
θ1 is the MLE.
Otherwise, we apply again the same steps as before but now to ^
θ1 :
^
- Value function iterations to obtain V(θ1 ), and the
∂V(^
θ1 )
corresponding P j j x, ^
θ1 and
0 ;
∂θ
- BHHH iteration to obtain ^
θ2 ;
And so on until convergence.
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
22 / 25
NFXP: Advantages and limitations
The main advantages of the NFXP algorithm are its conceptual
simplicity and, most importantly, that it provides the MLE which is
the most e¢ cient estimator asymptotically under the assumptions of
the model.
The main limitation of this algorithm is its computational cost. In
particular, the DP problem should be solved for each trial value of the
structural parameters.
Note: Even for the DP-CLOGIT, the log-likelihood function ia not
globally concave in general.
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
23 / 25
EXERCISES
[1]
In your favorite programming language, write a code for the
implementation of the NFXP in the Entry-Exit model with logit errors.
[2]
Given a choice for the "true" θ, write a code to generate a
simulated sample from the "true model". Use this sample to estimate
θ.
[3]
In the general DP-CLOGIT, show that in general the
log-likelihood is not globally concave. Obtain su¢ cient conditions on
z (j, x ) or/and fx (j, xit ) that imply global concavity.
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
24 / 25
Hotz-Miller Estimator
* Main idea: To estimate consistently θ we do not have to solve,
even once, a DP problem.
Victor Aguirregabiria ()
Empirical IO
Toronto. Winter 2015
25 / 25