EC501 Econometric Methods and Applications 4. Large Sample Methods Marcus Chambers

1/24
EC501 Econometric Methods and Applications
4. Large Sample Methods
Marcus Chambers
Department of Economics
University of Essex
31 October 2013
EC501 Econometric Methods and Applications
4. Large Sample Methods
2/24
Outline
1
Review
2
Large sample concepts
3
The method of maximum likelihood
4
Large sample hypothesis tests
Reference: Greene, chapter 14 and Appendix D.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Review
3/24
CLRM: y = Xβ + .
The OLS estimator, b = (X 0 X)−1 X 0 y, is BLUE.
Hypothesis tests: t- and F-statistics.
Exact t- and F-distributions rely on assumption of normality.
We shall attempt to see what happens when we relax the
assumption of normality in due course.
To do this we shall use large sample (asymptotic) methods.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample concepts
4/24
If we wish to relax some of the assumptions of the CLRM
then exact finite sample results are typically not available.
For example, if we relax the normality assumption,
t-statistics no longer have t-distributions, F-statistics no
longer have F-distributions etc.
Hence the critical values from these distributions are not
correct, and incorrect inferences may be drawn from the
tests.
We therefore use large sample methods to find out the
properties of estimators and test statistics as n → ∞.
For large enough n we treat the asymptotic results as
holding approximately.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample concepts
5/24
Consider a sequence of numbers indexed by n e.g.
1
1 1 1
−n
,
,
,..., n,... .
{xn = e } =
e e2 e3
e
We can define the limit of this sequence as n → ∞:
lim xn = lim e−n = 0.
n→∞
n→∞
The sequence {xn } is said to converge to zero.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample concepts
6/24
What happens if the elements are random variables?
The sequence of random variables {xn } is said to
converge in probability to a constant c if
lim Pr (|xn − c| > ) = 0 for some > 0.
n→∞
This is written
p
xn → c or plim xn = c.
In words: there exists a positive number such that, as n
gets larger and larger, the probability that the distance
between xn and c is larger than , converges to zero.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample concepts
7/24
If plim b = β then b is a consistent estimator of β.
Consistency is a large sample version of unbiasedness.
A useful property of the plim operator is:
Slutsky’s Theorem
Slutsky’s Theorem: If g(·) is a continuous function and
plim xn = c, then
plim g(xn ) = g(plim xn ) = g(c)
(see D-12 on p.1113 of Greene).
This is not a property shared by the expectations operator
– in general, E[g(x)] 6= g[E(x)] for a random variable x.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample concepts
8/24
Another useful result is:
Chebychev’s Lemma
If xn is a random sequence such that
lim E(xn ) = c and
n→∞
lim var(xn ) = 0,
n→∞
then plim xn = c (see D-1 on p.1107 of Greene).
This enables us to establish consistency by examining the
limiting properties of the expectation and variance.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample concepts
9/24
Convergence to a constant θ is illustrated above by the
variance of the distribution becoming smaller as n
increases.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample concepts
10/24
We are also interested in the distribution of random
variables.
Suppose Fn (·), the distribution function of xn , converges to
a distribution function F(·) as n → ∞.
Then F(·) is the limiting distribution of xn , and if x is a
random variable having distribution function F(·), then xn is
said to converge in distribution to x.
For example, if x ∼ N(0, σ 2 ) and xn converges in
distribution to x, then we write
d
d
xn → x ∼ N(0, σ 2 ) or xn → N(0, σ 2 ).
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample concepts
11/24
A useful result concerning convergence in distribution is:
Cramer’s Theorem
If An is a matrix sequence such that plim An = A, and bn is a
d
vector sequence such that bn → b ∼ N(0, Q), then
d
An bn → Ab ∼ N(0, AQA0 ).
This is useful in studying the limiting distribution of the OLS
estimator.
EC501 Econometric Methods and Applications
4. Large Sample Methods
The method of maximum likelihood
12/24
Consider the classical model
yi = xi0 β + i , i = 1, . . . , n; xi nonrandom; i ∼ NID(0, σ 2 ).
(1)
NB: ‘NID(0, σ 2 )’ means ‘normally and independently
distributed with mean zero and variance σ 2 ’.
The normality assumption means the pdf for i is
1
2i
f (i ) = √ exp − 2 , −∞ < i < ∞.
2σ
σ 2π
The independence of the i means the joint pdf for the
n × 1 vector is:
n
Pn 2 n
Y
1
√
f () =
f (i ) =
exp − i=12 i .
2σ
σ 2π
i=1
EC501 Econometric Methods and Applications
4. Large Sample Methods
(2)
(3)
The method of maximum likelihood
13/24
Because y = Xβ + and
obtain the joint pdf for y:
f (y) =
1
2πσ 2
n/2
P
2i = 0 = (y − Xβ)0 (y − Xβ) we
(y − Xβ)0 (y − Xβ)
exp −
2σ 2
.
(4)
This is a function of y for given β and σ 2 .
But in econometrics we need to estimate β and σ 2 for
given y.
The method of maximum likelihood takes the probability
density in (4) and chooses the values of β and σ 2 which
are most likely to have given the observed y.
EC501 Econometric Methods and Applications
4. Large Sample Methods
The method of maximum likelihood
14/24
When regarded as a function of β and σ 2 for given y the
function in (4) is called the likelihood function L(β, σ 2 ; y):
(y − Xβ)0 (y − Xβ)
2
2 −n/2
L(β, σ ; y) = 2πσ
exp −
(5)
2σ 2
We need to maximise (5) with respect to β and σ 2 .
It is easiest to take logs:
n
n
S(β)
ln L = − ln 2π − ln σ 2 −
,
2
2
2σ 2
where S(β) = (y − Xβ)0 (y − Xβ) is the familiar sum of
squares function.
EC501 Econometric Methods and Applications
4. Large Sample Methods
(6)
The method of maximum likelihood
15/24
The first-order conditions for the maximisation are:
∂ ln L
∂β
= −
∂S(β) 1
∂S(βˆML )
· 2 =0 ⇒
= 0;
∂β
2σ
∂β
∂ ln L
∂σ 2
= −
n
S(β)
S(βˆML )
2
+
=
0
⇒
σ
ˆ
=
. (8)
ML
2σ 2
2σ 4
n
(7)
Clearly, from (7), βˆML = b, the OLS estimator:
βˆML = (X 0 X)−1 X 0 y,
2
σ
ˆML
=
(9)
(y − X βˆML )0 (y − X βˆML )
6= s2 .
n
EC501 Econometric Methods and Applications
4. Large Sample Methods
(10)
The method of maximum likelihood
16/24
Consider a general estimation problem where we want to
estimate an m × 1 parameter vector θ whose true value is
θ0 .
Let
g(θ) =
∂ ln L(θ)
(m × 1),
∂θ
H(θ) =
∂ 2 ln L(θ)
(m × m).
∂θ∂θ0
Note that the maximum likelihood estimator (MLE) θˆ
ˆ = 0.
satisfies g(θ)
EC501 Econometric Methods and Applications
4. Large Sample Methods
The method of maximum likelihood
17/24
The MLE θˆ has the following properties:
1
2
3
Consistency: plim θˆ = θ0 .
a
Asymptotic normality: θˆ ∼ N θ0 , I(θ0 )−1 , where
I(θ) = −E[H(θ)] is the information matrix.
Asymptotic efficiency: θˆ achieves the Cramer-Rao lower
bound for consistent estimators, meaning that
ˆ = I(θ0 )−1
var(θ)
˜ var(θ)
˜ ≥ I(θ0 )−1 ).
(in general, for an estimator θ,
Although we have concentrated on the CLRM, maximum
likelihood can be applied in a wide variety of problems,
provided we can write down the density function!
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample hypothesis tests
18/24
Sometimes we want to test hypotheses involving
nonlinear restrictions of the form
H0 : c(θ) = 0 against H1 : c(θ) 6= 0
(11)
where θ is an m × 1 vector of parameters and the function
c : Rm → RJ (J ≤ m) i.e. c(θ) is J × 1.
Linear restrictions are a special case: c(θ) = Rθ − q.
Let θˆ be the unrestricted MLE and θˆR be the restricted
MLE:
θˆ = arg max L(θ); θˆR = arg max L(θ) s.t. c(θ) = 0,
θ
θ
where L(θ) is the likelihood function.
There are three large sample tests based on L(θ):
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample hypothesis tests
19/24
ˆ
Wald Test: Based on unrestricted estimator θ:
h
i−1
d
ˆ 0 C(θ)I(
ˆ θ)
ˆ −1 C(θ)
ˆ0
ˆ →
W = c(θ)
c(θ)
χ2J
(12)
under H0 as n → ∞, where
C(θ) =
∂c(θ)
(J × m).
∂θ0
and I(θ) denotes the information matrix
2
∂ ln L(θ)
I(θ) = −E
(m × m).
∂θ∂θ0
Easy to use when restrictions are difficult to impose on the
model.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample hypothesis tests
20/24
Likelihood Ratio Test: Based on both θˆ and θˆR .
Let
λ=
L(θˆR )
.
ˆ
L(θ)
Then LR = −2 ln λ i.e.
h
i
d
ˆ − ln L(θˆR ) →
χ2J
LR = 2 ln L(θ)
under H0 as n → ∞.
It can be of interest to compare θˆ and θˆR .
EC501 Econometric Methods and Applications
4. Large Sample Methods
(13)
Large sample hypothesis tests
21/24
Lagrange Multiplier Test: Based on θˆR :
h
i−1
d
LM = g(θˆR )0 I(θˆR )
g(θˆR ) → χ2J
(14)
under H0 as n → ∞, where
g(θ) =
∂ ln L(θ)
.
∂θ
Easy to use when it is difficult to estimate the unrestricted
model.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample hypothesis tests
22/24
The three tests are depicted above.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Large sample hypothesis tests
23/24
Which test should be used, and when?
A key consideration is how easy it is to estimate under H0
and H1 .
The tests are, however, asymptotically equivalent under
the null hypothesis.
But note that different inferences can be drawn from testing
the same hypothesis with the different tests.
In the CLRM, for example, it can be shown that
LM ≤ LR ≤ W.
In such circumstances, if LM rejects H0 , then so will LR and
W, while if W does not reject H0 , then neither will LM or LR.
EC501 Econometric Methods and Applications
4. Large Sample Methods
Summary
24/24
Summary
large sample convergence concepts
maximum likelihood estimation
large sample hypothesis tests
Next week:
OLS in large samples
instrumental variables estimation
EC501 Econometric Methods and Applications
4. Large Sample Methods