Chapter 2: Large sample theory Part IV Florian Pelgrin September-December, 2010

Chapter 2: Large sample theory
Part IV
Florian Pelgrin
HEC
September-December, 2010
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
1 / 46
Introduction
1. Introduction
...Under certain conditions, the OLS estimator is BLUE or one can
get the exact sampling distribution (e.g., Gaussian linear model).
In a more general framework (without the Gauss-Markov
assumptions or the normality assumption), it is not always
possible to find (best linear) unbiased estimators or the exact
sampling distribution.
What can be done?
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
2 / 46
Introduction
Solutions:
1. One may settle for estimators that are (weakly, strongly)
consistent, meaning that as the sample size goes to infinity, the
distribution of the estimator collapses to the parameter value:
?
βˆn,OLS → β0
n→∞
where β0 is the true unknown parameter vector.
Consistency is then the ”counterpart” of unbiasedness
for large samples.
2. Asymptotic or large sample theory tells us about the distribution of
the OLS estimator if the sample is sufficiently large:
√
?
n(βˆn,OLS − β0 ) → N (., .)
n→∞
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
3 / 46
Introduction
To obtain these results, one applies:
The Law of Large Numbers (LLN);
The Central Limit Theorem (CLT).
Using the law of large numbers (and suitable assumptions) shows
the consistency property:
?
βˆn,OLS → β0
n→∞
Using the central limit theorem (and suitable assumptions) shows
the large sample distribution:
√
?
n(βˆn,OLS − β0 ) → N (., .)
n→∞
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
4 / 46
Introduction
The law of large numbers and the central limit theorem invoke
different modes of convergence
Indeed, there is not a once-for-all convergence definition for
random variables (vectors, matrices) as in sequences of real
numbers.
Among others,
Almost sure convergence (a.s)
Convergence in probability (p)
Convergence in L2 , in mean square or in quadratic mean (m.s or
L2 )
Convergence in distribution
Each convergence notion provides an essential foundation for
certain results and requires some suitable assumptions.
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
5 / 46
Introduction
Some relations exist between these different modes:
quadratic mean ⇒ probability ⇒ distribution.
All in all, different laws of large numbers and central limit theorems
exist, i.e. different conditions that apply to different kinds of
economic and financial data (e.g., time series versus
cross-section data).
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
6 / 46
Introduction
1
2
Introduction
Modes of convergence
Almost sure convergence
Convergence in probability
Convergence in mean square
Convergence in distribution
Some handy theorems
3
Law of large numbers
Overview
Law of large numbers in the i.i.d case
4
Consistency of the OLS estimator
5
Central limit theorems
Overview
Univariate central limit theorem with i.i.d. observations
Multivariate central limit theorem with i.i.d. observations
Large sample distribution of the OLS estimator
6
7
8
Extension: Delta method
Example
Univariate Delta method
Multivariate Delta method
Summary
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
7 / 46
Modes of convergence
2. Mode of convergence
We are mainly concerned with four modes of convergence:
1
Almost sure convergence
2
Convergence in probability
3
Convergence in quadratic mean
4
Convergence in distribution.
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
8 / 46
Modes of convergence
Almost sure convergence
2.1. Almost sure convergence
Definition (Almost sure convergence)
Let X1 ,X2 ,· · · ,Xn be a sequence of (real-valued) random variables. Let
X be a stochastic or non-stochastic variable. Xn converges almost
surely to X , if, for every > 0,
P lim |Xn − X | < = 1.
n→∞
It is written:
a.s
Xn → X .
Remark: For sake of simplicity, we assume that X is a degenerate
random variable, i.e. a real number, say c.
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
9 / 46
Modes of convergence
Almost sure convergence
Definition
A point estimator θˆn of θ0 is strongly consistent if:
a.s
θˆn → θ0 .
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
10 / 46
Modes of convergence
Convergence in probability
2.2. Convergence in probability
Definition
Let {Xi } i = 1, · · · , n be a sequence of real-valued random variables.
p
Xn converges in probability to c, written Xn → c or plimXn = c, if there
exists c such that for all > 0,
lim P (|Xn − c| > ) = 0
n→∞
1
...Xn is very likely to be close to c for large n, but what about the location of the
remaining small probability mass which is not close to c?...
2
Convergence in probability allows more erratic behavior in the converging
sequence than almost sure convergence.
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
11 / 46
Modes of convergence
Convergence in probability
Definition
A point estimator θˆn of θ0 is (weakly) consistent if
p
θˆn → θ0 .
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
12 / 46
Modes of convergence
Convergence in mean square
2.3. Convergence in mean square
Definition
Let {Xi } i =1, · · · , n be a sequence of real-valued random variables
such that E |Xn |2 < ∞. Xn converges in mean square to c, written
m.s.
Xn → c, if there exists a real number c such that:
h
i
E |Xn − c|2 → 0.
n→∞
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
13 / 46
Modes of convergence
Convergence in distribution
2.4. Convergence in distribution
Definition
Let X1 ,· · · ,Xn be a sequence of random variables and let X be another
random variable. Let Fn and F denote the cumulative distribution
function of Xn and X , respectively. Xn converges in distribution to X :
lim Fn (t) = F (t)
n→∞
for all t such that F is continuous.
Convergence in distribution is written:
d
l
a
Xn → X or Xn → X or Xn ∼ X .
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
14 / 46
Some handy theorems
Modes of convergence
2.5. Some handy theorems
Theorem (Continuous mapping theorem)
Let g : Rm → Rp (m, p ∈ N) be a multivariate function. Let {Xi }
i = 1, · · · , n denote any sequence of random m × 1 vectors such that
Xn converges almost surely to c. If g is continuous at c, then:
a.s
g(Xn ) → g(c).
Example: Suppose that X1 ,· · · ,Xn are i.i.d. P(λ). Then:
p
¯n →
X
λ
(WLLN)
and (using the continuous mapping theorem):
p
¯n ) → exp(−λ).
exp(−X
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
15 / 46
Modes of convergence
Some handy theorems
Theorem (Slutsky)
Let Xn and Yn be two sequences of random variables.
p
p
p
p
p
p
p
1
If Xn → X and Yn → Y , then Xn + Yn → X + Y .
2
If Xn → X and Yn → Y , then Xn Yn → XY .
3
If Xn → X and Yn → c, then Xn /Yn → X /c.
p
p
Remark: This also holds for sequences of random matrices, the last
p
p
statement reads: If Xn → Ω, then Xn−1 → Ω−1 (provided Ω−1 exists).
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
16 / 46
Modes of convergence
Some handy theorems
Theorem (Slutsky!!!)
Let X1 ,· · · , Xn and Y1 ,· · · , Yn be two sequences of random variables
and let X and c be a random variable and a constant, respectively. If
p
d
Xn → X and Yn → c
then,
d
Xn Yn → cX
d
Xn + Yn → X + c
d
Xn /Yn → X /c
for c 6= 0.
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
17 / 46
Law of large numbers
Overview
3. Law of large numbers.
3.1. Overview
The law of large numbers tells you that sample moments converge in
probability (weak law of large numbers), almost surely (strong law of
large numbers), or in Lp (Lp law of large numbers) to the
corresponding population moments:
n
1X r
r
¯
Xn ≡
Xi → E X¯nr .
n→∞
n
i=1
...”the probability that the sample mean of order r gets close to the
population mean of order r can be made as high as you like by taking
a sufficiently large enough sample”
Example: The proportion of heads of a large number of (independent)
tosses of a fair coin is expected to be close to 1/2.
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
18 / 46
Law of large numbers
Overview
Example: Xi ∼ U[−0.5,0.5] for all i = 1, · · · , 1000.
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
19 / 46
Law of large numbers
Overview
General form of a law of large numbers:
Suppose restrictions on the dependence, the distribution, and
moments of a sequence of random variables {Zi }, then:
¯ nr a.s.
Z¯nr − m
→0
n
P
where Z¯nr ≡ n−1 Zir and mnr ≡ E Z¯nr .
i=1
Generally, four cases:
1
2
3
4
Independent and identically distributed observations;
Independent and heterogeneously distributed observations;
Dependent and identically distributed observations;
Dependent and heterogeneously distributed observations.
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
20 / 46
Law of large numbers
Law of large numbers in the i.i.d case
3.2. Law of large numbers in the i.i.d case
Theorem (Khinchine)
If {Zi } , i = 1, · · · , n, is a sequence of independently and identically
distributed random variables with finite mean E(Zi ) = µ0 (< ∞), then:
n
1X p
Zi → µ0 .
Z¯n =
n
i=1
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
21 / 46
Law of large numbers
Law of large numbers in the i.i.d case
Theorem (Kolmogorov)
If {Zi } , i = 1, · · · , n, is a sequence of independently and identically
distributed random variables such that:
E(Zi ) = µ0 < +∞
E(|Zi |) < +∞
then:
n
1 X a.s
Z¯n =
Zi → µ0 .
n
i=1
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
22 / 46
Consistency of the OLS estimator
4. Consistency of the OLS estimator
Theorem (Consistency in the i.i.d case)
Under suitable regularity conditions, the OLS estimator of β0 in the
multiple linear regression model
yi = xi0 β0 + ui
satisfies:
p
βˆn,OLS → β0
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
23 / 46
Consistency of the OLS estimator
Proof: The ordinary least squares estimator is given by:
!−1
!
n
n
X
X
βˆn,OLS =
xi x 0
xi yi .
i
i=1
i=1
STEP 1: Replace yi by xi0 β0 + ui and expand:
!−1
!
n
n
X
X
βˆn,OLS =
xi x 0
xi (x 0 β0 + ui )
i
i
i=1
= β0 +
n
X
xi xi0
i=1
!−1 n
X
i=1
xi ui .
i=1
STEP 2: Sample mean specification (multiply and divide by n):
!−1
!
n
n
X
X
1
1
0
βˆn,OLS = β0 +
xi xi
xi ui .
n
n
i=1
Florian Pelgrin (HEC)
Ordinary least squares estimator
i=1
September-December, 2010
24 / 46
Consistency of the OLS estimator
STEP 3: Using the weak law of large numbers (conditions hold!):
n
1X 0 p
xi xi → E[xi xi0 ]
n
i=1
n
1X
n
p
xi ui → E[xi ui ].
i=1
STEP 4: Using Slutsky theorem:
p
βˆn,OLS → β0 + E−1 [xi xi0 ]E[xi ui ]
p
βˆn,OLS → β0
since E−1 [xi xi0 ] exists and E(xi ui ) = 0k×1 (implication of the zero
conditional mean condition).
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
25 / 46
Consistency of the OLS estimator
Inconsistency of OLS estimator
Failure of the zero conditional mean assumption, E(u | X ) = 0n×1 ,
causes bias.
Failure of E(xi ui ) = 0k×1 causes inconsistency.
Consistency only requires zero correlation between u and X
(implied by and weaker than the unconditional mean
independence assumption).
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
26 / 46
Central limit theorems
Overview
5. Central limit theorems
5.1. Overview
One needs more than consistency to do inference:
The sampling distribution of the OLS estimator
On the one hand, consistency (and thus the use of laws of large
numbers) yields degenerated or point-mass distribution.
On the other hand, the exact sampling distribution of the OLS
estimator was obtained under the (conditional) normality of u
(u ∼ N (., .) or u | X ∼ N (., .)).
In practice, many outcomes (under study) are not (conditionally)
normal!
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
27 / 46
Central limit theorems
Overview
In this respect, large sample theory tells us that the distribution of
the OLS estimator is approximately normally distributed.
In doing so, one applies central limit theorems:
...”sample moments are asymptotically normally distributed (after
re-normalizing) and the asymptotic variance-covariance matrix is
given by the variance of the underlying random
variable”...or...”appropriately normalized sample moments are
approximately normally distributed in large enough samples”...
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
28 / 46
Central limit theorems
Overview
Example: Xi ∼ U[−0.5,0.5] for all i = 1, · · · , 1000.
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
29 / 46
Central limit theorems
Overview
General form of a central limit theorem:
Suppose restrictions on the dependence, the distribution, and
moments of a sequence of random variables {Zi }, then:
¯n
Z¯n − m
σ¯n
√
n
=
√ Z¯n − m
¯ n a.d.
n
→ N (0, 1)
σ¯n
n
√
P
where Z¯n ≡ n−1 Zi , mn ≡ E Z¯n , and σ¯n 2 ≡ V nZ¯n .
i=1
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
30 / 46
Central limit theorems
Univariate central limit theorem with i.i.d. observations
5.2. Univariate central limit theorem with i.i.d.
observations
´
Theorem (Lindeberg-Levy)
Let {Zi } denote a sequence of independent and identically distributed
real random variables, with mean µ0 = E (Zi ) and variance
σ02 = V (Zi ) < ∞. If σ02 6= 0, then:
√
√
¯n /¯
σn =
n Z¯n − µ
n Z¯n − µ0 /σ0
= n−1/2
n
X
a.d.
(Zi − µ0 ) /σ0 → N (0, 1)
i=1
where Z¯n =
n
1P
Zi ,
n
i=1
µ
¯n =
n
1P
µ0
n
i=1
= µ0 , and σ
¯ n = σ0 .
Application: Let X1 ,· · · ,Xn denote a sequence of independent and identically
distributed random variables, Xi ∼ B(p),
√ `
´ a.d.
¯n Ordinary
Florian Pelgrin (HEC)
leastN
squares
estimator
September-December, 2010
n X
−p →
(0, p(1
− p)).
31 / 46
Central limit theorems
Univariate central limit theorem with i.i.d. observations
1
´ theorem with
If we compare the conditions of the Lindeberg-Levy
the law of large numbers for independent and identically
distributed observations, only one single additional requirement is
imposed: σ02 = V (Zi ) < ∞. This implies that E|Zi | < ∞.
2
The central limit theorem requires virtually no assumptions (other
than independence and finite variances) to end up with normality:
normality is inherited from the sums of ”small” independent
disturbances with finite variance.
3
The central limit theory is ”stronger” than the law of large
numbers—conclusions
can be inferred regarding the speed of
√
convergence ( n) and the asymptotic behavior of the distribution.
4
No general result to check how good the approximation is in
`
general (e.g., the Berry-Esseen
inequality).
5
The central limit theorem does not assert that the sample mean
tends to normality. It is the transformation of the sample mean that
has this property!!!
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
32 / 46
Central limit theorems
Multivariate central limit theorem with i.i.d. observations
5.3. Multivariate central limit theorem with i.i.d.
observations
Theorem
Let Z1 ,Z2 ,· · · ,Zn be independent and identically distributed random
vectors (of dimension k )—Zi = (Z1i , Z2i , · · · , Zki )t with mean vector
µ0 = (µ1,0 , · · · , µk,0 )t and a positive definite variance-covariance
matrix Σ0 . Let


Z¯1,n


Z¯n =  ... 
Z¯k,n
n
P
where Z¯j,n = n−1 Zji (with j = 1, · · · , k ). Then:
i=1
√
Florian Pelgrin (HEC)
a.d.
n Z¯n − µ0 → N (0, Σ0 ).
Ordinary least squares estimator
September-December, 2010
33 / 46
Large sample distribution of the OLS estimator
6. Large sample distribution of the OLS estimator
Theorem
Consider the multiple linear regression model:
yi = xi0 β0 + ui
with assumptions H1-H5. Then, under suitable regularity conditions,
the large sample distribution of the OLS estimator in the i.i.d. case is
given by:
√
d
n(βˆn,OLS − β0 ) → N (0k×1 , σ02 E−1 (xi xi0 ))
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
34 / 46
Large sample distribution of the OLS estimator
Proof: The ordinary least squares estimator is:
βˆn,OLS =
n
X
!−1
xi xi0
n
X
i=1
!
xi yi
i=1
STEP 1: Proceed as in the consistency result:
βˆn,OLS = β0 +
n
X
!−1
xi xi0
i=1
n
X
xi ui
i=1
STEP 2: Normalize the vector (βˆn,OLS − β0 ):
√
n
n(βˆn,OLS − β0 ) =
1X 0
xi xi
n
!−1
i=1
Florian Pelgrin (HEC)
Ordinary least squares estimator
n
1 X
√
xi ui
n i=1
!
September-December, 2010
35 / 46
Large sample distribution of the OLS estimator
STEP 3: Weak law of large numbers and central limit theorem
Using the WLLN (and Slutsky theorem):
n
1X 0
xi xi
n
!−1
p
→ E−1 [xi xi0 ]
i=1
Using the CLT:
n
1 X
√
xi ui
n i=1
!
d
→ N (0k×1 , V(xi ui ))
where
V(xi ui ) = E (V(xi ui | xi )) + V (E(xi ui | xi ))
= E xi V(ui | xi )xi0 + 0
= σ02 E(xi xi0 )
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
36 / 46
Large sample distribution of the OLS estimator
STEP 4: Using Slutsky theorem (convergence in distribution)
implies that:
n
1X 0
xi xi
n
i=1
!−1
n
1 X
√
xi ui
n i=1
!
d
→ A−1 Z
d
with A = E[xi xi0 ], Z → N (0k×1 , σ02 A)
from which it follows:
√
d
n(βˆn,OLS − β0 ) → N A−1 0k×1 , σ02 A−1 AA−1
d
→ N 0k×1 , σ02 A−1 .
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
37 / 46
Large sample distribution of the OLS estimator
Remarks:
1. σ02 A−1 is unknown!
2. A consistent estimator of σ02 is:
2
σ
ˆn,OLS
= (n − k)−1
3. The sample analog of
A−1
n
n
P
2
ˆi2 or σ
u
˜n,OLS
= n−1
i=1
n
P
i=1
ˆi2
u
is:
1X 0
xi xi
n
!−1
= n(X 0 X )−1
i=1
4. A consistent estimator of the asymptotic variance-covariance
matrix of βˆn,OLS is:
2
Vasy = σ
ˆn,OLS
(X 0 X )−1
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
38 / 46
Large sample distribution of the OLS estimator
Definition
A consistent estimator θˆn of θ0 is said to be asymptotically normally
distributed (asymptotically normal) if:
√ a.d.
n θˆn − θ0 → N (0, Σ0 ) .
Equivalently, θˆn is asymptotically normal if:
a
θˆn ∼ N θ0 , n−1 Σ0
ˆ n.
Vasy θˆn ≡ avar θˆn = n−1 Σ
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
39 / 46
Extension: Delta method
Example
7. Extension: The Delta method
7.1. Example
Consider the following generalized learning curve (Berndt, 1992):
αc /R
Ct = C1 Nt
(1−R)/R
Yt
exp(ut ) t = 1, · · · , T
where Ct ,Nt , Yt , and ut denote respectively the real unit cost at
time t, the cumulative production up to time t, the production in
time t, and an i.i.d.(0, σ 2 ) error term. The two structural
parameters are αc (the learning curve parameter) and R (the
returns to scale parameter).
The log-linear model writes:
α 1−R
c
log(Ct ) = log(C1 ) +
log(Nt ) +
log(Yt ) + ut
R
R
= xt0 β + ut
(reduced-form equation)
where β0 = logC1 , β1 = αc /R, β2 = (1 − R)/R, and
xt = (1, logNt , logYt )0 .
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
40 / 46
Extension: Delta method
Example
Starting from the asymptotic distribution of the estimator of β (the
reduced-form model), can we back out the asymptotic distribution
of the structural parameters (αc , R)0 ?
Three ingredients:
1
2
3
The asymptotic distribution of the estimator of β has to be known...
The structural parameters (or the parameters of interest) must be
some functions of β.
Example : The learning curve may be recovered using:
αc
=
R
=
β1
= g1 (β)
1 + β2
1
= g2 (β).
1 + β2
Regularity condition(s)...
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
41 / 46
Extension: Delta method
Univariate Delta method
7.2. Univariate Delta method
Proposition
Let Z1 ,· · · ,Zn be a sequence of independent and identically distributed
real random variables, with mean E(Zi ) = µ0 and V(Zi ) = σ02 < ∞. If
σ02 6= 0 and g is a continuously differentiable function (from R to R) with
g 0 (µ0 ) 6= 0, then:
2 !
√
a.d.
dg
n g(Z¯n ) − g(µ0 ) → N 0, σ02
(µ0 )
.
dz
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
42 / 46
Extension: Delta method
Univariate Delta method
Example: Let X1 ,· · · ,Xn be i.i.d. B(p) random variables. By the central
limit theorem:
√
d
¯n − p →
N (0, p(1 − p)).
n X
¯n ) =
Find the limiting distribution of g(X
Take g(s) =
s
1−s
√
and g 0 (s) =
1
.
(1−s)2
We have:
d
¯n ) − g(p) →
n g(X
N
Florian Pelgrin (HEC)
¯n
X
¯n ?
1−X
p
0,
(1 − p)3
Ordinary least squares estimator
.
September-December, 2010
43 / 46
Extension: Delta method
Multivariate Delta method
7.3. Multivariate Delta method
Proposition
Suppose that the conditions of the multivariate central limit theorem for
independent and identically random vectors hold. If g is a continuously
differentiable function from Rk to Rp , then:
√
a.d.
∂g
∂g t
¯
n g(Zn ) − g(µ0 ) → N 0, t (µ0 )Σ0
(µ0 )
∂z
∂z
where
∂g
∂z t
denotes the p × k Jacobian matrix of g and:
√
a.d.
n Z¯n − µ0 → N (0, Σ0 ) .
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
44 / 46
Extension: Delta method
Multivariate Delta method
Example: The generalized learning curve (Berndt, 1992)
Using the reduced-form model:
0 −1 !
√ XX
a.d.
2
T βˆOLS − β → N 0, σ
T
or (using a consistent estimator of σ 2 ):
a
βˆOLS ∼ N 0, σ
ˆ 2 (X 0 X )−1 .
Therefore,
α
ˆc
ˆ
R
a
∼N
0 ˆ
ˆ
∂g(β)
2
0
−1 ∂g (β)
g(β),
σ
ˆ
(X
X
)
∂β 0
∂β
where g(β) = (αc , R)0 and

ˆ
ˆ
∂g1 (β)
∂g1 (β)
∂g(β)  ∂β
∂β2
1
=
ˆ
ˆ
∂g2 (β)
∂g2 (β)
∂β 0
∂β1
Florian Pelgrin (HEC)
∂β2
ˆ
∂g1 (β)
∂β3
ˆ
∂g2 (β)
∂β3

=
Ordinary least squares estimator
0
0
1
1+β2
0
!
−β1
(1+β2 )2
−1
(1+β2 )2
!
September-December, 2010
.
45 / 46
Summary
8. Key concepts
How can we define the convergence of sequence of real random
variables? Are they equivalent?
What is the interpretation of a weak (strong) law of large
numbers? Why do we use it?
What does consistency mean?
Show the weak consistency of the ordinary least squares
estimator of β0 (and σ02 ) with i.i.d. observations.
What is the interpretation of central limit theorems?
Show the asymptotic distribution of the ordinary least squares
estimator of β0 with i.i.d. observations.
What is the delta method? Why is it useful?
Florian Pelgrin (HEC)
Ordinary least squares estimator
September-December, 2010
46 / 46