Download Report

An Overview of the
Classical Regression Model
1
Assumptions of the Classical
Regression Model


Assume we want to investigate the
general relationship:
yt = g(x1t, x2t,...,xkt|1,2,..,k,2)
To use linear regression techniques for
parameter estimation we need the
following assumptions:
A1- g(·) is Klinear in parameter vector
xit i
: E(yt)= 
i 1
A2- xit are non-stochastic variables
A3- et, t = 1,2, …, T, are independently
and identically distributed:
K
et  yt   xit i
i 1
E(et) =E[yt-Xβ]= E[yt-E(yt)]=0
2
Assumptions of the Classical
Regression Model (cont)
A4- Homoskedastic error term:
E[(e-E(e))2]=E(e2)=σ2IT →
V(et) =  2, t = 1, …, T.
A5- The determinant of (XX) is nonzero. The k exogenous variables are
not perfectly collinear
 If A.5 fails → parameters can not
be estimated
T≥k, must be at least as many obs. as
there are parameters to estimate
|X'X| ≠0 but “very small”→
“collinearity problem”→ β’s can be
estimated but the estimator is
imprecise. Why?
 Var(βs)=σ2(XX)-1
3
Assumptions of the Classical
Regression Model (cont)
 If we want to use Maximum Likelihood
Techniques to obtain parameter
estimates we need the following:
A6- Normality assumption:
et  N(0, 2), t = 1, …, T
4
Overview of the Classical
Regression Model


Although a linear model in terms of the
parameters we can allow for a
nonlinear relationship between
exogenous and dependent variables via
the use of alternative functional forms
or variable construction (Stewart, Ch.
6; Greene Section 7.3, 124-130)
4 RHS variables
Example:
 Let Xt=[1 zt ln(zt) (w2tzt)]
(4 x 1)
 yt= [1 zt ln(zt) (wt2zt)]β + et
 yt a linear function of the βi’s
 yt a non-linear function of
explanatory variables (e.g.,
marginal effects)
5
Overview of the Classical
Regression Model
 yt= [1 zt ln(zt)
(wt2zt)]β + et
 Marginal effects can be represented as:
∂yt/∂zt=β2+β3/zt+β4wt2
∂yt/∂wt=2β4wtzt
 Note nonlinear marginal effects with
respect to exogenous variables
6
Overview of the Classical
Regression Model
Yt = 1 + (1/Xt)2 + et
(with 1>0, 2<0)
Yt
β1
Yt = 1
E[Yt] = 1 + (2/Xt)
dYt/dXt= -2/(Xt2)
Xt
-2/1
yx= Elasticity of Y wrt X =
[dYt/dXt] Xt/Yt=
[-2/(Xt2)]Xt/Yt = -2/(XtYt)
7
Overview of the Classical
Regression Model
Yt
error term
Yt =aXt1exp(et)
ln(Yt) = 0 + 1ln(Xt) + et
0=ln(a), 0 < 1 < 1, a = exp(0)
Elas. Y wrt X≡dlnYt/dlnXt = 1
E  ln Yt   0  1 ln  X t 

E(Yt )  aXt 1 exp 2 2


E  exp  et   

2
exp

2 1


0



Xt
Sometimes referred to as the CobbDouglas functional form w/more than 1
exogenous variable
8
Classical Regression
x is (1 x K) Model Summary
t
β is (K x 1)
K
y t = E  y t  + e t = β0 +  β k x kt + e t
k=2
 x tβ + et
where xt is non-stochastic and not
all identical
conditional
mean
E  et   0 E  y t  = x tβ
Var  e t   Var(y t )  σ 2
Cov  e t ,es   Cov  y t , ys   0 for t  s
 

 e t ~ 0,σ 2 and y t ~ x tβ,σ 2

If et, yt are normally distributed:
→ et~N(0,σ2) yt~N(xtβ, σ2)
9
Example of Food Expenditures
40-Households

Weekly food exp. and income data
Food Expenditures vs. Income
60
Food Expenditures
50
40
30
20
10
0
0
20
40
60
80
Household Income
100
120
140
10
Example of Food Expenditures
40-Households
 Suppose we have two different
estimators to obtain estimates of β0, β1
Food Expenditures vs. Income
60
Food Expenditures
50
40
yt = 0 0+ 1 0xt
30
yt = 01 + 11 xt t
20
10
0
0

20
40
60
80
Household Income
100
120
140
Which estimator is “preferred” ?
11
Our Classical
Regression Model
 Given the above data and theoretical
model we need to obtain parameter
estimates to place the mean expenditure
line in expenditure/income (X) space
 Would expect the position of the line
to be in the middle of the data points
 What do we mean by middle as some
et’s are positive and others negative
 We need to develop some rule to
locate this line
12
Our Classical
Regression Model

Least Squares Estimation rule
 Chose a line such that the sum of the
squares of the vertical distances from
each point to the line defined by the
coefficients (SSE) be as small as
possible
 The above line refers to E(yt)
 SSE= Σtet2= Σt(yt-β0-β1xt)2
et
 Graphically, in the above scatter plot,
the vertical distance from each point
to the line representing the above
linear relationship are called residuals
or regression errors
13
Our Classical
Regression Model
 yt = E  yt  + et = β0 + β1x t + et
 Let ˆ , ˆ
be an initial “guess” of the
intercept and slope coefficients
0
1
  eˆ t = y t - yˆ t = y t - βˆ 0 - βˆ 1x t
initial error
term “guess”
negatuve of
initial conditional
mean “guess”
14
Our Classical
Regression Model
y t = yˆ t + eˆ t = bˆ 0 + bˆ1x t + eˆ t
y3
yt
eˆ 3 = y3 - yˆ 3
ŷ 3
conditonal
mean
ŷ t = bˆ 0 + bˆ1X t
ŷ 4
eˆ 4 = y 4 - yˆ 4
β̂0
X3
y4
X4
T
S º SSE = å
2
T
(y t - yˆ t ) = å eˆ 2t
t= 1
Xt
t= 1
15
Our Classical
Regression Model
 Note that the SSE’s can be obtained via
the following:
(1xT)
(Tx1)
 e1 
SSE=e'e
e 
 
eT 
 T x1
(1x1)
16
Our Classical
Regression Model
Naïve
Model
ˆ e*t
y t = m+
y t = yˆ t + eˆ t
ŷ t = bˆ 0 + bˆ1X t
y3
yt
eˆ 3 = y3 - yˆ 3
ŷ 3
e*3 = y3 -μˆ
̂
ˆ ˆ  e*e*
Note: e'e
 SSE  SSE *
ŷ 4
e*4 = y4 -μˆ
eˆ 4 = y 4 - yˆ 4
y4
Xt
17
Our Classical
Regression Model
 Under the least squares estimation rule
we want to choose the value of β0 and β1
to minimize the error sum of squares
(SSE)
 Can we be assured that whatever values
of β0 and β1 we choose, they do indeed
minimize the SSE?
18
T
y t = 1 +  2 x t + et  SSE    yt  yˆt 
Our Classical
2
t 1
Regression Model
19
Our Classical
Regression Model

Lets look at the FOC for the
minimization of the sum of squared
errors (SSE) as a means of obtaining
estimates of β1,β2
(2 x 2)

β1s 
-1
βS =   =  X X  X' Y
β 2s  (2xT)(Tx2) (2xT) (Tx1)
(2 x 1)
(2 x T)
(2 x 1)

T
ˆ =  eˆ 2  eˆ eˆ
SSE
t
t=1
(1 x 1)
ˆ
where e=y-Xβ
S
Estimated
value
20
Our Classical
Regression Model
 Can we be assured that the SSE function

is convex not concave wrt β’s?
SSE   YY  2βXY  βXXβ 

β
β
 2XY  2XXβ
 The matrix of second derivatives of SSE
with respect to β1 and β2 can be shown to
be:
HSSE
 2T
 2SSE

 2XX  
2
β
 2Tx 2

2Tx 2 

T
2
2 x 2t 
t=1

 HSSE must be positive definite for
convexity
 To be positive definite, every principle
minor of HSSE needs to be postive
21
Our Classical
Regression Model

HSSE
 2T
 2SSE

 2XX  
2
β
 2Tx 2

2Tx 2 

T
2
2 x 2t 
t=1

 The two diagonal elements must be
positive
T


 HSSE  4T   x22t   T x22 
 t 1 
T
 4T   x2t  x22
t 1



2


 |HSSE| is positive unless all values of
x2 are the same
 →HSSE is positive definite
 → SSE convex wrt β’s
22
Our Classical
Regression Model

For our 40 HH Food Expenditure Data
1 25.83 
1 34.31 

X =


1 115.46 



βs   XX 
1
 9.46 
10.56 

Y


 48.71


7.3832 
XY = 

0.2323


23
Our Classical
Regression Model
Food Expenditures vs. Income
60
Food Expenditures
50
dYt/dIt=0.2323
40
30
20
10
7.3832
0
0
20
40
60
80
100
120
140
Household Income
24
Sampling Properties of
Estimated Coefficients



“True” relation: Y=X+e
Use random variable, Y to generate
estimate of unknown coefficients
S=(XX)-1XY
S is a random variable with a
distribution
 S will vary from sample to sample
 What is the E(βs)?
25
Sampling Properties of
Estimated Coefficients

Properties of Least Squares Esitmator
 Does E(S) =  e.g., unbiased)?
Y=Xβ+e
βs
-1
E  βS  = E  X 'X  X 'Y  =
True unknown


value
-1

E  X 'X  X '  Xβ + e   =


-1
-1
E  X 'X  X 'Xβ +  X 'X  X 'e  =


-1

E Iβ +  X 'X  X 'e  =


-1
E  Iβ  + E  X 'X  X 'e  =


β +  X 'X  X 'E  e  = β
-1
unbiased
estimate
=0
26
Sampling Properties of
Estimated Coefficients
 Properties of Least Squares Esitmator
 What is the covariance matrix, S, of
the estimated parameters?
 Σβ = σ2(X'X)-1 K=2
(K x K)
 What is a reasonable estimate of 2?
 σ2 ≡ variance of et =
E[(et-E(et))2]=E(et2) with E(et)=0
 Up to this point σ2 assumed known
 eˆ t = y t - yˆ t = y t -β1S -X 2tβ 2S
T
2
2
ˆ
σ
=
e
 S å t
t=1
T due to iid assumption
27
Sampling Properties of
Estimated Coefficients
 Is this an unbiased estimator of σ2
Standardize the SSE by the number of
parameters in the regression model
T
å
2
ê t
eˆ ¢eˆ
=
T-K
T-K
σ 2u = t-1

Given the above:
yy  yXβs  βs Xy
y  Xβs   y  Xβs 

 βs XXβs
2
σU 

TK
TK
yy  yXβs  βs Xy  yX  XX  XXβs
-1

TK
yy  yXβs  βs Xy  yXβs yy  βs Xy


TK
TK
28
Sampling Properties of
Estimated Coefficients
 In contrast to our least squares estimate of


β which is linear form of y, the above is a
quadratic form of the observable random
vector y
The above implies that σU2 is a random
variable and that our estimate of σ2 will
vary from sample to sample
We have derived the E(σU2)
Lets now evaluate the variance of the
random variable σU2
We showed in a previous handout that
es′es=e(IT-X(X′X)-1X′)e=e′Me where M is
an idempotent matrix.
True unknown
CRM error
error
Before we examine the variance of σU2
lets talk about the PDF of e′Me/σ2
29
Sampling Properties of
Estimated Coefficients
 Lets assume that e~N(0,σ2IT)

I will show a little later that
βl=βs=(X′X)-1X′Y ~N(β,σ2(X′X)-1)
where βl is the maximum likelihood
estimator of the unknown CRM
coefficients assuming normality
Given this assumption, lets look at
e′Me/σ2
The numerator in the above is a
quadratic form involving the normal
random vector, e
On page 52 of JHGLL, and Section
A.19, the distributional characteristics
of quadratic forms of normal RV’s are
discussed
30
Sampling Properties of
Estimated Coefficients
 The implications of this discussion that
with e~N(0,σ2IT) and M idempotent,
e′Me/σ2 is distributed χ2 with DF equal
to the rank of M where the rank of an
idempotent matrix equals its trace
tr(M) = tr(IT-X(X′X)-1X′)
= tr(IT)-tr[X(X′X)-1X′]
= tr(IT)-tr[X′X(X′X)-1]
tr(ABC)=tr(CAB)
= tr(IT) – tr(IK)
=T-K = rank of M



2
T

K


 U
  U2
2
~


2
T  K 
es es
2

eMe
2
~ T2  K
a constant
T2  K
31
Sampling Properties of
Estimated Coefficients
 We can use the above result find the
variance of σU2
A characteristic of a RV that is
distributed χ2 is that its variance is
equal to twice its DF
 T  K  U2 
  2 T  K 
  var 
2



T  K



2
4
 
var
   2 T  K 
2
U
4
2

 var  U2 
T  K 
Note that in order for us to say
something about the variance of our
error term under the CRM we needed
the additional normality assumption 32
Sampling Properties of
Estimated Coefficients
 Given the normality assumption of the


error term, →βs=βl ~ N(β,σ2(X′X)-1)
I would like to now show that the random
vector βs(=βl) is independent of the
random variable σU2 (p.29-30 of JHGLL)
Since σU2=e′ses/(T-K), βl and σU2 will
be independent if es (= el) and βs (=βl)
are independent
Given the above assumptions, both el and
βl are normal random vectors
To show that they are independent it is
sufficient to show that the matrix
containing the covariances between the
elements of el and βl are zero
This (T x K) covariance matrix can be
represented by E[el(βl-β)′]
(T x K)
33
Sampling Properties of
Estimated Coefficients
Previously we showed that
es=(IT-X(X′X)-1X′)e [= el]
We also know that: βs (=βl) =
(X′X)-1X′y = (X′X)-1X′(Xβ+e)
=β+(X′X)-1X′e
true unknown value

→βl- β= (X′X)-1X′e
This implies the covariance matrix can
E[ee′]=σ2IT
shown to be:
E el  β l  β  
(βl-β)′
el


-1
-1 





 E IT  X  X X  X ee X  X X 






 IT  X  XX  X E ee X  XX 

-1
2


-1
IT  X  XX  X X  XX   0TxK
-1
-1
34
Sampling Properties of
Estimated Coefficients
 The above results show that βl and σU2 are
independent
For more theoretical treatment refer to
section 2.5.9, bottom of page 52 in
JHGLL
35
Food Expenditure
Model Results Summary
FOODEXP = 7.3832 + 0.2323 INC
 4.008 (0.055)
Std. error
K=2 T=40
 u2
S S   u2
SSE=e'e =1780.4
1780.4

 46.853
40  2
 XX 
1
(X'X)-1
-0.0045548 
 0.342922
 46.853 

-0.0045548 6.525442e-005
16.0669 0.2134 



0.2134
0.0030


16.0669½ = 4.008
36
Our Classical
Regression Model


In summary, with K regressors, T
observations and our linear model:
 The random variable Y is composed of
nonstochastic conditional mean and an
unobservable error term: Y = Xb + e
bs= (XX)-1XY
bs is a linear function of the observable
random vector Y
bs is a random vector with a sampling
distribution
bs is unbiased
bs cov. matrix ≡ Sb = s2(XX)-1
 →bs ~(b, s2(XX)-1)
Finite sample properties, JHGLL:
198-209
This implies that with e~(0T, s2IT)
37
Y~(Xb, s2IT)
Our Classical
Regression Model

βs was obtained w/o knowing the
distribution of et

Lets compare the above estimate of β
with estimates obtained from other
linear and unbiased estimators (β*)
 βs = AY where A=(X'X)-1X'
 β*=CY where C is a (K x T) matrix
that is not a function of Y or the
unknown parameters (A is an
example of such a matrix)
 By assumption, E(βS)=E(β*)=β

Interested in finding the Best Linear
Unbiased Estimator (BLUE) of the true,
unknown parameter vector, β
38
Our Classical
Regression Model
 Is bs BLUE (e.g., minimum variance
compared to β*)?
 Gauss-Markov Theorem: Given the
CRM assumptions A1-A5, s is BLUE
 Multiple β’s→ βS is better than any
other linear unbiased estimator, β* if:
 Var(a'βS)  Var(a'β*) where a'β is
a is Kx1
any linear combination of β’s
constant
vector
Var(a'β*)
Var(a'βS)
Sβs, Sβ*
are (K x K)
 → a'(Sβ*)a  a'(Sβs)a  a
 a'(Sβ*-Sβs)a  0 for βS to be best
(1 x 1)
 To determine the above, I need to know
the characteristics of definite matrices
39
Our Classical
Regression Model
 Is bs BLUE (e.g., minimum variance
compared to β*)?
 a'(Sβ*-Sβs)a  0  a for βS to be best
(1 x 1)

 I want to show that if this holds  a,
(Sβ*-Sβs) is positive semi-definite
 The above is a characteristic of a
pos. semi-definite matrix (JHGLL,
p. 960)
 “A symmetric matrix D is positive
semi-definite iff C′DC≥0  C”
 Let D be Sβ*-Sβs and is symmetric
 The above shows that βs has the
“smallest” variance among all linear
unbiased estimators of β
→βS is Best Linear Unbiased Estimate of
40
β, the true unknown parameter
Our Classical
Regression Model

How well does the estimated equation
explain the variance of the dependent
variable?
 Lets first talk about the variation of
the dependant variable
unexplained
part
 Y1 


ˆ ˆ = Xβ +eˆ
 Y    Y=Y+e
s
explained
 YT 
by model
1x1
 YY   βs X+eˆ   Xβs +eˆ 
SSE
Sum of sq.  βs XXβs + eˆ Xβs + βs Xeˆ + eˆ eˆ
of yt’s
1x1
1x1
1x1 1x1
 βs XXβs + 2βs Xeˆ + eˆ eˆ
41
Our Classical
Regression Model


 Note that ê = I-X  XX  X Y
-1
 YY =βs XXβs +2βs Xe +eˆ eˆ
 βs XXβs +


-1

2βs X I-X  XX  X Y  + eˆ eˆ


= 0 given that X′[(I – X(X′X)-1X′)]=(X′-X′)=0
ˆ Y
ˆ +eˆ eˆ
 YY =βs XXβs +eˆ eˆ =Y
Sum of sq.
of yt’s (TSS)

Ŷ=XβS
Lets use the above but within the
framework of deviations from the mean
of Y (e.g. our naïve model)
42
Our Classical
Regression Model

We can represent the total variation about
the mean of the data via
ˆ + e = Xb + e
Y = Y
t
t
st
Subtract mean from
both sides
s
st
est
ˆ - Y )+ (Y - Y
ˆ )
Yt - Y = (Y
t
t
t
Total
Variation
Explained
by Exog
Variables
Unexplained
Component
43
Our Classical
Regression Model
Yt = bˆ 0 + bˆ1X t + eˆ t
Yt
Yt
ˆ
ê t = Yt - Y
t
Ŷt = bˆ 0 + bˆ 2X t
Y
Ŷt - Yt
Y
Yt - Y =
(Yˆ t - Y) + (Yt - Yˆ t )
Xt
X
44
Our Classical
Regression Model
 Total Variation About Mean
ˆ - Y) + (Y - Y
ˆ )
Yt - Y = (Y
t
t
t
Total variation
explained
unexplained
 If our goal is to have an accurate
prediction of Y, we would like the
component explained by our exogenous
variables, Ŷt -Y, to be large relative to
ˆ
error component, Yt -Y
t
 A large unexplained/unpredictable
component would mean our prediction
could be “way off”
45
Our Classical
Regression Model
 Note that:   Yt -Y 2  YY -TY 2
T
t 1
T
T
t=1
t=1
given that 2  Yt Y  2Y  Yt =2TY 2
ˆ Y
ˆ + eˆ eˆ
but YY = Y
2
2
ˆ
ˆ


 Y Y - TY = Y Y - TY + eˆ eˆ
Total Sum of
Squares (TSS)
Explained Sum
Error Sum
of Squares (RSS) of Squares (SSE)
 TSS:


a measure of total variation of Yt
about its mean
RSS: portion of total variation in Yt
about sample mean explained by RHS
variables, X
SSE: portion of total variation in Yt
about mean not explained by RHS
46
Our Classical
Regression Model
 In scalar notation, the above definition
of deviation from sample mean can be
represented as:
T
T
2
T
ˆ - Y ) + å eˆ 2
å (Yt - Y) = å (Y
t
t
t= 1
Total Sum of
Squares (TSS)
2
t= 1
Explained Sum
of Squares (RSS)
t= 1
Error Sum
of Squares (SSE)
47
Our Classical
Regression Model
 How well does the estimated equation
explain the variance of the dependent
variable?
R2 (Coefficient of Determination) is
the proportion of total variation
about the mean explained by the
model
2
ˆ
ˆ

RSS
Y
Y
TY
 R2 
=
TSS YY - TY 2
But because TSS=RSS+SSE
SSE
eˆ eˆ
2
R  1=1TSS
YY - TY 2
 Note:
The β’s that minimize the SSE
→ maximize the R2 value
48
Our Classical
Regression Model
 Calculation of R2 (Greene:31-38)
 Use the above formulas or the
following:
 M0 ≡ [IT – (1/T) ii']
 i ≡ column vector of 1’s
 diagonals of M0 are (1-(1/T))
 off-diagonals of M0 are -(1/T)
 M0 a T x T idempotent matrix that
transforms any variable into deviations
from sample means
RSS
 TSS=Y'M0Y = βs'X'M0Xβs + es'es

0


β
X
M
Xβs
RSS
2
s
R =
=
TSS
YM 0Y
49
Our Classical
Regression Model


0 < R2 < 1 (when intercept present)
 = 0→regression is horizontal line, all
elements of β are zero except constant
term→ predicted value of yt equals μ
t
 = 1→all residuals are 0, perfect fit
R2 will never decrease when another
variable is added to regression (Greene,
pp.34)
50
Our Classical
Regression Model

2
Adjusted R2, R , controls for D.F. (e.g.,
number of regressors in the model):
SSE (T - K )
2
R = 1TSS (T - 1)
s 2u
= 1TSS (T - 1)
(T - 1)
= 11- R 2
(T - K)
(

)
When K > 1, adjusted R2< R2
51
Our Classical
Regression Model

Adjusted R2 may decline when a variable
is added to the model
↑ or ↓
SSE (T - K )
R = 1TSS (T - 1)
2
Does not
change
  or  depends on whether
contribution of new variable to
model fit (as represented by the
SSE) offsets the loss due to
correction for DF.
 When dropping variables from a
model, if |t-ratio| < 1.0 Adj. R2 will
. When |t-ratio| > 1.0, dropping
variable from the regression → Adj.
R2 will  (Greene, p.35)
52
Our Classical
Regression Model
 Change in R2 from adding a variable
Greene:28-31, 34
R2X,z is the coefficient of determination
in the regression of y on X and an
additional variable, z.
R2X is the coefficient of determination
when regressing Y on X alone
r*yz is the partial correlation between y
and z after accounting for the effects of
X (see Greene: 28-31 for details
concerning the calculation of r*yz )
Change in R2 from adding a variable to
a regression is:
R 2X  1 - R 2X  ryz*2
% variation in Y
left after X
53
Food Expenditure
Model Results Summary
FOODEXP = 7.3832 + 0.2323 INC
 4.008 (0.055)
Std. error
K=2 T=40 TSS=2607.0
SSE=e'e =1780.4
RSS= 826.6
1780.4
826.6
2

 46.853 R 
 0.317
40  2
2607.0
1780.4



40

2


  0.299
R 2  1 
 2607.0


 40  1 

 u2
-0.004554 
 0.342922
 XX  = 46.853 

-0.0045548
6.525442e-005


16.0669 0.2134



0.2134
0.0030


S S   u2
1
54
Prediction Under
The Classical Model
55
Prediction Under
The Classical Model



Assume we have the CRM which
satisfies assumptions A.1-A.5
 βs=(X'X)-1X'Y
 βS is BLUE of β
Attempt to anticipate new/unknown
values of Y0 given known explanatory
variables, X0 (JHGLL: 209-211)
 Remember our assumption of the
error variance: E(ee′)=σ2IT
 This assumption simplifies
prediction
 After our GLS lectures we will
revisit this
Prediction variance:
V(e0|X,X0) = σ2IT0+σ2X0(X'X)-1X'0
56
Prediction Under
The Classical Model

With 2 parameters (1 an intercept)

2 
x 02 -x 2


1
2
var(eˆ0 )   1   T

2
 T  x -x 
t2 2


t 1
 2nd and 3rd terms → progressively
smaller as we collect more
information
 1st term is constant → no matter
how much data one has, one can
never predict with certainty
 The farther the forecast point is
from the center of data, the greater
the degree of uncertainty




57
Prediction Under
The Classical Model

V(e0|X,X0) = σ2IT0+σ2X0(X'X)-1X'0
 Prediction variability due to:
Equation error term
σ2IT0
Variability in estimating
unknown parameters
σ2X0(X'X)-1X'0=X0 Var(βs)X'0
58
Prediction Under
The Classical Model

Lets assume that e~N(0,σ2IT)
 → The following distribution of the
prediction error
Ŷ0 -Y0
V (e0 |X,X 0 )
~ t T-K
With 2 param.

2 
x 02 -x 2


1
2
V (e0 )   1   T

2
 T  x -x 
t2 2


t 1





This implies the following forecast
interval
59