Sample Selection Regression Models (Ch. 17)

Sample Selection Regression Models (Ch. 17)
Until now we always assumed to have a random sample
Now we cover cases where no random sample is available
There are two distinct cases:
- the sample was collected/selected according to some value of y
- the sample is selected by behaviour of the population under
consideration (self-selection)
We focus on the second case
Microeconometrics
Michael Gerfin
Examples
Family wealth function
Effect of pension plan on wealth accumulation
y = β 0 + β1 plan + β 2 x + u
The sample only contains people with wealth less than 100'000
Æ Selection on basis of y
2
Fall 2008
Microeconometrics
Michael Gerfin
Fall 2008
Wage function
Estimation of wage function for population in working age
But wages are only observed for workers
Æ y is only observable for subsample which is defined by another
variable (working)
Æ Self selection: decision to work depends on wage
3
Microeconometrics
Michael Gerfin
Fall 2008
When can Sample Selection Be Ignored?
Simply put: if selection is based on exogenous right-hand side variable
it does not affect the consistency of OLS
However, the precision of the estimates decreases (standard errors get
larger)
4
Microeconometrics
Michael Gerfin
Fall 2008
Example
set
g x
g u
g y
obs 10000
= uniform()
= invnorm(uniform())
= 1 + x + u
. reg y x if x > 0.25
Source |
SS
df
MS
-------------+-----------------------------Model | 289.307451
1 289.307451
Residual | 7679.02682 7506
1.0230518
-------------+-----------------------------Total | 7968.33427 7507 1.06145388
Number of obs
F( 1, 7506)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
7508
282.79
0.0000
0.0363
0.0362
1.0115
-----------------------------------------------------------------------------y |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------x |
.911556
.0542066
16.82
0.000
.8052959
1.017816
_cons |
1.051498
.0361101
29.12
0.000
.9807125
1.122284
------------------------------------------------------------------------------
5
Microeconometrics
Michael Gerfin
Fall 2008
reg y x if x > 0.50
Source |
SS
df
MS
-------------+-----------------------------Model | 97.2936697
1 97.2936697
Residual |
5150.4905 5067
1.0164773
-------------+-----------------------------Total | 5247.78416 5068 1.03547438
Number of obs
F( 1, 5067)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
5069
95.72
0.0000
0.0185
0.0183
1.0082
-----------------------------------------------------------------------------y |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------x |
.9652273
.0986589
9.78
0.000
.7718133
1.158641
_cons |
1.00676
.0755266
13.33
0.000
.8586954
1.154825
------------------------------------------------------------------------------
. reg y x if x > 0.75
Source |
SS
df
MS
-------------+-----------------------------Model | 10.8707227
1 10.8707227
Residual | 2530.00844 2581 .980243489
-------------+-----------------------------Total | 2540.87917 2582 .984074038
Number of obs
F( 1, 2581)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
2583
11.09
0.0009
0.0043
0.0039
.99007
-----------------------------------------------------------------------------y |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------x |
.8980481
.2696729
3.33
0.001
.3692508
1.426845
_cons |
1.069994
.2364745
4.52
0.000
.6062953
1.533693
------------------------------------------------------------------------------
6
Microeconometrics
Michael Gerfin
Fall 2008
Self Selection
Sample selection is not result of sample design but due to decisions
made by members of the population (self selection)
Exogenous explanatory variable
Classic example: Labour force participation and wages
We want to know: E ( wi | xi ) for a person randomly drawn from the
population (w : wage)
w is only observed for working people.
7
Microeconometrics
Michael Gerfin
Fall 2008
Model of labour supply:
Decision to work is based on difference between market wage and
reservation wage
Assume that
wi = exp(xi1 β1 + ui1 )
wir = exp(xi 2 β 2 + γ 2 ai + ui 2 )
(u11 , ui2) independent of (xi1 , xi2 , ai). xi1 contains productivity
characteristics and xi2 contains charactistics that determine marginal
utility of leisure and income (there may be an overlap)
log wi = xi1 β1 + ui1
8
Microeconometrics
Michael Gerfin
But wage is only observed if w > wr, i.e.
log wi − log wir = xi1 β1 − xi 2 β 2 − γ 2 ai + ui1 − ui 2 ≡ xiδ 2 + v2 > 0
Problem: wr is not observed and depends on xi2 and ui2 ,
Æ we need another estimation procedure
Notation: drop subscript i, y1 ≡ log w and y2 ist binary indicator
(1)
y1 = x1 β1 + u1
(2)
y2 = 1[xδ 2 + v2 > 0]
(2) is a probit if v2 is normally distributed
9
Fall 2008
Microeconometrics
Michael Gerfin
Fall 2008
Assumptions 17.1: (a) (x,y2) are always observed, y1 is only observed
if y2 = 1; (b) (u1,v2) is independent of x with zero mean; (c)
v2 ~ N (0,1); and (d) E (u1 | v2 ) = γ 1v2
(a)
describes the selection process; (b) is strong exogeneity
assumption; (c) is necessary to derive a conditional expectation given
the selected sample; and (d) requires linearity of regression of u on v.
(d) always holds if (u1,v2) is bivariate normal (but it is not necessary to
assume that u is normally distributed).
This model is also called Tobit Typ 2 . It is important to recognise that
we are not dealing with a corner solution for y1
Æ y1 must not be set to zero for estimation (it is missing)
10
Microeconometrics
Michael Gerfin
Fall 2008
Estimation of Selection Model
Let ( y1 , y2 , x, u1 , v2 ) denote a random draw from the population. Given
the selection rule we can hope to estimate
E ( yi | x, y2 = 1) and P ( y2 = 1| x)
How does E ( yi | x, y2 = 1) depend on β1? First,
(3)
E ( yi | x, v2 ) = x1 β1 + E (u1 | x, v2 ) = x1 β1 + E (u1 | v2 ) = x1 β1 + γ 1v2
where the second equality follows because (u1,v2) is independent of x
If γ1 = 0 Æ no selection problem!
11
Microeconometrics
Michael Gerfin
Fall 2008
What if γ1 ≠ 0? Using iterated expectations on (3) gives
E [ E ( yi | x, v2 ) | x, y2 ] = E (x1β1 | x, y2 ) + γ 1 E (v2 | x, y2 ) =
E ( yi | x, y2 ) = x1β1 + γ 1 E (v2 | x, y2 ) = x1β1 + γ 1h(x, y2 )
where h(x, y2 ) = E ( v2 | x, y2 )
If we knew h(x, y2 ), we could estimate β1 und γ1 from the regression of
y1 on x and h(x, y2 ) (in the selected sample).
In the selected sample y2 = 1 Æ we only have to find h(x,1) .
h(x,1) = E ( v2 | v2 > − xδ 2 ) = λ (xδ 2 ) , where λ (⋅) =
12
φ (⋅)
Φ(⋅)
Microeconometrics
Michael Gerfin
Fall 2008
This implies
(4)
E ( y1 | x, y2 = 1) = x1 β1 + γ 1λ (xδ 2 )
From (4) it is obvious that OLS of y on x1 in the selected sample omits
the term λ (xδ 2 ) Æ omitted variable bias
(4) also shows a way to consistently estimate β1.
Heckman (1979) has shown that β1 und γ1 can consistently be
estimated in the selected sample by regressing y on x1 and λ (xδ 2 ) .
But δ2 is unknown and must be estimated in a first step (using Probit).
13
Microeconometrics
Michael Gerfin
Fall 2008
Heckman Estimator
Step 1: Estimate Probit model
(5)
P( y2 = 1| x) = Φ( xiδ 2 )
using all observations. Obtain λˆi 2 ≡ λ (xiδˆ2 )
Step 2: Estimate βˆ1 und γˆ1 using OLS in the selected sample
(6)
yi1 = xi1 β1 + γ 1λˆi 2 + ui
This estimator is consistent and asymptotically normally distributed
14
Microeconometrics
Michael Gerfin
Fall 2008
Simple test for selection bias:
under H0 (no selection bias) in (6) γ1 = 0 Æ t – test for γ1.
IMPORTANT: this test is only valid if the model is correctly specified
(distributional assumptions)
If γ1 ≠ 0 the standard errors of β1 must be corrected
- for heteroskedasticity
- because δ2 has been estimated in the first step
Stata does this for you if you use the command heckman
15
Microeconometrics
Michael Gerfin
Fall 2008
Data generation for selection problem
set obs 10000
g x = uniform()
g z = uniform()
matrix c = (4, 1 \ 1, 1)
/*Kovarianzmatrix u1,v2*/
drawnorm u1 v2, n(10000) cov(c)
/*korrelierte Störterme */
g y1 = 1 + x + u1
g y2star = 0.5 + 0.5*x + 0.5*z + v2
g y2 = y2star>0.6
/* selection indicator */
replace y1 = . if y2==0
/* set y1 to missing if not
selected */
OLS
. reg y1 x
Source |
SS
df
MS
Number of obs =
6531
.
.
------------------------------------------------------------------------------y1 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------x |
.7547057
.0804123
9.39
0.000
.5970712
.9123402
_cons |
1.676609
.0482139
34.77
0.000
1.582094
1.771124
16
Microeconometrics
Michael Gerfin
Fall 2008
Heckman Two Step Estimator
. heckman y1 x, select (y2= x z) twostep
Heckman selection model -- two-step estimates
(regression model with sample selection)
Number of obs
Censored obs
Uncensored obs
=
=
=
10000
3469
6531
Wald chi2(2)
Prob > chi2
=
=
215.55
0.0000
-----------------------------------------------------------------------------|
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------y1
|
x |
1.076284
.1234465
8.72
0.000
.8343331
1.318235
_cons |
.8734129
.2295979
3.80
0.000
.4234092
1.323416
-------------+---------------------------------------------------------------y2
|
x |
.5332074
.0451385
11.81
0.000
.4447375
.6216772
z |
.5002837
.0452514
11.06
0.000
.4115926
.5889748
_cons | -.1151823
.0340753
-3.38
0.001
-.1819686
-.0483959
-------------+---------------------------------------------------------------mills
|
lambda |
1.146936
.3192996
3.59
0.000
.5211204
1.772752
-------------+---------------------------------------------------------------rho |
0.55818
sigma | 2.0547643
lambda | 1.1469362
.3192996
------------------------------------------------------------------------------
17
Microeconometrics
Michael Gerfin
Fall 2008
Theoretically, it is not necessary that x1 is a strict subset of x
Æ β1 is identified if x = x1 (because λ is nonlinear function of x)
However, in practice λ is often almost a linear function of x
Æ severe multicollinearity Æ very imprecise estimates
Î Strong recommendation: you should have at least one element in x
that is not in x1 (exclusion restriction)
18
Microeconometrics
Michael Gerfin
Fall 2008
0
1
lambda
2
3
Relation between xβ and λ
-4
-2
2
0
xb
19
4
Microeconometrics
Michael Gerfin
Fall 2008
Simulation continued
. heckman y1 x z, select (y2= x z) twostep
Heckman selection model -- two-step estimates
(regression model with sample selection)
Number of obs
Censored obs
Uncensored obs
=
=
=
10000
3469
6531
Wald chi2(4)
Prob > chi2
=
=
333.21
0.0000
-----------------------------------------------------------------------------|
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------y1
|
x |
.4166927
.9797537
0.43
0.671
-1.503589
2.336975
z |
-.623502
.9195849
-0.68
0.498
-2.425855
1.178851
_cons |
2.828842
2.889762
0.98
0.328
-2.834987
8.49267
-------------+---------------------------------------------------------------y2
|
x |
.5332074
.0451385
11.81
0.000
.4447375
.6216772
z |
.5002837
.0452514
11.06
0.000
.4115926
.5889748
_cons | -.1151823
.0340753
-3.38
0.001
-.1819686
-.0483959
-------------+---------------------------------------------------------------mills
|
lambda | -1.171361
3.428838
-0.34
0.733
-7.891761
5.549039
-------------+---------------------------------------------------------------rho |
-0.56807
sigma | 2.0620111
lambda | -1.1713608
3.428838
-----------------------------------------------------------------------------20
Microeconometrics
Michael Gerfin
Generate Table 17.1
wage equation
OLS
Heckman 2 step
educ
0.107 (7.60)**
0.109 (7.03)**
exper
0.042 (3.15)**
0.044 (2.70)**
expersq
-0.001 (2.06)*
-0.001 (1.96)
mills:lambda
0.032 (0.24)
Constant
-0.522 (2.63)**
-0.578 (1.90)
selection equation
inlf:educ
0.131 (5.18)**
inlf:exper
0.123 (6.59)**
inlf:expersq
-0.002 (3.15)**
inlf:age
-0.053 (6.23)**
inlf:kidslt6
-0.868 (7.33)**
inlf:kidsge6
0.036 (0.83)
inlf:nwifeinc
-0.012 (2.48)*
inlf:Constant
0.270 (0.53)
lambda
.032 (0.24)
sigma
.663
Observations
428
753
R-squared
0.16
Absolute value of t statistics in parentheses
* significant at 5%; ** significant at 1%
21
Fall 2008
Microeconometrics
Michael Gerfin
Fall 2008
Second example: data from the Swiss expenditure survey 1998
. reg lnwage edu* age* foreign city
Source |
SS
df
MS
-------------+-----------------------------Model | 123.740716
6 20.6234526
Residual | 647.467676 3407
.19004041
-------------+-----------------------------Total | 771.208392 3413 .225962025
Number of obs
F( 6, 3407)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
3414
108.52
0.0000
0.1605
0.1590
.43594
-----------------------------------------------------------------------------lnwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------edu_l | -.2510206
.0249717
-10.05
0.000
-.2999816
-.2020595
edu_h |
.2751949
.0171604
16.04
0.000
.2415491
.3088407
age |
.0448688
.0058037
7.73
0.000
.0334897
.056248
age2 | -.0049732
.0007318
-6.80
0.000
-.0064081
-.0035384
foreign | -.0843714
.0222799
-3.79
0.000
-.1280547
-.0406882
city |
.0520015
.0166524
3.12
0.002
.0193518
.0846511
_cons |
2.293195
.1104736
20.76
0.000
2.076594
2.509796
22
Microeconometrics
Michael Gerfin
Heckman selection model -- two-step estimates
(regression model with sample selection)
Fall 2008
Number of obs
=
4800
Censored obs
=
1386
Uncensored obs
=
3414
-----------------------------------------------------------------------------|
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------lnwage
|
edu_l | -.2454693
.025113
-9.77
0.000
-.2946899
-.1962487
edu_h |
.2690675
.0174234
15.44
0.000
.2349183
.3032168
age |
.0464142
.0058536
7.93
0.000
.0349413
.057887
age2 | -.0051244
.0007356
-6.97
0.000
-.0065662
-.0036827
foreign | -.0847045
.0222888
-3.80
0.000
-.1283897
-.0410193
city |
.047618
.0167895
2.84
0.005
.0147112
.0805248
_cons |
2.284004
.1106135
20.65
0.000
2.067206
2.500803
-------------+---------------------------------------------------------------ilf
|
edu_l | -.2140455
.0631543
-3.39
0.001
-.3378256
-.0902654
edu_h |
.2711794
.0531317
5.10
0.000
.1670431
.3753157
age |
.1491991
.0189459
7.87
0.000
.1120657
.1863324
age2 | -.0208317
.0023369
-8.91
0.000
-.0254119
-.0162514
foreign |
.0905475
.0649259
1.39
0.163
-.0367049
.2178
city |
.0058761
.0455646
0.13
0.897
-.0834288
.095181
inc_0 |
-.239319
.0354921
-6.74
0.000
-.3088821
-.1697558
kids |
-.480615
.0231543
-20.76
0.000
-.5259967
-.4352334
married | -.4891778
.0878587
-5.57
0.000
-.6613777
-.3169779
_cons | -.7185278
.3602989
-1.99
0.046
-1.424701
-.012355
-------------+---------------------------------------------------------------mills
|
lambda |
-.056072
.0271142
-2.07
0.039
-.1092149
-.002929
-------------+---------------------------------------------------------------rho |
-0.12842
sigma | .43663547
lambda | -.05607195
.0271142
23
Microeconometrics
Michael Gerfin
Fall 2008
Predictions after estimation of selection models
Often selection models are used to predict the dependent variable for
the observations not in the selected subsample
Example:: expected wage of nonworkers
Correct prediction:
E ( yi1 | xi ) = xi βˆ
and NOT
E ( yi | xi , yi 2 = 1) = xi βˆ + γ 1λˆi 2 ≠ E ( yi1 | xi )
Stata
heckman lnlohn .../*selection model for ln(wage)*/
predict lnlohn_pred, e(.,.) /* prediction of ln(wage)*/
24