Chapter 14 and 15 Overview: Panel Data and Instrumental Variables

Pooled Cross Section
Panel Data
Instrumental Variables
Chapter 14 and 15 Overview: Panel Data and
Instrumental Variables
Jeff Borowitz
Georgia State University
Jeff Borowitz
Panels/Instrumental Variables
1 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Pooled Cross Sections
Have data for multiple units (cross section)
Have data at multiple periods of time (time series)
Use all together; a pooled cross-section
Jeff Borowitz
Panels/Instrumental Variables
2 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Pooled Cross Section: Example
House prices in Boston before and after an incinerator is put near
some houses
Dummy: nearinc = 1 if house is near incinerator (once incinerator built)
Real dollar price: rprice
Model with incinerator:
rprice =β0 + β1 nearinc + u
Result: β1 = −30688 ± 5827
Problem: houses near incinerator were generally in a worse part of
town
Model run on data before incinerator rumored:
β1 = −18824 ± 4744
Difference-in-Differences estimator: −30688 − (−18824) = 11863
Jeff Borowitz
Panels/Instrumental Variables
3 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Difference-In-Differences
Model
y = β0 + δ0 d2 + β1 dT + δ2 dT · d2 + . . . + u
d2: a dummy for period after treatment has happened
dT : a dummy for whether treatment occurred
δ2 is the difference-in-difference estimator
Control
Treatment
Treatment−Control
Before
β0
β0 + β1
β1
After
β0 + δ0
β0 + δ0 + β1 + δ1
β1 + δ1
Jeff Borowitz
After−Before
δ0
δ0 + δ1
δ1
Panels/Instrumental Variables
4 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Difference-In-Differnces
Jeff Borowitz
Panels/Instrumental Variables
5 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Natural Experiments
Difference in differences is a powerful tool for experiments
Often a policy affects some units only some of the time
Effect of state changes in maternity leave policies on maternal
employment
Effect of Massachusetts universal healthcare requirement on
uninsurance rates
Mariel boat lift
Jeff Borowitz
Panels/Instrumental Variables
6 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Card and Krueger
Minimum wage changed on April 1, 1992 from $4.25 to $5.05 in New
Jersey but not Pennsylvania
Card and Krueger surveyed fast food restaurants on either side of the
border about how many employees they employed.
Regressed:
empl = β0 + δ0 after + β1 NJ + δ2 after · NJ + X γ + u
Found no effect of minimum wage on employment
Jeff Borowitz
Panels/Instrumental Variables
7 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Clustering
It turns out that Card and Krueger made an error
OLS assumes that cov(ui , uj ) = 0 to get standard errors
But two fast food restaurants nearby in NJ might have similar demand
shocks, so cov(ui , uj ) > 0
Turns out this is a big problem!
Bertrand, Duflo, and Mullainathan
Take data on average state female wages from real CPS
Apply a fake treatment to some states at different times (there is no
real effect)
Find a statistically significant effect at 5% level 45% of the time!
Solution: use “clustered” standard errors available in Stata/R
Jeff Borowitz
Panels/Instrumental Variables
8 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Panel Data
Have multiple observations for the same unit over time
Generally, we think important stuff is unobserved
But what if it’s unobserved and fixed?
yit =β1 xit + ai + uit
Average all observations for an individual:
y¯i =β1 x¯i + ai + u¯i
Now subtract an individual observation from the average and ai
cancels out!
yit − y¯i =β1 (xit − x¯i ) + uit − u¯i
Jeff Borowitz
Panels/Instrumental Variables
9 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Fixed Effects Estimation
This is called the within estimator or fixed effects estimator
yit − y¯i =β1 (xit − x¯i ) + uit − u¯i
We could extend to the situation with controls as:
y˙ it =β1 x˙ it + β2 z˙ it + . . . + u˙ it
where y˙ it = yit − y¯i
Jeff Borowitz
Panels/Instrumental Variables
10 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Fixed Effects: Discussion
Fixed effects is the true workhorse model of modern applied
econometric analysis!
Why?
It can control for unobservables which are correlated with our
independent variables
It makes no assumptions on their nature or form: just that they enter
linearly
It’s easy to do in Stata/R
Jeff Borowitz
Panels/Instrumental Variables
11 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Fixed Effects: Drawbacks
You can’t estimate the effects of variables that don’t vary within
individuals over time (e.g. education)
If there are no fixed effects, you lose a lot of your variation, and get
bigger standard errors
Measurement error problems can be exacerbated
Jeff Borowitz
Panels/Instrumental Variables
12 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Random Effects
Another way to think of the model:
yit =β1 xit + ai + uit
Composite error term:
vit =ai + uit
If ai is uncorrelated with x, we are throwing away variation.
Can estimate using GLS: there is a simple data transformation
Problem: ai was probably not uncorrelated with xit
If it was, we wouldn’t be worried about unobservables!
Jeff Borowitz
Panels/Instrumental Variables
13 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Example: log(wage) estimation
Independent Variables
educ
black
hispan
exper
married
union
Pooled OLS
.091
(.005)
−.139
(.024)
.016
(.021)
.067
(.014)
.108
(.016)
.182
(.017)
Random Effects
.092
(.011)
−.139
(.048)
.022
(.043)
.106
(.015)
.064
(.017)
.106
(.018)
Fixed Effects
−
−
−
−
.047
(.018)
.080
(.019)
Shrinking coefficients on married, union, suggest cov(x, a) > 0
Jeff Borowitz
Panels/Instrumental Variables
14 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Panel Data and Other Structures
Differencing out unobserved effects can be useful other places
For example, within twin/within family differencing
E.g. Black, Devereux, and Salvanes
Use Norwegian registry data on birthweight/family
Fixed effects regressions for outcomes
Jeff Borowitz
Panels/Instrumental Variables
15 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Instrumental Variables
A big problem that we have is that our variable of interest is
correlated with unobservables
A proxy would work, but only if it’s a good one
Instrumental variables are like the opposite of a proxy
Consider:
y = β0 + β1 x + u
where we worry that cov(x, u) 6= 0
An instrumental variable is a variable z where:
cov(z, u) =0
cov(z, x) 6=0
These terms are, respectively, exogeneity and relevance
We can test relevance by looking at the “first stage” which is a
regression of x on z
Jeff Borowitz
Panels/Instrumental Variables
17 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Instrumental Variables: Estimation
Before, we would have estimated βˆ1 with:
Pn
(xi − x¯)(yi − y¯ )
βˆ1 = Pi=1
n
¯)(xi − x¯)
i=1 (xi − x
Now we do:
Pn
(zi − z¯)(yi − y¯ )
ˆ
β1 = Pi=1
n
¯)(xi − x¯)
i=1 (zi − z
Standard errors are bigger, divided by correlation of x and z, ρx,z
σ2
σ2
⇒
nσx2
nσx2 ρ2x,z
Jeff Borowitz
Panels/Instrumental Variables
18 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Instrumental Variables: Example
Consider the wage regression with ability omitted:
log(wage) =β0 + β1 educ + β2 abil + e
An instrument here needs:
To be uncorrelated with wages/ability
To be correlated with education
A couple of reasonable choices:
Quarter of birth (Card and Krueger 1991)
Distance to nearest college (Card 1995)
Further example: draft number as instrument for effect of veteran
status on wages (Angrist 1990)
Jeff Borowitz
Panels/Instrumental Variables
19 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Instrumental Variables: Further Examples
Solution to the measurement error problem
Assume you know x measures true x ∗ with noise
This causes classical errors in variables: biases coefficient on x toward
zero
But what if you have another measure of the true x ∗ , z?
If this is uncorrelated with x − x ∗ (it has different measurement error)
then this could be a valid instrument
Twinsburg, OH twins festival (Ashenfelter and Krueger 1991)
Use one twin’s reported education for the other as an instrument
Find not a huge amount of ability bias in returns to schooling
Find schooling is often measured with some error
European mortality rates instrument for institution quality
(Acemoglu, Johnson, and Robinson 2001)
Want to explain why equatorial/African countries have lower GDP
European colonists set up different institutions depending on their
mortality rate
Using mortality as an instrument for institution quality, there is no
Borowitz African/equatorial
Panels/Instrumental Variables
20 / 22
difference in incomeJeff
between
countries and the rest
Pooled Cross Section
Panel Data
Instrumental Variables
Problems with Instrumental Variables
We can’t ever check the correlation between z and u
Low correlation between z and x can cause bias
Jeff Borowitz
Panels/Instrumental Variables
21 / 22
Pooled Cross Section
Panel Data
Instrumental Variables
Conclusions
Together, instrumental variables and panel methods underly a huge
amount of progress in empirical economics in the last 20 years
“Taking the con out of econometrics”
The main messages:
Think hard about where bias will come from in your models
Combinations of OLS, IV, proxies, and fixed effects should tell a
coherent story about what your
¡++¿
Jeff Borowitz
Panels/Instrumental Variables
22 / 22