Pooled Cross Section Panel Data Instrumental Variables Chapter 14 and 15 Overview: Panel Data and Instrumental Variables Jeff Borowitz Georgia State University Jeff Borowitz Panels/Instrumental Variables 1 / 22 Pooled Cross Section Panel Data Instrumental Variables Pooled Cross Sections Have data for multiple units (cross section) Have data at multiple periods of time (time series) Use all together; a pooled cross-section Jeff Borowitz Panels/Instrumental Variables 2 / 22 Pooled Cross Section Panel Data Instrumental Variables Pooled Cross Section: Example House prices in Boston before and after an incinerator is put near some houses Dummy: nearinc = 1 if house is near incinerator (once incinerator built) Real dollar price: rprice Model with incinerator: rprice =β0 + β1 nearinc + u Result: β1 = −30688 ± 5827 Problem: houses near incinerator were generally in a worse part of town Model run on data before incinerator rumored: β1 = −18824 ± 4744 Difference-in-Differences estimator: −30688 − (−18824) = 11863 Jeff Borowitz Panels/Instrumental Variables 3 / 22 Pooled Cross Section Panel Data Instrumental Variables Difference-In-Differences Model y = β0 + δ0 d2 + β1 dT + δ2 dT · d2 + . . . + u d2: a dummy for period after treatment has happened dT : a dummy for whether treatment occurred δ2 is the difference-in-difference estimator Control Treatment Treatment−Control Before β0 β0 + β1 β1 After β0 + δ0 β0 + δ0 + β1 + δ1 β1 + δ1 Jeff Borowitz After−Before δ0 δ0 + δ1 δ1 Panels/Instrumental Variables 4 / 22 Pooled Cross Section Panel Data Instrumental Variables Difference-In-Differnces Jeff Borowitz Panels/Instrumental Variables 5 / 22 Pooled Cross Section Panel Data Instrumental Variables Natural Experiments Difference in differences is a powerful tool for experiments Often a policy affects some units only some of the time Effect of state changes in maternity leave policies on maternal employment Effect of Massachusetts universal healthcare requirement on uninsurance rates Mariel boat lift Jeff Borowitz Panels/Instrumental Variables 6 / 22 Pooled Cross Section Panel Data Instrumental Variables Card and Krueger Minimum wage changed on April 1, 1992 from $4.25 to $5.05 in New Jersey but not Pennsylvania Card and Krueger surveyed fast food restaurants on either side of the border about how many employees they employed. Regressed: empl = β0 + δ0 after + β1 NJ + δ2 after · NJ + X γ + u Found no effect of minimum wage on employment Jeff Borowitz Panels/Instrumental Variables 7 / 22 Pooled Cross Section Panel Data Instrumental Variables Clustering It turns out that Card and Krueger made an error OLS assumes that cov(ui , uj ) = 0 to get standard errors But two fast food restaurants nearby in NJ might have similar demand shocks, so cov(ui , uj ) > 0 Turns out this is a big problem! Bertrand, Duflo, and Mullainathan Take data on average state female wages from real CPS Apply a fake treatment to some states at different times (there is no real effect) Find a statistically significant effect at 5% level 45% of the time! Solution: use “clustered” standard errors available in Stata/R Jeff Borowitz Panels/Instrumental Variables 8 / 22 Pooled Cross Section Panel Data Instrumental Variables Panel Data Have multiple observations for the same unit over time Generally, we think important stuff is unobserved But what if it’s unobserved and fixed? yit =β1 xit + ai + uit Average all observations for an individual: y¯i =β1 x¯i + ai + u¯i Now subtract an individual observation from the average and ai cancels out! yit − y¯i =β1 (xit − x¯i ) + uit − u¯i Jeff Borowitz Panels/Instrumental Variables 9 / 22 Pooled Cross Section Panel Data Instrumental Variables Fixed Effects Estimation This is called the within estimator or fixed effects estimator yit − y¯i =β1 (xit − x¯i ) + uit − u¯i We could extend to the situation with controls as: y˙ it =β1 x˙ it + β2 z˙ it + . . . + u˙ it where y˙ it = yit − y¯i Jeff Borowitz Panels/Instrumental Variables 10 / 22 Pooled Cross Section Panel Data Instrumental Variables Fixed Effects: Discussion Fixed effects is the true workhorse model of modern applied econometric analysis! Why? It can control for unobservables which are correlated with our independent variables It makes no assumptions on their nature or form: just that they enter linearly It’s easy to do in Stata/R Jeff Borowitz Panels/Instrumental Variables 11 / 22 Pooled Cross Section Panel Data Instrumental Variables Fixed Effects: Drawbacks You can’t estimate the effects of variables that don’t vary within individuals over time (e.g. education) If there are no fixed effects, you lose a lot of your variation, and get bigger standard errors Measurement error problems can be exacerbated Jeff Borowitz Panels/Instrumental Variables 12 / 22 Pooled Cross Section Panel Data Instrumental Variables Random Effects Another way to think of the model: yit =β1 xit + ai + uit Composite error term: vit =ai + uit If ai is uncorrelated with x, we are throwing away variation. Can estimate using GLS: there is a simple data transformation Problem: ai was probably not uncorrelated with xit If it was, we wouldn’t be worried about unobservables! Jeff Borowitz Panels/Instrumental Variables 13 / 22 Pooled Cross Section Panel Data Instrumental Variables Example: log(wage) estimation Independent Variables educ black hispan exper married union Pooled OLS .091 (.005) −.139 (.024) .016 (.021) .067 (.014) .108 (.016) .182 (.017) Random Effects .092 (.011) −.139 (.048) .022 (.043) .106 (.015) .064 (.017) .106 (.018) Fixed Effects − − − − .047 (.018) .080 (.019) Shrinking coefficients on married, union, suggest cov(x, a) > 0 Jeff Borowitz Panels/Instrumental Variables 14 / 22 Pooled Cross Section Panel Data Instrumental Variables Panel Data and Other Structures Differencing out unobserved effects can be useful other places For example, within twin/within family differencing E.g. Black, Devereux, and Salvanes Use Norwegian registry data on birthweight/family Fixed effects regressions for outcomes Jeff Borowitz Panels/Instrumental Variables 15 / 22 Pooled Cross Section Panel Data Instrumental Variables Instrumental Variables A big problem that we have is that our variable of interest is correlated with unobservables A proxy would work, but only if it’s a good one Instrumental variables are like the opposite of a proxy Consider: y = β0 + β1 x + u where we worry that cov(x, u) 6= 0 An instrumental variable is a variable z where: cov(z, u) =0 cov(z, x) 6=0 These terms are, respectively, exogeneity and relevance We can test relevance by looking at the “first stage” which is a regression of x on z Jeff Borowitz Panels/Instrumental Variables 17 / 22 Pooled Cross Section Panel Data Instrumental Variables Instrumental Variables: Estimation Before, we would have estimated βˆ1 with: Pn (xi − x¯)(yi − y¯ ) βˆ1 = Pi=1 n ¯)(xi − x¯) i=1 (xi − x Now we do: Pn (zi − z¯)(yi − y¯ ) ˆ β1 = Pi=1 n ¯)(xi − x¯) i=1 (zi − z Standard errors are bigger, divided by correlation of x and z, ρx,z σ2 σ2 ⇒ nσx2 nσx2 ρ2x,z Jeff Borowitz Panels/Instrumental Variables 18 / 22 Pooled Cross Section Panel Data Instrumental Variables Instrumental Variables: Example Consider the wage regression with ability omitted: log(wage) =β0 + β1 educ + β2 abil + e An instrument here needs: To be uncorrelated with wages/ability To be correlated with education A couple of reasonable choices: Quarter of birth (Card and Krueger 1991) Distance to nearest college (Card 1995) Further example: draft number as instrument for effect of veteran status on wages (Angrist 1990) Jeff Borowitz Panels/Instrumental Variables 19 / 22 Pooled Cross Section Panel Data Instrumental Variables Instrumental Variables: Further Examples Solution to the measurement error problem Assume you know x measures true x ∗ with noise This causes classical errors in variables: biases coefficient on x toward zero But what if you have another measure of the true x ∗ , z? If this is uncorrelated with x − x ∗ (it has different measurement error) then this could be a valid instrument Twinsburg, OH twins festival (Ashenfelter and Krueger 1991) Use one twin’s reported education for the other as an instrument Find not a huge amount of ability bias in returns to schooling Find schooling is often measured with some error European mortality rates instrument for institution quality (Acemoglu, Johnson, and Robinson 2001) Want to explain why equatorial/African countries have lower GDP European colonists set up different institutions depending on their mortality rate Using mortality as an instrument for institution quality, there is no Borowitz African/equatorial Panels/Instrumental Variables 20 / 22 difference in incomeJeff between countries and the rest Pooled Cross Section Panel Data Instrumental Variables Problems with Instrumental Variables We can’t ever check the correlation between z and u Low correlation between z and x can cause bias Jeff Borowitz Panels/Instrumental Variables 21 / 22 Pooled Cross Section Panel Data Instrumental Variables Conclusions Together, instrumental variables and panel methods underly a huge amount of progress in empirical economics in the last 20 years “Taking the con out of econometrics” The main messages: Think hard about where bias will come from in your models Combinations of OLS, IV, proxies, and fixed effects should tell a coherent story about what your ¡++¿ Jeff Borowitz Panels/Instrumental Variables 22 / 22
© Copyright 2024