The Causal Effects of Childcare and Parental Investment: Short and Long-Term Evidence Jorge Luis Garc´ıa∗ April 14, 2015 Abstract I jointly estimate the causal effects of childcare and parental investment on the production of cognitive skills, using experimental data from a large-scale, randomized, high-quality early education program in the US. To do so, I rely on two sources of variation: random treatment assignment at birth and the interaction of treatment assignment and being a twin. The first provides access to childcare. The second reduces parental investment when parents have twins instead of a singleton, arguably due to additional resource constraints. This allows me to decompose the total treatment effect of the program on cognitive skills, as measured by an IQ test at age 3: 73% corresponds to childcare, 47% to parental investment, and -20% to a residual. My evidence on skill production and parental choices suggests that childcare and parental investment are substitutes; both have long-term causal effects on schooling and behavior outcomes. ∗ The University of Chicago, Department of Economics. e-mail: [email protected]. 1 Attributes acquired before age 18 determine around 50% of the variability in outcomes such as lifetime earnings (Keane and Wolpin, 1997; Cunha et al., 2005; Huggett et al., 2011). This variability is a fundamental component of life-cycle and cross-sectional inequality. Highquality center-based childcare is a means to narrow gaps in these outcomes and, therefore, promote social mobility. To this end, the current federal administration spends 14.5 billion dollars in early education programs and initiatives, mainly Head Start, Early Head Start, the Childcare Care Development Fund, and Individuals with Disabilities Education Act (IDEA) (see Elango et al., 2015). Yet the US lags compared to other nations, ranking 28th out of OECD countries in early educational program enrollment (OECD, 2012). Moreover, the European Union had as policy goal “to provide childcare by 2010 to at least 90% of children between three years old and the mandatory school age (see Jaumotte, 2003). The evi- dence on the positive effects of early education both on early-life skills and later-life outcomes is vast (see Elango et al. (2015) for an extensive, recent survey). Much of this evidence is built on evaluations of different high-quality early education programs which have centerbased childcare as their main component. However, the joint study of center-based childcare and parental investment is absent in the literature. This is surprising because successful early education programs have strong parenting components.1 In this paper, I (i) investigate the causal effects and relative importance of center-based childcare and parental investment on child skill development; (ii) find that high-quality childcare and parental investment are (Edgeworth) production substitutes and that parents use these inputs as substitutes; (iii) link findings on early-life childcare and parental investment experiences to several later-life schooling and behavior outcomes at age 18: school absenteeism, therapy and counseling attendance, need for tutoring in reading and math, taking of 1 Morever, (i) a vast literature documents the importance of parenting and scaffolding on child skill development (see Heckman and Mosso (2014) for a theoretical analysis and a survey of the literature); (ii) parental responses mediate a considerable proportion of the treatment effects of successful early education programs (see Garc´ıa and Heckman, 2015); and (iii) parental investment is the main driver of social mobility in economic models of human capital transmission (see Becker and Tomes, 1979, 1986). Thus, a comprehensive approach to remediation and pre-distribution needs to consider both center-based childcare and parental investment jointly. 2 SAT, teen parenthood, tobacco use, and idleness. I start by writing a simple parental decision model to argue that it is reasonable to consider two basic inputs in the production of child skills. Analogous to Becker (1965), different combinations of market goods and time allocations produce the basic inputs. As expected, this eases identification to a great extent: two exclusion restrictions suffice to identify the effects of center-based childcare and parental investment on child skills through the estimation of a simple production function. Moreover, when I decompose the effect of high-quality early education, I find that the residual is minor and not statistically significant once I account for the effects of these two inputs. My identification relies on experimental and quasi-experimental variation coming from a large- scale, nation-wide, randomized early education program, the Infant Health and Development Program (IHDP). The program began in 1985, when all of its participants were born, and took place in eight different states in the US. Prematurely-born, low-weight children (≤ 37 weeks, ≤ 2500 grams) were randomly assigned to receive treatment over a period of three years, from birth to age 3.2 The treatment consisted of (i) frequent home visits from ages 0 to 3; (ii) center-based childcare from ages 1 to 3; (iii) parenting group sessions. Both the control and treatment groups received frequent medical, developmental, and social assessments during the treatment period. Hence, I have rich longitudinal sets of measurements on parental and child investment and skill dimensions. My baseline identification strategy requires two instruments. The first I use is random assignment to treatment in IHDP. On average, children in the treatment group attended 27 (s.d. 13.2) weekly center-based care from ages 1 to 3, while the control group attended 5 (s.d. 10.8). My second instrument is the interaction between treatment assignment and “being a twin”. I argue that it operates through a reduction in parental investment when parents have twins instead of a single child. Importantly, twins were assigned either to treatment or control as a pair. The instrument satisfies the exclusion restriction because assignment 2 Some potential consequences of being low in birth-weight are detailed in Appendix E. 3 to treatment was at the moment of birth. Thus, conditional on treatment status, “being a twin” was random. The interaction instrument has two appealing features: (i) a transparent economic interpretation and (ii) a low raw correlation with the other instrument, random assignment to treatment (.37). With respect to the first feature, “being a twin” implies that parents face additional resource constraints. Conditional on treatment assignment and “being a twin”, the interaction of these two variables, which I argue is a source of exogenous variation, accounts for a decrease of 2/5 of a standard deviation in my preferred measure of parental investment. The second feature is fundamental to deal with two endogenous variables using two instruments. Roughly speaking, a rigorous identification strategy not only requires two instruments. It requires these instruments to act in an independent way on each of the endogenous variables. In this case, random assignment to treatment has very powerful rank on center-based childcare and powerful rank on parental investment. The interaction of treatment and “being a twin” has moderate rank on center-based childcare and powerful rank on parental investment. Moreover, the instruments appear to operate through different economic mechanisms and have low correlation in my data. This is the reason why even conservative tests support my identification strategy (see Cragg and Donald, 1993; Kleibergen and Paap, 2006). To the best of my knowledge, my paper is the first to use this source of variation either when studying child skill development or in the program evaluation literature.3 My baseline estimation is straightforward. I estimate a linear production function of cognitive skills at age 3, as measured by the Stanford-Binet IQ test. I consider two inputs and measure them as cumulative from ages 0 to 3: (i) average weekly hours spent in center3 This paper is actually the first to make use of the full twins sample when assessing IHDP. All prior studies make use of the “primary analysis group”. The primary analysis group neglects one of the twins in each twin couple. At no moment of the treatment was the staff aware of which twin in each twin couple was designated to the the primary analysis group (see Gross et al., 1997, Chapter 5) In my data, every time I observe information for one twin I also observe the same information for his or her sibling, and “being a twin” does not affect item non-response. 4 based childcare in each year; and (ii) average parental investment, as measured by a factor summarizing extensive questionnaires assessing time and material investment in children. In my preferred estimations, I control for a wide set of child, mother, father, and household characteristics, as well as state fixed effects. Using my baseline estimates, I assess the relative importance of each of my inputs of interest. To do that, I decompose the total treatment effect of IHDP on cognitive skills by totally differentiating the production function. I identify and estimate the components due to (i) center-based childcare, (ii) parental investment, and (iii) a residual due to other observed and unobserved characteristics. This decomposition is a function of the so-called first and second stage coefficients of my baseline estimation. The total treatment effect is 8.6 points, in a distribution with mean 100 and standard deviation 15. The decomposition indicates that 6.3(73%) points correspond to the total hours spent in center-based childcare, 4.0(47%) to parental investment, and −1.7(−20%) to other observed and unobserved characteristics. My baseline estimation does not allow for input complementarity or substitutability. An additional structural assumption allows me to account for this. I let the interaction of center-based childcare and parental investment enter linearly into the production function. To account for input endogeneity, I obtain the so-called projections by estimating the first stages for each input. Then, I multiply them together to obtain the projection for the interaction of the two inputs. To estimate the effects of the two inputs and their respective interaction, I estimate a linear regression of Stanford-Binet IQ test at age 3 with the three projections as independent variables and account for a rich set of controls. This helps me find that center-based childcare and parental investment are (Edgeworth) substitutes. The decomposition results change as follows. Of the total treatment effect of high-quality early education (8.6 points), 6.3(73%) corresponds to center-based childcare, 3.0(35%) to parental investment, and −0.7(−8%) to a residual. To link my early-life findings to later-life outcomes, I use age 18 measures on schooling 5 and behavior—adult follow-ups are not available yet. Estimating a production function of these outcomes with center-based childcare and parental investment is out of the scope of this paper. Thus, I directly estimate the so-called “reduced-form” regression of my two instruments on the various age 18 outcomes. The two instruments have clear and significant relationships with my inputs of interest: treatment with center-based childcare and the interaction of treatment and “being a twin” with parental investment. Thus, I interpret the coefficients on these two instruments in the “reduced-form” regression as the long-term link of the inputs of interest with later-life outcomes. The estimates show a very clear pattern across all the age 18 outcomes: (i) the coefficients on center-based childcare and parental investment always go in opposite directions; (ii) for outcomes such as smoking or teen parenthood the coefficients on treatment are negative and the coefficients on the interaction are positive. For outcomes such as idleness or “in therapy” the opposite is true. The precision and size of parental investment are much stronger. The findings on later-life outcomes are consistent with the early-life findings. In fact, they confirm the economic rationale of my two instruments. The first operates through boosting the use of center-based childcare since it is free of cost. Given the high-quality of the program, the effects are positive. The second instrument operates through a resource constraint on the parents, which makes them invest less in children. This has negative longterm consequences, as expected. Thus, this evidence shows how both early-life investment through center-based childcare and parenting have long-term tangible consequences on relevant outcomes. This paper relates to different instances of the literature studying early skill development.4 It is consistent with theoretical and empirical evidence establishing the importance of early childhood environments on child skill formation and development (see Karoly et al., 1998, 2006; Knudsen et al., 2006; Heckman, 2008; Currie and Almond, 2011). Specifically, it relates to recent studies on the effects of childcare in short and long-term outcomes (see 4 Importantly, I take the idea of estimating a production function depending of multiple inputs and using the estimates to decompose the total treatment effects of a early education program from Garc´ıa et al. (2015). 6 Gormley Jr et al., 2005; Loeb et al., 2007; Magnuson et al., 2007; Baker et al., 2008; Havnes and Mogstad, 2011; Berlinski et al., 2009). More importantly, it relates to papers aiming not only to estimate treatment effects of high-quality early education programs but also to understand the mechanisms through which this operate. Papers typically use one of two main approaches to analyze the mechanisms through which the effects of childcare operate. Heckman et al. (2013); Garc´ıa and Heckman (2015) decompose treatment effects of high-quality early education programs through variations of Laspeyres decompo- sitions.5 Brilli (2013); Bernal (2008); Del Boca et al. (2014) estimate dynamic structural models considering parental time allocation and investment decisions. My paper lies between these two approaches. I base my understanding of mechanisms based on a decomposition as Heckman et al. (2013); Garc´ıa and Heckman (2015), but I am able to assess causality in a clearer way. I base my decomposition on an economic model, which relates my work to Brilli (2013); Bernal (2008); Del Boca et al. (2014). Admittedly, my approach simplifies relevant components such as entry-exit parental labor supply decisions or parental uncertainty about her child’s skills. However, the sources of my identification are more transparent. My work also relates to different approaches intending to identify the production function of child skills. They span from the estimation of simple linear production functions of cognitive skills as in Bernal and Keane (2011) to complex estimation of dynamic production functions of multiple skills as in Cunha and Heckman (2008); Cunha et al. (2010). My identification strategy is much more similar to Bernal and Keane (2011). I differ from them, because the quality of my data enables me to measure parental investment and estimate its effect on skill development. They have to make a set of assumptions on parental time constraints and the demand for market goods devoted to child development due to data constraints and cannot assess the effect of parental investment. We both estimate the effect of 5 Heckman et al. (2013) uses an statistical decomposition to analyze how early-life skills mediate later-life outcomes. Garc´ıa and Heckman (2015) uses a similar methodology and finds that parental responses together with early-life skills mediate later-life outcomes. 7 childcare on skill development. They find a negative effect of non-center-based childcare and no effect of center-based childcare. They do not explore complementarity or substitutability of the inputs in skill production, nor do they intend to study the economic mechanisms driving their results or link their estimates to long-term outcomes. Another vast set of papers does intend to estimate substitution or complementarity patterns in skill production. They use a variety of inputs and estimation techniques and it is hard to make a clear comparison of my findings (see Todd and Wolpin (2003); Cunha et al. (2006); Cunha and Heckman (2008) for theoretical and empirical discussions and surveys). However, to the best of my knowledge, my paper is the first to provide causal, transparent evidence on substitutability between center-based childcare and parental investment. The rest of the paper proceeds as follows. In Section 1, I model the parental investment problem and explain how it eases my identification strategy. Section 2 describes the data I use, and Section 3 delineates my identification strategy. In Section 4, I present and discuss the results, and in Section 5, I give some final comments. 1 Production Inputs and Child Skills I present a simple parental choice and child skill development model. This helps me to argue why is it reasonable to model the production of skills as dependent on two inputs. I consider a single utility-maximizing parent with a single child. Her utility depends both on her consumption and on her child’s unidimensional skill. It is simple to extend the model to add, for example, the cases of two parents and multiple children. That is not the objective in this section. The objective is to understand how is it possible to think of a reduced set of inputs that summarize multiple parental decisions. The idea that it is possible to produce goods through different time and market goods allocations is not new in economics (see Becker, 1965). I extend that idea and develop a model in which skill production depends on two basic inputs. The first is the service the child 8 receives by being taken to center-based childcare. The second is direct parental investment. Both consist of combinations of market goods and parental time allocation. The first requires the parent to spend time to take the child to center-based childcare and pay for the explicit service—which she buys in the market. The second is a composite of parent-child activities, which requires time and market goods (e.g., books, puzzles, etc). These two basic inputs are what build child skill. Given that they are generally produced through various market goods and time allocations, it is hard to think of additional inputs in the skill production. In section 3, I present an indirect test considering as few as two inputs, and in Section 4, I present the results on this indirect test. They indicate that with the two inputs, I account for most of the variation in the production of cognitive skills .6 I denote the two inputs by Gc and Gp and write them as Gj = fj (xj , tj ) (1) for j = c, p and where xj is a vector of material goods and tj is a vector of time allocations. fj maps xj and tj into the scalar input Gj . Becker (1965) argues that basic goods, not purchased or sold in the marketplace but instead produced by consumers using both market purchased goods and time, are the arguments of the utility function. Analogously, I argue that basic inputs produce child skill, which I write as A = g (Gc , Gp ) (2) where g(·, ·) is strictly concave in both of its arguments. The parent maximizes utility, which depends on her child’s skill and consumption, ω. As the basic inputs, consumption needs not to be purchased or sold in the marketplace and, therefore, may be produced by the combination of market goods and different time allocations, xω , tω . Thus, parents maximize 6 A third basic input may be the time children spend with family or acquaintances. It is easy to tackle this by arguing that the two inputs are measured relative to the time children spend with their relatives. 9 U (ω, g (Gc , Gp )) (3) h (ω, Gc , Gp ) = H (4) subject to a budget constraint where h(·) is an expenditure function and H represents total household resources. The material goods and time constraints are X pj xj + pc xc = Ts s j∈{c,p} X tj + tω = T − Ts (5) j∈{c,p} where pj , pc are market prices, Ts is parental time spent at work, s is wage, and T is the total time of the parent. To ease interpretation, for adequate scalars aj , bj , aω , bω , it is possible to write tj , xj , tω , xω as tj ≡ aj Gj , tω ≡ aω ω xj ≡ bj Gj , xω ≡ bω ω (6) where the scalars are the units of either material goods or time per unit of Gj , ω and represent the technological restriction in (1). Instead of maximizing (3) subject to (5) and (6) by choosing tj , xj , tω , xω , it is straightforward to restate the problem as max U (ω, g (Gc , Gp )) ω,Gc ,Gp 10 (7) subject to πc Gc + πp Gp + πω ω = Ts s (8) where πc , πp , πω are the full prices of Gj and ω and are given by pj bj + aj tj and pω bω + aω tω , respectively. In the context of skill production, thinking of basic inputs is appealing both in theoretical and applied perspectives. On one hand, few are the cases when it is intuitive to think of concrete material goods or time allocations as being the direct inputs of the production function in (2). It is more intuitive to think of inputs as being mixtures of time and material goods. For example, a basic input is a composite of activities in which the child and the mother spend time together reading a book, playing, etc. Put differently, it is more intuitive to think of the input as the activities and not as books or the time by themselves. On the other hand, it is easier to find measures of mixtures of time and material goods than disaggregated measures of material goods and time. Empirically, this eases identification because endogeneity makes it difficult to tackle the presence of multiple inputs. To illustrate how this framework helps, suppose identification exploits exogenous variation in the price of a basic input, πj . This may either be a price shock to pj or a price shock to s, since xj and tj are endogenous. Consider a shock to pj . This shock automatically shifts the price of the basic input, πj , and it induces a behavioral response on the optimal decision of each of the two basic inputs and parental consumption. By the technological restrictions in (6), these responses imply changes in the optimal levels of material goods and time allocation levels, xj , tj , xω , tω . Importantly, the structure of the model is such that changes in the basic inputs imply changes in material goods and time allocations. These changes are embedded in the model. Thus, identifying the effect the basic input Gj on child skill does not require to identify separately the effects of xj , tj , xω , tω , which requires more sources of exogenous variation. This model enables to decompose the effects exogenous price shocks have on child skill. 11 To derive this, I assume that (3) is strictly concave in both of its arguments. The solution to the parental problem is the following system of continuous and differentiable Marshallian demands: Gc = Gc (πc , πp , πω , s, T ) (9) Gp = Gp (πc , πp , πω , s, T ) ω = ω (πc , πp , πω , s, T ) . An exogenous shock is, for example, free provision of childcare. This decreases πc . Importantly, both production inputs are functions of πc . Thus, the total effect of the price change on child skill is the following: dA dπc |{z} Total Price Effect = ∂A dGc ∂A dGp . + ∂Gc dπc ∂Gp dπc | {z } | {z } Effect throgh Gc (10) Effect throgh Gp The decomposition does not make any assumption about complementarity or substitutability of Gc and Gp in skill production. In the case in which the inputs are not complements or substitutes, the level of each of the components of (10) is not reinforced or lessened by a production scale effect. The price change only induces the “standard” price and income effects. If the inputs are complements or substitutes, there is a scale effect. I am able to allow for this an extension for my baseline estimation and find that allowing for complementarity or substitutability is relevant. I use this decomposition to understand the mechanisms driving total effect of free provision of early education. This enables me to quantify the relative importance of center-based childcare and parental investment in child skill development. In practice, it is impossible to perfectly fit (2) using two inputs. Thus, there is a difference between the left and the right hand-sides of (10). I ascribe this difference to other observed and unobserved characteristics. 12 Before discussing how I identify and estimate the elements of the decomposition, I describe my data in the following section. 2 Data Only the Head Start Impact Study (HSIS), a randomized controlled trial designed to evaluate Head Start, has larger scale than the Infant Health and Development Program (IHDP). I use the data arising from it.7 The initial sample was 1,090 individuals, including 105 twin couples. The program targeted prematurely-born, low birth-weight individuals (≤ 37 weeks, ≤ 2, 500 grams). It began in 1985 and took place place at eight different sites in eight different states in the US: Arkansas, Connecticut, Massachusetts, Miami, Pennsylvania, New York, Texas, and Washington (see Figure 1). The sites were facilities of major university campuses, which had hospitals in which the children were born. 7 According to the director of the office administering the program, “IHDP is the first multisite Randomized Controlled Trial designed to evaluate the efficiency of combining early childhood development and family support services [...]” Gross et al. (1997). This design suits perfectly the needs of my research question. IHDP was an early education program combining center-based childcare and home visiting, which directly relates to parental investment. Thus, it assessed and measured intensively my two inputs of interest, center-based childcare and parental investment. 13 Figure 1: IHDP Sites Location Note: this map highlights the states in which there was an IHDP site: Washington (The University of Washington School of Medicine –Seattle), Texas (University of Texas Health Sciences Center at Dallas), Arkansas (University of Arkansas for Medical Sciences –Little Rock), Florida (University of Miami School of Medicine –Miami), Pennsylvania (University of Pennsylvania School of Medicine –Philadelphia), Connecticut (Yale University School of Medicine –New Haven), New York (Albert Einstein School of Medicine –New York), and Massachusetts (Harvard Medical School –Boston). 2.1 Experimental Design Across the eight sites, 4,551 mothers gave birth to either low-weight single children or twins (≤ 2, 500 grams) between January 7, 1985 and October 9, 1985. The children of 3,249 mothers were excluded due to (i) residence (approximately more than 45 minutes driving distance); (ii) gestational age (> 37 weeks); (iii) immediate hospital discharge. Additionally, the children of 61 mothers were excluded due to various severe illnesses or neurological deficits. The sample left consisted of 1,302 mothers. 274 refused to participate of the treatment before being randomized. Of the remaining 1,028, 43 withdrew after the randomization. 985 mothers remained; 105 were mothers of twins. Children were stratified by “low-low birth-weight” and “high-low birth-weight” and site and randomized to either treatment or control status. The per stratum probability of being in the treatment group was 1/3. The randomization was at family level. Therefore, twins were jointly assigned to treatment or control as a couple. However, children from higher order multiple births were not eligible, and twin pairs 14 in which one infant was ineligible disqualified both twins from participation. In total, 420 individuals were assigned as treated and 670 as controls. Table 1 details the distribution of treatment assignment and further information on randomization is in Appendix B . Table 1: Treatment Assignment Distribution by State and Sibling Status, IHDP Singletons Control Treatment Arkansas 68 43 Connecticut 57 42 Florida 52 38 Massachusetts 85 41 New York 84 40 Pennsylvania 46 43 Texas 80 43 Washington 74 44 All 546 334 Twins Control Treatment 24 10 18 8 8 12 16 8 16 12 14 10 16 12 12 14 124 86 All Control Treatment 92 53 75 50 60 50 101 49 100 52 60 53 96 55 86 58 670 420 Note: this table shows the distribution of treatment assignment by state and sibling status in the Infant Health Development Program. “Singletons” refers to the count of single births and “Twins” to the count of siblings born in couple from the same pregnancy. By design, there were no triplets or greater order multiple births in the program. The program began in 1985, when all of its participants were born, and took place in eight different states in the US. Prematurely-born, low-weight children (≤ 37 weeks, ≤ 2500 grams) were assigned to receive treatment over a period of three years, from birth to age 3. Twins were assigned to treatment or control jointly. Thus, they either belonged to the treatment or control groups as a couple. The treatment consisted of (i) frequent home visits from ages 0 to 3; (ii) center-based childcare from ages 1 to 3; (iii) parenting group sessions from (children) ages 1 to 3. Unlike programs like the Carolina ABC Project (ABC), the Perry Preschool Program (Perry), or Head Start, IHDP had no eligibility requirements based on socio-economic status. Yet, the fact that families self-selected their children to be born in those locations and that the infants were born prematurely with low birth-weight makes the sample particular. Table 2 characterizes household-level and individual characteristics. It offers some statistics from the Panel Study of Income Dynamics (PSID) to provide a notion of comparability between the IHDP sample and nation-wide characteristics of the population. IHDP individuals appear more disadvantaged. For example, there is a much greater welfare take-up and women give birth at younger ages in the IHDP sample, relative to the nationally repre15 sentative cross-section of individuals born in 1985. The reason may be a strong correlation between birth-weight adverse conditions and socio-economic status. 16 17 2.7273 0.5002 33.1104 0.5136 0.5136 25.1232 0.3584 0.4992 0.5136 0.2864 0.2096 0.9376 0.0176 1.1408 1.0192 93.8000 53.1824 6666.6381 Mother Black Age Works in Birth-year Married in Birth-year Father Black Household Welfare in Birth-year Moved in Birth-year Siblings at Birth Older than 63 Dependents at Birth Adult Dependents at Birth Employed Adults at Birth Economy Employment % Median Income in 1000s (2014 USD) Government Expenditure per Capita 93.8110 52.9575 6559.4673 0.3838 0.2872 0.9034 0.0209 1.1775 1.0496 0.5614 0.5561 24.6475 0.2950 0.4386 0.2089 0.4883 1817.3051 N/A 33.0183 0.5561 1.6710 8.4621 1475.5385 0.4869 0.5976 1.0327 0.1432 1.0584 0.7624 0.4969 0.4975 5.7358 0.4567 0.4969 2.4574 0.4975 0.4070 0.5005 438.1410 0.0109 -0.2249 -107.1707 0.0974 0.0776 -0.0342 0.0033 0.0367 0.0304 0.0478 0.0425 -0.4757 -0.0634 -0.0606 -0.0921 0.0425 0.0185 0.0019 16.0612 0.9615 0.8917 0.8578 0.2385 0.2275 0.8789 0.7917 0.6743 0.6321 0.2771 0.2927 0.8514 0.8514 0.8550 0.8706 0.2725 0.6459 0.9780 0.6514 1.3801 0.1669 5.0859 0.4985 0.3912 2.0048 8.3652 2716.2209 N/A N/A N/A N/A N/A N/A 93.3971 50.5623 6715.5542 2.0443 0.0287 26.2701 0.5404 0.8115 0.0189 N/A 0.5696 0.4952 3495.0749 620.3717 0.0328 0.1781 N/A 0.1994 0.3997 US Population born in 1985 Note 1 (general): this tables provides characteristics of the individuals in IHDP and the US population born in 1985. The program began in 1985, when all of its participants were born, and took place in eight different states in the US. Prematurely-born, low-weight children (≤ 37 weeks, ≤ 2500 grams) were assigned to receive treatment over a period of three years, from birth to age 3. Twins were assigned to treatment or control jointly. Thus, they either belonged to the treatment or control groups as a couple. There were no triplets or greater order multiple births from a single pregnancy. Note 2 (variable definitions): all variables are in the year the child was born. Twin, male, low birth-weight, black, works, and married are indicators. N/A: not available or does not apply. Note 3 (IHDP details): I use the sample that has all variables in the table as “non-missing” (625 controls and 383 treated). Describing each variable according to item-specific respondents does not change the descriptions significantly. Note 4 (US Population details): individual, mother, father, and household characteristics come from the Panel Study of Income Dynamics (PSID). I use cross-sectional weights for year 1986 (210 observations). Data on the proportion of twins comes from the National Vital Statistics System and I take it from Martin et al. (2012). Economy characteristics come from the Current Population Survey (CPS) and the Bureau of Labor Statistics (BLS) and do not weight for state population. Government expenditure per capita is at state level. Note 5 (estimations): I calculate the t-statistics through a simple regression of the variable as left-hand side variable and a constant and treatment indicator as right-hand side variables. They are stratified by site and birth-weight groups and clustered by sibling status. p − value’s are bootstrap, non-parametric, i.e. represent the proportion of non-rejections of the null hypothesis in 1,000 draws. 1.7384 8.6149 1530.6615 0.4524 0.5182 1.0923 0.1316 1.0754 0.7046 0.5002 0.5002 6.0692 0.4799 0.5004 0.3929 0.5002 460.6490 0.1904 0.4864 1801.2439 Individual Twin Male Birth-weight (grams) Low Birth-weight Gestational Age Black Infant Health Development Program (IHDP) Control Treatment Treatment - Control Mean SD Mean SD Mean Difference Pr(|T | > |t|) Table 2: Baseline Characteristics by Treatment Status and Comparison to the US Population Treatment8 2.2 One of the main concerns of the the program administrators was the homogeneity of treatment across the eight cites. Their objective was to deliver a homogeneous treatment to ease evaluation. The centralized structure of of the program helped keep track of this homogeneity. Some differences are reported in Gross et al. (1997). There was a central office, the National Service Office, in charge of the administration of the program, staff training and hiring, monitoring of across-site treatment homogeneity, and program evaluation.9 This office was based on Stanford University. With the objective of avoiding conflicts of interest, there was an independent office in charge of developing the curriculum of the program and monitoring its correct execution. This was the Program Development Office and was based on North Carolina. Several experts with experience developing curricula for other projects such as the Carolina ABC Project and Project CARE were involved in IHDP from this office (see Gross et al., 1997, Chapters 26 and 27). These two offices had daily communication with the centers, received data, and met multiple times a year. One of the main objectives was to align the treatment across sites as much as possible as it evolved. In the three following subsection I describe the three main components of treatment: (i) home visits; (ii) center-based childcare; and (iii) parent group meetings. 2.2.1 Home Visits Weekly home visits began as soon as the child or children (in the case of twins) were discharged from the hospitals. They were weekly for ages 0 to 1 and bimonthly from ages 1 to 3. The home visitors were trained by the Program Development Office, the centralized committee in charge of developing the curricula of the program. This office supervised, developed, and executed the hiring and training of all the home visitors months before the intervention 8 9 The information in this subsection is largely based on Chapters 2 and 3 of Gross et al. (1997). More details on the administration of the program are in Appendix D. 18 started. All visitors were college graduates with previous experience in home visiting. Their goal was twofold: (i) implement a child curriculum and (ii) teach parents problem-solving skills, focusing on situations specific to children but also advising on general issues of life. The curricula for the children was Early Partners(see Sparling et al., 1991) and was complemen- ted with the Partners for Learning Kit (see Sparling et al., 1984). The curriculum consisted of a series of learning activities designed to boost child development. The kit complemented these with toys and playing materials such as memory cards. The home visitor instructed the parent on how to use the materials and made joint decisions with him or her on the type of learning activities and toys used. This interaction was fundamental, because it helped the home visitor to learn whether there were any particular issues in the parent-child interaction which could be assessed and solved as part of the parents problemsolving skill component of the visit. The interaction between the visitor and the parent was intensive. In sum, the parent was encouraged to be analytic about her or his problems. The objective of the home visitor was to engage with the parent, learn about his or her problems, talk about different solving strategies, and encourage the parent to solve issues in an independent way. For doing this, problem analysis and solution planning was the main driver of this part of the treatment. Figure 2 provides two of the cards home visitors often used in the visit. It is a an accurate visual description of the treatment parents received. 19 Figure 2: Example Materials of Home Visitation Treatment Component, IHDP (a) STOP Card (b) THINK Card 1. What is the problem? 2. What do I want? 3. What can I do? 4. What will happen if? 5. What is my decision? THINK Note: this figure shows the STOP and THINK cards. These were two of the cards used in the component of home visitation directed to parents in IHDP. They were also provided as magnets to each of the treated families. 2.2.2 Center-based Childcare When the children were a year old, they began center-based childcare. This continued to age 3 and was suspended only on national holidays and one or two weeks during the summer and during December. Children were granted access to center-based childcare for 5 days a week at least for four hours a day and at most for nine hours a day. The teacher staff continued with the curricula implemented at home from ages 0 to 1, Partners for Learning. Across sites, 131 different activities of the curriculum were introduced to children over the two years in the program, on average. There was one adult for every three children in class sizes of six children for children ages 1 to 2. The ratio was one teacher for every four children and class sizes increased to eight when children were 2 to 3 years old. In each classroom there was one teacher and one assistant teacher. Teachers were required to have an undergraduate degree on early childhood or a relative field and two years of experience with children younger than three years old. Assistants were required to have at least a high-school degree and one year of experience 20 with infants or young children. All the staff was intensively trained in North Carolina by the Program Development Office prior to the beginning of the treatment. Center-based childcare had two additional elements: (i) free transportation and (i) nutrition. Free transportation was provided by each center. The mean use across centers was 80%, with a minimum of 60% and a maximum of 98%. Nutrition was provided by the center as well. Children who arrived early received breakfast. All children received lunch and two snacks. 2.2.3 Parent Group Meetings This component was minor and it was held every other month in each site. It was intended to provide family support. Parents shared concerns about their children’s and their own problems. They were given information on the current health and developmental status of their children. 21 2.3 Variables Figure 3: Stanford-Binet IQ Test Score at Age 3 by Treatment Status, IHDP .02 Density .015 .01 .005 0 40 60 80 100 Test Score Control 120 140 Treatment Note: this figure plots the density of the Stanford-Binet IQ Test Score at Age 3 by treatment status. The design of this test enables me to assess knowledge, quantitative reasoning, visual-spatial processing, working memory, and fluid reasoning. I standardize it to have the mean and standard deviation of the US population, 100 and 15 respectively. The raw mean difference between the treatment and control distributions is 8.6, with a stratified-by-site and birth-weight and clustered by family standard error of 1.3. The initial sample size was 1,090; 670 individuals belonged to control and 420 to treatment. The data arising from IHDP consists of a very rich set of longitudinal measures with follow-ups up to age 18.10 I have a rich set of controls available. Individual-level characteristics are indicators of “being a twin” and “being a male”, gestational age in weeks, and birth-weight. Mother-level characteristics are race, age, marital status, and working status. I also use state fixed effects. My dependent variable of interest measures cognitive skills at age 3 using the StanfordBinet IQ. Its design enables me to assess knowledge, quantitative reasoning, visual-spatial processing, working memory, and fluid reasoning. I standardize it to have the mean and standard deviation of the US population, 100 and 15, respectively. My estimates indicate that the total effect of IHDP treatment on this IQ measure is 3/5 of the standard deviation. Figure 3 plots the distribution of this variable by treatment status. 10 For details on the construction and exact definition of each variable see Table A.1. 22 My two independent variables of interest are two cognitive skills production function inputs, accumulated center-based childcare and parental investment. I measure the first as the average weekly hours a week per year in center-based childcare reported by the mother from ages 1 to 3. I observe yearly measures at ages 1.5, 2, 2.5, and 3. I measure the second as the average of 78 indicator items of the Home Observational Measurement Environment (HOME score) at ages 1 and 3. This score aims to measure the quality and quantity of parental investment. I pick the items arbitrarily using two criteria: (i) reflection of parental investment behavior; (ii) precision in the construction of the factor. What I mean by the second of the criteria is that I not only use items reflecting parental investment but also belonging to common sub-scales, as the HOME score is divided in sub-scales aiming to measure different qualities of the child’s environment. Given that the items are designed to measure those sub-scales, dropping items that belong to a scale for which I only keep very few items augments the precision of the factor. To ease interpretation and comparison I convert the two inputs to 25 quantiles in the overall sample distribution. My main identification strategy requires at least two instruments or exclusions restrictions. I use exactly two: random assignment to treatment and the interaction of treatment and “being a twin”. Table 1 shows raw information for these two variables. Interestingly, they have a relatively low raw correlation (.37). As I discuss in Section 3 this and their strong economic and statistic correlation with the the two inputs in the last paragraph are fundamental for my identification strategy. Finally, I use age 18 outcomes to provide long-term evidence. I group these in two categories: (i) schooling and (i) behavior. All of them are binary variables. Unfortunately, nonresponse is considerably high in some of them. However, as I document in Appendix A, there are no significant differences in item non-response across treatment and control groups. This is important for my evidence on later-life outcomes not to depend on response pattern differences across these groups, specially given that treatment assignment generates my sources of exogenous variation. Furthermore, estimates accounting for item non-response through 23 inverse probability weighting schemes do not change significantly. They are available under request. 3 Identification 3.1 Skill Production 3.1.1 Inputs Linearity My first objective is to identify the effects of center-based childcare and parental investment on cognitive skills at age 3. In Section 1, I explain why is it reasonable to postulate a production function depending on only two inputs. In sum, households make market goods and time allocation decisions to produce these inputs. Then, these inputs themselves produce cognitive skill. The production of cognitive skills at age 3, thus, depends the accumulated inputs from ages 0 to 3. In my preferred estimations I control for a wide set of individual, family, and household characteristics as well as state fixed effects –control set 3 in Table 3. I interpret these controls as technology shifters and not as inputs. In what follows, I omit them to ease notation. Table 3: Control Sets for Different Specifications ID 1 2 3 Control Set Individual: Twin, Male, Birth-weight, Gestational Age, Black, Hispanic Control Set 1 Mother: Education, Age, Marital Status Household: Welfare, Times Moved, Siblings, Under 18 Dependents Age > 63 Dependents, Age > 18, ≤ 63 Dependents, Number of Adults Employed Control Set 2, State: Fixed Effects Note: this table details the controls I use in the different specifications I present results for. Other combinations of control do not alter the results in terms of magnitude, although the specifications with more controls are, in general, more precise. Detailed description of each variable is in Table A.1. To identify the effects of interest, I make a functional form assumption on (2) and assume that the relationship between my measure of cognitive skills for individual i in the 24 set of individuals in the sample, I, relates linearly to the accumulated inputs –center-based childcare, Gc,i , and parental investment, Gp,i . This is, Ai = α0 + α1 Gc,i + α2 Gg,i + εi (11) where εi is an unobserved component. α1 , α2 are the estimands. They represent the marginal effects of Gc,i , Gg,i on Ai . This unobserved component may be individual, family, or household specific. It is unknown to the econometrician and, potentially, not mean independent of either of the inputs. For example, if it represents unobserved “natural ability” of the child and parents are aware of it, parental decisions on the inputs depend on its level. This implies that the standard assumption to estimate α1 , α2 through standard methods as OLS, E[εi |Gc,i , Gg,i ] = 0, fails. When this happens, the inputs are said to be endogenous. A common approach to tackle this issue is to “instrument” the endogenous variables of interest, in this case the inputs. Use Gc,i as an example. A source of exogenous variation, Zc,i , is a valid instrument if it moves Gc,i independently of εi . Formally, Zc,i is a valid instrument if E[εi |Zc,i ] = 0 and Cov(Zc,i , Gc,i ) 6= 0. The first condition is the “exclusion restriction” and the second condition is the “rank condition”. Importantly, the instrument needs to satisfy these two “conditional on the set of controls used in the estimation”. To g clarify this, let Z c,i be the residual of a regression of the relevant set of controls on Zc,i . Then, it is possible to restate the “exclusion restriction” and “rank condition” as follows: g g E[εi |Z c,i ] = 0 and Cov(Zc,i , Gc,i ) 6= 0. The two inputs in (11) are as likely to be endogenous because they arise from a joint parental decision problem. Thus, identification requires at least two instruments. Informally, at least two exogenous sources of variation are necessary to separate the effects of the two inputs on the dependent variable of interest, cognitive skill. This does not mean that the instruments need to be input-specific. However, they need to have enough independent variation between each other so that the mechanisms through which they have effects on 25 cognitive skill are isolated and the relevant coefficients are identified. Once two instruments are available and valid, estimating (11) is straightforward. Let Zi be the vector of instruments for i ∈ I. The first step is to estimate the vector coefficients θc , θg in Gc,i = Zi θc + ηc,i Gg,i = Zi θg + ηg,i . (12) where E[Zi |ηj,i ] = 0 and for j = c, g. ηc,i , ηg,i , εi may have a joint distribution. The system in (12) is often called “first stage”. The second step is to use the first stage estimates, θˆc , θˆg to obtain the linear projections of the inputs: b d G c,i := Zi θc b d G g,i := Zi θg . (13) d By construction E[εi |G j,i ] = 0 for j = c, g because E[εi |Zi ] = 0. This justifies the d validity of the third and final step. Substitute G j,i for Gj,i for j = c, g in (11) and estimate the coefficients by a simple method such as OLS. Let these estimands be αk2SLS for k = 0, 1, 2. It is easy to show that the estimates from this procedure are consistent. This is, plim α ˆ k2SLS = αk for k = 0, 1, 2 if E[εi |Zi ] = 0. This last procedure is often called “second stage”. Explicitly, it consists of estimating the coefficients in 2SLS d d Gg,i + εˆi . Ai = α02SLS + α12SLS G c,i + α2 26 (14) 3.1.2 Allowing for Complementarity or Substitutability The production function in (11) is additively separable in inputs. Therefore, it does not allow for the inputs to be either substitutes or complements. The simplest way to allow for this is to let an interaction of the inputs enter linearly as in the following function: Ai = α0 + α1 Gc,i + α2 Gg,i + α3 Gc,i Gg,i + εi (15) where the notation is analogous to the notation in Subsection 4.2.1.11 There are two possibilities to identify the coefficients in (15). The first is to find an instrument for Gc,i Gg,i . This proves difficult because independent sources of variation for three highly correlated variables is necessary. It is hard to come up with an economic story of an instrument satisfying this. The second option is to exploit the functional form or parametric assumption on (15). I follow this approach. To do that, I follow the first two estimation steps in Subsection (4.2.1). Once I obtain d d \ G c,i , Gg,i , I multiply them together to find Gc,i Gg,i . Then, I follow the third estimation step in Subsection 3. As before, it is easy to show that plim α ˆ k2SLS = αk for k = 0, 1, 2, 3 if E[εi |Zi ] = 0. This illustrates the usefulness of parametric assumptions in the identification of models. This comes at a cost as, admittedly, I identify α3 by functional form.12 Explicitly, I estimate the coefficients in 2SLS d d Ai = α02SLS + α12SLS G Gg,i + α32SLS G\ c,i Gg,i + εˆi . c,i + α2 (16) 11 In a similar specification, Løken et al. (2012) estimate the degree of diminishing marginal returns of parental income on different child outcomes. They allow the square of parental income to enter linearly into an specification which has the non-squared term as well. My specification allows me to find evidence of (Edgeworth) substitutability between inputs. Their specification allows them to find evidence on diminishing marginal returns of parental income on different child outcomes. 12 See Flinn and Heckman (1982) for a discussion on parameters than can be identified by functional form and parameters than can be identified non-parametrically. Also, see Keane et al. (2011) for a discussion of parametric and non-parametric approaches and their relation to identification of models. 27 3.2 Decomposing the Effects of Each Input My second objective is to identify and quantify the relative importance of each input of interest in the production of cognitive skill. Specifically, I want to assess the portions of the treatment effect of a policy corresponding to each of the inputs. The policy I analyze is random assignment to an early education program, IHDP. Among other components, this policy made center-based childcare freely available. Thus, it exogenously drove its price, πc,i , to zero. It has an individual index because it was randomly driven to zero only for some of the individuals (see Section 2). In terms of the notation I introduce in Subsection 4.2.1, πc,i is one of the components of Zi . As I argue in Section 1, I am able to decompose the total effect of the policy by totally differentiating the production function. The decomposition is, therefore: dAi dπ | {zc,i} Total Price Effect = ∂Ai dGc,i ∂Ai dGp,i + + residuali ∂Gc,i dπc,i ∂Gp,i dπc,i | {z } | {z } Effect throgh Gc,i (17) Effect throgh Gp,i In practice, this decomposition is not exact. Omitted or unmeasured variables may cause discrepancies between the left-hand and right-hand sides of (17). The left-hand side is the total treatment effects of the policy. This is, it is the change in cognitive ability the policy causes. The right hand side decomposes this total treatment effect in the components corresponding to center-based childcare and parental investment. The discrepancy term, which is due to other observed and unobserved variables, is an indirect test of the specification of the production of Ai . The larger the residual (in absolute value) the greater the miss-specification. The components of (17) are straightforward to estimate. The left-hand side is the mean difference between the individuals receiving the random price drop in πc and the individuals not receiving it. For the right hand-side, the components ∂Ai , ∂Ai ∂Gc,i ∂Gp,i are the marginal effects of the inputs on Ai and are function of the second stage coefficients. This is, functions of the coefficients in (17) when I do not allow for input complementarity or substitutability and 28 of (16) when I allow for it. Similarly, dGc,i dGp,i , dπc dπc are functions of the first stage coefficients. These come directly from (12). 3.3 Long-term Impacts Assessing the long-term effects of both center-based childcare and parental investment is economically interesting an policy relevant. As I show in Section 4, these are the two inputs through which a policy providing early education has a positive total treatment effect on cognitive skills. It is important, however, to understand whether the effects of these inputs persists over the life-cycle. One possibility is to estimate the production function of long-term outcomes using a similar methodology as the one I develop in the last few paragraphs. One immediate need is to extend the set of inputs through the life-cycle up to whenever the long-term outcome is measured. My main limitation is that I do not not have enough longitudinal data to construct inputs up to my latest span of observation. Thus, I propose a the following method to assess this. As I show in Section 4, the two instruments I use in my empirical strategy, roughly, input-specific. This is, one instrument exogenously shifts Gc,i and the other exogenously shifts Gg,i . Their relative “cross effect” on each input is relatively low compared to the “direct effect”. To clarify this, let πc,i , πg,i denote the two instruments. πc,i generates much more exogenous variation on Gc,i than in Gg,i . Conversely, πg,i generates much more exogenous variation on Gg,i than in Gc,i . I use this to identify the long-term effects of center-based childcare and parental investment. Let Yi denote a long-term outcome for i ∈ I. I estimate the coefficients in the following equation: Yi = γ0 + γ1 πc,i + γ2 πg,i + ξi . 29 (18) In Section 4, I provide an economic interpretation of the link of πc,i , πg,i and Gc,i , Gg,i . The fact πc,i , πg,i are relatively input-specific, allows me to interpret interpret γ1 , γ2 as indirect estimates of the the long-term effects of center-based childcare and parental investment on long-term outcomes. Importantly, πc,i , πg,i are sources of exogenous variation. Thus, E[ξi |πc,i , πg,i ] = 0 is a plausible assumption. This makes the estimation of γ1 , γ2 through a simple technique such as OLS consistent. 3.4 Inference All of my inference is bootstrap, non-parametric. By this I mean that for each estimate I present, I re-sample with replacement 1,000 times and calculate the estimate again with each re-sample. My standard errors are simply the standard deviation of the estimates of these 1,000 draws. All of my tests are two-tailed and contrast the null hypothesis of a coefficient being equal to zero. Therefore, to construct p − value’s, I do the following. I demean the 1,000 estimates to center them around zero. I compare their absolute values with the absolute value of the estimate from the actual sample. The p − value is the proportion of times the “re-sampled” estimates is larger than the estimate from the real sample. My resample emulates the randomization process from which my data originates (see Section 2). Thus, I stratify by site and a low-low birth-weight indicator (< 2,000 grams) and I cluster at family level. As in the randomization process, I assign twins to either treatment or control as a couple in each re-sample. 4 4.1 Results Justifying and Interpreting the Validity of the Instruments The two different specifications I consider for the production function of cognitive skills require at least two sources of exogenous variation, as I explain in Section 3. In that section 30 I also explain the two requirements they need to satisfy: (i) exclusion restriction and (ii) rank condition. 4.1.1 Exclusion Restriction My two instruments are (i) random assignment to IHDP and (ii) the interaction of random assignment and “being a twin”—see Section 2 for details on the experimental design an sample details. For these two variables to be valid exclusions restriction, I need them to relate to my outcome of interest, a measure of cognitive skills, only through the inputs I instrument for: center-based childcare and parental investment. Using the same notation as in Section 3, I need the instruments to satisfy the following condition: E[εi |πc,i , πg,i ] = 0. By definition, this condition is impossible to test: εi is unobserved to the econometrician. However, some placebo tests are useful. A necessary condition for the instruments to be valid is for them not to have any significant relationship with baseline variables. If they do, it is likely for the instrument to directly relate to the outcome of interest, and, therefore, violate the exclusion restriction. For example, it is likely for it to “pick-up” unobserved family, household, or state characteristics that directly affect the outcome of interest. In that case, it would relate to the outcome independently of the inputs. I test whether the two instruments relate to a wide set of individuals, family, and state characteristics as follows. I estimate the coefficients in the following regression: Xi = λ0 + λ1 1[Treatmenti ] + λ2 1[Twini ] + λ3 1[Treatmenti ∗ Twini ] + ψi (19) where 1[·] is an indicator function. Xi is a baseline variable, and ψi is mean independent from all the right-hand side variables in (19). It is important to account for a twin indicator in (19) for the economic interpretation of why the interaction works as instrument to make sense. I discuss this in Subsection 4.1.3. Rejecting the null hypothesis H0 : λ1 , λ3 = 0 provides evidence against the exclusion re31 striction, i.e. against assuming that E[εi |πc,i , πg,i ] = 0. Figure 4 presents the test for various baseline characteristics. The results indicate an overwhelming fail to reject the null hypothesis in most of cases. It is impossible to guess a relationship between baseline variables and the instruments. Statistically, this relationship does not exist. 32 Figure 4: Placebo Tests, the Instruments and Pretreatment Variables (a) Individual (b) Mother .5 .3 Regression Coefficients Regression Coefficients .2 0 .1 0 −.1 −.5 −.2 Male Birth−weight Gestational Age Black Hispanic Treatment Twin Treatment*Twin p < .10 p < .05 p < .01 Mother’s Education +/− s.e. Mother Married Twin Treatment*Twin p < .10 p < .05 p < .01 (c) Household +/− s.e. (d) Economy 1 .4 .5 Regression Coefficients Regression Coefficients Mother’s Age Treatment 0 .2 0 −.2 lts s Ad u pl oy ed ep e lt D −.4 Em Yo u O Ad u ld ng D D ep en de nt nd en t s ep en de nt s in gs Si bl M es Ti m In W el fa ov e re d −.5 Treatment Twin Treatment*Twin p < .10 p < .05 p < .01 Employment +/− s.e. Median Income State Expenditure Treatment Twin Treatment*Twin p < .10 p < .05 p < .01 +/− s.e. (e) States Regression Coefficients .4 .2 0 −.2 Twin Treatment*Twin p < .10 p < .05 p < .01 on s in gt xa as h Te ia Treatment W M as Pe nn sy lv Yo ew N us ch sa an rk ts et id a or Fl ne c on C Ak an tic sa s ut −.4 +/− s.e. Note 1 (general): these figures present placebo tests on the instruments I use in my identification strategy. 33 I regress a treatment indicator, a twin indicator, and the interaction of these two variables on each of the variables labeled in the x-axis of each figure. The objective is to test whether the treatment indicator and/or the interaction and my instruments, correlate to baseline variables, and, therefore, reflect unobserved individual, family, or state unobserved trends. Note 2 (data): all variables are standardized for graphic display. Results are not sensitive to this. Details on each variable are in Table A.1. Note 3 (inference): p − value’s are bootstrap, non-parametric, i.e. they represent the proportion of nonrejections of the null hypothesis in 1,000 draws. They are stratified by site and by a low-low birth-weight indicator (< 2,000 grams), emulating the random assignment of treatment. My interpretation on why the instruments satisfy the exclusion restriction is the following. First, treatment was successfully assigned at random (see Table 2 for further details). For the interaction of treatment and “being a twin”, treatment assignment was assigned at birth without pre-screening on the children status as a single child or twin. Thus, conditional on being assigned to treatment, being a twin was random. 4.1.2 Rank Condition Usually, economists study problems in which there is a single endogenous variable. Thus, testing the rank condition is easy, through a standard F − test over coefficients analogous to θc or θg in the first stage. In that case, the first stage is not a system as in (12). The F − stat has a well-defined distribution in that case. A usual rule of thumb for an “acceptable rank condition” is that F − stat ≥ 10. It is based on simulations indicating that the relative bias of 2SLS estimators, as the one I use, with respect to OLS are minimal if this holds (see Stock and Yogo, 2005). In the case of multiple instruments, the F − test does not provide enough information on the independence with which the instruments shift the inputs (see Section 3 for a more formal discussion on this). Two tests with well-defined distributions are available for the case of multiple instruments. One is the Cragg-Donald test and the other is the Kleibergen and Paap test (Cragg and Donald, 1993; Kleibergen and Paap, 2006, see). The first assumes that the unobserved component in both the first and second stages of the estimations are i.i.d. The second generalizes to cases of heteroskedasticity and serial correlation. In particular, Kleibergen and Paap (2006) discuss how the value of the statistic and the critical values of 34 its distribution change in the presence of heteroskedasticity or serial correlation. The first test is convenient because critical values are available in Stock and Yogo (2005). For the second test, there no critical values available yet. Thus, I display the Cragg-Donald test in all my estimations and base the test on the rank condition on it. In all of my estimations the Kleibergen-Paap statistic is almost identical to the Cragg-Donald statistic. Hence, I base my conclusions on it. Table 4: First Stage Estimates, Center-based Childcare (1) 11.568∗∗∗ (0.0000) 2.7628∗∗∗ (0.0080) -1.0524 (0.2150) (2) 11.6723∗∗∗ (0.0000) 2.1304∗ (0.0580) -0.5909 (0.5400) (3) 11.6518∗∗∗ (0.0000) 2.3112∗ (0.0540) -0.6768 (0.4640) X X X X X X X X 985 380 194 80 945 357 180 72 945 357 180 72 0.4385 0.4601 0.4751 Rank Tests Cragg-Donald Stat Critical Value for 15%-10% Maximum Relative Bias 4.0941 4.58-7.03 6.3822 4.58-7.03 7.5396 4.58-7.03 F-Statistic (exclude instruments) AP F-Statistic (exclude residual instruments) 369.9536 492.9785 349.4054 378.7205 356.0704 414.4478 Treatment Treatment * Twin Twin Child Characteristics Mother Characteristics Household Characteristics State Fixed Effects N Treated Twins Treated Twins R2 Note 1 (general): this table displays the first stage results of treatment and the interaction treatment and “being a twin” for center-based childcare. Center-based childcare is the average weekly hours across ages 1 to 3—transformed to 25 quantiles. I describe in detail the controls in Table 3 and Table A.1. The ranks tests are described in Subsection 4.1.2. They jointly test the rank of center-based childcare and parental investment, as they are the two endogenous inputs in my specifications. Note 2 (inference): p − value’s are bootstrap, non-parametric, i.e. they represent the proportion of nonrejections of the null hypothesis in 1,000 draws. They are stratified by site and by a low-low birth-weight indicator (< 2,000 grams), emulating the random assignment of treatment. 35 Table 5: First Stage Estimates, Parental Investment Treatment Treatment * Twin Twin (1) 1.4218∗∗∗ (0.0010) -2.6305∗∗ (0.0300) -0.2237 (0.7740) (2) 1.8535∗∗∗ (0.0000) -3.047∗∗∗ (0.0060) -0.1477 (0.8520) (3) 1.8962∗∗∗ (0.0000) -3.2726∗∗∗ (0.0040) -0.0038 (0.9930) X X X X X X X X 985 380 194 80 945 357 180 72 945 357 180 72 0.274 0.4427 0.4626 4.0941 4.58-7.03 6.3822 4.58-7.03 7.5396 4.58-7.03 6.0511 8.0634 11.7206 12.704 12.866 14.9754 Child Characteristics Mother Characteristics Household Characteristics State Fixed Effects N Treated Twins Treated Twins R2 Rank Tests Cragg-Donald Stat Critical Value for 15%-10% Maximum Relative Bias F-Statistic (exclude instruments) AP F-Statistic (exclude residual instruments) Note 1 (general): this table displays the first stage results of treatment and the interaction treatment and “being a twin” for parental investment. Parental investment is measured as a scale of a extensive set of measures on parental investment and home environment at ages 1 and 3. Then, it is transformed to 25 quantiles. I describe in detail the controls in Table 3 and Table A.1. The ranks tests are described in Subsection 4.1.2. They jointly test the rank of center-based childcare and parental investment, as they are the two endogenous inputs in my specifications. Note 2 (inference): p − value’s are bootstrap, non-parametric, i.e. they represent the proportion of nonrejections of the null hypothesis in 1,000 draws. They are stratified by site and by a low-low birth-weight indicator (< 2,000 grams), emulating the random assignment of treatment. Table 4 presents the first stage estimates for center-based childcare and Table 5 does the same for parental investment. The Cragg-Donald tests are satisfying for all of the specifications. Actually, the more controls I add, the stronger the evidence based on this test. This happens because the test “assigns a reward” to the statistic when the number of controls increase without affecting the rank of the instruments. In no case I am unable to reject that the maximal relative bias of 2SLS with respect to OLS is greater than 15%. Complementary tests as standard F − test’s and AP F − test’s are satisfying as well.13 13 The AP F − test is similar to the F − test. However, it “residualizes” through a linear regression the non-relevant input to each first stage. For example, in the first stage of center-based childcare it partials out parental investment from the instruments. The AP F − test, however, does not have a well-defined 36 4.1.3 Economic Interpretation Subsection 4.1.1 and Subsection 4.1.2 provide evidence on the validity of the instruments. Economic interpretation, however, helps both to understand why the instruments work and how to interpret further the early- and later-life outcomes. Random assignment to IHDP made center-based childcare freely available. Not surprisingly, assignment to treatment implies an increase of 11.65 out of 25 quantiles in the distribution of average weekly hours in childcare across ages 1 to 3. Intensive treatment on parents was also part of IHDP. Thus, random assignment to treatment also has a noticeable treatment effect on parental investment. The effects, however, is much lower: 2.3 quantiles out of 25 in the distribution of a scale summarizing parental investment from ages 1 to age 3 (see Table 4). The interaction between treatment assignment and “being a twin” operates through a different economic mechanism than assignment to treatment. As discussed before, conditional on being assigned to treatment, “being a twin” is completely random. This random event generates (i) a relatively small increase in the use of center-based childcare, and (ii) a noticeable and significant increase in parental investment. I interpret this as a constraint in parent resources which makes them invest less in children and substitute away from investment sending children to center-based childcare more hours. Actually, my data allows me to break the accumulated inputs I use in my main estimations in measures across ages 1 to 3. The pattern over the course of childhood of the two instruments is very stable and favors this interpretation (see Figure 5). distribution (see Sanderson and Windmeijer, 2013). 37 Figure 5: First Stage Coefficients broken into Age 1 to 3 Components (a) Center-based Childcare 5 Regression Coefficients 4 3 2 1 0 −1 Age 1.5 Age 2 Age 2.5 Average Hours per Week in Childcare Age 3 Treatment Twin Treatment*Twin p < .10 p < .05 p < .01 +/− s.e. (b) Parental Investment Age 1 Age 3 Regression Coefficients 1 .5 0 −.5 −1 St in el od ge ng La ng ua ni ar Le M ul im im St rg O g n io at at ul at an iz tri es R io io n n ct io e ns po es R n −1.5 HOME Score Sub−scales Treatment Twin Treatment*Twin p < .10 p < .05 p < .01 +/− s.e. Note 1 (general): this figure breaks the components of center-based childcare and parental investment into different age-particular categories—measured in deciles. The coefficients plotted are analogous to the firststage coefficients in Table 4 and Table 5, respectively. Table A.1 provides details on variable construction. Note 2 (inference): p − value’s are bootstrap, non-parametric, i.e. they represent the proportion of nonrejections of the null hypothesis in 1,000 draws. They are stratified by site and by a low-low birth-weight indicator (< 2,000 grams), emulating the random assignment of treatment. This evidence shows that the economic mechanisms through which the instruments work are different. Actually, the instruments have a raw correlation of (.37). Given that they both have independent power to shift the inputs, tests on the rank condition are satisfied. 38 4.2 Skill Production 4.2.1 Inputs Linearity Table 6: Second Stage Estimates, Linear Inputs Center-based Childcare Parental Investment Twin Child Characteristics Mother Characteristics Household Characteristics State Fixed Effects N Treated Twins Treated Twins F-Statistic (exclude all variables) R2 (1) 0.4862∗∗ (0.0330) 2.7807∗ (0.0720) 2.8966 (0.2770) (2) 0.4857∗∗ (0.0240) 2.5267∗ (0.0520) 3.3817 (0.2000) (3) 0.5421∗∗∗ (0.0060) 2.1105∗ (0.0500) 2.5307 (0.2620) X X X X X X X X 985 380 194 80 945 357 180 72 945 357 180 72 45.39 0.2006 28.4269 0.2717 25.7729 0.3899 Note 1 (general): this table displays the second stage results using treatment and the interaction of treatment and “being a twin” as instruments for center-based childcare and parental investment. Center-based childcare is the average weekly hours across ages 1 to 3 – transformed to 25 quantiles. Parental investment is measured as a scale of a extensive set of measures on parental investment and home environment at ages 1 and 3. Then, it is transformed to 25 quantiles. I describe in detail the controls in Table 3 and Table A.1. The corresponding first stage estimates are in Table 4 and Table 5. Note 2 (inference): p − value’s are bootstrap, non-parametric, i.e. they represent the proportion of non-rejections of the null hypothesis in 1,000 draws. They are stratified by site and by a low-low birth-weight indicator (< 2,000 grams), emulating the random assignment of treatment. Table 6 displays the “second-stage” estimates in my first strategy. In this case, center-based childcare and parental investment enter linearly into the production function of skills, which I measure through the Stanford-Binet IQ Test at age 3. This specification does not allow for substitubility or complementarity across inputs. I display three sets of results. The number of controls increases from left to right, i.e. from (1) to (3). The coefficients are stable across the three columns. They become more precise as the number of controls increasing, which is not surprising as 2SLS estimations are known to be relatively inefficient. I standardize the Stanford-Binet IQ Test sore with respect to the US population. Specif39 ically, it has the same mean and standard deviation in my sample as it has in the US population, 100 and 15 respectively. The coefficients in (3), my preferred specification, mean the following. An increase of one out of 25 quantiles in average weekly hours in center-based childcare from ages 1 to 3 increases the Stanford-Binet IQ score half point. The mean quantile of average weekly hours in center-based childcare in the sample is 11, 18 in the treatment group and 6 in the control. The effect, therefore, is pronounced. For example, treatment implies an increase of 2/5 of the standard deviation in the distribution of IQ. I discuss this further below. The measure of parental investment from ages 0 to 3, is a scale of several parenting and home environment measures which I convert to 25 quantiles. The mean of this converted measure in the sample is 12.7. An increase of one quantile in this measure increases the Stanford Binet IQ test score in 2.1 points. Again, a pronounced effect. The estimation of this effect is less precise than the effect of center-based childcare. This is expected. Almost all children in the sample went to the same center-based childcare, IHDP, which had an homogeneous quality. Thus, the measure of this input is not likely to have measurement error. The measure of parental investment is based on extensive questionnaires and a single scale summarizing it is likely to be noisy. 4.2.2 Allowing for Complementarity or Substitutability Table 6 displays the results allowing for complementarity or substituability. Details on how I estimate this are in Subsection 3.1.2. The estimates change to a great extent when compared to the additively separable case. 40 Table 7: Second Stage Estimates, Allowing for Complementarity or Substitutability Center-based Childcare Parental Investment Center-based Childcare × Parental Investment Twin Child Characteristics Mother Characteristics Household Characteristics State Fixed Effects N Treated Twin Treated Twins F-Statistic (exclude all variables) R2 (1) 1.2089∗∗ (0.0180) 4.0410∗ (0.0510) -0.060∗ (0.0520) 3.1218 (0.3590) (2) 0.9381∗∗∗ (0.0050) 3.2688∗∗ (0.0380) -0.0378∗ (0.0630) 3.3996 (0.2110) (3) 0.9719∗∗∗ (0.0040) 2.7632∗∗ (0.0260) -0.0356∗∗ (0.0480) 2.4468 (.3130) X X X X X X X X 985 380 194 80 945 357 180 72 945 357 180 72 46.7535 0.3015 33.2002 0.4055 26.7067 0.4307 Note 1 (general): this table displays the second stage results using treatment and the interaction of treatment and “being a twin” as instruments for center-based childcare and parental investment. Given the interaction of these two is endogenous as well, I multiple the projections in the first stages of the inputs instead of considering a third instrument. Center-based childcare is the average weekly hours in across ages 1 to 3 –transformed to 25 quantiles. Parental investment is measured as a scale of a extensive set of measures on parental investment and home environment at ages 1 and 3. Then, it is transformed to 25 quantiles. I describe in detail the controls in Table 3 and Table A.1. The corresponding first stage estimates are in Table 4 and Table 5. Given the interaction of the two inputs is endogenous as well, I multiple the projections in the first stages of the inputs to obtain a projection for the second stage instead of using a third instrument. Note 2 (general): p − value’s are bootstrap, non-parametric, i.e. they represent the proportion of nonrejections of the null hypothesis in 1,000 draws. They are stratified by site and by a low-low birth-weight indicator (< 2,000 grams), emulating the random assignment of treatment. First, I find that the two inputs are (Edgeworth) substitutes: the coefficient on the interaction is negative. This coefficient is the cross-partial of center-based childcare and parental investment on cognitive skills. This is an interesting finding. IHDP, a high-quality early education program, boosts both center-based childcare take-up and parental investment. The production of cognitive skills uses these inputs as substitutes. Interestingly, I am able to display some evidence on parents perceiving these two inputs as substitutes as well. To see this, recall that in Subsection 4.1.3 I interpret the interaction of treatment and “being a twin” as exogenously constraining parental resources. Parents, actually, use more center-based childcare and decrease parental investment when they receive this exogenous 41 shift. These results are true controlling for the widest set of controls in Table 3. When parents receive this shift, they increase the use of center-based childcare in 2.3 out of 25 quantiles and decrease parental investment in 3.2 out of 25 quantiles. Figure 6 displays this result. Figure 6: Center-based Childcare and Parenting, Evidence on Substitution 15 Regression Coefficients 10 5 0 −5 Childcare Parenting Accumulated Inputs, Ages 0 to 3 Treatment Twin Treatment*Twin p < .10 p < .05 p < .01 +/− s.e. Note 1 (general): the coefficients plotted are analogous to the first-stage coefficients in Table 4 and Table 5, respectively. Table A.1 provides details on variable construction. I transform the two variables to 25 quantiles. Note 2 (inference): p − value’s are bootstrap, non-parametric, i.e. they represent the proportion of nonrejections of the null hypothesis in 1,000 draws. They are stratified by site and by a low-low birth-weight indicator (< 2,000 grams), emulating the random assignment of treatment. The coefficients on center-based and parental investment, representing the effects of these two inputs on cognitive skills, are much bigger in this case. This is mechanic. The specification in Subsection 4.2.1 omits the interaction between the two inputs. Thus, the coefficients on the inputs capture part of the substitutability between the two inputs and decrease. When the interaction term is accounted for, the effect of these two inputs is much larger. In the case of center-based childcare, it almost doubles, and in the case of parental investment, it increases by roughly 30%. 42 4.3 Decomposing the Total Effect of Treatment IHDP consisted of intensive intervention on both children and parents. This is not exceptional in the context of Early Education Programs, as they usually combine center-based and home visit programs (see Elango et al., 2015). It is relevant to identify and quantify the relative importance of each component on the total effect of treatment. I explain how to do that in Subsection 3.2. The results based on the specification not allowing for complementarity or substitutability are in Table 8; the results allowing for these are in Table 9. Table 8: Total Treatment Effect Decomposition, Linear Inputs Childcare Treat, Control 1 Treat, Control 2 Treat, Control 3 5.6241∗∗ (0.0330) 5.6687∗∗ (0.0230) 6.3161∗∗∗ (0.0060) Parental Investment 3.9537∗ (0.0720) 4.6833∗∗ (0.0350) 4.0020∗∗ (0.0430) Residual Total -0.8029 (0.6440) -1.7764 (0.3090) -1.7425 (0.3030) 8.7750∗∗∗ (0.0000) 8.5756∗∗∗ (0.0000) 8.5756∗∗∗ (0.0000) Note 1 (general): this table decomposes the total treatment of IHDP on cognitive skills at age 3, as measured by the Stanford Binet IQ Score at age 3. This has in-sample mean and standard deviation 100 and 15, respectively. The decomposition is based on the results in Table 6. Note 2 (inference): p − value’s are bootstrap, non-parametric, i.e. they represent the proportion of non-rejections of the null hypothesis in 1,000 draws. They are stratified by site and by a low-low birth-weight indicator (< 2,000 grams), emulating the random assignment of treatment. 43 Table 9: Total Treatment Effect Decomposition, Allowing for Complementarity or Substitutability Childcare Treat, Control 1 Treat, Control 2 Treat, Control 3 5.4608∗∗ (0.0400) 5.5307∗∗ (0.0300) 6.2805∗∗∗ (0.0100) Parental Investment 3.0508 (0.1600) 3.6294∗ (0.0630) 3.0157∗ (0.0680) Residual Total 0.2634 (0.8680) -0.5844 (0.7330) -0.7206 (0.6700) 8.7750∗∗∗ (0.0000) 8.5756∗∗∗ (0.0000) 8.5756∗∗∗ (0.0000) Note 1 (general): this table decomposes the total treatment of IHDP on cognitive skills at age 3, as measured by the Stanford Binet IQ Score at age 3. This has in-sample mean and standard deviation 100 and 15, respectively. The decomposition is based on the results in Table 7. Note 2 (inference): p − value’s are bootstrap, non-parametric, i.e. they represent the proportion of non-rejections of the null hypothesis in 1,000 draws. They are stratified by site and by a low-low birth-weight indicator (< 2,000 grams), emulating the random assignment of treatment. The results decompose the total treatment effect on Stanford-Binet IQ at age 3 into the components corresponding to center-based childcare, parental investment, and a residual. The results indicate the the absolute magnitude of the unobserved residual is relatively small if compared to the absolute magnitude of the other two components. This is an indirect test on the assumption that the production function depends of on two inputs only. The results for both decomposition exercises are similar. The two inputs reflect part of the interaction term in the first specification, when no interaction is allowed. Thus, this is not surprising. The decomposition indicates that center-based childcare accounts for 73% of the total treatment. The effect of parenting is 35%. The reason why these sum for than 100% is that the residual is negative (−8%). These results are relevant. They state that parental investment, not only center-based childcare, is fundamental in boosting IQ in early education programs. Actually, the most successful early education programs as the Carolina ABC Program or the Perry Preschool program had a strong parental component (see Elango et al., 2015). 44 4.4 Long-Term Effects Figure 7: Long-Term Effects of Center-based Childcare and Parental Investment, Reducedform Evidence (a) Cognitive Skills (b) Schooling at Age 18 .4 5 Regression Coefficients Regression Coefficients 10 0 .2 0 −5 −10 −.2 Age 3 Age 5 Age 8 Age 18 Absent Last Year In Therapy IQ Treatment Twin Treatment*Twin p < .10 p < .05 p < .01 +/− s.e. In Counseling Reading Tutor Age 18 School Outcomes Treatment Twin Treatment*Twin p < .10 p < .05 p < .01 Math Tutor +/− s.e. (c) Behavior at Age 18 Regression Coefficients .2 0 −.2 −.4 Took SAT Teen Parent Smokes Tobacco Age 18 Outcomes Treatment Twin Treatment*Twin p < .10 p < .05 p < .01 Idle +/− s.e. Note 1 (general): this figure displays reduced-form estimates of my two instruments, treatment and the interaction of treatment and “being a twin”, on long-term outcomes. Table A.1 provides details on variable construction. Note 2 (inference): p − value’s are bootstrap, non-parametric, i.e. they represent the proportion of nonrejections of the null hypothesis in 1,000 draws. They are stratified by site and by a low-low birth-weight indicator (< 2,000 grams), emulating the random assignment of treatment. Figure 7 presents evidence on long-term outcomes. As I explain in Subsection 3.3, I am not able to estimate a production function for these outcomes due mainly to data limitations. The evidence I display before, however, helps to construct an economic interpretation of the 45 instruments. Therefore, the “reduced-form” estimates of the instruments on these outcomes inform on the longer-term effects of center-based childcare and parenting. All these estimates control for the widest set of controls in Table 3. To interpret the results, recall that the main effects of the treatment indicator is to increase the hours of center-based childcare. The main effect of the interaction of treatment and “being a twin” is to decrease parental investment, due to a exogenous decrease in parental resources. The magnitudes of the effects are as expected. For “positive outcomes” as IQ or “took SAT”, treatment has a positive causal effect and the interaction term a negative effect. For “negative outcomes” as school absenteeism, need of school tutors, teen parenthood, or tobacco treatment has a positive effect and the interaction has a negative effect. The effect of treatment in the long-term is much lower and noisier in general. Nonetheless, there are some relevant, precise impacts: treatment lowers school absenteeism, reading tutor use, and tobacco use in between 3 and 5 percentage points. Interestingly, all these outcomes are measures of behaviors related to non-cognitive skills.14 Thus, this evidence is consistent with previous findings in Heckman et al. (2013); Garc´ıa and Heckman (2015): non-cognitive skills mediate longer-term outcomes to a greater extent than cognitive skills. The interaction between treatment and “being a twin”, which lowers parental investment, has much larger and precise effects. It lowers IQ to a great extent up to age 18. It increases the use of therapy, counseling, reading and math tutor use, teen parenthood, and tobacco use. It decreases “taking the SAT”. The effects on IQ range between -4 and -7 percentage points, the in-sample distributions of all these tests have mean 100 and standard deviation 15. The effects of all schooling and behavior outcomes range between 8 and 25 percentage points, in absolute value. I interpret the fact that parental behavior has a more persistent effect on longer-term outcomes as follows. Random assignment to treatment implied an intensive program from 14 Often, these behaviors form the basis for measuring non-cognitive skills (see Almlund et al., 2011; Kautz et al., 2014). 46 ages 0 to 3. The interaction between treatment and “being a twin” represents a permanent exogenous decrease in parental resources. The effects, thus, are much stronger and precise than the effects of the three year intervention. 5 Final Comments Provision of center-based childcare is usually proposed as a solution for remediation and predistribution in order to promote social mobility (see Karoly et al., 2006; Heckman, 2013). Actually, in recent speeches, President Obama has gone as far as proposing universal childcare (see Condon, 2015). Much of the evidence supporting claims favoring early education are based on a few programs in the US or programs in Europe (see Elango et al., 2015). Importantly, this evidence does not assess the relative importance, substitutability, or short and long-term effects of the distinct components of successful early education programs. My paper contributes to the literature and the debate expanding the current knowledge with these three respects. Firstly, I am able to provide causal evidence on the components of early education programs and state that their success is not at all based exclusively on center-based childcare. Parental investment is a highly relevant component of the success. Secondly, I am able to provide two peaces of evidence on center-based childcare and parental investment being substitutes. This is highly relevant: when parents face resource constrains, they substitute away from investment by sending children to center-based childcare more often. Thus, the quality of center-based childcare becomes a highly relevant policy matter. Policy makers should consider that constrained parents use center-based childcare as a substitute for parental investment. Importantly, this has short- and long-term consequences of diverse outcomes. This is my third contribution. Both center-based childcare and parenting have long-term outcomes. The effects of the latter are more persistent and precise that the effects of the former. This is not surprising since parental investment keeps going after children “graduate” from center-based childcare. 47 References Almlund, M., A. L. Duckworth, J. Heckman, and T. Kautz (2011). Personality Psychology and Economics. Handbook of the economics of education 4 (1). Almond, D., K. Y. Chay, and D. S. Lee (2004). The Costs of Low Birth-weight. National Bureau of Economic Research Working Paper . Almond, D. and J. Currie (2011). Killing me Softly: The Fetal Origins Hypothesis. The Journal of Economic Perspectives 25 (3), 153. Baker, M., J. Gruber, and K. Milligan (2008). Universal Childcare, Maternal Labor Supply, and Family Well-being. Journal of Political Economy 116 (4), 709–45. Becker, G. S. (1965). A Theory of the Allocation of Time. The Economic Journal 75 (299), 493–517. Becker, G. S. and N. Tomes (1979). An Equilibrium Theory of the Distribution of Income and Intergenerational Mobility. Journal of Political Economy 87 (6), 1153–1189. Becker, G. S. and N. Tomes (1986). Human Capital and the Rise and Fall of Families. Journal of Labor Economics 4 (3, Part 2), S1–S39. Behrman, J. R. and M. R. Rosenzweig (2004). Returns to Birthweight. Review of Economics and Statistics 86 (2), 586–601. Berlinski, S., S. Galiani, and P. Gertler (2009). The Effect of Pre-primary Education on Primary School Performance. Journal of Public Economics 93 (1), 219–234. Bernal, R. (2008). The Effect of Maternal Employment and Child Care on Children’s Cognitive Development. International Economic Review 49 (4), 1173–1209. Bernal, R. and M. P. Keane (2011). Child Care Choices and Children’s Cognitive Achievement: the Case of Single Mothers. Journal of Labor Economics 29 (3), 459–512. Black, S. E., P. J. Devereux, and K. Salvanes (2005). From the Cradle to the Labor Market? The Effect of Birth-weight on Adult Outcomes. Brilli, Y. (2013). Mother or Market Care? A Structural Estimation of Childcare Impacts on Child Development. Collegio Carlo Alberto Unpublished Manuscript. Condon, S. (2015). Obama: We had universal childcare in the 1940’s, so let’s do it in 2015. Cragg, J. G. and S. G. Donald (1993). Testing Identifiability and Specification in Instrumental Variable Models. Econometric Theory 9 (02), 222–240. Cunha, F., J. Heckman, and S. Navarro (2005). Separating Uncertainty from Heterogeneity in Life-cycle Earnings. Oxford Economic Papers 57 (2), 191–261. 48 Cunha, F. and J. J. Heckman (2008). Formulating, Identifying and Estimating the Technology of Cognitive and Non-cognitive Skill Formation. Journal of Human Resources 43 (4), 738–782. Cunha, F., J. J. Heckman, L. Lochner, and D. V. Masterov (2006). Interpreting the Evidence on Life-cycle Skill Formation. Handbook of the Economics of Education 1, 697–812. Cunha, F., J. J. Heckman, and S. M. Schennach (2010). Estimating the Technology of Cognitive and Non-cognitive Skill Formation. Econometrica 78 (3), 883–931. Currie, J. and D. Almond (2011). Human Capital Development before Age Five. Handbook of Labor Economics 4, 1315–1486. Del Boca, D., C. Flinn, and M. Wiswall (2014). Household Choices and Child Development. The Review of Economic Studies 81 (1), 137–185. Elango, S., A. Hojman, J. L. Garc´ıa, and J. J. Heckman (2015). Early Education Programs and Skill Development in the US. The University of Chicago Unpublished Manuscript. Flinn, C. and J. Heckman (1982). New Methods for Analyzing Structural Models of Labor Force Dynamics. Journal of Econometrics 18 (1), 115–168. Garc´ıa, J. L. and J. J. Heckman (2015). Parenting, Skills, and Social Mobility. The University of Chicago Unpublished Manuscript. Garc´ıa, J. L., J. Shea, and A. Hojman (2015). The Production System of Cognitive Skills. The University of Chicago Unpublished Manuscript. Gormley Jr, W. T., T. Gayer, D. Phillips, and B. Dawson (2005). The Effects of Universal Pre-K on Cognitive Development. Developmental Psychology 41 (6), 872. Gross, R. T., D. Spiker, and C. W. Haynes (1997). Helping Low-birth Weight, Premature Babies: the Infant Health and Development Program. Stanford University Press. Hack, M., D. J. Flannery, M. Schluchter, L. Cartar, E. Borawski, and N. Klein (2002). Outcomes in Young Adulthood for Very-low-birth-weight Infants. New England Journal of Medicine 346 (3), 149–157. Havnes, T. and M. Mogstad (2011). No Child Left Behind: Subsidized Childcare and Children’s Long-run Outcomes. American Economic Journal: Economic Policy, 97–129. Heckman, J. J. (2008). Schools, Skills, and Synapses. Economic Inquiry 46 (3), 289–324. Heckman, J. J. (2013). Giving Kids a Fair Chance. Mit Press. Heckman, J. J. and S. Mosso (2014). The Economics of Human Development and Social Mobility. Annual Review of Economics 6 (1), 689–733. Heckman, J. J., R. Pinto, and P. A. Savelyev (2013). Understanding the Mechanisms Through Which an Influential Early Childhood Program Boosted Adult Outcomes. American Economic Review 103 (6), 1–35. 49 Huggett, M., G. Ventura, and G. Yaron (2011). Sources of Lifetime Inequality. American Economic Review 101, 2923–2954. Jaumotte, F. (2003). Female Labour Force Participation: Past Trends and Main Determinants in OECD Countries. OECD Working Paper . Karoly, L. A., P. W. Greenwood, S. S. Everingham, J. Houb´e, and M. R. Kilburn (1998). Investing in our Children: What we know and don’t know about the Costs and Benefits of Early Childhood Interventions. Rand Corporation. Karoly, L. A., M. R. Kilburn, and J. S. Cannon (2006). Early Childhood Interventions: Proven Results, Future Promise. Rand Corporation. Kautz, T., J. J. Heckman, R. Diris, B. Ter Weel, and L. Borghans (2014). Fostering and Measuring Skills: Improving Cognitive and Non-cognitive Skills to Promote Lifetime Success. Technical report. Keane, M. P., P. E. Todd, and K. I. Wolpin (2011). The Structural Estimation of Behavioral Models: Discrete Choice Dynamic Programming Methods and Applications. Handbook of Labor Economics 4, 331–461. Keane, M. P. and K. I. Wolpin (1997). The Career Decisions of Young Men. Journal of political Economy 105 (3), 473–522. Kleibergen, F. and R. Paap (2006). Generalized Reduced Rank Tests using the Singular Value Decomposition. Journal of Econometrics 133 (1), 97–126. Knudsen, E. I., J. J. Heckman, J. L. Cameron, and J. P. Shonkoff (2006). Economic, Neurobiological, and Behavioral Perspectives on Building Americas Future Workforce. Proceedings of the National Academy of Sciences 103 (27), 10155–10162. Loeb, S., M. Bridges, D. Bassok, B. Fuller, and R. W. Rumberger (2007). How much is too much? The Influence of Preschool Centers on Children’s Social and Cognitive Development. Economics of Education Review 26 (1), 52–66. Løken, K. V., M. Mogstad, and M. Wiswall (2012). What Linear Estimators Miss: The Effects of Family Income on Child Outcomes. American Economic Journal: Applied Economics 4 (2), 1–35. Magnuson, K. A., C. Ruhm, and J. Waldfogel (2007). Does Pre-kindergarten Improve School Preparation and Performance? Economics of Education Review 26 (1), 33–51. Martin, J. A., B. E. Hamilton, M. J. Osterman, N. C. for Health Statistics (US), et al. (2012). Three Decades of Twin Births in the United States, 1980-2009. US Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics. Matte, T. D., M. Bresnahan, M. D. Begg, and E. Susser (2001). Influence of Variation in Birth-weight within Normal Range and within Sibships on IQ at Age 7 years: Cohort Study. Bmj 323 (7308), 310–314. 50 OECD (2012). OECD Country Note. Sanderson, E. and F. Windmeijer (2013). A Weak Instrument F-test in linear IV Models with Multiple Endogenous Variables. Technical report. Sparling, J., I. Lewis, and C. Ramey (1984). Partners for learning. Kaplan, Lewisville, NC . Sparling, J., I. Lewis, C. T. Ramey, B. H. Wasik, D. M. Bryant, and L. M. LaVange (1991). Partners a Curriculum to Help Premature, Low Birth-weight Infants Get off to a Good Start. Topics in Early Childhood Special Education 11 (1), 36–55. Stock, J. H. and M. Yogo (2005). Testing for Weak Instruments in Linear IV Regression. Identification and Inference for Econometric Models: Essays in Honor of Thomas Rothenberg. Strauss, R. S. (2000). Adult Functional Outcome of those Born Small for Gestational Age: Twenty-six–year Follow-up of the 1970 British Birth Cohort. Journal of the American Medical Association 283 (5), 625–632. Todd, P. E. and K. I. Wolpin (2003). On the Specification and Estimation of the Production Function for Cognitive Achievement. The Economic Journal 113 (485), F3–F33. A Data The following two subsections contain variable descriptions. Table A.1 describes in detail each variable in the paper. Table A.2 and Figure A.1 analyze item non-response in each of these variables. A.1 Construction 51 Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates the child’s average hours per week in child care by percentile Selected Items in the Home, Ages 1 and 3 Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Stanford–Binet IQ Test Wechsler Preschool and Primary Scale of Intelligence (WPPSI) for Children at age 8 Wechsler Intelligence Scale for Children (WISC) at age 8 Peabody Picture Vocabulary Test (PPVT) at age 18 Indicates the average state employment rate, during 1985-1988 Indicates the average state government expenditure per capita in thousand 2014 USD, during 1985-1988 Indicates the average state median income in thousands 2014 USD, during 1985-1988 State State State State State State State State State Home Environment Language stimulation, age 3 Learning stimulation, age 3 Modeling, age 3 Number of children dependent age 0-18 Number of dependent adults (between 18 and 63) at individual’s 40 weeks of age Number of dependent old adults (above 63) at individual’s 40 weeks of age Number of employed adults (over 18 years-old), non-child, age 40 weeks Number of siblings at home at age 40 weeks Number of times moved by 4 months of age Organization, age 1 Response, age 1 Restriction, age 1 Welfare received at 4 months of age Skill Production Inputs (Age 1–3) Average hours per week in child care, percentiles HOME Inventory, Selected Items Schooling (Age 18) Counseling , age 18 Math tutor, age 18 Number of days absent, age 18 Reading tutor, age 18 Therapy, age 18 Behavior (Age 18) Took SAT Teen pregnancy Cigarettes Idle Cognitive Skills IQ Age 3 IQ Age 5 IQ Age 8 IQ Age 18 Economy Average state employment rate, 1985-1988 Average state government expenditure per capita, 1985-1988, 2014 USD Average state median income, 1985-1988, thousands, 2014 USD 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 child child child child child child child child was was was was was was was was born born born born born born born born in in in in in in in in Arkansas Connecticut Florida Massachusetts New York Pennsylvania Texas Washington score the home obtained for language stimulation of the child at age 3 score the home obtained for learning stimulation of the child at age 3 score the home obtained for modeling for the child at age 3 number of dependent children between (less than 18) in the household, when child is 40 weeks of age number of dependent adults under 63 years-old in the household, when child is 40 weeks of age number of dependent adults post-retirement age (over 63 years old) in the household, when child is 40 weeks of age number of employed adults over 18 years-old in the household, when child is 40 weeks of age number of siblings in the household at 40 weeks of age number of times the child had moved by 4 months of age score the home obtained for the organization of the physical environment at child’s age 1 score the parents obtained for their level of responsiveness to the child at age 1 score the parents obtained for their level of restrictions imposed to the child at age 1 welfare received in the household, when child is 4 months of age the the the the the the the the whether the child has taken the SAT or ACT at 18 years old whether the subject become pregnant as a teenager the number of cigarettes smoked per day whether the subject was idle whether the child receives counseling at 18 years old whether the child works with a math tutor at 18 years old the number of days absent from school over the last year at 18 years old whether the child works with a reading tutor at 18 years old whether the child receives therapy at 18 years old the the the the the the the the the the the the the that that that that that that that that Note: this table details the origin and construction of each variable I use in the paper. AR indicator CT indicator FL indicator MA indicator NY indicator PA indicator TX indicator WA indicator Indicates whether the mother is married when child is 0 years of age Indicates the mother’s age, when child is 0 years of age Indicates the mother’s years of education, when child is 1 year of age Mother Mother married at age 0 Mother’s age at age 0 Mother’s years of education at age 1 11 12 13 the birth-weight in pounds of the child whether the child is black whether the child is Hispanic the gestational age of the child in weeks whether the child had a low birth-weight, takes the value 1 when birth-weight < 4.42 pounds whether the child is a male whether the child is in the treatment group whether the child has a twin sibling in IHDP the child’s IHDP cluster number (for bootstrap) whether the child is in the treatment group and is a twin Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates Indicates 1 2 3 4 5 6 7 8 9 10 Description Variable Baseline Birth-weight Black indicator Hispanic indicator Gestational age Light low birth-weight indicator Male indicator Treatment group indicator Twin indicator IHDP twin cluster number (for bootstrap) Interaction between treatment and twin variables ID Table A.1: Variable Details and Construction A.2 Item Non-response Figure A.1: Item Non-Response .45 Proportion of Non−response .35 .25 .15 .05 −.05 51 49 47 45 43 41 39 37 35 33 31 29 27 25 23 21 19 17 15 13 11 9 7 5 3 1 Variable ID Note: this figure shows the proportion of individuals for whom I do not have information in each of the variables in Table A.1. This table has the individual identifier of each question in the first column. I assign this identifier in the plot. The initial sample size is 1,090. 53 Table A.2: Item Non-response by Variable Control Mean SD Treatment Mean SD Treatment - Control Mean Difference Pr(|t| = 0) Baseline Twin Male Birth-weight Gestational Age Black 1.0000 1.0000 1.0000 1.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 NA NA NA NA NA Mother Black Age Works in Birth-year Married in Birth-year 1.0000 1.0000 0.9403 1.0000 0.0000 0.0000 0.2371 0.0000 1.0000 1.0000 0.9381 1.0000 0.0000 0.0000 0.2413 0.0000 NA NA Father Black 1.0000 0.0000 1.0000 0.0000 Household Welfare in Birth-year Moved in Birth-year Siblings at Birth Dependents Aged 63+ at Birth Adult Dependents at Birth Employed Adults at Birth 0.9478 0.9403 0.9806 0.9821 0.9821 0.9821 0.2227 0.2371 0.1380 0.1327 0.1327 0.1327 0.9405 0.9310 0.9667 0.9690 0.9690 0.9690 0.2369 0.2538 0.1797 0.1734 0.1734 0.1734 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.9615 0.9193 1.0000 1.0000 1.0000 1.0000 Skill Production Inputs (Age 1–3) Avg. Weekly Hours in Childcare Parental Investment 0.9448 0.9269 0.2286 0.2606 0.9429 0.9310 0.2324 0.2538 -0.0002 0.0031 0.9917 0.8321 Schooling (Age 18) Days Absent in Last Year Therapy Counseling Reading Tutor Math Tutor 0.5343 0.5597 0.5597 0.5522 0.5537 0.4992 0.4968 0.4968 0.4976 0.4975 0.5690 0.5881 0.5810 0.5857 0.5881 0.4958 0.4928 0.4940 0.4932 0.4928 0.0211 0.0170 0.0088 0.0227 0.0236 0.5257 0.6064 0.8183 0.4945 0.4606 Behavior (Age 18) Took SAT Teen Parent Smokes Tobacco Idle 0.6075 0.6045 0.6119 0.6328 0.4887 0.4893 0.4877 0.4824 0.6310 0.6333 0.6381 0.6667 0.4831 0.4825 0.4811 0.4720 0.0184 0.0240 0.0221 0.0286 0.6055 0.5000 0.5339 0.4110 Cognitive Skills Stanford-Binet IQ, Age 3 WPPSI, Age 5 WISC, Age 8 PPVT, Age 18 0.9239 0.8104 0.8806 0.5463 0.2654 0.3922 0.3245 0.4982 0.9214 0.8286 0.8952 0.5833 0.2694 0.3773 0.3066 0.4936 0.0020 0.0199 0.0212 0.0299 0.9193 0.4376 0.2963 0.3697 Economy Employment Median Income Government Expenditure 1.0000 1.0000 1.0000 0.0000 0.0000 0.0000 1.0000 1.0000 1.0000 0.0000 0.0000 0.0000 0.0077 0.2266 NA NA NA NA NA Note 1 (general): this tables provides details on item non-response. For each variable, I create an indicator of response availability. Then, I regress a treatment indicator and Control Set 3 in Table 3. Note 2 (inference): p − value’s are bootstrap, non-parametric, i.e. they represent the proportion of non-rejections of the null hypothesis in 1,000 draws. They are stratified by site and by a low-low birth-weight indicator (< 2,000 grams), emulating the random assignment of treatment. 54 B Additional Details on Randomization Table B.1: The Screening Process, IHDP Total Screened 4,551 100% Initial Exclusions Residence outside of catchment area Gestational age > 37 weeks Hospital discharge outside of recruitment period Sibling of an eligible twin Multiple birth ¿ two, or sibling of ineligible twin Erroneously coded by sites as eligible Other (e.g. family moving, site quota met, enrolled in another program) 2,790 61% 294 6% 165 4% 3,249 1,302 71% 29% Infant Exclusions Death Chromosome multiple anomaly syndrome Recipient of oxygen for > 90 days Extended hospitalization for ≥ 60 days Neural tube defect Severe neurological abnormality Severe sensory defect 1,524 604 431 140 53 19 19 233 25 19 9 3 3 2 Maternal Exclusions Unable to participate in the program in English Maternal report of drug/alcohol abuse Maternal report of psychiatric hospitalization Total Exclusions Total Eligible 108 51 6 Note: this table describes the screening process at the moment of birth in the Infant Health and Development Program. The unit of observation is family. Thus, it either represents a single child or a twin pair. Source: adapted from Gross et al. (1997). 55 Table B.2: The Enrollment Process, IHDP Total Eligible 1,302 100% Family Refused Consent Do not want day care 73 Not interested, no specific reason given 48 Program not needed, no special needs perceived 33 Too busy, program too much work 28 Do not want to participate in research 21 Family pressure not to participate 13 Have other source of medical follow-up, do not want to duplicate 12 Three years too long a commitment 10 Too far to travel 10 Program perceived as an invasion of privacy 11 Could not be located to request consent 8 Want to continue in other program 5 Other 2 Group Assignment Could Not be Presented Family Rejected Treatment Group Assignment 274 21% 26 17 2% 1.3% Total Not Enrolled Total Enrolled 317 985 24.3% 75.7% Note: this table describes the enrollment process at the moment of birth in the Infant Health and Development Program. The unit of observation is family. Thus, it either represents a single child or a twin pair. Source: adapted from Gross et al. (1997). 56 57 C Material Used in the Childcare Centers and Home Visits Figure C.1: Problem Solving Planning Sheet Example 1, IHDP Note: this table shows an example of the problem solving planning sheet filled during home visits in the Infant Health and Development Program. Source: adapted from Gross et al. (1997). 58 59 Figure C.2: Problem Solving Planning Sheet Example 2, IHDP Note: this table shows an example of the problem solving planning sheet filled during home visits in the Infant Health and Development Program. Source: adapted from Gross et al. (1997). 60 D Administrative Organization of IHDP Figure D.1 displays the structure of the Infant Health and Development Program (IHDP), which provides the data I use in this paper.15 Figure D.1: The Administrative Structure of IHDP Note: this figure displays the administrative structure of the Infant Health and Development Program. Source: adapted from Gross et al. (1997). • The Robert Wood Johnson Foundation: the major founder, which appointed the directors of the National Study Office and Program Development Office. It also appointed the National Advisory Committee. Other founders were the Bureau of Maternal and Child Health and Resources Development, US Public Health Service, National Institute of Child Health and Human Development, the Pew Charitable Trusts, and the Center for the Study of Families and Children at Stanford University. • National Advisory Committee: oversight body of the study. Reviewed the applications of all universities and/or hospitals which applied to be a treatment site on a competitive basis. Tracked the progress of the program and provided advice and support on major issues regarding study design, program implementation, and ethical and social policy considerations. Location: Stanford University. Director: Ruth T. Gross. • National Study Office: in charge of formulation of research plan and protocol, random assignment of subjects, implementing the evaluation, data management. • Program Development Office: development of the curriculum materials, initial and ongoing training of the intervention staff at the eight participating sites. Monitored and thoroughly documented the delivery of the program to ensure homogeneity across sites. Location: University of North Carolina, Chapel Hill. Director: Craig T. Ramey. 15 The information in this section is largely based on Gross et al. (1997, Chapter 26), which provides much more extensive details. 61 E Consequences of Low Birth-weight Low birth-weight results in high initial medical costs and high risk of premature death. Additionally, researchers have consistently found lower educational attainment, poorer selfreported health status, and reduced employment and earnings as adults, relative to their normal weight counterparts Almond et al. (2004). The main findings in the literature agree on the effects in short and medium term. However, it has been difficult to track the long-term effects or their consequences during adulthood. Hack et al. (2002) finds an educational disadvantage associated with very-low-birthweight that persists into early adulthood, including lower educational attainment, lower rates of pregnancy and even lower use of alcohol and drugs than those with normal birth weight. Strauss (2000) studied the cohort of 1970 in the United Kingdom, which was followed 5, 10, 16 and 26 years after birth. The main results suggest that at 5, 10 and 16 years of age, those born small for gestational age showed deficits in academic achievement compared with those at normal birth weight. At age 26 there is no difference found in in years of education, employment, hours of work per week, marital status, or satisfaction with life. Matte et al. (2001) examine the relation between birth weight and measured intelligence at age 7 years in children within the normal range of birth weight and in siblings. The main finding is that IQ at age 7 years is linearly related to birth weight among children of normal birth weight. Black et al. (2005) evaluates long-term effects of low birth weight on the labor market. The main results suggest that children from poorer families would have sizeable effects on long-term variables, such as labor. In this sense, Behrman and Rosenzweig (2004) finds evidence about augmenting birth weight increases adult schooling attainment and adult height for babies at most levels of birth weight, but has no effect on adult body mass. Furthermore, they argue that investing in increasing weight among low-birth-weight babies has significant labor market payoffs. Understanding birth weight as a consequence of in-utero health is key for the development of hypothesis on its effects through out life Almond and Currie (2011). Research on this subject has a wide range of opportunity, there is a strong need to strengthen the academic information about the relationship of low birth weigh and its consequences in the long run. 62
© Copyright 2024