B

Defining and Estimating Causal Effects

What is a causal effect? Many discussions of causal inference and research design neglect to confront this issue. However, a theory that has come to dominate modern thinking in statistics about cause begins with this fundamental question. Pioneered by Rubin (1976) and Rosenbaum and Rubin (1983) and elaborated by Holland (1986), this theory has come to be known as the Rubin-Rosenbaum-Holland (RRH) theory of causal inference, although its roots can be traced to much earlier work on experimentation (e.g., Fisher, 1918; Cochran, 1965). To describe this theory, the simplest case will suffice: we have a causal variable (the treatment) with two possible values (experimental and control). For clarity, let us consider a case in which a child will receive either the new experimental approach to day care (call it E) or the currently available approach (call it C) and the outcome will be a measure of self-regulation. If the child were to receive E, the child would experience the outcome under E. However, if the child were to receive C, that same child would experience the outcome under C. The causal effect of the experimental treatment (relative to the control) is defined by a comparison between how that child would fare under E versus C. For example, we might define this treatment effect for that child as

Treatment effect = (Outcome under E) minus (Outcome under C),

that is the difference between the outcome a child would receive if assigned to treatment E and the outcome that same child would receive if assigned to



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 545
From Neurons to Neighborhoods: The Science of Early Childhood Development B Defining and Estimating Causal Effects What is a causal effect? Many discussions of causal inference and research design neglect to confront this issue. However, a theory that has come to dominate modern thinking in statistics about cause begins with this fundamental question. Pioneered by Rubin (1976) and Rosenbaum and Rubin (1983) and elaborated by Holland (1986), this theory has come to be known as the Rubin-Rosenbaum-Holland (RRH) theory of causal inference, although its roots can be traced to much earlier work on experimentation (e.g., Fisher, 1918; Cochran, 1965). To describe this theory, the simplest case will suffice: we have a causal variable (the treatment) with two possible values (experimental and control). For clarity, let us consider a case in which a child will receive either the new experimental approach to day care (call it E) or the currently available approach (call it C) and the outcome will be a measure of self-regulation. If the child were to receive E, the child would experience the outcome under E. However, if the child were to receive C, that same child would experience the outcome under C. The causal effect of the experimental treatment (relative to the control) is defined by a comparison between how that child would fare under E versus C. For example, we might define this treatment effect for that child as Treatment effect = (Outcome under E) minus (Outcome under C), that is the difference between the outcome a child would receive if assigned to treatment E and the outcome that same child would receive if assigned to

OCR for page 545
From Neurons to Neighborhoods: The Science of Early Childhood Development treatment C.1 These are called potential outcomes. Children are viewed, then, as having potential outcomes, only some of which will ever be realized. Several conclusions follow from this definition. First, the causal effect is defined uniquely for each child. The impact of the treatment can thus vary from child to child. Modern thinking about cause thus rejects the conventional assumption that a new treatment adds a constant effect for every child. This assumption, never realistic to scientists or practitioners, was historically made to simplify statistical analysis. Second, the causal effect cannot be observed. If a given child is assigned to E, we will observe the outcome under E but not the outcome under C for that child. But if the child is assigned to C, we will observe the outcome under C but not the outcome under E. Holland (1986) refers to the fact that only one of two potential outcomes can be observed as the fundamental problem of causal inference. Third, although a given child will ultimately receive only one treatment, say, treatment E, it must be reasonable at least to imagine a scenario in which that child could have received C. And similarly, even though another child received C, it must be reasonable to imagine a scenario in which that child had received E. If it is not possible to conceive of each child's response under each treatment, then it is not possible to define a causal effect. There must, then, be a road not taken that could have been taken, for each child. Thus, both the outcome under E and the outcome under C must exist in principle even if both cannot be observed in practice. Therefore, in current thinking about cause in statistical science, a fixed attribute of a child (say sex or ethnic background) cannot typically be a cause. We cannot realistically imagine how a girl would have responded if she had been a boy or how a black child would have responded if that child had been white. Epidemiologists referred to such attributes as fixed markers (Kraemer et al., 1997), unchangeable attributes that are statistically related to an outcome but do not cause the outcome. This theory of causation provides new insights into why randomized experiments are valuable. It also provides a framework for how to think about the problem of causal inference when randomized experiments are not possible. According to the RRH theory, the problem of causal inference is a problem of missing data. If both potential outcomes were observable, the causal effect could be directly calculated for each participant. But one of the potential outcomes is inevitably missing. If the data were missing completely at random, we could compute an unbiased estimate of the average 1   The causal effect could also be defined as the ratio Yi(E)/Yi(C), depending on the scale of Y, but we limit this discussion to causal effects as differences for simplicity.

OCR for page 545
From Neurons to Neighborhoods: The Science of Early Childhood Development causal effect for any subgroup. A randomized experiment ensures just that: that the missing datum is missing completely at random, ensuring unbiased estimation of the average treatment effect. Suppose, by contrast, that E or C could be selected by each child 's parents. Suppose further that more-advantaged parents tended to choose E while the less-advantaged parents tended to chose C. Then the potential outcomes would be nonrandomly missing. The outcome under E would come to be observed more often for advantaged than for disadvantaged children. Selection bias is thus a problem of nonrandomly missing data. Even more insidiously, suppose that some parents had previous knowledge about how well their child is likely to fare under the new day care program. For example, one parent might know that, without the new program, her child will be cared for by the paternal grandmother, who is known to be a master teacher of young children. Thus, this parent decides not to participate in the new day care program, knowing that the child will probably do better without it. Other parents who know their families do not include talented teachers with time to care for their child choose the new program. Such information is rarely available to researchers, yet it produces nonrandomly missing data. We view the probability of assignment to E to be the propensity to receive the experimental treatment or simply “the propensity score ” (Rosenbaum and Rubin, 1983). Under random assignment to treatments, the propensity score is independent of the potential outcomes. In the hypothetical case above, by contrast, family advantage is related to both the propensity score and to the potential outcomes. This creates a correlation between the propensity and the potential outcomes. Now suppose that it is impossible to conduct a randomized experiment but it is possible to determine exactly how family circumstances translates into propensity—that is, how families get selected into the treatment. We could then implement a statistical procedure: For every possible participant, predict the propensity of being in the experimental group. Divide all sample members into subgroups having the same propensity. Within each subgroup, compute the mean difference between those in E and C as the average treatment effect for that group.2 Average these treatment effects across all subgroups to estimate the overall average treatment effect. 2   In a variant of this procedure devised by Robins, Greenland, and Hu (1999), sample weights are computed that are inversely proportional to the propensity of receiving the treatment actually received. Experimental and control groups are then compared with respect to their weighted means. This procedure minimizes the influence of persons with the strongest propensity to receive the treatment they received and eliminates bias in estimating treatment effects when the propensity is accurately predicted. The method has especially useful applications when the treatments are time-varying.

OCR for page 545
From Neurons to Neighborhoods: The Science of Early Childhood Development The resulting estimate will be an unbiased estimate of the average treatment effect. Every comparison between those in E and those in C involves subsets of children having identical propensities to experience E. Therefore, the potential outcomes of the children compared cannot be associated with their propensities, and the estimates of the treatment effect will be unbiased. This procedure also makes it easy to estimate separate treatment effects for each subgroup. When children are matched on propensity scores, the validity of the causal estimate depends strongly on the investigator's knowledge of the factors that affect the propensity to experience E versus C. More specifically, if some unknown characteristic of the child predicts the propensity to be in E versus C, and if that characteristic also is associated with the potential outcomes, then the estimate of the treatment effect based on propensity score matching will be biased. The assumption that no such confounding variable exists is a strong assumption. It is the responsibility of the investigator to collect the relevant background data and to provide sound arguments based on theory and data analysis that the relevant predictors of propensity have been controlled. Even then, doubts will remain in the minds of some readers. In contrast, all possible predictors of propensity are controlled in a randomized experiment, including those that would have escaped the attention of the most thoughtful investigator. Rosenbaum (1995) describes procedures for examining the sensitivity of causal inferences to lack of knowledge about propensity when randomization is impossible. Perhaps the most common strategy for approximating unbiased causal inference in nonexperimental settings is the use of statistical adjustments. In early childhood research, it is very common to use linear models (regression, analysis of variance, structural equation models) to adjust estimates of treatment impact for covariates related to the outcome. These covariates must be pretreatment characteristics of the child or the setting, and the aim is to include all confounders in the set of covariates controlled. By statistically “holding constant ” the confounders in assessing treatment impact, one aims to approximate a randomized experiment. Under some assumptions, this strategy will work. In particular, if the propensity score (the probability of receiving treatment E) is a linear function of the covariates used in the model, then this adjustment strategy will provide an unbiased estimate of the treatment effect. Aside from the possible fragility of this assumption, this strategy is limited, in that only a relatively small set of covariates may be included in the model. In a propensity score matching procedure mentioned earlier, it is possible—and advisable —to use as many possible covariates as one can obtain in the analysis that predicts propensity.