ticipation is having positive effects on those served and, if so, why. Knowing that an intervention met its goals is a step in the right direction, but the real need is to move from this general conclusion to specific conclusions about which aspects of the intervention have which effects, to what degree, and under which circumstances (Rutter et al., in press). Assessing program impact on those served is once again a causal question. As in more basic research, threats to valid causal inference arise. Selection bias occurs when characteristics that predict program participation are also associated with outcomes. Simultaneity bias can also occur, especially when program activities and outcomes are studied over time.
There is, however, an important distinction between the kinds of causal questions that arise in basic research and those that arise in program evaluations. Basic developmental science typically assesses causal connections between events that unfold naturally over time. Opportunities for experimentation are limited. In contrast, program evaluations assess the effects of deliberately planned interventions —activities that would not occur in the absence of a new policy. For this reason, it is often more plausible, both ethically and logistically, to conduct experiments in program evaluation than in basic developmental science.
Such experimentation not only provides strong causal inferences about the impact of the program, but it can also provide new insights of great relevance to basic research. Experiments such as the Infant Health and Development Program (Gross et al., 1997), the High/Scope Project (Schweinhart et al., 1993; see Chapter 13) study of the long-term effects of high-quality child care, the Abecedarian Program (Campbell and Raney, 1994, 1995), and the Nurse Home Visitation Program (Olds et al., 1986, 1999) provide a wealth of knowledge about how early environmental enrichment affects short-and long-term cognitive and social development, knowledge that would otherwise be unavailable. Experimental evaluations have also shown that some promising ideas do not appear to translate into better childhood outcomes, a result that requires a deeper reflection on the validity of the theory behind the program, as well as on program implementation. This interplay between basic and applied research is essential to the vitality of the field.
This discussion may seem to imply that all program evaluations should be randomized experiments. Although we strongly suspect that randomized experiments are underutilized in program evaluation research, they are not the right tool for addressing all questions about interventions and special conditions must hold before a randomized experiment is feasible or desirable. Moreover, well-planned experiments can unravel.
Randomized experiments are of use when a clearly stated causal question is on the table. Many program evaluations are designed to answer other kinds of questions. Early in the life of a new program, the key