study, however, may have more to do with the particular characteristics of the agency or personnel involved than with the strengths or weaknesses of the program itself. Note, for example, the variation Braga (2003) found in the effects of hot spots policing across five randomized control group studies. Similarly, a strong program impact in one jurisdiction may not carry over to others that have offenders or victims drawn from different ethnic communities or socioeconomic backgrounds (Berk, 1992; Sherman, 1992). This does not mean that single-site studies cannot be useful for drawing conclusions about program effects or developing policy, only that caution must be used to avoid overgeneralizing their significance.
Such circumstances highlight the importance of conducting multiple studies and integrating their findings so that meaningful conclusions can be drawn. The most common technique for integrating results from impact evaluation studies is meta-analysis or systematic review (Cooper, 1998). Meta-analysis allows the pooling of multiple studies in a specific area of interest into a single analysis in which each study is an independent observation. The main advantage of meta-analysis over traditional narrative reviews is that it yields an estimate of the average size of the intervention effect over a large number of studies while also allowing analysis of the sources of variation across studies in those effects (Cooper and Hedges, 1994; Lipsey and Wilson, 2001).
Another approach for overcoming the inherent weakness of single-site studies is replication research. In this case, studies are replicated at multiple sites within a broader program of study initiated by a funding agency. The Spouse Assault Replication Program (Garner, Fagan, and Maxwell, 1995) of the National Institute of Justice is an example of this approach. In that study, as in other replication studies, it has been difficult to combine investigations into a single statistical analysis (e.g., Petersilia and Turner, 1993), and it is common for replication studies to be discussed in ways similar to narrative reviews. A more promising approach, the multicenter clinical trial, is common in medical studies but is rare in criminal justice evaluations (Fleiss, 1982; Stanley, Stjernsward, and Isley, 1981). In multicenter clinical trials, a single study is conducted under very strict controls across a sample of sites. Although multicenter trials are rare in criminal justice evaluations, Weisburd and Taxman (2000) described the design of one such trial that involved innovative drug treatments. In this case a series of centers worked together to develop a common set of treatments and common protocols for measuring outcomes. The multicenter approach enhances external validity by supporting inferences not only to the respondent samples at each site, but also to the more general population that the sites represent collectively.