2
What Questions Should the Evaluation Address?

Criminal justice programs arise in many different ways. Some are developed by researchers or practitioners and fielded rather narrowly at first in demonstration projects. The practice of arresting perpetrators of domestic violence when police were called to the scene began in this fashion (Sherman, 1992). Others spring into broad acceptance as a result of grass roots enthusiasm, such as Project DARE with its use of police officers to provide drug prevention education in schools. Still others, such as intensive probation supervision, arise from the challenges of everyday criminal justice practice. Our concern in this report is not with the origins of criminal justice programs but with their evaluation when questions about their effectiveness arise among policy makers, practitioners, funders, or sponsors of evaluation research.

The evaluation of such programs is often taken to mean impact evaluation, that is, an assessment of the effects of the program intervention on the intended outcomes (also called outcome evaluation). This is a critical issue for any criminal justice program and its stakeholders. Producing beneficial effects (and avoiding harmful ones) is the central purpose of most programs and the reason for investing resources in them. For this reason, all the subsequent chapters of this report discuss various aspects of impact evaluation.

It does not follow, however, that every evaluation should automatically focus on impact questions (Rossi, Lipsey, and Freeman, 2004; Weiss, 1998). Though important, those questions may be premature in light of limited knowledge about other aspects of program performance that are prerequisites for producing the intended effects. Or, they may be inap-



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 14
Improving Evaluation of Anticrime Programs 2 What Questions Should the Evaluation Address? Criminal justice programs arise in many different ways. Some are developed by researchers or practitioners and fielded rather narrowly at first in demonstration projects. The practice of arresting perpetrators of domestic violence when police were called to the scene began in this fashion (Sherman, 1992). Others spring into broad acceptance as a result of grass roots enthusiasm, such as Project DARE with its use of police officers to provide drug prevention education in schools. Still others, such as intensive probation supervision, arise from the challenges of everyday criminal justice practice. Our concern in this report is not with the origins of criminal justice programs but with their evaluation when questions about their effectiveness arise among policy makers, practitioners, funders, or sponsors of evaluation research. The evaluation of such programs is often taken to mean impact evaluation, that is, an assessment of the effects of the program intervention on the intended outcomes (also called outcome evaluation). This is a critical issue for any criminal justice program and its stakeholders. Producing beneficial effects (and avoiding harmful ones) is the central purpose of most programs and the reason for investing resources in them. For this reason, all the subsequent chapters of this report discuss various aspects of impact evaluation. It does not follow, however, that every evaluation should automatically focus on impact questions (Rossi, Lipsey, and Freeman, 2004; Weiss, 1998). Though important, those questions may be premature in light of limited knowledge about other aspects of program performance that are prerequisites for producing the intended effects. Or, they may be inap-

OCR for page 14
Improving Evaluation of Anticrime Programs propriate in the context of issues with greater political salience or more relevance to the concerns of key audiences for the evaluation. In particular, questions about aspects of program performance other than impact that may be important to answer in their own right, or in conjunction with addressing impact questions, include the following: Questions about the need for the program, e.g., the nature and magnitude of the problem the program addresses and the characteristics of the population served. Assessment of the need for a program deals with some of the most basic evaluation questions—whether there is a problem that justifies a program intervention and what characteristics of the problem make it more or less amenable to intervention. For a program to reduce gang-related crime, for instance, it is useful to know how much crime is gang-related, what crimes, in what neighborhoods, and by which gangs. Questions about program conceptualization or design, e.g., whether the program targets the appropriate clientele or social units, embodies an intervention that could plausibly bring about the desired changes in those units and involves a delivery system capable of applying the intervention to the intended units. Assessment of the program design examines the soundness of the logic inherent in the assumption that the intervention as intended can bring about positive change in the social conditions to which it is directed. One might ask, for instance, whether it is a sound assumption that prison visitation programs for juvenile offenders, such as Scared Straight, will have a deterrent effect for impressionable antisocial adolescents (Petrosino et al., 2003a). Questions about program implementation and service delivery, e.g., whether the intended intervention is delivered to the intended clientele in sufficient quantity and quality, if the clients believe they benefit from the services, and how well administrative, organizational, personnel, and fiscal functions are handled. Assessment of program implementation, often called process evaluation, is a core evaluation function aimed at determining how well the program is operating, especially whether it is actually delivering enough of the intervention to have a reasonable chance of producing effects. With a program for counseling victims of domestic violence, for example, an evaluation might consider the number of eligible victims who participate, attendance at the counseling sessions, and the quality of the counseling provided. Questions about program cost and efficiency, e.g., what the program costs are per unit of service, whether the program costs are reasonable in relation to the services provided or the magnitude of the intended benefits, and if alternative approaches would yield equivalent benefits at equal or lower cost. Cost and efficiency questions about the delivery of

OCR for page 14
Improving Evaluation of Anticrime Programs services relate to important policy and management functions even without evidence that those services actually produce benefits. Cost-benefit and cost-effectiveness assessments are especially informative, however, when they build on the findings of impact evaluation to examine the cost required to attain whatever effects the program produces. Cost questions for a drug court, for instance, might ask how much it costs per offender served and the cost for each recidivistic drug offense prevented. The design and implementation of impact evaluations capable of producing credible findings about program effects are challenging and often costly. It may not be productive to undertake them without assurance that there is a well-defined need for the program, a plausible program concept for bringing about change, and sufficient implementation of the program to potentially have measurable effects. Among these, program implementation is especially critical. In criminal justice contexts, the organizational and administrative demands associated with delivering program services of sufficient quality, quantity, and scope to bring about meaningful change are considerable. Offenders often resist or manipulate programs, victims may feel threatened and distrustful, legal and administrative factors constrain program activities, and crime, by its nature, is difficult to control. Under these circumstances, programs are often implemented in such weak form that significant effects cannot be expected. Information about the nature of the problem a program addresses, the program concept for bringing about change, and program implementation are also important to provide an explanatory context within which to interpret the results of an impact evaluation. Weak effects from a poorly implemented program leave open the possibility that the program concept is sound and better outcomes would occur if implementation were improved. Weak effects from a well-implemented program, however, are more likely to indicate theory failure—the program concept or approach itself may be so flawed that no improvement in implementation would produce the intended effects. Even when positive effects are found, it is generally useful to know what aspects of the program circumstances might have contributed to producing those effects and how they might be strengthened. Absent this information, we have what is often referred to as a “black box” evaluation—we know if the expected effects occurred but have no information about how or why they occurred or guidance for how to improve on them. An important step in the evaluation process, therefore, is developing the questions the evaluation is to answer and ensuring that they are appropriate to the program circumstances and the audience for the evaluation. The diversity of possible evaluation questions that can be addressed and the importance of determining which should be addressed in any

OCR for page 14
Improving Evaluation of Anticrime Programs given evaluation have several implications for the design and management of evaluation research. Some of the more important of those implications are discussed below. EVALUATIONS CAN TAKE MANY DIFFERENT FORMS Evaluations that focus on different questions, assess different programs in different circumstances, and respond to the concerns of different audiences generally require different designs and methods. There will thus be no single template or set of criteria for how evaluations should be conducted or what constitutes high quality. That said, however, there are several recognizable forms of evaluation to which similar design and quality standards apply (briefly described in Box 2-1). A common and significant distinction is between evaluations concerned primarily with program process and implementation and those focusing on program effects. Process evaluations address questions about how and how well a program functions in its use of resources and delivery of services. They are typically designed to collect data on selected performance indicators that relate to the most critical of these functions, for instance, the amount, quality, and coverage of services provided. These performance indicators are assessed against administrative goals, contractual obligations, legal requirements, professional norms, and other such applicable standards. The relevant performance dimensions, indicators, and standards will generally be specific to the particular program. Thus this form of evaluation will be tailored to the program being evaluated and will show little commonality across programs that are not replicates of each other. Process evaluations may assess program performance at one point in time or be configured to produce periodic reports on program performance, generally referred to as “performance monitoring.” In the latter case, the procedures for collecting and reporting data on performance indicators are often designed by an evaluation specialist but then routinized in the program as a management information system (MIS). When conducted as a one-time assessment, however, process evaluations are generally the responsibility of a designated evaluation team. In that case, assessment of program implementation may be the main aim of the evaluation, or it may be integrated with an impact evaluation. Program performance monitoring sometimes involves indicators of program outcomes. This situation must be distinguished from impact evaluation because it does not answer questions about the program’s effects on those outcomes. A performance monitoring scheme, for instance, might routinely gather information about the recidivism rates of the of-

OCR for page 14
Improving Evaluation of Anticrime Programs BOX 2-1 Major Forms of Program Evaluation Process or Implementation Evaluation An assessment of how well a program functions in its use of resources, delivery of the intended services, operation and management, and the like. Process evaluation may also examine the need for the program, the program concept, or cost. Performance Monitoring A continuous process evaluation that produces periodic reports on the program’s performance on a designated set of indicators and is often incorporated into program routines as a form of management information system. It may include monitoring of program outcome indicators but does not address the program impact on those outcomes. Impact Evaluation An assessment of the effects produced by the program; that is, the outcomes for the target population or settings brought about by the program that would not have occurred otherwise. Impact evaluation may also incorporate cost-effectiveness analysis. Evaluability Assessment An assessment of the likely feasibility and utility of conducting an evaluation made before the evaluation is designed. It is used to inform decisions about whether an evaluation should be undertaken and, if so, what form it should take. fenders treated by the program. This information describes the post-program status of the offenders with regard to their reoffense rates and may be informative if it shows higher or lower rates than expected for the population being treated or interesting changes over time. It does not, however, reveal the program impact on recidivism, that is, what change in recidivism results from the program intervention and would not have occurred otherwise. Impact evaluations, in turn, are oriented toward determining whether a program produces the intended outcomes, for instance, reduced recidivism among treated offenders, decreased stress for police

OCR for page 14
Improving Evaluation of Anticrime Programs officers, less trauma for victims, lower crime rates, and the like. The programs that are evaluated may be demonstration programs, such as the early forms of Multidimensional Treatment Foster Care Program (Chamberlain, 2003), that are not widely implemented and which may be mounted or supervised by researchers to find out if they work (often called efficacy studies). Or they may involve programs already rather widely used in practice, such as drug courts, that operate with representative personnel, training, client selection, and the like (often called effectiveness studies). Such differences in the program circumstances, and many other program variations, influence the nature of the evaluation, which must always be at least somewhat responsive to those circumstances. For present purposes, we will focus on broader considerations that apply across the range of criminal justice impact evaluations. EVALUATION MUST OFTEN BE PROGRAMMATIC Determining the priority evaluation questions for a program or group of programs may itself require some investigation into the program circumstances, stakeholder concerns, utility of the expected information, and the like. Moreover, in some instances it may be necessary to have the answers to some questions before asking others. For instance, with relatively new programs, it may be important to establish that the program has reached an adequate level of implementation before embarking on an outcome evaluation. A community policing program, for instance, could require changes in well-established practices that may occur slowly or not at all. In addition, any set of evaluation results will almost inevitably raise additional significant questions. These may involve concerns, for example, about why the results came out the way they did, what factors were most associated with program effectiveness, what side effects might have been missed, whether the effects would replicate in another setting or with a different population, or whether an efficacious program would prove effective in routine practice. It follows that producing informative, useful evaluation results may require a series of evaluation studies rather than a single study. Such a sustained effort, in turn, requires a relatively long time period over which the studies will be supported and continuity in their planning, implementation, and interpretation. EVALUATION MAY NOT BE FEASIBLE OR USEFUL The nature of a program, the circumstances in which it is situated, or the available resources (including time, data, program cooperation, and evaluation expertise) may be such that evaluation is not feasible for a par-

OCR for page 14
Improving Evaluation of Anticrime Programs ticular program. Alternatively, the evaluation questions it is feasible to answer for the program may not be useful to any identifiable audience. Unfortunately, evaluation is often commissioned and well under way before these conditions are discovered. The technique of evaluability assessment (Wholey, 1994) was developed as a diagnostic procedure evaluators could use to find out if a program was amenable to evaluation and, if so, what form of evaluation would provide the most useful information to the intended audience. A typical evaluability assessment considers how well defined the program is, the availability of performance data, the resources required, and the needs and interests of the audience for the evaluation. Its purpose is to inform decisions about whether an evaluation should be undertaken and, if so, what form it should take. For an agency wishing to plan and commission an evaluation, especially of a large, complex, or diffuse program, a preliminary evaluability assessment can provide background information useful for defining what questions the evaluation should address, what form it should take, and what resources will be required to successfully complete it. Evaluability assessments are discussed in more detail in Chapter 3. EVALUATION PLANS MUST BE WELL-SPECIFIED The diversity of potential evaluation questions and approaches that may be applicable to any program allows much room for variation from one evaluation team to another. Agencies that commission and sponsor evaluations will experience this variation if the specifications for the evaluations they fund are not spelled out precisely. Such mechanisms as Requests for Proposals (RFPs) and scope of work statements in contracts are often the initial forms of communication between evaluation sponsors and evaluators about the questions the evaluation will answer and the form it will take. Sponsors who clearly specify the questions of interest and the form in which they expect the answers are more likely to obtain the information they want from an evaluation. At the same time, an evaluation must be responsive to unanticipated events and circumstances in the field that necessitate changes in the plan. It is advantageous, therefore, for the evaluation plan to be both well-specified and also to have provisions for adaptation and renegotiation when needed. Development of a well-specified evaluation solicitation and plan shifts much of the burden for identifying the focal evaluation questions and the form of useful answers to the evaluation sponsor. More often, in contrast, the sponsor provides only general guidelines and relies on the applicants to shape the specific questions and approach. For the sponsor to be proac-

OCR for page 14
Improving Evaluation of Anticrime Programs tive in defining the evaluation focus, the sponsoring agency and personnel must have the capacity to engage in thoughtful planning prior to commissioning the evaluation. That, in turn, may require some preliminary investigation of the program circumstances, the policy context, feasibility, and the like. When a programmatic approach to evaluation is needed, the planning process must take a correspondingly long-term perspective, with associated implications for continuity from one fiscal year to the next. Agencies’ capabilities to engage in focused evaluation planning and develop well-specified evaluation plans will depend on their ability to develop expertise and sources of information that support that process. This may involve use of outside expertise for advice, including researchers, practitioners, and policy makers. It may also require the capability to conduct or commission preliminary studies to provide input to the process. Such studies might include surveys of programs and policy makers to identify issues and potential sites, feasibility studies to determine if it is likely that certain questions can be answered, and evaluability assessments that examine the readiness and appropriateness of evaluation for candidate programs.