While the case for generating evidence and the value of applying it to decision making was explored in Chapter 2, building the evidence base can be complicated by many factors. Workshop speakers discussed several barriers to building the evidence base as well as considerations for methodological design.
BARRIERS TO EVIDENCE COLLECTION
Individual speakers discussed several barriers that complicate efforts to collect and generate evidence for violence prevention:
- the complexity of violence,
- the need for political and societal support,
- the difficulty of coordination among sectors,
- limited resources and the diversion of resources from directly responding to violence,
- various research methods, and
- restricted support for funding and publishing.
The Complexity of Violence
Throughout the workshop, speakers noted that collecting evidence on which to build violence prevention interventions is a challenging endeavor. Workshop speakers Jennifer Matjasko from the Centers for Disease Control and Prevention and Harriet MacMillan of McMaster University noted that
risk factors for violence develop at multiple social levels, including the individual, family and interpersonal, school and community, and institutional, and a variety of interventions are needed to address all of them. Jerry Reed from the Education Development Center pointed out that there is no single, easily measurable causative factor for violence; thus, no single intervention can solve the problem. Mary Lou Leary from the Department of Justice added that issues related to violence are not stagnant, but always evolving and demanding of continued assessment and new solutions.
Given the complexity of violence, efforts to prevent it can be complex and interact with each other and other factors in ways that may hinder straightforward understanding of effectiveness. Workshop speaker Lisbeth Schorr from the Center for the Study of Social Policy suggested that violence prevention solutions require (1) reforms of institutions, policies, and systems that are adapted to a variety of contextual issues; (2) thoughtful implementation; and (3) evolution in response to changes in context, advances in knowledge, and lessons learned. Workshop speaker Christopher Maxwell from the University of Michigan also noted that it can be difficult to evaluate successful intervention components because of the many linkages among programs. For example, battered women’s shelters provide a variety of services to meet multiple needs, and it can be challenging to determine which shelter resources are most effective at reducing violence.
Need for Political and Societal Support
As Marta Santos Pais, the United Nations (UN) Special Representative for Violence Against Children, reminded the participants, although evidence itself should be neutral, sometimes political tensions can arise from research findings. Evidence on the rates of violence or program effectiveness does not always shed favorable light on political administrations, nor does it always support decisions that leaders believe to be the most politically strategic. For example, Forum member Rodrigo Guerrero, Mayor of Cali, Colombia, noted that policy makers face the decision of whether to keep published homicide rates low by using inconsistent methods of determining homicides or to remain transparent and use consistent methods but at the risk of inviting public questioning and criticism of the effectiveness of their leadership. Similarly, workshop planning committee member and speaker Daniel Webster from the Johns Hopkins Bloomberg School of Public Health mentioned that some law enforcement leaders are rewarded for maintaining low levels of violent crime and thus have little incentive to report higher rates of incidents.
Webster noted that politicians often want to respond quickly to violence by implementing the programs they believe are necessary to reduce incidents. Frequently, they are not as interested in ensuring a random
assignment of interventions for study groups because they are hesitant about allocating limited resources to control groups or random provision of services.
Santos Pais discussed a global survey that she and other partners developed and sent to many UN Member States. The survey was an attempt to better understand the steps that national governments were taking toward violence prevention. It included questions on the prioritization of data and research in influencing policy. Fifty-five percent of the countries did not respond to these questions, and those that did reported mixed success with the use of evidence in policy making. Many responses suggested that statistics offices are effective in data collection, but these offices are not linked well to the public and policy makers. Other respondents mentioned that poor coordination among different organizations hinders effective data collection and dissemination.
Difficult Coordination Among Sectors
The results of the UN global survey highlight another barrier that extends beyond politics, which is the potential difficulty of working across stakeholder groups on multisectoral approaches to violence prevention. A workshop participant cautioned that as public health violence prevention approaches are designed, it is critical to include other relevant sectors, such as the human rights and criminal justice communities. A comprehensive approach is no easy task, however, and Santos Pais noted the challenges in bringing sectors together to address; for example, unemployment and poverty and their effects on families. She noted that too often people wait to address these issues after the violent crisis has been resolved, but violence prevention is more likely to be successful and sustainable if sectors collaborate early to address violence.
Limited Resources and Diversion of Resources
Another challenge to evidence generation is that, in the short run, data collection can divert time, energy, and resources from implementing violence prevention programming. As Leary discussed, there is an ongoing struggle between expedience and evidence—is it worth the time to evaluate program effectiveness when there are problems that need to be addressed with an intervention now? Workshop speaker Daniela Ligiero from the Office of the U.S. Global AIDS Coordinator recognized that there is a debate among the research community, decision makers, and the public over the importance of evidence and, even within the research community, over the amount of evidence needed before taking action.
Forum co-chair and workshop planning committee member Mark Rosenberg from the Task Force for Global Health noted that while researchers are taking the time to get good data, they often lose the opportunity to influence policy. Policy makers often call for data on prevalence and effective interventions immediately after a crisis has occurred; Forum co-chair and workshop speaker Jacquelyn Campbell from the Johns Hopkins School of Nursing emphasized the need to have evidence available in the face of these crises in order to garner support for the most appropriate response. However, because political and financial support for research often does not reach its peak until after a crisis, Rosenberg noted the difficulty of generating usable evidence in a timely fashion without rushing methods and affecting the quality of the data.
Different Research Methods
Workshop participant Janice Humphries from the University of California, San Francisco, School of Nursing suggested that mixed-methods research, which uses both quantitative and qualitative methods and the appropriate analyses of each type of data, is an opportunity to include the voices of stakeholders and community members into the evidence base in a rigorous way and can provide useful context about the community in which the program is being implemented. However, there are barriers to mixed-methods approaches in violence prevention research. Workshop speaker Neil Boothby from Columbia University mentioned that multilateral organizations and governments often resist tapping into informal networks to find information from individuals and communities who are perhaps closest to the problems and the most insightful. Furthermore, Ligiero commented that evidence derived from the hard sciences is frequently valued over evidence from the social and behavioral sciences.
Restricted Support for Funding and Publishing
While there are calls for financial, material, political, and academic support for violence prevention, it is only one of a multitude of important public health priorities competing for a limited pool of resources. Schorr noted that the best funded prevention programs are those that are backed by significant evidence and carry the lowest risk of failure, and the most frequently published studies are those that employ what is viewed as proven methodologies. Webster added that grants often require applicants to demonstrate that they are applying evidence-based interventions to address a problem, but he questioned the value of broadly applying the criteria because effectiveness will vary, depending on the place and time of an intervention. Schorr noted that funders and journal editors have the
power to move the field forward, but will need to be convinced of the value of supporting a greater variety of programs and types of evidence. Don Berwick, formerly from the Centers for Medicare & Medicaid Services, and Paul Batalden, from the Geisel School of Medicine at Dartmouth, have attempted to address this issue by proposing new criteria to journal editors for the acceptance of articles for peer-reviewed publication. Schorr said it is less clear how to convince funders of the need to take risks to encourage innovation, but challenged the workshop audience to continue to spread the message that there is a difference between an intervention that is ineffective and an intervention that has not yet been proven to be effective.
Despite the noted challenges in building the evidence for violence prevention, several speakers recognized that progress continues in developing methods for data collection. Speakers shared some best practices in methodological design considerations from their experiences in research and practice:
- identifying meaningful, measurable outcomes;
- engaging stakeholders in design and dissemination;
- consulting affected communities;
- choosing high-quality and relevant comparison data;
- looking at effects across sectors;
- and applying judgment and common sense.
Identifying Meaningful, Measurable Outcomes
Several speakers highlighted the importance of identifying meaningful outcomes to measure in outcome-focused program evaluations. They cautioned against becoming distracted with measuring program effects that ultimately do not clarify whether the program is meeting its final goals. For example, Forum member Michael Phillips from Shanghai Jiao Tong School of Medicine mentioned that many suicide prevention programs promote their effectiveness in reducing depression or suicide ideation, but it is still not clear whether they actually reduce suicidal behavior. Similarly, MacMillan pointed out some studies that show that sexual abuse prevention programs improve children’s knowledge of the issues, but do not determine whether the programs actually reduce the occurrence of child sexual abuse.
On the other hand, some speakers also noted that evaluations should not become so focused on measuring the predetermined meaningful outcomes that they miss other less obvious effects. Reed pointed out that the
Air Force Suicide Prevention Program, a successful program included on the Suicide Prevention Resource Center’s Best Strategies Registry, not only has reduced suicide by 33 percent among active duty Air Force personnel, but has also been effective in reducing domestic and other forms of violence. These findings improve the knowledge base of violence prevention and can be used practically to change and refine program elements and objectives.
Engaging Stakeholders in Design and Dissemination
A recurring message throughout the workshop was evidence as an opportunity to break down silos and involve key stakeholders at all stages of research. Ligiero stressed the importance of working with communities and governments before embarking on survey research to ensure widespread support and to receive important feedback on research design from various perspectives. Santos Pais described the recent study conducted in Tanzania that involved multiple partners in an effort to uncover the prevalence of physical, emotional, and sexual violence in youth populations (United Republic of Tanzania, 2011). She suggested that the most strategic element of the project was that it was carried out by a multiministerial task force representing many governmental departments, religious organizations, youth, international agencies, and academia. She noted that this widespread participation resulted in more stakeholders feeling ownership of the issue and a willingness to invest in violence prevention. The broad reach of the taskforce allowed them to widely disseminate the survey findings. She noted that when the task force made a presentation to the government, they presented suggested policy responses alongside the concerns that arose in their findings. Their community communications strategy sought to convey their messages to everyone, including young people. By involving multiple stakeholders in the planning and dissemination of their research, they had knowledge, resources, and political support when they publicly presented their study results and ideas for how to move forward to reduce violence.
Reed encouraged the public sector to engage more with private companies to increase interest in and funds for more evidence generation. He noted one example of a successful public–private partnership started by Health and Human Services Secretary Kathleen Sebelius and Defense Secretary Robert Gates. This partnership, the Action Alliance for Suicide Prevention, is co-chaired by Army Secretary John McHugh and Gordon Smith, chief executive officer and president of the National Association of Broadcasters. It brings together about 45 partners from the public and private sectors. They have formed task forces, a sustainability committee, and an operational arm that assists the country in advancing the objectives of the National Strategy for Suicide Prevention, and have set an objective of saving 20,000 lives in 5 years. Reed suggested that such bold goals can
be reached if researchers continue to determine what works in violence prevention while investors continue to invest in promising interventions.
Consulting Affected Communities
Throughout the workshop, speakers emphasized the broad reach of violence and its effects on all sectors and levels of society. Several speakers implied that too often, voices from affected communities are excluded from the data that are collected. Boothby contended that the information researchers need does not always come from the police or multilateral organizations, but from the victims, offenders, siblings, and neighbors. Santos Pais noted that surveys on violence in youth populations can be more comprehensive when completed directly by youth themselves. Several speakers highlighted the importance of qualitative data collection methods for illuminating often otherwise unheard perspectives.
When using surveys, one challenge of collecting data from victims and other people involved directly in violent incidents can be determining how to limit reporting bias. For example, MacMillan mentioned that once parents have been through a parenting program, they know the program objectives and provider expectations. When reporting on their own behavior, they are thus more likely to alter their responses to better align with these program goals.
Choosing High-Quality and Relevant Comparison Data
In his presentation on methodological flaws in gun violence studies, Webster noted that in the information age, there is a tendency to think that all the available data should be used for study controls. However, Webster argued that more data are not necessarily better data, and instead of focusing on amassing high quantities of data, researchers should spend more time ensuring that the data are relevant to their research. For example, when trying to assess the effectiveness of a school bullying program in Baltimore, one could try to compile multitudes of data from numerous schools across the country. However, Webster claimed it would make more sense to find a smaller number of comparison schools in communities that are most similar to the Baltimore community. Webster suggested that finding similar comparisons will help researchers to develop study controls that most closely mirror the condition that the studied population would be in if the intervention had not been implemented.
Webster suggested two qualities that researchers should look for when searching for good controls to compare with intervention sites, using the antibullying program study as an example. First, the comparison schools should have similar baseline rates of bullying as the school with the intervention, and
second, the comparison schools should be on trajectories similar to the intervention school. For example, if the antibullying program were implemented in a school as a response to quickly rising rates of bullying, the comparison schools should also have similarly rising rates of bullying. Comparison schools with bullying levels that have remained relatively unchanged over time likely will not shed light on what levels could have been in the studied school if bullying continued to increase without the intervention.
Webster shared a study that adhered to the principles of finding comparisons with similar baselines and trajectories. The Maryland Saturday Night Special Ban study examined the effects of a ban on small, easily concealable, and poorly made handguns in Maryland. Rather than comparing Maryland to every state to measure the effects of this law, they searched for states that had similar trends and patterns in homicide as Maryland before the law was passed. They found that Pennsylvania and Virginia would be the best comparison states. They used these two states together, with prior cycles and patterns in Maryland, to model predicted gun homicide rates in Maryland had the law not been implemented. They found that rates did decrease after the policy came into effect. Webster explained that modeling must be accurate to give an evaluation study credibility, and the accuracy of this particular modeling was demonstrated by comparing the modeled preintervention homicide rates to the actual preintervention rates. The modeled rates were very similar to the observed rates, thus highlighting the effectiveness of using data from only two very similar states rather than many less similar states.
Looking at Effects Across Sectors
Workshop speaker Mark Bellis from Liverpool John Moores University discussed the importance of measuring program effects across sectors, especially when evaluating prevention interventions that address the underlying root causes of violence. When evaluating school-based interventions such as social and emotional learning programs, for example, one should look beyond measuring educational outcomes. Educational outcomes alone might show that it takes a long time to offset program costs, but assessment of the program benefits to criminal justice, health, and social sectors may reveal that the costs are actually offset much more quickly when a larger range of outcomes is considered. While it might be difficult to convince the education sector that its investments are worthwhile because of benefits to other sectors, perhaps a combined approach to program evaluations will encourage sectors to combine programmatic efforts for greater outcomes.
Applying Judgment and Common Sense
Schorr reminded the audience that numbers and data do not have meaning; it is individuals who use intelligence and judgment to understand the data who assign meaning. Webster cautioned against attaching the wrong meaning to numbers, and stressed the importance of using common sense. He added that politicians and policy makers do not necessarily need to know epidemiology to read study conclusions with a critical eye, but they can evaluate studies using what they know about causality from their experience in violence prevention. As an example, Webster noted that experienced decision makers should realize that policy changes that affect certain gun-carrying permits mostly held by individuals in suburban and rural communities will not likely affect urban violence.
In addition, Webster cautioned evaluators to be aware of other program effects that the data are not capturing, again in order to ensure that the data are being correctly interpreted. For example, an evaluation might show that drug use has declined in an area after a certain intervention. Before assuming that the intervention is effective, however, one would need to first rule out the possibility that the program has merely pushed drug use into a neighboring area. Thoughtfulness and a comprehensive understanding of the issues can help determine what to measure.
Efforts are ongoing to make existing evidence for violence prevention easily and quickly accessible. Several speakers presented on the development of these efforts, which assist practitioners and policy makers who need timely advice on successful interventions for responding to violence. Patrick Tolan from the University of Virginia noted that there are multiple strategies for estimating effectiveness and limited consensus about the standard for identifying programs as evidence based. Best practice lists are much-needed tools that use set criteria to standardize the evidence and aid practitioners and policy makers in comparing programs and their characteristics. Ongoing list-making initiatives that were discussed in the context of the workshop are included in Table 3-1. Two lists that were presented in detail during the workshop, Blueprints for Healthy Youth Development and CrimeSolutions.gov, are described in detail following the table.
Table 3-1 Initiatives for Integration Violence Prevention Evidence Discussed During the Workshop
|Blueprints for Healthy Youth Development||http://www.blueprintsprograms.com|
|Evidence-based Prevention and Intervention Support Center at Penn State University (EPISCenter) (Prevention Research Center, Penn State University, Pennsylvania Commission on Crime and Delinquency, and the Pennsylvania Department of Public Welfare)||http://www.episcenter.psu.edu|
|Global Implementation Conference||http://globalimplementation.org/gic|
|National Suicide Prevention Resource Center best practices registry||http://www.sprc.org/bpr|
|Secretary general’s database on violence against women||http://sgdatabase.unwomen.org/home.action|
|U.S. Preventative Services Task Force||http://www.uspreventiveservicestaskforce.org|
|Virtual Knowledge Centre to End Violence Against Women and Girls||http://www.endvawnow.org|
Examples of Promising Evidence Integration
Blueprints for Healthy Youth Development
As described by Tolan, Blueprints for Healthy Youth Development (Blueprints) is a registry compiled by proactive search for and review of evaluations of individual prevention and treatment programs for violence, drug abuse, delinquency, mental health, educational achievement, and physical health. Blueprints staff perform literature reviews on a monthly basis and identify studies that might meet the Blueprints standards, and board members systematically review all material available about a particular program, including information directly received from program developers. Individual programs with positive effects on meaningful outcomes are then certified as promising or model programs, and the programs labeled as model are eligible for dissemination (see Table 3-2). Blueprints then produces a fact sheet that describes the program’s theoretical model, program costs, net benefits, funding, materials, and extra references.
Tolan noted that the critical issue in program evaluation is determining whether the effects shown are a result of the program or of alternative
TABLE 3-2 Blueprints’ Criteria for Promising and Model Programs
|Study(s) meet design requirements to be considered valid||Study(s) meet design requirements to be considered valid|
|1 high-quality RCT or 2 QEDs||2 high-quality RCTs or 1 RCT and 1 QED|
|Significant positive effects||Significant positive effects|
|No health-compromising effects||No health-compromising effects Sustainability of 12 months or more on 1 outcome|
NOTE: QED = quasi experimental design; RCT = randomized control trial.
SOURCE: Tolan, 2013.
causes, and Blueprints tries to identify as model programs those that have big effects. Ideally, model programs would be supported by a meta-analysis of at least two high-quality randomized controlled trials (RCTs), but Tolan pointed out that programs with multiple evaluation studies are difficult to find; replication is not a well-funded activity, nor is it exciting for a researcher’s career to replicate another’s work.
Another registry, CrimeSolutions.gov, was presented by Phelan Wyrick from the Department of Justice. This registry provides practitioners and policy makers with information on effective programs in criminal justice, juvenile justice, and crime victim services. The registry labels programs as effective, promising, or no effects and also indicates which conclusions are supported by multiple studies. The no effects category includes programs that have either null or negative effects, and it specifically identifies programs that have been proven to cause harm. Each program profile includes a program description, the measured outcomes, study methodology used in the evaluation, cost information, implementation information, and additional references. Wyrick explained that the purpose of the site is not only to provide information on specific programs in which to invest resources, but also program characteristics that could be incorporated into existing programs.
Programs that are added to the CrimeSolutions.gov list are reviewed in an eight-step process of identifying programs, screening programs, searching literature, screening evidence, selecting evidence, sending evidence through an expert review, classifying studies, and rating the evidence. The reviewers assess four main dimensions of the studies: (1) the degree to which the program is based on a well-articulated conceptual framework; (2) the ability
of the research design to establish a causal association between treatment and outcome; (3) the degree to which the outcome evidence supports the program treatment; and (4) the degree to which the program was implemented as designed. Wyrick noted that much weight is given to reviewer confidence—if reviewers notice a significant flaw in the study that is not captured in the other criteria, they have the authority to change the rating to reflect it.
CrimeSolutions.gov targets mayors and police chiefs as users more than researchers and academics and thus tries to present the research in clear, nontechnical terms. The developers have invested in website design and accessibility, and usage of the site has continued to rise since its launch. Twenty percent of the site users are outside the United States. Wyrick mentioned that some users have expressed concern that the program descriptions are lengthy, but he emphasizes to practitioners that, if they are investing large amounts of money into a program, the information in the descriptions is valuable.
There is significant overlap between the Blueprints and CrimeSolutions.gov lists. Blueprints uses a higher standard of evidence to rate their programs and thus CrimeSolutions.gov’s “effective” category includes more studies than the Blueprints model program category. All of the Blueprints model programs are also deemed effective by CrimeSolutions.gov, which provides further confirmation of the programs’ effectiveness.
Both Tolan and Wyrick emphasized that the process of developing registries is ongoing and always evolving. While discussing further considerations for registry development, Tolan mentioned the need to better understand the role of cost-effectiveness studies in reviews to determine program effectiveness (see Box 3-1). The audience was queried, if reviewers encounter a study that shows that a program has a small effect size (p < 0.05) but the cost/benefit ratio is good because the program affects a lot of people, how is it best to balance effect size with cost-effectiveness?
Tolan also noted that much work needs to be done to develop consistent criteria for reviewing program evaluations so that practitioners and policy makers do not have to refer to multiple, similar-looking lists. Different lists have different standards of what is meant by “evidence base,” and as evidence becomes more important for program development, it will be helpful to have a shared understanding of what constitutes adequate evidence.
Looking Beyond Program Data
Several speakers stressed the importance of integrating evidence beyond program evaluations to determine what works in violence prevention. Schorr cautioned the audience that if the research community decides
Evolving Considerations for Blueprint Reviews
- Determine how to measure the sustained effect of continuous treatment/intervention programs
- Look at design and power issues between studies at different levels (school, neighborhood, community level)
- In addition to individual programs, evaluate the impacts and sustainability of program delivery systems (e.g., Communities That Care)
- Consider regression discontinuity and other “non-experimental” estimates of program effects
- Undergo independent replication
- Determine replication criteria
- Understand the role of effect sizes
- Consider cost-effectiveness of programs
SOURCE: Tolan, 2013.
that strictly defined experimental evaluation of programs is the preferred method of understanding effectiveness, they are likely to close off other ways of valuable learning.
She pointed out that current pressure to minimize risk and maximize effectiveness with public and private money often leads funders to invest primarily in programs proven successful by evaluation, which potentially limits creativity and innovation for developing new interventions that fill the gaps not covered by the proven programs. For example, the evaluations of the overall successful Nurse–Family Partnership (NFP) found that the program was not able to retain the most depressed mothers or enroll families with substance abuse and domestic violence issues. Community groups that wanted to also address these issues were discouraged from doing so because they could not easily find funding for initiatives that fell outside of the NFP’s proven framework.
Collective Impact of Interventions
In addition to facilitating more innovation and exploration, many speakers agreed that looking beyond program data will illuminate not only the programs that work, but also how and why they work. Schorr and Brian Bumbarger from The Pennsylvania State University both encouraged the violence prevention community to focus their research efforts on
understanding the effects of combined interventions, as well as identifying the successful underlying components of interventions.
Schorr suggested that the collective impact of programs is sometimes greater than the sum of its parts, and isolated program evaluations do not necessarily illustrate the true effects of programs. Bumbarger pointed out that real-world effect sizes are often different from effect sizes found in tightly controlled experimental trials, thus illustrating that implementing a proven program in a community will not necessarily bring about population-level health improvements. To understand the real effects of programs, much work is needed to develop evaluation methods that measure comprehensiveness of programs and interactions of interventions.
Schorr mentioned efforts in the United States to reduce tobacco use as an example of success that resulted from reaching beyond isolated interventions. Initially, in the 1960s, states were not finding significant effects of tobacco reduction programs because they were only measuring the effects of individual interventions. However, California and Massachusetts took a very comprehensive approach and studied the overall effects of multiple programs working in unison. They found that this combination of interventions eventually led to tripling their annual rates in the decline of tobacco consumption. Schorr mentioned that other states were much more interested in the data from California and Massachusetts than they were in the thousands of controlled trials on individual programs because they provided useful information that could not be gleaned merely from individual program evaluations.
Boothby described a program in Burundi that implements household economic-strengthening programs with positive parenting programs. Together, these interventions produce positive results that might not appear at all if the programs were delivered in an uncoordinated and isolated manner. He suggested moving to multiyear efficacy trials and developing a larger and consistent research effort that would move the field from project level evidence to knowledge across sectors.
Identifying Successful Intervention Components
A theme acknowledged by several workshop speakers was the need to go beyond determining which strategies work to understanding the underlying components that make them work. One method of uncovering successful elements of programs is the use of systematic reviews of the current research body. Workshop speaker Mark Lipsey from Vanderbilt University discussed the potential of systematic reviews to aggregate evidence from multiple studies in order to smooth out variability within the studies and summarize the full body of evidence on a particular topic. Lipsey pointed out that one study on program effectiveness is not enough to provide a
confident basis for action, and the strength of systematic reviews usually increases with the number of studies incorporated.
Lipsey highlighted the key steps in a systematic review, beginning with the development of clear criteria for study inclusion in the review. Researchers then perform methodical searches of research literature to find studies that meet the criteria. They continue with coding and extraction of data, which they then statistically analyze.
Lipsey noted that meta-analysis is one type of systematic review method that combines the results of individual quantitative studies by standardizing effect sizes. It involves the use of statistics to find a common metric that can be used to compare studies that use differing units to measure the same outcome. These standardized effect sizes are found in a way that does not confound the magnitude of the effect with sample size or variability and thus are different from statistical significance. Meta-analysis is not merely summation of statistically significant results throughout the research body, but instead uses analysis that looks at the distribution of effects across studies.
Lipsey explained how meta-analysis allows researchers to pull together the evidence on a particular type of intervention to produce a more accurate comparison of programs. For example, an effect size distribution of studies measuring the effectiveness of cognitive behavioral therapy (CBT) programs on prevention of juvenile and adult recidivism used an odds ratio as the standard measure for comparing 58 studies. It showed a wide spectrum of effectiveness—some CBT programs were very effective, while others were not, and no difference in effectiveness was found between brand-name and homegrown CBT programs.
Lipsey suggested that researchers can use meta-analysis to determine the special ingredients of successful programs such as the characteristics associated with positive and negative effects of the CBT interventions mentioned above—the types of offenders involved, the environment of the intervention (in the community, a probation setting, a prison, etc.), the types of approaches and treatments used, or the quality of implementation. For example, this analysis found that programs focused on high-risk offenders had larger effects than those focused on lower risk offenders. High-intervention frequency, program fidelity, and interpersonal problem-solving approaches also correlated with CBT program success in this study.
One can go a step further and use comparative meta-analysis to compare the effectiveness of different types of approaches on an outcome. Lipsey mentioned that one such analysis gathered studies on the effects of CBT, behavioral, social skills, challenge, academic, and job-related interventions on recidivism rates of juvenile offenders. The analysis found the greatest effects from CBT and behavioral interventions. One can also look even more broadly to compare philosophies of interventions. For example,
TABLE 3-3 Characteristics of Two Methods Used to Compile and Evaluate Program Data
|Best Practices List||Meta-Analysis|
|Assesses programs||Assesses practices and approaches|
|Looks at programs that can be used as they are||Looks at large numbers of program studies|
|Takes study methodology into consideration and puts it through an account analysis||Requires quality methodology for inclusion|
SOURCE: Tolan, 2013.
a meta-analysis that compiled studies on the effects of therapeutic (restorative, skill building, counseling, etc.) and control-oriented (discipline, deterrence, etc.) approaches on recidivism rates found that therapeutic interventions are far more effective.
Some of the characteristics of best practices lists and meta-analysis that were identified by Tolan are summarized in Table 3-3.
Risk and Protective Factors
Many speakers emphasized the need for more data on individual and community-level risk and protective factors. Risk factors are commonly considered to be conditions or variables associated with a lower likelihood of positive outcomes and a higher likelihood of negative or socially undesirable outcomes, while protective factors have the reverse effect, enhancing the likelihood of positive outcomes and lessening the likelihood of negative consequences from exposure to risk (Jessor et al., 1998). Campbell and Bumbarger both called for a shift from surveillance of outcomes to more surveillance of these risk and protective factors, as well as the development of long-term studies to better understand these factors and their effects. Bumbarger highlighted the program model Communities That Care, which collects local epidemiological data on risk and protective factors that predispose children to multiple poor outcomes.
Bellis explained that once risk factors are identified, programs can be designed to target risk factors and could affect a wider range of outcomes than if the program were designed only with a single outcome in mind. He highlighted two studies, one in the United Kingdom and one in the United States, which showed an increased risk of low mental well-being, severe obesity, and smoking in adulthood for those who experienced four or more adverse experiences in childhood. As these studies imply, the determination of the real impact of risk-focused interventions on well-being will require the measurement of appropriate outcomes (in this case, mental well-being, obesity, and smoking), even if a program’s main focus is measuring and
targeting risk factors and associated outcomes. Due to the widespread impact of preventative programs that address risk and protective factors, evaluating their effectiveness may require measurement of many outcomes across sectors.
Consideration of Theoretical Frameworks
Schorr suggested that in addition to identifying program components that seem to be successful, the violence prevention community should also use theoretical research to consider nonevaluated interventions and program components that might work. For example, a project on human development in Chicago neighborhoods found that the largest single predictor of crime levels in the neighborhoods studied was social cohesion and mutual trust among neighbors combined with the willingness to intervene on behalf of the common good. Currently there are no proven program interventions that address neighborhood cohesion, but Schorr suggested these are the places that need to be given more attention.
Consideration of Experiential Knowledge
Workshop speakers emphasized the need to look for evidence in places beyond traditional evaluations, studies, and analysis. As Schorr claimed, “We need a broader knowledge base, not a narrower one that considers experimental evidence as the sole proof of effectiveness.” Workshop planning committee co-chair and Forum member James Mercy from the Centers for Disease Control and Prevention highlighted the knowledge among workshop participants that can be derived not from the results of their study findings, but from their experience in implementing, practicing, and researching violence prevention interventions. In addition to learning from the scientific evidence presented at the workshop, he encouraged workshop participants to also learn from the experiences of other participants and audience members.
Key Messages Raised by Individual Speakers
- Better understanding risk and protective factors can provide evidence for developing programs on a broader range of outcomes (Bellis, Bumbarger, Campbell).
- Meaningful outcomes need to be identified when programs are developed and measured (MacMillan, Maxwell, Phillips, Tolan).
- Stakeholder communities should be engaged throughout the process to increase understanding and buy-in (Ligiero, Santos Pais).
Jessor, R., M. S. Turbin, and F. M. Costa. 1998. Risk and protection in successful outcomes among disadvantaged adolescents. Applied Developmental Science 2(4):194-208.
Tolan, P. 2013. Creating lists of “evidence based” programs: Utilizing set standards for what works in violence prevention. Presented at the IOM Workshop on the Evidence for Violence Prevention Across the Lifespan and Around the World. Washington, DC, January 23.
United Republic of Tanzania. 2011. Violence against children in Tanzania: Findings of a national survey, 2009. http://www.unicef.org/media/files/VIOLENCE_AGAINST_CHILDREN_IN_TANZANIA_REPORT.pdf (accessed July 1, 2013).