Researchers from diverse disciplines have contributed to the capital punishment literature, with prominent contributions by economists, criminologists, and sociologists. Although researchers’ disciplinary backgrounds have affected the methods used and the framing of the research questions, the failings of the capital punishment literature are not rooted in the use of particular empirical methods or theoretical models of criminal decision making. Rather, the failings are rooted in manifest deficiencies related to the research data and methods and the researchers’ interpretations of results. Chapters 4 and 5 call attention, respectively, to fundamental deficiencies in panel and time-series studies. Both approaches share two basic deficiencies and also manifest two others to some degree. One shared deficiency is grossly incomplete specification of the sanction regime for homicide. Even in states that make the most frequent use of capital sanctions, noncapital sanctions are the most common sanction imposed for a homicide conviction. No study of either type accounts for the noncapital component of the sanction regime in states with and without capital punishment. The second basic deficiency is failure to pose a credible model of the sanction risk perceptions of potential murderers and the behavioral response to such perceptions. In the absence of such a model, it is difficult, at best, to interpret data relating sanction regimes to homicide rates.
As discussed in Chapters 4 and 5, these two deficiencies are sufficient to make existing studies uninformative about the effect of capital punishment on homicide. Both of these deficiencies are potentially correctable. However, even if the research and data collection initiatives discussed in this chapter are ultimately successful, research in both literatures share a
common characteristic of invoking strong, often unverifiable, assumptions in order to provide point estimates of the effect of capital punishment on homicides. A point estimate may offer the appearance of desirable certitude, but only at a high cost in credibility. Still another deficiency is inattention to potential feedbacks through which homicide rates, and crime rates more generally, may affect the specification and administration of a sanction regime while the regime simultaneously affects homicide rates. Recognition of potential feedbacks is relevant both to identify the direct effect of capital punishment on homicide rates and to predict the ultimate effect after feedbacks occur. Feedbacks affect the time-series and panel studies differently because of differences in the time frames of the data typically used in the two approaches—monthly, weekly, or even daily data in the time-series studies and annual data in the panel studies.
In light of these deficiencies, the committee has reached the following conclusion and recommendation:
CONCLUSION AND RECOMMENDATION: The committee concludes that research to date on the effect of capital punishment on homicide is not informative about whether capital punishment decreases, increases, or has no effect on homicide rates. Therefore, the committee recommends that these studies not be used to inform deliberations requiring judgments about the effect of the death penalty on homicide. Consequently, claims that research demonstrates that capital punishment decreases or increases the homicide rate by a specified amount or has no effect on the homicide rate should not influence policy judgments about capital punishment.
The committee was disappointed to reach the conclusion that research conducted in the 30 years since the National Research Council (1978) report on this subject has not sufficiently advanced knowledge to allow a conclusion, however qualified, about the effect of the death penalty on homicide rates. Yet this is our conclusion. Some studies play the useful role, either intentionally or not, of demonstrating the fragility of their claims to have found—or not to have found—deterrent effects. However, even these studies suffer from two intrinsic shortcomings that severely limit what can be learned from them about the effect of the death penalty on homicide rates from an examination of the death penalty as it has actually been administered in the United States in the past 35 years.
Commentary on research findings often pits studies claiming to find statistically significant deterrent effects against those finding no statistically significant effects, with the latter studies sometimes interpreted as implying that there is no deterrent effect. A fundamental point of logic about hypothesis testing is that failure to reject a null hypothesis does not imply
that the null hypothesis is correct. For the evidence of even a small effect to be credible, it requires a demonstration, first and foremost, that the effect is based on a sound research design. Estimates that lack credibility are not informative regardless of the consistency of their estimated size. The amount of the effect must also be small in size and estimated with good precision, for example, by being contained within a tight confidence interval.
Our mandate was not to assess whether competing hypotheses about the existence of marginal deterrence from capital punishment are plausible, but simply to assess whether the empirical studies that we have reviewed provide scientifically valid evidence. In its deliberations and in this report, the committee has made a concerted effort not to approach this question with a prior assumption about deterrence. Having reviewed the research that purports to provide useful evidence for or against the hypothesis that the death penalty affects homicide rates, we conclude that it does not provide such evidence.
We stress, however, as noted above, that a lack of evidence is not evidence for or against the hypothesis. Hence, the committee does not construe its conclusion that the existing studies are uninformative as favoring one side or the other side in the long-standing societal debate about deterrence and the death penalty.
In this chapter, we elaborate on these deficiencies that form the basis for this conclusion and cautiously offer some ideas on potential remedies. With regard to remedies, our report provides a somewhat less pessimistic perspective than did the earlier National Research Council (1978, p. 63) report: “[T]he Panel considers that research on this topic is not likely to produce findings that will or should have much influence on policymakers.”
The committee does not expect that advances in collecting data on sanction regimes and obtaining knowledge of sanctions risk perceptions will come quickly or easily. However, data collection on the noncapital component of the sanction regime need not be entirely complete to be useful. And even if research on perceptions of the risk of capital punishment cannot resolve all major issues, some progress would be an important step forward. Even if these advances prove unsuccessful in providing useful information on the incremental deterrent effect of capital punishment in relation to a lengthy prison sentence, the committee believes that there are potentially major benefits from new data collection, theory, and methodology for study of the effect of noncapital sanctions on crimes not subject to the death penalty. As discussed in Chapter 1, because of the overlap in the methods and data used in studies of capital punishment and in broader studies on the effects of sanctions on crime, our charge included a provision for recommending research that might advance that broader research literature, and we do so in the rest of this chapter.
Incomplete and inaccurate data have marred research on the effect of capital punishment on homicides. The most important data problem is that studies have been based on a very incomplete specification of state sanction regimes. Part of the difficulty has been lack of conceptual agreement on how to measure the intensity of use of capital punishment. However, we see the primary problem as a complete absence of data on the noncapital sanctions that might be applied to offenders convicted of homicide. A study of capital punishment in North Carolina by Cook (2009) illustrates the importance of the problem of the absence of information on noncapital sanctions. Of 274 cases prosecuted as capital cases, only 11 resulted in a death sentence. Another 42 resulted in dismissal or a verdict of not guilty, which left 221 cases that resulted in convictions and received noncapital sanctions.
As discussed at length in Chapter 4 and below, there are sound reasons for predicting a correlation between the capital and noncapital components of a state’s sanction regime. Two examples of how this might occur are the plea bargaining leverage that the threat of capital punishment may afford prosecutors and the influence of the state’s political culture on the legislated design and administration of both the capital and noncapital components of the regime. Such a correlation would bias the estimated deterrent effect of capital punishment.
None of the studies we reviewed sought to measure the availability and intensity of use of the noncapital sanction alternatives for the punishment of homicide. Such alternatives may include a life sentence without the possibility of parole, a life sentence with the possibility of parole, and sentences of less than life. It would also be important to have data on the time actually served for convicted murderers who are paroled or who serve less than a life sentence.
It is currently not possible to measure noncapital sanction alternatives at the state level because the required data are not available. The data that are available include those from the Bureau of Justice Statistics (BJS), which publishes nationwide statistics on sentences for prison admissions and time served for prison releases, based on data collected as part of the National Corrections Reporting Program (NCRP) initiated in the early 1980s. More than 40 states now report annual data on sentences for admissions and time served for releases. Individual-level demographic characteristics are also reported. In principle, these data could be used to measure the actual administration of the legally authorized dimensions of most state sanction regimes, not only for murder but also for other types of crimes. The difficulty is that the data are often extremely incomplete.
In some years, states fail to report any data. Just as important, the data that are sent to BJS are often so incomplete that it is impossible to
construct valid state-level measures of the administration of the sanction regime. Indeed, the committee attempted to use these data for the purposes of this report but concluded that the data gaps made their use infeasible. More complete data on the actual administration of sanction regimes might be obtained by expanding the NCRP to include all 50 states and filling the data gaps due to incomplete reporting. Alternatively, an entirely new data collection system might be desirable. Either way, the collection of more complete data on sanction regimes for murder and other crimes is feasible. The data are available: the challenge is designing and implementing an effective system for their collection.
Even if data on the actual administration of state sanction regimes were complete, they could only be used to measure how sanction regimes are actually administered. The data do not specify the potential sanction regime in a state—the range of sanction alternatives that are legally authorized. We are not aware of any ongoing effort to assemble data on the legislated sanction regimes of the states for murder and other crimes. Data on the legislated regime are important because they define the range of penalties that can potentially be imposed. Thus, the measurement of legally authorized sanctions by the states for homicides and other crimes may require a new data collection system.
The committee did not explore the benefits and costs of alternative approaches for measurement of state-level sanction regimes for murder. We only emphasize the vital importance of collecting these data.
RECOMMENDATION: The committee recommends that a concerted effort be made to collect data on the sanctions regimes faced by potential murderers, with particular attention to fixing the current absence of data on noncapital sanctions.
As noted above, because the methods and data used to study the effect of noncapital sanctions on crimes other than murder are similar to those used in research on capital punishment, the committee’s charge includes a provision that we make recommendations for advancing research on the broad effects of sanctions on crime. Thus, we also stress the vital importance of an expanded effort to collect data suitable not only for measuring sanction regimes for murder, but also for measuring sanction regimes for other major crimes.
As emphasized in Chapter 3, it is not possible to interpret empirical evidence on the relationship of homicide rates to sanctions without understanding how potential murderers perceive sanction regimes. The com-
mittee’s review of the time-series and panel studies identified fundamental deficiencies in this regard.
In the case of the time-series studies, none of them explicitly articulates a model of sanction risk perceptions. The studies are silent on whether execution events and their frequency alter perceptions of sanction regimes. Moreover, the studies do not ask whether the trend lines specified by researchers correspond to the trend line (if any) perceived by potential murderers.
Panel studies typically suppose that people who are contemplating murder perceive sanctions risks as subjective probabilities of arrest, conviction, and execution. Lacking data on these subjective probabilities, researchers presume that they are somehow based on the observable frequencies of arrest, conviction, and execution.
The fundamental problem is that perceptions of the risk of sanction are subjective, but researchers have no direct measurements of the perceptions of potential murderers. In the absence of data on risk perceptions, the research practice in the panel studies has been to use publicly available data on homicides and executions to construct statistics that purport to measure the objective risk of execution. Then, having done that, many researchers assume that potential murderers have “rational expectations.” The word “rational” suggests that potential murderers carefully assess the risk of execution. What “rational expectations” actually means in practice is that researchers construct their own measures of execution risk and assume that potential murderers perceive the risk in the same way. However, the assumption of rational expectations of execution risk has no empirical foundation. Indeed, it hardly seems credible.
In Chapter 4, we discuss in detail the complications of calculating the objective risk of execution. One of these complications is that only 15 percent of individuals sentenced to death have actually been executed (since the resumption of the death penalty in 1976) and that a large fraction of death sentences are subsequently reversed. Another complication is that the volume of data on death sentences and executions available for forming perceptions depends on the size of the state. By various measures of execution risk, Delaware was at least as aggressive as Texas in its use of the death penalty. However, over the period 1976 to 2000, Delaware sentenced 28 people to death and carried out 11 executions, while Texas sentenced 753 people to death and carried out 231 executions. Still another complication is that sanction regimes are not stable due to changes in a state’s political leadership, moratoriums on executions, and legal decisions. Yet another complication is that there are within-state differences in the risk of execution due to differences across counties in prosecutorial vigor in the use of the death penalty and local differences in receptivity to its application.
These many complications make clear that even with a concerted effort
by careful, conscientious researchers to assemble and analyze relevant data on death sentences and executions, assessment of the evolving objective risk of execution facing a potential murderer is a daunting challenge. It is also clear that perceptions of this risk among potential murderers must at best be highly impressionistic. To make headway on whether and to what degree the death penalty affects the behavior of potential murderers, it is imperative to have knowledge about how their perceptions of execution risk are formed and then possibly revised on the basis of new information.
RECOMMENDATION: The committee strongly recommends that a concerted effort be made to research the origins and nature of execution sanctions risk perceptions specifically and of noncapital sanctions risks more broadly.
The essential task is to measure the perceptions of sanctions risks that potential murderers actually hold. How might this be done?
One possibility is to take seriously the presumption in the panel studies that people who are contemplating murder perceive sanctions risks as subjective probabilities of arrest, conviction, and execution. This possibility suggests that the risk perceptions of potential murderers be measured probabilistically.
Researchers have developed considerable experience measuring beliefs probabilistically in broad population surveys. Manski (2004) reviews the history in several disciplines, describes the emergence of the modern literature, summarizes applications, and discusses open issues. Among the major U.S. platforms for collection of such data, the Health and Retirement Study (HRS) has periodically elicited probabilistic expectations of retirement, bequests, and mortality from multiple cohorts of older Americans (see, e.g., Hurd and McGarry, 1995, 2002; Hurd, Smith, and Zissimopoulos, 2004). The Survey of Economic Expectations (SEE) has asked repeated population cross sections to state the percent chance that they will lose their jobs, have health insurance, or be victims of crime in the year ahead (see, e.g., Dominitz and Manski, 1997; Manski and Straub, 2000). The National Longitudinal Survey of Youth 1997 has periodically asked young people about the chance that they will become a parent, be arrested, or complete schooling (see, e.g., Fischhoff et al., 2000; Lochner, 2007). Examples of victimization and arrest questions include, “What do you think is the percent chance that your home will be burglarized in the next year?” “What do you think is the percent chance that you will be arrested in the next year?” Researchers have learned from these and other surveys that most people have little difficulty, once the concept is introduced, in using
subjective probabilities to express the likelihood they place on future events relevant to their lives.
However, success in measuring beliefs probabilistically within the general public does not imply that survey research could similarly measure the sanction risk perceptions of potential murderers. A major issue when initiating study of this type is to obtain data from the relevant population, in this case, the population of potential murderers. Theoretically, most people who would be legally eligible to be executed (e.g., are not juveniles or of very low intelligence) are also physically capable of committing a murder and thereby are potential murderers. The reality, however, is that the probability of most people committing a murder is so small that as a practical matter it can be treated as zero. Even the probabilities of people committing other serious crimes, such as robbery and burglary, while likely greater, are still extremely small. Thus, when using the term “potential murderer,” one means that part of population with a non-negligible risk of committing murder.
Thus, the first step and an important prerequisite for a program of research on sanction risk perceptions is to define the relevant population of potential murderers and, more generally, potential criminals. Such a definition will be required to devise cost-effective sampling strategies for interviewing people with nontrivial risks of committing crimes. We expect that one important segment of the relevant population is people with criminal records. The correlation between past and future offending is among the best documented empirical regularities in criminology (National Research Council, 1986; West and Farrington, 1973; Wolfgang, 1958). In the case of murder, for example, Cook, Ludwig, and Braga (2005) found that 43 percent of murderers in Illinois had a felony conviction.
Some may question the feasibility of collecting data on the sanction risk perceptions and criminal behavior of individuals with prior histories of serious crimes, especially if subjects are repeatedly interviewed for the purpose of obtaining longitudinal data. Longitudinal data are useful to study how offending experience and external events, such as police crackdowns or policy changes, affect sanction risk perceptions. However, experience demonstrates that, with sufficient diligence, it is feasible to collect longitudinal data on highly crime-prone people.
A leading example is the Pathways to Desistance Project (Mulvey, 2011), a two-site longitudinal study of desistance from crime among serious adolescent offenders. The project recruited 1,354 adolescents from the Philadelphia and Phoenix juvenile and adult court systems who had been adjudicated as delinquent or found guilty of a serious felony and were 14 to 17 years old at the time that they committed the offense. For the first 4 years of the study, interviews were conducted at 6-month intervals and for the next 3 years the interviews were annual. The retention rate was quite
high, with 87 percent of the subjects interviewed in at least 8 of the 10 interview cycles. Respondents were asked about their perceptions of sanctions risks, among other things. The success of this project indicates that collection of data on sanction risk perceptions from crime-prone populations is feasible with a sustained commitment among a cadre of researchers and with the availability of funding.
Apel (in press) reviews the existing research that measures perceptions of sanction risks. Although there have been a scattering of suggestive studies, there has not yet been systematic large-scale research on the subject. Moreover, there has been no research at all on the specific question of perceptions of the sanction risk associated with commission of murder.
With so much to learn, we think it prudent for research to proceed sequentially. A good beginning would be small-scale studies that include one-on-one cognitive interviews with respondents in the relevant population of potential murderers. These interviews, taking the form of structured conversations, would explore the feasibility and usefulness of probabilistic and other modes of questioning about sanction risk perception. The lessons learned from this exploratory research would inform the design of larger studies, the aim being to eventually develop a program of survey research that would regularly measure the perceptions of the sanction risk held by potential murderers and by potential criminals more generally.
The committee is not confident that measurement of the sanctions risk perceptions of potential murderers can succeed in producing information useful to the study of deterrence, but one cannot be sure unless the effort is made. As demonstrated by the discussion in Chapters 4 and 5, the alternative of continuing to make unfounded assumptions about these perceptions is not useful. Measurement of sanction risk perceptions may enable deterrence research to make progress that thus far has not been possible in the absence of data.
The committee is more optimistic about the feasibility and usefulness of measuring perceptions of sanctions risks among potential criminals more broadly. This greater optimism has two bases. First, homicide is the least frequent of the crimes included in the “Part 1-Crime Index” of the Federal Bureau of Investigation (FBI), which also includes rape, robbery, aggravated assault, burglary, larceny, and auto theft. More people commit all the other crimes than commit homicides. Thus, it will probably be easier to survey sizable numbers of potential perpetrators of these crimes than of potential murderers. The National Survey of Youth, for example, already surveys youth and young adults about their involvement in such crimes as theft, selling drugs, and assault.
Second, perpetrators who are apprehended for crimes less serious than murder are far less likely to receive lengthy prison sentences, particularly if they are juveniles. Thus, these people have more opportunity to learn about
sanction risk on the basis of personal experience, a source of information that may be vital to formation of sanction risk perceptions.
As a complement to research that directly measures perceptions, some committee members believe that study of homicide rates immediately following execution events might also provide useful evidence of the perceptions of potential murderers. As discussed in Chapter 5, the time-series research has largely been devoted to the question of whether homicide rates change in the immediate aftermath of an execution. For the reasons detailed in that chapter, the committee concluded that existing studies were not informative about whether capital punishment affects homicide rates, in part because of the absence of any measure of perceptions.
The committee considered at length whether future research on execution events, if properly conducted, might be informative about whether homicide rates, at least in the short term, are responsive to execution events. We concluded that at best the information to be gleaned from this type of research would be limited and fall far short of establishing whether capital punishment increases, decreases, or has no effect on homicide rates. Even if a short-term impact could be established, it would be difficult to determine whether homicides were actually prevented or simply displaced in time. More fundamentally, execution event studies cannot speak to the question of whether and how the state’s overall sanction regime affects the homicide rate. For example, a null finding from an event study would leave open the possibility that a death penalty regime had a deterrent effect relative to a regime that precluded the death penalty or more narrowly prescribed its applicability. It is important to note that any one execution would only have a deterrent effect if it changed potential murderers’ perceptions of the likelihood of an execution, which is not necessarily the case.
Acknowledging these limitations, some committee members nonetheless argue that if a well-done event study did produce evidence of an effect—whether positive or negative and no matter how temporary—that result would be of considerable interest. It would demonstrate that potential murderers as a group are actually paying attention to the state’s actions and are influenced by them. In short, it would confirm a threshold condition for there to be a deterrent or brutalization effect and invite further inquiry. Other committee members are not convinced of the value of establishing this threshold condition or are not convinced that any study of this sort could make a convincing case that it had isolated a causal effect of executions.
Even with better data and information on sanction regimes and perceptions of sanction risks, formidable difficulties remain to understanding the impact of the death penalty on homicide. With only observational (nonexperimental) data on capital punishment and homicides, researchers must face the fundamental problem that the data alone cannot reveal the counterfactual question of interest: What would have happened if the death penalty not been applied in a “treatment” state or if the death penalty had been applied in a “control” state? Although this counterfactual-outcomes problem is common to all observational studies of cause and effect, it has long been understood to be particularly problematic for understanding the deterrent effect of the death penalty. A capital punishment regime evolves over time as a result, among other things, of a complex interplay of crime trends, social norms, criminal justice budgets, and election results. This context makes it very difficult to identify the effects of the capital sanction regime alone.
To better understand these issues, we highlight three related identification problems that complicate efforts to draw credible inferences on the effect of capital punishment on homicides. The first, referred to as a feedback effect, arises when homicide rates may directly affect the capital sanction regime. The second, referred to as the omitted variable problem, arises when variables that are jointly associated with the sanction regime and homicide rate are either unknown or unobserved. The third, referred to as an equilibrium effect, arises when the capital sanction regime may directly affect other aspects of the criminal justice system, including, most notably, noncapital sanction policies.
Deterrence research conducted in the early 1970s (Carr-Hill and Stern, 1973; Ehrlich, 1975; Sjoquist, 1973) recognized the possibility of feedbacks or simultaneity whereby crime rates may affect the sanction risk and severity even as the sanction risk and severity may affect crime rates. The nature of such feedbacks is not well understood, but there are good reasons for believing that feedbacks are present and may be substantial.
To illustrate the problem, suppose that in a particular state during a particular year there is an exogenous increase in the rate of homicide. If, given the additional workload and resulting strain on resources, district attorneys were more reluctant to pursue the death penalty, a continuing upward trend in homicides would appear to show that a reduction in the probability of a death sentence is associated with an increase in
homicides—a result compatible with a deterrent effect. But suppose instead that the upward trend in homicides resulted in greater public concern about violence and hence a greater willingness on the part of juries in capital cases to choose a death sentence rather than a life sentence. A continuing upward trend in homicides would then appear to show that an increase in capital sanctions is associated with an increase in homicide, a result compatible with a “brutalization” effect. In both these scenarios, the important fact is that the homicide trends influenced the sanction regime. These particular feedbacks are hypothetical, and indeed the very presence of feedbacks has yet to be documented. Still, there are plausible reasons for believing that feedbacks are present and possibly substantial in magnitude. If so, they increase the difficulty of identifying deterrent effects.
The second and related problem arises when unobserved changes in the social, political, and economic environment may have an impact on both capital sanctions and other aspects of the sanction regime. For example, a political shift that results in the election of “law and order” legislators may increase criminal justice resources and produce a broad shift toward greater severity in sentencing, with some effects on the homicide rate. In this case, changes in the capital sanction regime may be spuriously related to the changes in the homicide rate through the associated changes in the noncapital sanction regime. If variables that are jointly associated with the sanction regime and homicide rates are omitted from statistical models of the effect of capital punishment on homicide, then estimates of the deterrent effect will be biased.
The panel research includes studies that recognize and attempt to address the inferential consequences of feedback effects and omitted variable problems. As discussed in Chapter 4, these attempts have not been successful in advancing plausible identification strategies to these problems. In particular, the instrumental variables used in these analyses do not plausibly meet criteria for a valid instrument. The two key criteria are that (1) on average, sanction levels vary as a function of the instrumental variable but (2) on average, the crime rate at a given sanction level does not vary as a function of the instrumental variable. In Chapter 4, we argue that the instrumental variable used in the studies do not meet the second test. This criticism echoes the conclusions of the earlier National Research Council report (1978). Thus, the same elementary error in identification is being made in contemporary research on the deterrent effect of capital punishment that was made decades ago by early deterrence researchers.
We now turn to a third causal process that makes identification problematic, one that has been largely ignored in the research yet is of unique salience to studying the deterrent effect of capital punishment. For capital punishment, changes in the probability of capital sanctions may cause changes to other aspects of the sanctions regime. To illustrate the problem, consider two examples. First, a district attorney who can credibly threaten an accused homicide defendant with the death penalty may have greater bargaining leverage than one who lacks this threat; as a result, the defendants in the former situation may be more willing to plead guilty to first-degree murder with an agreement that their sentence will be life imprisonment rather than death (Cook, 2009; Kuziemko, 2006). Thus, a district attorney who is willing to devote resources to capital prosecutions may end up achieving more severe noncapital sentences, and the two types of sentences are intrinsically linked.
There may also be a negative linkage, if, for example, a district attorney’s proclivity to seek the death penalty in homicide cases comes at the cost of reduced prosecutorial resources available for other cases.1 Due to resource constraints and the additional costs of prosecuting capital murder cases rather than noncapital murder cases, emphasis on capital cases may diminish prosecutorial effectiveness in noncapital cases. The result in that situation may be that the more intense capital regime is achieved at the cost of reduced sentencing (and more dismissals) for the majority of homicide cases that are not capital. These potential links between capital and noncapital sentences make it difficult to isolate the deterrent effect of the threat of execution for homicide.
The equilibrium process, whereby capital and noncapital sanction policies are jointly related and jointly influence the outcome of interest, poses a qualitatively different challenge to identification than the first two. In principle, if the probability that a homicide case would result in a death sentence was randomly assigned across jurisdictions, then the identification problems resulting from feedbacks or omitted variables (discussed above) would be solved. What would remain, however, is the potential difficulty in isolating the deterrent effect of the death penalty by itself from the changes in the overall sanction regime that are influenced by the availability and use of the death penalty.
1 Numerous studies have documented that the prosecution of capital homicide cases is far more costly than noncapital homicide cases: see, for example, Roman, Chalfin, and Knight (2009) in Maryland; Cook (2009) in North Carolina; and Alarcón and Mitchell (2011) and the California Commission on the Fair Administration of Justice (2008) in California. Due to resource constraints, emphasis on capital cases may diminish prosecutorial effectiveness in noncapital cases.
Knowledge of the entire system, however, is not a necessary requirement for learning about the overall impact of the capital sanction regime. For some questions, the effects of the death penalty on sentence bargaining and on administrative resource constraints are an intrinsic part of the mechanism by which a capital regime affects murder rates. Consider, for example, a case in which a judicial ruling terminates the use of the death penalty for some category of homicides. It would be of considerable interest to have a reliable estimate of the overall effect of this reform on the murder rate, even if it is not possible to distinguish among the various mechanisms (reduction in the probability of a death sentence, weaker bargaining position by the district attorney, or increased court resources available for the average case) that led to that effect. Still, this sort of “black box” estimate is not satisfactory if the goal is to estimate the effect of the threat of execution, in part because the ancillary effects of the administration of the death penalty can be generated by other means, such as changes in court budgets.
Is a more reliable approach to identifying the deterrent effect of capital punishment possible? Part of the solution may be to develop a better understanding of the factors that affect sanction regimes, including possible feedbacks from homicide or other crime patterns. The earlier National Research Council report (1978, p. 47) observed: “Knowledge of the effect of crime on the behavior of the criminal justice system is still extremely limited.” This conclusion is still true today, 30 years later. The 1978 report went on to observe: “While the seeming dearth of untainted identification restrictions may reflect the fact that none exist, it is certainly as likely that it simply reflects our ignorance of the determinants of sanctions” (p. 48). Three decades later this committee observes that both of these assessments apply to contemporary research on deterrence.
As noted above, the 1978 report urged more research on the sanction-generation process for the purpose of accumulating a knowledge base that might reveal approaches to plausible identification. Although knowledge of the sanction-generation process is not required for identification of overall effects of certain relevant regime changes, that knowledge may be useful in determining the validity of a proposed identification method. Also, as a practical matter, some committee members believe that without better knowledge of sanction generation, the prospects for credible identification are small. Committee members holding this perspective argue that a deeper institutional and theoretical knowledge of sanction process would materially increase the chances of researchers’ becoming aware of credible sources of identification and that without such knowledge the chances for credible identification are remote. Other committee members are less pessimistic that a chance event or insight might provide a basis for credible identification.
However credible identification might ultimately be achieved, the com-
mittee fully endorses another observation from the earlier report (National Research Council, 1978, p. 49):
It must be noted, however, that identification restrictions cannot be manufactured. If the process generating the data is truly one that leaves the crime function unidentified, then persistent attempts to produce identifying restrictions because of the desire to estimate the deterrent effect will only produce different kinds of error. Even if all such attempts found a “deterrent” effect, no conclusion would be warranted unless some of them used validly based identification restrictions.
The persistent problems that researchers have had in providing meaningful answers about the deterrent effect of capital punishment is unsurprising once one recognizes that this body of empirical research rests on strong and unverified assumptions. Although, in practice, researchers often recognize and acknowledge that their assumptions may not hold, they are defended as necessary to provide meaningful answers and in order to make inferences. But the use of strong assumptions hides the problem that very little is understood about the process that may link capital punishment to future crimes.
The different findings in the deterrence research reflect different choices of assumptions, most of which cannot be supported by strong a priori justifications. As documented throughout this report, many of the assumptions used in the research on the deterrent effect of capital punishment are not credible. Furthermore, the state of social science knowledge does not support a unique model that can be used to identify the effects of capital punishment under the current U.S. sanction regime or to permit the evaluation of deterrence under alternative regimes. The study of deterrence is plagued by model uncertainty.
The failure of the existing research to address the issue of model uncertainly is evident in the debate initiated by Donohue and Wolfers (2005), who challenged claims of deterrence by a broad set of researchers. Much of their challenge involved demonstrations of how small changes in the models used in the various studies led to very different estimates of deterrence effects, in some case changing from positive to negative or vice versa, and in others eliminating statistical significance. Some of their exercises altered the set of observations over which the analysis had been conducted; in other cases they changed the choice of control or instrumental variables.
Although Donohue and Wolfers provide useful evidence of the sensitivity of many claims of deterrence to model assumptions, their demonstration begs the question of how to adjudicate their findings relative to the
papers they critique. This may be seen in two of the rejoinders that have been written to their study. Dezhbakhsh and Rubin (2011) and Mocan and Gittings (2010) provide a large number of modifications of their baseline homicide regressions and argue that deterrence effects generally appear in them. However, they fail to provide any guidance as to what is learned from the specifications that are inconsistent with their claim of evidence of deterrence. Rather, the authors’ claims are based on ad hoc choices of alternative model specifications; there is no systematic construction of the models from which to draw inferences. That changes in a given statistical model change the output of the model is hardly unique to the studies of capital punishment and deterrence literature. The problem is that there have been almost no serious attempts to reconcile the many different findings reported in the research.
Given this existing uncertainty, how might research proceed? Certainly, research aimed at reducing model uncertainty would be useful. To that end, the committee proposed, above, developing data and research on sanction regimes and perceptions of sanction risk. Another complementary and potentially useful approach would be to explicitly account for model uncertainty when drawing inferences on the impact of capital punishment. Rather than continue with the conventional practice of assuming whatever it takes to achieve point identification, and then providing ad hoc justifications for particular sets of assumptions to justify a given model, deterrent studies might instead consider what can be learned when explicitly recognizing model uncertainty. Although the resulting inferences may reflect a certain degree of ambiguity about the effects of capital punishment on homicides, those inferences will necessarily possess greater credibility.
To explore the idea of addressing model uncertainty, the committee commissioned papers illustrating application of two complementary research paradigms—the model averaging approach and the partial identification approach.
Model averaging, though based on earlier work (Bates and Granger, 1969; Leamer, 1978), developed theoretically, algorithmically, and as an applied technique in the mid-1990s (examples include Chatfield, 1995; Draper, 1995; Draper et al., 1993; Raftery, Madigan, and Hoeting, 1997). The model averaging approach constructs a probability distribution for a range of estimates of the deterrent effect of capital punishment, and the researcher constructs this distribution to reflect the researcher’s own or others experts’ prior beliefs about the probability that a given model is valid. By asking what can be learned by combining the information obtained across a wide range of models, model averaging methods provide
a natural way to make empirical claims robust to the details of uncertain model specifications.
This technique has recently been used in two studies of capital punishment: Cohen-Cole et al. (2009) and Durlauf, Fu, and Navarro (in press). These studies apply the modeling average approach to various specifications that have appeared in the research on capital punishment and deterrence. Cohen-Cole et al. (2009) use this method to adjudicate the different findings of Dezhbakhsh, Rubin, Shepherd (2003) and Donohue and Wolfers (2005). Durlauf, Fu, and Navarro (in press), whose paper was written for this committee, consider a range of models based on alternative substantive assumptions that have appeared in the research, including, for example, how to measure subjective arrest, sentencing, and execution probabilities and whether the deterrent effect of capital punishment differs across states. These two papers aim to understand how different assumptions matter and whether differences in assumptions render deterrence estimates fragile. In both papers, the researchers find that model uncertainty swamps the informational content about deterrent effects. That is, after accounting for the modeling uncertainty, the empirical evidence does not reveal whether capital punishment increases or decreases homicides.
As an example of this result, consider the Cohen-Cole et al. (2009) analysis of the models in Dezhbakhsh, Rubin, and Shepherd (2003) and Donohue and Wolfers (2005). Dezhbakhsh, Rubin, and Shepherd (2003) report, under their preferred specification, a statistically significant point estimate of 18 lives saved for each execution. However, when all of the different specifications spanned in the two papers are given probability weights, Cohen-Cole et al. estimate an approximate 95 percent confidence interval on the number of lives saved per execution of [–24, 124]: see Figure 6-1, which is from Cohen-Cole et al. The figure illustrates the model uncertainty by providing a weighted histogram of the estimated net lives saved for all of the models considered. For the case illustrated in this histogram, the posterior probability for the models with point estimates suggest that deterrence is 72 percent, but there is substantial bunching around 0, the individual estimates vary widely, and there is a nontrivial probability on models that suggest a large increase in homicides associated with executions (a probability 0.15 of point estimates of 20 or more homicides). Thus, the heterogeneity of the model-specific estimates makes it impossible to draw strong qualitative conclusions about the deterrent effect of capital punishment.
The model averaging approach provides a formal and elegant Bayesian method for incorporating uncertainty about the correct modeling assumptions into inferential methods. This approach can be effectively used to illustrate the importance of different assumptions and the fragility of the estimates to these assumptions, as is done in Cohen-Cole et al. (2009) and
FIGURE 6-1 Weighted histogram of the net lives saved by the death penalty.
NOTES: The figure includes models for each of the DRS (Dezhbakhsh, Rubin, and Shepherd, 2003) categories. The weights are the posterior model probabilities (Bayes factors). The DRS and DW (Donohue and Wolfers, 2005) lines correspond to the individual model from each with the largest and smallest number of lives saved, respectively. The unweighted histogram is similar.
SOURCE: Cohen-Cole et al. (2009, Figure 1). Used with permission.
Durlauf, Fu, and Navarro (in press). The approach depends on researchers’ specifications of the model space and prior over that model space, over which there may be disagreement. Such disagreement should not obscure an essential strength of the model averaging approach: model averaging provides an approach for systematically exploring sensitivity over an explicitly defined model space.
Ultimately, this approach might also be used to infer the effect of the death penalty on homicides. However, for this purpose, a key challenge would be selecting a set of models to include in the averaging and providing a prior probability distribution over this set that is plausible. The approach presumes that the range of models included in the averaging routine includes the correct model that accurately describes the real world and,
moreover, that the researcher can provide informed prior beliefs about the probability that each model is valid. In the context of the research on capital punishment, we have found no reason to believe that the existing range of point-identified models includes the correct one, and there is currently little basis for assigning probabilities to the correctness of each model in the literature. As discussed in Chapter 4, the committee did not find the instrumental variables used in the existing research to be credible. If the existing models are all invalid, using the modeling averaging approach to produce interpretable deterrence estimates can be problematic.2 With uncertainty about the model space and the prior probabilities, either research efforts to construct informative priors or research showing the sensitivity of the posterior to different prior distributions may be useful.
Partial identification methods provide an alternative approach for reducing the dependence of claims of a deterrence effect on arbitrary assumptions. Rather than start with a particular set of point-identified models and prior beliefs about the probability that each model is valid, both as defined by the researcher, one might instead begin by directly considering what can be inferred under a set of weak assumptions that may possess greater credibility. A natural starting point, for example, is to examine what can be learned in the absence of any assumptions. What do the data alone reveal? Under these weaker assumptions, deterrent effects may not be point identified, but they will be partially identified, with bounds rather than point estimates. Thus, the partial identification approach formalizes the inherent tradeoff between the strength of the maintained assumptions and the credibility of inferences (see Manski, 2003).
The partial identification methodology has been developed and applied over the past 20 years, beginning with Manski (1989, 1990). In an early application to criminal justice policy, Manski and Nagin (1998) studied sentencing and recidivism of juvenile offenders in the state of Utah and demonstrated how partial identification can be used to produce more credible inferences than had previously been produced. Youth in Utah faced a policy that gave judges the discretion to order varying sentences. Using this discretion, judges sentenced some offenders to residential confinement and sentenced other offenders to no confinement. A policy question of potential
2 The Cohen-Cole et al. exercise (2009) was narrow in that it considered the smallest model space one could generate around the different assumptions in Donohue and Wolfers (2005) and Dezhbakhsh, Rubin, and Shepherd (2003). One can easily argue that for a full model averaging analysis, other models warrant a priori consideration. However, one could also argue that some of the models considered in Cohen-Cole et al. should not have been included, given a prior probability of 0.
interest was to compare recidivism under that policy with the recidivism that would occur under a policy proposal that removed judicial discretion and instead mandated that all offenders be sentenced to confinement. The study showed how bounds of varying width on the existing treatment effect which allows judges’ discretion could be achieved by combining data on outcomes under the status quo with relatively weak assumptions regarding the manner in which (1) judges have made sentencing decisions and (2) criminality was affected by sentencing.
More recently, in a paper written for this committee, Manski and Pepper (in press) illustrate the partial identification approach in a relatively simple setting by examining the effect of death penalty statutes on the national homicide rate (per 100,000) over 2 years, 1975-1977: 1975 was the last full year of the federal moratorium on death penalty, and 1977 was the first full year after the moratorium was lifted. In 1975, the death penalty was illegal throughout the country; and in 1977, 32 states had legal death penalty statutes. Over this 2-year period, homicide rates in the 32 states that had adopted a death penalty statute in 1977 decreased by 0.6; in the remaining states, the homicide rates decreased by 1.1. It has been common in the relevant research to report the difference-in-difference estimate, which in this case is 0.5 (–0.6 + 1.1), as a point estimate of the effect of capital punishment on the national homicide rate. This interpretation suggests that the death penalty increases crime, but Manski and Pepper (in press) show that this difference-in-difference form only point identifies the impact of the death penalty under a number of strong assumptions, most notably that the effect is assumed to be homogeneous across states and dates. Under weaker assumptions that allow the deterrent effect to vary across states, the average effect of the death penalty is only partially identified, and it was found to lie in the interval [–1.9, 8.3]. Under still weaker assumptions under which the effect of the death penalty is allowed to vary over time, the bounds widen further. Thus, under these weaker models, the average treatment effect of capital punishment is bounded, but the data do not identify whether the death penalty increases or decreases homicides.
The committee does not endorse the specific findings of the recent studies applying the model averaging or partial identification approaches. These studies are largely illustrative and do not address many of the key problems identified throughout this report. Most notably, they do not define the counterfactual sanction regime and do not address the issue of how potential murderers perceive sanction risks. Still, these studies serve as a starting point for future research that might inform the debate on the death penalty. Rather than imposing the strong but unsupported assumptions required to identify the effect of capital punishment on homicides in a single model or an ad hoc set of similar models, approaches that explicitly account for model uncertainty may provide a constructive way for research
to provide credible albeit incomplete answers. The basic insight is that with model uncertainty, the identification of deterrent effects need not be an all-or-nothing undertaking: the available data and credible assumptions may yield partial conclusions.
Some people may find partial conclusions unappealing and be tempted to impose strong assumptions in order to obtain definitive answers. We caution against this reaction. Imposing strong but untenable assumptions cannot truly resolve inferential problems. Rather, it simply replaces the modeling uncertainty with uncertainty associated with the underlying assumptions. We have seen this repeatedly in the literature on the death penalty. The earlier Panel on Research on Deterrent and Incapacitative Effects recognized this when it concluded (National Research Council, 1978, p. 63) “research on this topic is not likely to produce findings that will or should have much influence on policymakers.” Today, more than 30 years later, perhaps the primary lesson learned from the latest round of empirical research on the deterrent effect of the death penalty is that researchers and policy makers must cope with ambiguity. Explicitly recognizing and accounting for this uncertainty seems like the only hope of moving forward.
RECOMMENDATION: The committee recommends further investigation of the effects of capital punishment using assumptions that are weaker and more credible than those that have traditionally been invoked by empirical researchers.
Alarcón, A.L., and Mitchell, P.M. (2011). Executing the will of the voters?: A roadmap to mend or end the California legislature’s multi-billion-dollar death penalty debacle. Loyola of Los Angeles Law Review, 44(Special), S41-S224.
Apel, R. (in press). Sanctions, perceptions, and crime: Implications for criminal deterrence. Submitted to Journal of Quantitative Criminology, 28.
Bates, J.M., and Granger, C.W.J. (1969). The combination of forecasts. Operational Research Quarterly, 20(4), 451-468.
California Commission on the Fair Administration of Justice. (2008). Report and Recommendations on the Administration of the Death Penalty in California. Sacramento: Author.
Carr-Hill, R.A., and Stern, N.H. (1979). Crime, the Police and Criminal Statistics: An Analysis of Official Statistics for England and Wales Using Econometric Methods. New York: Academic Press.
Chatfield, C. (1995). Model uncertainty, data mining and statistical inference. Journal of the Royal Statistical Society Series A-Statistics in Society, 158(3), 419-466.
Cohen-Cole, E., Durlauf, S., Fagan, J., and Nagin, D. (2009). Model uncertainty and the deterrent effect of capital punishment. American Law and Economics Review, 11(2), 335-369.
Cook, P.J. (2009). Potential savings from abolition of the death penalty in North Carolina. American Law and Economics Review, 11(2), 498-529.
Cook, P.J., Ludwig, J., and Braga, A.A. (2005). Criminal records of homicide offenders. The Journal of the American Medical Association, 294(5), 598-601.
Dezhbakhsh, H., and Rubin, P.H. (2011). From the “econometrics of capital punishment” to the “capital punishment” of econometrics: On the use and abuse of sensitivity analysis. Applied Economics, 43(25), 3,655-3,670.
Dezhbakhsh, H., Rubin, P.H., and Shepherd, J.M. (2003). Does capital punishment have a deterrent effect? New evidence from postmoratorium panel data. American Law and Economics Review, 5, 344-376.
Dominitz, J., and Manski, C.F. (1997). Perceptions of economic insecurity - evidence from the survey of economic expectations. Public Opinion Quarterly, 61(2), 261-287.
Donohue, J.J., and Wolfers, J. (2005). Uses and abuses of empirical evidence in the death penalty debate. Stanford Law Review, 58(3), 791-845.
Draper, D. (1995). Assessment and propagation of model uncertainty. Journal of the Royal Statistical Society Series B-Methodological, 57(1), 45-97.
Draper, D., Hodges, J.S., Mallows, C.L., and Pregibon, D. (1993). Exchangeability and data-analysis. Journal of the Royal Statistical Society Series A-Statistics in Society, 156(1), 9-37.
Durlauf, S., Fu, C., and Navarro, S. (in press). Capital punishment and deterrence: Understanding disparate results. Submitted to Journal of Quantitative Criminology, 28.
Ehrlich, I. (1975). Deterrent effect of capital punishment—Question of life and death. American Economic Review, 65(3), 397-417.
Fischhoff, B., Parker, A.M., De Bruin, W.B., Downs, J., Palmgren, C., Dawes, R., and Manski, C.F. (2000). Teen expectations for significant life events. Public Opinion Quarterly, 64(2), 189-205.
Hurd, M., and McGarry, K. (1995). Evaluation of the subjective probabilities of survival in the Health and Retirement Study. Journal of Human Resources, 30(5), S268-S292.
Hurd, M.D., and McGarry, K. (2002). The predictive validity of subjective probabilities of survival. The Economic Journal, 112(482), 966-985.
Hurd, M.D., Smith, J.P., and Zissimopoulos, J.M. (2004). The effects of subjective survival on retirement and social security claiming. Journal of Applied Econometrics, 19(6), 761-775.
Kuziemko, I. (2006). Does the threat of the death penalty affect plea bargaining in murder cases? Evidence from New York’s 1995 reinstatement of capital punishment. American Law and Economics Review, 8(1), 116-142.
Leamer, E.E. (1978). Specification Searches: Ad Hoc Inference with Nonexperimental Data. New York: Wiley.
Lochner, L. (2007). Individual perceptions of the criminal justice system. American Economic Review, 97(1), 444-460.
Manski, C.F. (1989). Anatomy of the selection problem. Journal of Human Resources, 24(3), 343-360.
Manski, C.F. (1990). Nonparametric bounds on treatment effects. American Economic Review, 80(2), 319-323.
Manski, C.F. (2003). Partial Identification of Probability Distributions. New York: Springer.
Manski, C.F. (2004). Measuring expectations. Econometrica, 72(5), 1,329-1,376.
Manski, C.F., and Nagin, D.S. (1998). Bounding disagreements about treatment effects: A case study of sentencing and recidivism. Sociological Methodology, 28(1), 99-137.
Manski, C.F., and Pepper, J. (in press). Deterrence and the death penalty: Partial identification analysis using repeated cross sections. Submitted to Journal of Quantitative Criminology, 28.
Manski, C.F., and Straub, J.D. (2000). Worker perceptions of job insecurity in the mid-1990s—Evidence from the survey of economic expectations. Journal of Human Resources, 35(3), 447-479.
Mocan, N., and Gittings, K. (2010). The impact of incentives on human behavior: Can we make it disappear? The case of the death penalty. In R.E.S. Di Tella and E. Schargrodsky (Eds.), The Economics of Crime: Lessons for and from Latin America (pp. 379-420). Chicago: University of Chicago Press.
Mulvey, E.P. (2011). Highlights from Pathways to Desistance: A Longitudinal Study of Serious Adolescent Offenders. Juvenile Justice Fact Sheet. Washington, DC: U.S. Department of Justice.
National Research Council. (1978). Deterrence and Incapacitation: Estimating the Effects of Criminal Sanctions on Crime Rates. Panel on Research on Deterrent and Incapacitative Effects. A. Blumstein, J. Cohen, and D. Nagin (Eds.), Committee on Research on Law Enforcement and Criminal Justice. Assembly of Behavioral and Social Sciences. Washington, DC: National Academy Press.
National Research Council. (1986). Criminal Careers and “Career Criminals.” Panel on Research on Criminal Careers, A. Blumstein, J. Cohen, J.A. Roth, and C.A. Visher (Eds.), Committee on Research on Law Enforcement and the Administration of Justice. Commission on Behavioral and Social Sciences and Education. Washington, DC: National Academy Press.
Raftery, A.E., Madigan, D., and Hoeting, J.A. (1997). Bayesian model averaging for linear regression models. Journal of the American Statistical Association, 92(437), 179-191.
Roman, J.K., Chalfin, A.J., and Knight, C.R. (2009). Reassessing the cost of the death penalty using quasi-experimental methods: Evidence from Maryland. American Law and Economics Review, 11(2), 530-574.
Sjoquist, D.L. (1973). Property crime and economic behavior: Some empirical results. American Economic Review, 63(3), 439-446.
West, D.J., and Farrington, D.P. (1973). Who Becomes Delinquent? Second Report of the Cambridge Study in Delinquent Development. London: Heinemann Educational.
Wolfgang, M.E. (1958). Patterns in Criminal Homicide. Philadelphia: University of Pennsylvania Press.