Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

The Role of Extralegal Factors In Determining Criminal Case Disposition Steven Garber, Steven Klepper, and Daniel Nagin INTRODUCTION The major participants in the criminal justice process exercise substantial discretion. An issue that has received considerable attention is the degree to which the existence of such discretion results in systematic inequities in the disposition of criminal cases. In particular, numerous empirical studies have examined the extent to which members of racial minority groups and/or disadvantaged social classes are treated more harshly because of their race or socioeconomic status. Most of the empirical studies on case disposition use regression and related statistical techniques. Correlations between outcomes of the various processing stages in the criminal justice system and measured case and defendant characteristics are examined. Discrimination is analyzed by testing for an empirical association between extralegal characteristics, such as race and socioeconomic status, and various decisions in the criminal justice system, holding constant observable, legally relevant case and defendant characteristics. A fundamental problem with this approach is that many of the important factors affecting case disposition are We thank Alfred Blumstein, Jacqueline Cohen, and Franklin Fisher for their helpful comments. 129

130 extremely difficult to measure. In particular, the seriousness of an offense and the quality of the evidence, perhaps the two most important factors affecting case disposition, involve important elements for which researchers typically can observe no data. When seriousness and case quality are correlated with race and social status, the techniques currently being employed yield biased estimates of the effects of extralegal factors on case disposition. This possibility is particularly troublesome for studies of discrimination in the criminal justice system because even if discrimination is present and of sufficient concern to warrant reform, extralegal factors are undoubtedly of secondary importance in explaining variations in case disposition. Under these circum- stances, biases attributable to measurement error may dominate the estimated effects of extralegal factors. Thus inferences about the incidence of discrimination based on standard regression techniques may be seriously distorted and are unlikely to provide a reliable basis for policy reform. One response to this problem is to measure more accurately the primary determinants of case disposition. However, the inherent unobservability of a number of the components of the primary determinants suggests strongly that this strategy is unlikely to resolve the ambiguities that plague existing studies. We propose an alternative approach known as structural equation modeling. It involves explicit mathematical representation of the fundamental mechanisms believed to generate the data. For the study of discrimination in the criminal justice system, it involves modeling the fundamental relation- ships linking observable case outcomes to both their observable and unobservable causes. If a sufficient number of decisions affected by the unobservable principal determinants of case disposition are observed, it is possible to control fully for forces that cannot be observed. It is then possible in principle to make inferences about the extent of discrimination that are not distorted by the inevitable lack of accurate measurements of the primary determinants of case disposition. The methods we propose are relatively complex, but we know of no simpler way to control for the effects of unobservable variables. The paper is organized as follows. First we review nine recent and influential empirical studies of discrimination in the criminal justice system. The major

131 purpose of the review is to provide motivation and background for the discussion that follows. In the next section we discuss statistical implications of the impossibility of measuring accurately the primary determinants of criminal case disposition. We then illustrate the import of these statistical issues by presenting alternative interpretations of various results reported in the literature. We then illustrate the proposed approach by presenting a structural equation model of criminal case disposition. We model nine decisions affecting the criminal process, taking explicit account of the measurement difficulties discussed. The next section is a discussion of the estimability of the parameters of our illustrative structural model and an example that illustrates how the effects of unobserved variables can be estimated. We then provide a heuristic discussion indicating how our illustrative model aids in the effort to obtain less ambiguous data summaries. In the next section we indicate briefly how future studies might take account simultaneously of the measurement issues emphasized here and the sample selection issues discussed in Klepper et al. (in this volume). The next section is a brief discussion of the trade-offs in specifying alternative structural models of the criminal justice system. The final section contains concluding remarks. EMPIRICAL STUDIES OF DISCRIMINATION I N THE CRIMINAL JUSTICE SYSTEM This section reviews nine recent and influential studies on the incidence of discrimination in the criminal justice system. The studies are of three kinds: studies of the choice of sentence given conviction, studies of case disposition] given arrest andVor indictment, and one longitudinal study of forcible sex offenses from arrest through sentencing. Some of the studies analyze samples combining dissimilar offenses, whereas others concentrate on specific offenses ranging from theft to murder. We first discuss the studies of sentence given conviction and case disposition given arrest. This is followed by a review of studies on the various stages preceding sentencing, beginning with the conviction process and working backward to the choice of plea, release on and setting of bail, choice of legal

132 representation, charge, and the decision to prosecute. Since only LaFree (1980a) examines separately the stages preceding sentencing, it was necessary to supplement the nine studies with a few additional studies of the stages preceding sentencing. We conclude with a summary of the major findings of the various studies. Case Disposition and Sentencing Studies The various studies of case disposition and sentencing focus on a small number of common forces. They include: (1) Seriousness of the offense. Nearly all the studies include a measure of the charge to control for seriousness of the offense. The exceptions are Farrell and Swigert (1978), Swigert and Farrell (1977), and Chiricos and Waldo (1975), which concentrate on specific offenses. Some of the studies also try to use characteristics of the offense to control more completely for seriousness. For example, Lizotte (1977:569) takes account of such factors as whether the defendant resisted arrest, the number of defendants, the sobriety of the defendant, injury to the victim, and the value of property taken. For forcible sex offenses, LaFree (1980a) considers such factors as whether a weapon was used and the type of offense (i.e., rape or attempted rape). (2) Prior record. All of the studies include a variable to represent the criminal history of the defendant. Measures of prior record range from a dummy variable indicating whether the defendant was ever arrested to the total of the maximum statutory penalties of the defendant's prior convictions. (3) Type of legal representation. A number of studies examine the choice of legal representation, distinguishing no attorney, a public defender, and privately retained counsel. The choice of legal representation is expected to affect primarily the probability of conviction and the sentence resulting from a plea bargain. Legal representation is not generally viewed as affecting sentence if the defendant is convicted at trial, although Tiffany et al. (1975) include a measure of legal representation in their sentencing study of defendants convicted at trial.

133 (4) Release on bail. A number of the studies include a variable indicating whether the defendant was released on bail. Being out on bail is expected to improve the defendant's ability to develop an effective defense, which is expected to be helpful both at trial and in plea bargaining. (5) Type of conviction. Some of the studies that combine guilty pleas and trial convictions include an additive dummy variable denoting whether the defendant pleaded guilty. In studies of case disposition, a plea of guilty is generally expected to lead to a worse outcome in that it precludes acquittal. In studies of convictions, guilty pleas are generally assumed to be the result of plea bargains and hence are expected to result in lighter sentences, ceteris paribus. (6) Miscellaneous factors. Some of the studies include the age of the defendant, whether the defendant is employed, and the type of county (urban versus rural) in which the defendant is convicted. (7) Discriminatory or extralegal factors. Various characteristics of the defendant that are not legally relevant are included in all the studies. They include race, socioeconomic status (SES), sex, and the racial composition of the victim-defendant dyed. In addition, Clarke and Koch (1976) use the average income in the Census tract in which the defendant resides as a measure of the defendant's income, while other studies use SES as a proxy for income. Swigert and Farrell (1977) distinguish a characteristic they label "normal primitive" to denote particularly lower class, black defendants who are (stereotypically) thought to be disposed toward violent behavior. The conclusions of the various studies of final case outcome can be summarized as follows. First, virtually all the studies that include a variable measuring the charge found that the seriousness of the offense is the most important factor affecting case outcome. This is most evident for studies that analyze only convictions. Second, all the studies conclude that the prior record of the defendant is important. Third, all the studies that include a variable denoting whether the defendant makes bail infer that it is an important factor in case outcome. Fourth, most of the studies that include legal representation found that it affects case outcome, but the nature of this effect varies considerably among the

134 studies. Clarke and Koch (1976) and Tiffany et al. (1975) conclude that for some types of cases legal representation affects the sentence received, while Hagan (1975), Lizotte (1977), Farrell and Swigert (1978), and Swigert and Farrell (1977) infer that legal represen- tation matters principally through making bail and secondarily through choice of plea. Fifth, type of conviction generally seems to be important: Defendants who plead guilty fare worse on average than those who plead not guilty (Hagan, 1975:541; Farrell and Swigert, 1978:449; Swigert and Farrell, 1977:26) but fare better than defendants who are convicted at trial (LaFree, 1980a:850). The inferences concerning the role of extralegal characteristics differ considerably across the studies. One point of agreement is that if extralegal character- istics affect case outcome, their quantitative signi- ficance is small compared with the other factors discussed above. This view is consistent with Hagan's (1974) review of earlier studies. Most of the studies find a role for some extralegal characteristics, and different characteristics appear to be important in different studies. Swigert and Farrell (1977) and Farrell and Swigert (1978) infer that for murder cases, SES has a significant effect on case outcome, holding constant a number of other factors. LaFree (1980a) found that for forcible sex offenses, cases involving white victims and black defendants are generally treated more harshly. Clarke and Koch (1976) infer that for burglaries and larcenies, defendants with lower incomes are more likely to be imprisoned. They attribute most of this effect to the correlations between income and making bail and income and the choice of legal representation. Tiffany et al. (1975) found that for defendants with no prior record, blacks receive significantly more severe sentences, holding constant a number of other factors. Some studies that did not find a direct role for extralegal characteristics in determining case dispo- sition suggest an indirect role for such factors. Hagan's (1975) results suggest that individuals with lower socioeconomic status in Canada are charged with more serious offenses and that charge directly affects case outcome. Lizotte (1977) infers that race and SES play important roles in determining whether the defendant is released on bail, which in turn has an important effect on case outcome. Only the results of Gibson (1978) and Chiricos and Waldo (1975) suggest a role

135 neither for race nor SES, although Gibson does present some evidence of racial discrimination on the part of some judges.2 Overall the studies suggest that low-status blacks fare worse in the criminal justice system than other defendants. Below we examine the information provided by the various studies concerning the extent to which this disadvantage is attributable to events prior to the sentencing decision. The Conviction Process LaFree (1980a, 1980b) focuses on factors affecting conviction at trial for forcible sex offenses. None of the other studies focuses directly on the conviction process, although Clarke and Koch t 1 qlh) nr~vi ~" =~m" ~ ~, ,= ~ evidence concerning tne conviction Process tor burglaries . _ . _ _ ~of: ~ _ ~. t · . ~ ana larcenies. 1nalrect evidence about the conviction process is also provided by the studies of Lizotte (1977), Hagan (1975), Farrell and Swigert (1978), and Swigert and Farrell (1977), all of whom examined case outcome following arrest or indictment. The various studies emphasize two types of factors: quality of the evidence and prior record of the defendant. For forcible sex offenses, LaFree (1980b) constructed measures of the quality of the prosecution case and the quality of the defense case. He included other variables, such as misconduct on the part of the defendant and the victim's living arrangement, to proxy for whether the alleged act was voluntary. Lizotte (1977) also recognizes the importance of such factors. He constructed an index that represents the availability of 10 different types of evidence, including such factors as the number of eye witnesses, length of time between arrest and the incident, the recovery of a weapon, etc. (Lizotte, 1977:568-9). Clarke and Koch (1976) also constructed an admittedly crude measure of the quality of the evidence using the length of time elapsed between the offense and the arrest. LaFree (1980b) used a measure of promptness of the report of the offense to the police. Prior record is also cited in some of the studies. LaFree (1980b:843) notes that despite legal procedures intended to conceal from the jury the defendant's prior record, it was often inferred by jurors from other testimony or through the defendant's failure to testify. Clarke and Koch (1976:72) conjecture that prior record

136 might affect the prosecutor's efforts to convict the defendant. Swigert and Farrell (1977) and Farrell and Swigert (1978) also consider the role of prior record in the conviction process. The findings of the various studies suggest that both the quality of the evidence and the defendant's prior record affect the likelihood of conviction. LaFree (1980a, 1980b) infers that both factors are relevant in forcible sex offenses. Lizotte's (1977) results concerning the role of the quality of the evidence are inconclusive, but he attributes this to the equivocal nature of his index when applied to different types of crimes (Lizotte, 1977:57). Clarke and Koch (1976) found a minor role for the promptness of arrest and an insignificant effect of prior record, although their results are difficult to interpret since cases settled by guilty plea as well as at trial were considered jointly. Only LaFree (1980a) examined the role of the race of the defendant in affecting the likelihood of conviction at trial. He did not find a significant role for race. The Plea Decision A number of the studies examined the choice of plea. None of the studies proposes an explicit theory of the plea bargaining process. The choice of plea is approached in an exploratory fashion, with different researchers examining the role of different factors. LaFree's (1980b) analysis is the most detailed investigation of the plea decision. He found that the amount of evidence assembled by the defense is an important factor in the choice of plea, with the accumulation of more evidence lowering the probability of a guilty plea. Concerning the role of race, he found that black defendants were less likely to plead guilty, regardless of the race of the victim. He was unable to determine whether this is attributable to the attitude of the prosecutor or the defendant or both. Hagan (1975) also analyzed the choice of plea. He concludes that defendants charged with more serious offenses and represented by private counsel are less likely to plead guilty. He found no role for race or SES. The other study that considered the choice of plea is Swigert and Farrell (1977). They conclude that the single most important factor affecting the choice of plea

137 is the perceived characteristics of the defendant, and those classified as normal primitive are more likely to plead guilty. Release on Bail Three factors are cited as affecting the decision to make bail: the amount of bail, the income of the defendant, and the defendant's legal representation.3 The role of the first two factors is obvious. The role of legal representation is less obvious and differs across the studies that consider it. The importance of bail amount is supported by Lizotte (1977:5711. Clarke and Koch (1976:83) found that the defendant's income is an important factor in the ability to make bail. Lizotte (1977:571) provides evidence of a role for race and SES in making bail, which he interprets as proxies for the defendant's income. Lizotte (1977:572) and Swigert and Farrell (1977:25) found significant but somewhat opposite roles for legal representation. Setting of Bail Only Lizotte (1977) analyzed the setting of bail, although other studies contain speculation concerning the determinants of the bail amount. Lizotte (1977:571) found that seriousness of the offense, the defendant's prior record, and the defendant's legal representation influence the bail amount. Defendants represented by courtroom regulars and public defenders on average are required to post lower bonds. Lizotte (1977:566) offers, but is not able to test, the hypothesis that the quality of the evidence also affects the level of bail. He did not find a significant effect of race or SES on bail amount, although the results of Swigert and Farrell (1977:25) on making bail suggest possible discrimination against "normal primitives" in the determination of the level of bail. Choice of Legal Representation Defendants can choose either no attorney, a public defender, or a private attorney. (Lizotte further

138 distinguishes between courtroom regular and nonregular private attornies.) This choice was studied by Lizotte (1977), Hagan (1975), Farrell and Swigert (1978), and Swigert and Farrell (1977). Four factors are cited: the seriousness of the offense, the prior record of the defendant, the quality of the evidence, and extralegal characteristics of the defendant. The seriousness of the offense and the prior record of the defendant are included as predictors of the sentence the defendant would receive if convicted. The quality of the evidence is included as a predictor of the probability of conviction. It is expected that the greater the probability of conviction and the more serious the offense, the greater the incentive of the defendant to retain higher-quality legal representation. A private attorney is assumed to mount the best defense and no attorney the worst. The primary extralegal character- istic that is expected to affect the choice of attorney is the defendant's income. Other characteristics of the defendant are included only to proxy for income when income is not observed. The results of the various studies suggest that seriousness of the offense is the most important determinant of choice of attorney (Lizotte, 1977:570; Clarke and Koch, 1976:83; Hagan, 1975:541). None of the researchers was able to measure the quality of the evidence and hence none can test its effect on choice of attorney. However, Clarke and Koch (1976:83) found that case outcome and choice of attorney are highly corre- lated, those who choose no attorney having a much smaller probability of conviction and imprisonment. Prior record appears to affect the choice of attorney in Lizotte (1977) and, to a lesser degree, in Swigert and Farrell (1977): Those with more extensive prior records are more likely to choose either no attorney or a private nonregular attorney in Lizotte (1977:570) and a public defender in Swigert and Farrell (1977:23). Hagan's (1975:541) results, however, suggest that the effect of prior record is insignificant. As for extralegal characteristics of the defendant, the results of Clarke and Koch (1976:83) suggest a role for income, and Farrell and Swigert (1978:448) and Swigert and Farrell (1977:23-24) found a role for SES, which they interpret as a proxy for income. In contrast, Hagan (1975:541) and Lizotte (1977:571) found no role for race or SES in the choice of attorney.4

139 The Charge Decision The charge decision was analyzed in detail only by LaFree (1980a). The role of race (alone) was considered by Hagan (1975). For forcible sex offenses, LaFree (1980a:850) found that the type of offense (i.e., attempted rape or rape), the use of a weapon, and victim preference are important elements of the charge decision. These variables are interpreted primarily as measures of the seriousness of the offense (LaFree, 1980a:852). LaFree (1980a:850) also found that the racial composition of the victim-defendant dyed affects the charge decision; cases involving a black defendant and a white victim led to a more serious charge. Hagan (1975:541) also found that SES is correlated with charge: Individuals with lower SES were charged with more serious crimes. Hagan did not control for other factors (such as seriousness), presumably because of a lack of data. The Decision to Prosecute The only study that focused on the decision to prosecute is LaFree (1980a). He found that for forcible sex offenses, the charge, the presence of witnesses, the use of a weapon, and the defendant's age are important determinants of the decision to prosecute. These findings generally agree with Frase's (1980) detailed investigation of the reasons given by U.S. attornies for dismissals. Frase found that the three factors cited most often for dismissing a case are the seriousness of the offense, the quality of the evidence, and the defendant's prior record. LaFree (1980a:850) also found that the racial composition of the victim-defendant dyed affects th e decision to prosecute; cases involving a white victim and a black defendant less likely to be dismissed. Frase did not consider the role of race or other extralegal characteristics in his study. Sublunary of the Major Findings Virtually all the studies suggest that three factors ar of particular importance in the processing of cases

140 through the criminal justice system: the seriousness of the offense, the quality of the evidence, and the prior record of the defendant. These factors were measured in various ways. Seriousness of the offense is generally acknowledged to have a number of dimensions. Lizotte (1977) and LaFree (1980a) measured some of these dimensions, while most of the other studies used the charge as their only measure of seriousness (presumably because other measures are either not available or are too costly to compile). Some of the studies that concentrate on specific offenses (such as murder in Farrell and Swigert, 1978, and Swigert and Farrell, 1977) used no measure of seriousness at all. Seriousness of the offense was generally found to play a role at a number of stages. It appears to be particularly relevant in the decision to prosecute, the charge, the size of bail, and the sentence (given conviction). It also appears to be an important factor affecting the defendant's choice of attorney. The quality of the evidence also has many dimensions and was measured in various ways. LaFree (1980a) and Lizotte (1977) measured some of these dimensions, while others used the promptness of arrest as a crude proxy for the quality of the evidence. Most of the studies include no measure of the quality of the evidence, presumably because the relevant information is too costly to compile. The quality of the evidence appears to play an important role in the decision to prosecute, the choice of plea, and trial conviction. Different researchers emphasize different aspects of prior record, the third primary factor. It too is multidimensional and is inferred to play an important role in the decision to prosecute, the size of the bail amount, sentencing, and (to some degree) conviction. Other legal factors were also found to be important at same stages. Making bail consistently appears to affect case disposition. It presumably operates through the conviction process by affecting the defendant's ability to put together a successful defense. In some studies the quality of legal representation and the type of plea also seem to play a role. Extralegal factors seem to affect outcomes at a number of stages, including the decision to prosecute, the charge decision, the choice of plea, making bail, and sentence. The only stage at which extralegal factors

141 have not been found to play a role is the trial conviction stage--although only LaFree (1980a, 1980b) studied convictions directly. A number of the studies emphasize the cumulative nature of the role of extralegal factors. By the time black and lower-status defendants reach the sentencing stage, they are claimed to be at a considerable disadvantage. They appear to face more serious charges, be more often induced to plead guilty, be less able to make bail and thus organize a successful defense, and have restricted access to good legal representation. All of these factors are believed to affect sentence and more generally case disposition. Swigert and Farrell (1977) also note that discrimination can start a vicious cycle, with discrimination contri- buting to the creation and growth of a criminal record, which in turn leads to harsher treatment in subsequent encounters with the criminal justice system. A number of the researchers discuss measures that might be adopted to reduce the inequities they perceive in the criminal justice system. These include reforms of the bail system, constraints on the use of prosecutorial discretion in plea bargaining, and various sentencing reforms. Each of these reforms undoubtedly has some undesirable aspects. A crucial question is whether there really is sufficient evidence of discrimination to consider seriously implementation of some of the suggested reforms. In the following section we discuss problems associated with the measurement of the key forces that may substantially obscure the true extent of discrimination in the criminal justice system. This provides motivation for the modeling approach we propose, illustrate, and develop in the following sections of the paper. BIASES INDUCED BY MEASUREMENT ERROR AND OMITTED VARIABIES The discussion above suggests that it is extremely difficult to measure many of the principal determinants of case disposition. In this section we discuss the implications of measurement error and omitted variables for inferences about the extent of discrimination in the criminal justice system. Our discussion suggests that failure to control adequately for the primary determinants of case disposition may seriously distort

142 the true extent of discrimination. as the motivation for the model we propose. The discussion serves The Statistical Consequences of Failing to Control Adequately for the Primary Determinants We begin with a simple example. Suppose that we are interested in the following relationship:5 y = fix* + Fez + u, where y = sentence (given conviction), x* = seriousness of the offense, z = socioeconomic status of the accused, u = a random disturbance that is distributed independently of x* and z, and ~ and ~ = parameters to be estimated. If we could observe a random sample of y, x*, and z, the least-squares regression of y on x* and z would yield unbiased and consistent6 estimates of ~ and 6. Suppose, however, that seriousness of the offense cannot be measured and the researcher regresses sentence on SES alone. It is well known that this will generally cause the estimate of the coefficient of z to be biased and inconsistent for 8. In particular, the bias can be shown to equal as, where a is the coefficient of SES in the so-called auxiliary regression of seriousness on SES. Thus presuming that seriousness affects sentence (i.e., ~ does not equal 0), the omission of serious- ness from equation (1) will lead to biased estimation of unless seriousness and SES are uncorrelated (i.e., unless ~ = 0). The intuitive basis for this result is straight- forward. Suppose that we do not control for seriousness and seriousness and SES are correlated. Under these circumstances, variations in sentence that are really due to variations in seriousness cannot be attributed to seriousness. Instead, they will be attributed to SES to the extent that SES and seriousness are correlated. In effect, regressions can only summarize correlations among observed variables: when seriousness is not measured, SES will proxy in part for seriousness and the coefficient of SES will in part reflect variations in sentence that are actually attributable to seriousness. Thus, if sentence is regressed on SES alone, the coefficient of SES is properly interpreted as an estimate (1)

143 of ~ + an rather than an estimate of ~ alone. Suppose that more serious offenses are punished more severely, ceteris paribus. This implies that ~ > 0. In addition, suppose that seriousness and SES are negatively correlated, which implies that a < 0. Then the coefficient of SES in the regression of sentence on SES will underestimate ~ (i.e., an is negative). Thus even if there is no discrimination (i.e., ~ = 0) or there is reverse discrimination (i.e., ~ > 0), it is conceivable that the regression of sentence on SES might be interpreted as suggesting the presence of discrimi- nation. More generally, the omission of relevant factors like seriousness might cause the regression of case disposition on SES and other variables to suggest the presence of discrimination when no such discrimination exists. Alternatively, depending on the coefficients of the omitted variables and their correlation with SES, it is conceivable that the omission of relevant factors could cause the true extent of discrimination to be underestimated. In many of the studies we reviewed, it is common practice to use an observed variable to proxy for a relevant variable that could not be observed. For example, suppose that a variable x is used to proxy for seriousness, where x is related to seriousness by x = fix* + C, where ~ is an unknown coefficient and ~ is a random disturbance that is independent of x*, z, and u. The variable x is called a classical proxy and £ iS referred to as the measurement error in x.7 Suppose that x is used to "control" for x* and y is regressed on x and z. Then it can be shown that the coefficient of z is properly interpreted as an estimate of ~ + fan rather than ~ where a is (again) the coefficient of SES in the regression of seriousness on SES and f equals the fraction of the independent (of z) variation in the Proxy that is due to the measurement error.8 Since f is (2) c, between zero and one, this implies that the inclusion of the proxy in the regression of sentence on SES reduces the absolute magnitude of the bias from aD to fag. ~ bias remains, however, because the proxy does not fully control for the effects of seriousness, and SES still "picks up" some of the effect of seriousness on sentence that is not attributed to the proxy.

144 More generally, this suggests that if the primary determinants of case disposition are measured with error (or are not measured at all) in the studies we reviewed, then estimates of the effects of extralegal variables on case disposition will be biased. Of course, some measurement error is present in all regressions. The crucial question is whether the estimated coefficients of extralegal variables in the various studies are likely to reflect primarily discrimination or primarily statistical bias. The following discussion of the conditions under which the bias will be large relative to the true effect suggests that the estimates of the role of extralegal variables in the various studies may be seriously distorted. For the case of one variable measured with error, it can be shown that the bias in the estimate of a correctly measured variable will be larger relative to its true coefficient:l° (1) The greater the fraction of the variation in the dependent variable attributable to the incorrectly measured variable; (2) The smaller the fraction of the variation in the dependent variable attributable to the correctly measured variable; (3) The greater the correlation between the correctly and incorrectly measured variables; and (4) The greater the fraction of the variation in the incorrectly measured variable (holding constant the correctly measured variable) attributable to the measurement error. Conditions (1) and (2) suggest that the bias in the estimates of the role of extralegal factors will be larger when extralegal factors play a relatively small role in affecting case disposition. The various studies suggest that the primary determinants of case disposition are the seriousness of the offense, the quality of the evidence, and prior record. While most of the studies suggest discrimination, they also suggest that if discrimination is present, it explains only a small fraction of variation in case disposition (see note 7). This does not imply that there isn't sufficient discrimination to warrant reform of the criminal justice system. Rather, it simply says that variations in case disposition can be explained primarily by clearly appropriate factors, such as seriousness of the offense.

145 Conditions (1) and (4) suggest that the bias in the estimates of the role of extralegal factors will be large when the primary determinants of case disposition are measured with considerable error. In the review in the previous section, we noted that two of the three primary determinants of case disposition--the seriousness of the offense and the quality of the evidence--are measured very crudely when they are measured at all. Both factors include many components that are difficult and/or costly to measure from available records. Combined with condition (3), this suggests that if seriousness of the offense and quality of the evidence are correlated with race and YES, the bias in the estimates of extralegal variables might dominate the true effects of extralegal variables on case disposition. In the subsection that follows, we explore the nature of these correlations. Correlations Between the Primary Determinants and Extralegal Variables Perhaps the most troublesome correlation between extralegal variables and the primary determinants of case disposition involves the seriousness of the offense. The literature reviewed above provides both theoretical and empirical reasons to expect seriousness of the offense to be correlated with race and SES. For example, Lizotte (1977) suggests two pertinent but quite different mechanisms. The first is the "labeling" mechanism associated with conflict theory. According to this view, society may perceive and therefore define some crimes to be more serious precisely because they are dispropor- tionately committed by members of racial minorities and individuals with lower SES. The second mechanism is economic: Members of socially and economically disadvantaged groups may rationally choose to commit more serious offenses (particularly in samples of property crimes) precisely because their legitimate labor market opportunities are restricted. In addition, the "normal primitive" concept investigated by Swigert and Farrell (1977) could be interpreted as suggesting a cultural basis for correlation between seriousness and race and SES (see Swigert and Farrell, 1977:19). Furthermore, various correlations are reported in the literature suggesting that seriousness is correlated with race and SES. Lizotte (1977) reports that SES and his index of seriousness are negatively correlated. Hagan

146 (1975) reports a negative correlation between SES and charge, which suggests a negative correlation between SES and seriousness unless (quite implausibly) charge is unrelated to seriousness. Another indication of correlation between seriousness and the race and income of the defendant is found in Clarke and Koch (1976). They found that estimates of the relationships between race and income and the likelihood of imprisonment were weakened when variables were added to control for seriousness and prior record. Finally, in a study of capital punishment in Dade County, Florida, Arkin (1980) found that among all murder cases, blacks were dispro- portionately involved in executionrstyle felony murders. Such offenses might reasonably be viewed as particularly serious relative to other murders. There are at least three reasons that race and SES might be correlated with the quality of the evidence. First, members of racial minority groups andVor people with low SES who participate in criminal activities might on average be less competent than other criminals and thus tend to leave more incriminating evidence. Second, police andVor prosecutors might work harder to amass evidence against members of socially and economically disadvantaged groups. Finally, economic theories of crime suggest that individuals with restricted opportun- ities in legitimate labor markets might, because of this fact, be willing to undertake crimes involving higher probabilities of arrest and conviction. We could not find any evidence in the literature concerning the sign of the correlation between quality of the evidence and race and SES. This is presumably a reflection of the difficulty of constructing measures for quality of the evidence. It seems well accepted that race and SES are corre- lated with prior record. This correlation does not seem problematic, however, because numerous aspects of prior record are generally observable to the investigator. Consequently, the fraction of variation in observable measures of prior record that is attributable to measure- ment error is likely to be quite small relative to the corresponding fractions for observable measures of seriousness and quality of the evidence. Condition (4) above suggests that under these circumstances, the measurement error in prior record will bias the estimates of the effects of extralegal factors considerably less than the measurement errors involved in seriousness and quality of the evidence.

147 Thus the possibility of nontrivial correlations of race and SES with seriousness of the offense and quality of the evidence seem to be the most troublesome. Further complications are introduced by the possibility that these correlations may differ for different types of crimes. As a result, the biases attributable to measure- ment error in seriousness and quality of the evidence might be critical in some cases and trivial in others. The biases might even work in Apposite directions in dif- ferent studies. This suggests that measurement error bias could substantially obscure the true incidence of discrimination in the criminal justice system. In the following subsection we present four examples of how mea- surement error bias might have distorted inferences in the literature about the incidence of discrimination in the criminal justice system. Alternative Interpretations of Results Reported in the Literature The statistical perspective offered here suggests alter- native interpretations of a number of the results report- ed in the review above. We discuss four such interpreta- tions in this section in order to demonstrate the poten- tial importance of taking account of measurement error. First, consider Hagan's (1975) finding of a negative correlation between the initial charge and SES. One ob- vious explanation for this finding other than discrimi- nation is that individuals with lower SES commit more serious crimes. Since Hagan does not control at all for seriousness, the analysis above suggests that SES may be "picking up" for some of the effects of seriousness. This is a simple example of omitted variable bias. Second, a similar explanation can be provided for the finding by Swigert and Farrell (1977) and Farrell and Swigert (1978) that final case disposition and SES are negatively correlated in murder cases, holding constant a number of other factors, such as prior record, but not seriousness of the offense. Arkin's (1980) findings cited above support the hypothesis that individuals with lower SES commit more serious murders, which could explain Swigert and Farrell's and Farrell and Swigert's findings without appeal to discrimination. A third example concerns the finding in a number of studies that defendants who are not released on bail experience less favorable case dispositions. This is

148 often cited as evidence of discrimination in that lower- SES and black defendants are found to be less often released on bail than other defendants, ceteris paribus. Lizotte (1977) in particular emphasizes this finding. He infers that failure to make bail hinders the development of an effective defense. An alternative explanation for this result is based on the difficulty of measuring seriousness of the offense and case quality. Lizotte's (1977) findings suggest that the likelihood of making bail is highly related to the bail amount. Lizotte (1977) also finds that bail amount is directly related to a crude measure he constructs for the seriousness of the offense. Thus it might be expect- ed that on average, defendants who are not released on bail commit more serious offenses. Suppose that making bail does not have any causal effect on case disposition. Consider a regression of case disposition on a proxy for seriousness and a dummy variable for making bail. The results above suggest that if making bail is negatively related to seriousness and a proxy variable is used to ~control" for seriousness, then the estimate of the ef- fect of making bail on case disposition will be biased downward. Thus even if making bail has no causal effect on case disposition, making bail will appear to contri- bute to a less severe disposition if seriousness is measured with error. A similar explanation may account for the finding in a number of studies that defendants with higher-quality legal representation experience more favorable case dis- positions, holding constant a number of other legally relevant factors. Suppose that higher-income individuals commit less serious crimes and that the quality of legal representation is highly (positively) correlated with income. legal represenation will have committed less serious crimes. The results above suggest that, if case disposi- tion is regressed on a proxy for seriousness and the qua Then on average, defendants with higher-quality _ lity of legal representation, then the coefficient of legal representation may be negative even if legal repre- sentation has no effect on case disposition. Each of our explanations for the four findings rests on measurement error bias. One hypothesis--that serious- ness is correlated with SES and race--coupled with mea- surement error in seriousness is sufficient to explain all four findings. While not conclusive, this discussion is suggestive of the extent to which measurement error bias might account for some of the more prominent fin- dings in the literature.

149 Future Research Directions Our analysis suggests that biases due to measurement error may obscure the true incidence of discrimination in the criminal justice system. The most obvious response to this problem--better measurement of the primary determinants of case disposition--does not seem promising because of the inherent unobservability of a number of dimensions of seriousness of the offense and quality of the evidence. Thus further efforts that rely on crude proxies to control for seriousness of the offense and quality of the evidence are unlikely to provide a reliable basis for making inferences about discrimination in the criminal justice system. In the following sections we discuss an alternative approach that directly confronts the inherent unobservability of the primary determinants of case disposition. AN ILLUSTRATIVE STRUCTURAL MODEL OF CRIMINAL CASE DISPOSITIONS A natural response to the interpretation difficulties raised in the previous section is to model explicitly the various links among observable variables and the unobservable theoretical constructs that are hypothesized to be the primary determinants of case disposition. Such models are often described as "structural models involving latent variables" and have found applications in economics, political science, psychology, and sociology.ll The effects of unobserved variables may be estimable if we can observe multiple variables (often called "indicators") whose values are postulated to be determined as functions of the unobservable variables. In this section we present a particular structural model that we believe to be plausible, consistent with the (often implicit) theorizing discussed above, and potentially capable of empirical implementation. Structural equations are viewed as mathematical representations of the fundamental processes generating the data. useful structural equation models are best formulated with explicit reference to a specific question of interest. We have chosen to model various aspects of criminal proceedings relating to individuals arrested for residential thefts who do not enter guilty pleas.12 The following notational conventions are employed. Outcomes of the criminal process (which are assumed to depend on the unobserved primary determinants) are

150 denoted by y's with various subscripts (these are thought of as endogenous indicators of the latent variables). The latent variables are denoted by x*'s with various subscripts. Observed exogenous variables believed to have direct effects on the indicators are denoted by z's with various subscripts and observed exogenous variables believed to affect case disposition through the latent variables are denoted by x's with various subscripts. We treat the three primary determinants of case disposition--seriousness, case quality, and prior record--as latent variables that are not observed. This does not suggest that it is impossible to measure some dimensions of each of these variables. Rather, it emphasizes that these variables can be measured only with considerable error relative to the other important determinants of case disposition. Part of our model, which we discuss at the end of this section, involves linking the three latent variables to their observable components. Our structural equation model involves equations for nine indicators: (1) dismissal, (2) victim cooperation, (3) charge, (4) dollar level of bail, (5) release on bail, (6) type of legal representation, (7) conviction at trial, (8) presentence report recommendation, and (9) severity of punishment. We focus on these particular indicators for two reasons. First, the literature suggests that each is substantially influenced by one or more of the latent variables of interest: seriousness, case quality, and prior record. Second, each of these indicators either represents the outcome of an important juncture in the criminal justice system or is likely to have a major influence on a significant decision in it (e.g., the recommendation of the presentence report). A number of these indicators can be observed only to a limited degree. For example, the dismissal decision is based on the prosecutor's desire to proceed, which we generally do not observe. Rather, we observe only whether the prosecutor decides to dismiss a case, which provides only limited information about the strength of the prosecutor's desires. For expositional convenience, we assume throughout that in instances such as the dismissal decision, we observe the indicator of interest. For example, we assume that in the case of the dismissal decision, we actually observe a continuous measure of the strength of the prosecutor's desire to prosecute. In later sections we consider complications

151 that arise when such continuous measures are not available. The structural model begins with the arrest of a suspect. Following the arrest, the first major function is the initial screening. The prosecution must decide whether to file charges or reject the case. In the event of an affirmative decision to press charges, the prosecutor may subsequently choose to nolle the charges.13 We combine the rejection and nolle decisions into a single prosecutorial decision concerning whether to prosecute a case. Studies suggest that the decision to prosecute or dismiss depends principally on the gravity of the offense, the quality of the evidence, and the record of prior criminal activity of the accused. Given the discretion the prosecutor can exercise at this stage, it is also possible that the decision to prosecute could be affected by discrimination. This suggests that the decision to prosecute can be specified as Y1 = 811X1* + 012X2* + 013X3* + ellZ1 + 812Z2 + U1, where Y1 is an index variable that determines the probability that a case will be dismissed, xl* is the (unobserved) seriousness of the offense, x2* is the (unobserved) quality of the evidence, x3* is the (unobserved) prior record of the defendant, Zl is the (observed) race of the defendant, Z2 is the (observed) wealth of the defendant, u1 is an (unobserved) disturbance, and 011, 012' 0~3, 811, and 012 are parameters to be estimated.] The variables Zl and Z2 are included to represent the possibility of discrimination.ls The disturbance u1 captures the effects of all factors that affect Y1 other than xl*, x2*, x3*, Zl, and Z2 The second indicator is whether the victim cooperates with the prosecutor. The victim is often a pivotal figure in the preparation of the prosecution case. In some instances the victim may be able to identify the perpetrator and for the crime of residential theft is likely to be the person best able to identify property seized from the accused. Cooperating in the preparation of a criminal case can be very time-consuming and the likelihood of the victim's being willing to make this sacrifice is undoubtedly determined in part by the seriousness of the offense, xl*. Victims themselves may also exercise biases by being more willing to cooperate (3)

152 if the accused is of a different race. Let Y2 be an index variable that determines the probability that the victim cooperates with the prosecution. We assume that Y2 is determined by Y2 = 021X1* + 623Z3 + U2 ~ where xl* is as defined in the text below equation (3), Z3 is a dichotomous variable taking the value one if the victim and the accused are of different races and zero otherwise, u2 is a disturbance summarizing all relevant factors besides xl* and Z3, and 621 and 823 are structural parameters to be estimated. The third structural equation represents the (4) determinants of the severity of the charges filed against the accused. For residential theft, the prosecutor may be free to choose among a number of different charges ranging from petit larceny to first degree burglary. We expect that the principal case characteristics affecting the choice of charge seriousness are the seriousness of the offense and the defendant's prior record. We include the latter because the evidence suggests that prosecutors proceed more vigorously against defendants with prior criminal records, particularly if the record is extensive. As with the dismissal decision, the prosecutor may be influenced by his racial and class prejudices. These considerations suggest the following specification of the determinants of charge seriousness: Y3 = 831X1* + 833X3* + 831Z1 + 032Z2 + us , (5) where xl*, x3*, Zl, and Z2 are as defined in the text below equation (3), y3 is the severity of the charge(s) filed against the accused, U3 is a disturbance, and 031, 633' 831, and 632 are structural coefficients to be estimated. Thus the charge is assumed to depend on seriousness, prior record, and race end class considerations, but not on the quality of the evidence. The bail-setting decision is typically made by a magistrate. The stated purpose of bail is to ensure that the accused appears for trial. The dollar amount of bail is supposed to be set at a level that is sufficient to ensure that the accused does not abscond. The probability of the accused not appearing at trial is generally thought to be related to the probability of conviction and the expected sentence if convicted. Our

153 review of the literature suggests that the former is chiefly influenced by the quality of the evidence and the latter - ~ -- ~ ~ by the seriousness of the offense and the defendant's prior record. The bail amount is alto supposed to be graduated to the defendant's income; wealthier defendants are required to post a larger bond to ensure appearance at trial. Finally, there is a possibility that judges discriminate against certain types of defendants in setting bail. These observations suggest that the bail amount, y4, is influenced by all of the case and defendant characteristics previously defined. We relate y4 to these variables by y4 041x1 + 642x2 + 043x3 + 041Z1 + 042Z2 + u4 , (6) where U4 is a disturbance, 041, 642, 043, 841, and 842 are parameters to be estimated, and all other symbols are as defined above. The fifth structural equation represents the accused's decision to make the payment required to be released on bail.l6 The likelihood of the accused being released on bail will be determined primarily by the amount of bail, y4, and the defendant's wealth, measured by Z2 Let y5 be an index variable determining the probability of the accused making bail. We assume y5 is determined by Ys = Y54Y4 + 052Z2 + us ~ where US is a disturbance term and y54 and 852 are parameters to be estimated. (7) A second decision made by the accused concerns the type of legal representation. Available options include no counsel, court-appointed counsel, and private attorney. The various studies we reviewed suggest that the choice of attorney is associated with the seriousness of the charge, the quality of the evidence, and the defendant's prior record. We would also expect this choice to be affected by the defendant's financial capabilities. This suggests the following representation for the quality of legal representation: Y6 = 062X2* + 663X3* + 763Y3 + 062Z2 + U6 ~ (8) where Y6 is an index variable denoting the quality of the legal counsel chosen, u6 is a disturbance, and

154 062, 863, Yes, and 862 are structural coefficients to be estimated. The next observable indicator is whether the accused is convicted at trial.l7 The literature suggests that the likelihood of conviction at trial is primarily determined by the quality of the evidence implicating the accused. In addition, the quality of legal representa- tion may play an important role in affecting the likeli- hood of conviction. On the basis of these considerations we model the probability of conviction at trial by ~ . . . . . . . . y7 = 074X4* + 176Y6 + 871Z1 + 872Z2 + u7 , (9) where y7 is an index variable determining the probability that the accused is convicted, X4* is the quality of the evidence implicating the accused at the time of the trial,l8 U7 is a disturbance term, and 074, r76, 871, and 072 are structural coefficients to be estimated. The variables Zl and Z2 are included in equation (9) to allow for the possibility that juries discriminate on the basis of race or wealth. The next structural equation specifies the determinants of the presentencing recommendation. Presentence reports may contain especially valuable information because in many instances they all but recommend the type of sentence the accused should receive. Presentence reports are often lengthy documents that emphasize the convicted criminal's psychological traits, prior criminal record, and the seriousness of the offense. This suggests the following specification for the presentence recommendation: Ye = 081X1* + 683X3* + 681Z1 + 882Z2 + US , (10) where Y8 is a variable measuring the severity of the sanction recommended by the presentence report, us is a disturbance' and 881, 683, 881' and 082 are parame e be estimated. The severity of the presentence report is thus related to the seriousness of the offense (xl*) and prior record (X3*). The variables Zl and Z2 are included in equation (10) to allow for discrimination at this state of the sentencing process. The final outcome we consider is the severity of punishment. Perhaps the most important factor determining the severity of punishment is the seriousness of the charge for which the defendant is convicted. A

155 discussion with a judge in Pittsburgh and the results of Swigert and Farrell (1977) suggests that the presentence report may also be a major influence on the sentencing decision. These considerations, along with the possibility that judges may directly discriminate on the basis of race and/or wealth, suggest the following specification of the sentencing equation: ye = Tg3Y3 + Y98Y8 + 89lZ1 + 092Z2 + u' ~ (11) where ye is the severity of the sentence imposed on the accused, up, is a disturbance, and y93, y9~, 891, and 092 are structural coefficients to be estimated. Sentence is thus assumed to depend directly on the charge (y3), the forcefulness of the presentence recommendation (Ye), and the race (Zl) and wealth (Z2) of the defendant. The seriousness of the offense and prior record are allowed to affect sentence indirectly through their influence on charge seriousness, y3, and the presentence recommendation, ye. The next four structural equations relate the unobservables x1*, x2*, X3*, and X4* to observable variables. This has two purposes. The first is to structure the suspected correlations among race, seriousness and case quality. This will allow us explicitly to sort out potential sources of the various reported correlations of race with case disposition. Second, estimated parameters from these equations provide important diagnostic information. Since the primary variables believed to affect sentencing outcomes are unobserved, it is worthwhile to structure the estimation in order to be able to test whether the indicators are in fact indicators of the theoretical variables specified. This issue can be addressed by considering the conformity of the estimated coefficients with a priori expectations and more formally by statistical testing of implied constraints on the estimated parameters. We begin by specifying observable factors affecting the seriousness of the offense, xl*. We relate xl* to observables by X1* = cllZ1 + glee + al X1 + ~1 (12) where Z1 is the race of the accused (as defined above), e is the education of the accused, ~1 is a K1 x 1 vector of observable variables affecting seriousness but appearing nowhere else in the model, c1 is a disturbance, and

156 611' dle' and al are K1 + 2 structural parameters to be estimated. The racial variable Zl is included in equation (12) to allow for a correlation between race and seriousness. Reasons to expect correlation between race and seriousness of the offense are discussed above. The education of the accused is included in equation (12) because people with less favorable opportunities in legitimate labor markets are likely to participate in more serious offenses.l9 The vector ~ includes measurable case characteristics that are conventionally thought to affect the perception of seriousness. For residential theft, this vector might include: (1) the value of the property taken, (2) whether the dwelling was occupied at the time of the crime, (3) whether there was forced entry, (4) whether the accused carried a weapon, (5) the age of the accused, etc. The quality of the evidence prior to trial is specified as X2* = ~Y2 + 621Z1 + 62ee + ~'x2 + ~2 ~(13) where x?* is a K2 x 1 vector of observable variables affecting the quality of the evidence but appearing nowhere else in the model, £2 is a disturbance, and X, 621, 62e, and a2 are K2 + 3 parameters to be estimated. As previously discussed, victim cooperation, Y2, is often critical for successful prosecution. Its inclusion as an observable cause of case quality reflects this view. Case quality will also be determined by observable variables such as: (1) the number of witnesses besides the victim, (2) whether the property was recovered, (3) whether the accused's fingerprints were found at the scene, (4) whether the accused was found with burglary tools, (5) whether the accused had an alibi, etc. These observables are included in x:. Race and education are included in equation (13) to control for common variation in xl* and x2*. Reasons to expect a correlation between race and case quality are discussed above. Education is included in equation (13) because all other things being equal, people with little education are likely to be less skillful even in illegitimate activities and hence may leave more incriminating evidence. The prior record of the defendant consists of a number of dimensions. Generally, the information that is available to the judge concerning prior record can also be observed by the investigator. It is not clear,

157 however, how this information is typically interpreted by the various decision makers in the criminal justice system. The various studies we reviewed emphasize different aspects of prior record. To reflect these uncertainties, we have chosen to model prior record as unobservable.20 our specification of prior record is X3* = 23'E3 + ~3 ' (14) where ~ is a K3 x 1 vector of observable variables determining the prior record of the accused, £ 3 is a disturbance, and as contains K3 parameters to be estimated. The components of the vector X3 might include the number of previous arrests (felony and misdemeanor separately), the number of previous convictions (felony and misdemeanor separately), previous time served, recency of previous offenses, whether the accused was on parole or probation at the time of the offense, etc. The disturbance £ 3 represents our uncertainty about the factors that determine prior record. Race and wealth are excluded from the equation for prior record because they are unrelated to our uncertainty concerning the determinants of prior record. Finally, we specify an equation determining the quality of the evidence at the time of the trial, X4*. We distinguish between evidence quality at the time of the trial and prior to trial because some have conjectured that a defendant who is detained while a defense. This might be examined with the following specification of the quality of the evidence at the time of trial: awaiting trial is at a disadvantage in preparing X4* = nY5 + X2 + £4 ~ where £4 is a disturbance and n is a parameter to be estimated.21 Here y5 indicates whether the accused was released on bail.22 The 13 equations (3) through (15) are the structural equations of the illustrative model of criminal case disposition. Specification of the model is finalized by specification of the stochastic properties of the disturbances ul, u2, . . . , us, £1, c2, 3' 4 The structural equations contain 13 disturbances: Ui (i = 1, 2, . . . , 9), and £j (j = 1, 2, 3, 4). We assume that each Ui (i = 1, 2, . . . , 9) has a zero mean, a constant variance denoted by wit (i = 1, 2, . . . , 9) (15)

158 and is distributed independently Of Zm (m = 1, 2, 3), e, en (n = 1, 2, 3, 4), and the vectors xn (n = 1, 2, 3). Equations (3) through (11) were specified so that any common determinants of every pair of indicators are explicitly taken into account. Accordingly, we assume that the covariance Of Ui and ut is zero for all i not equal to i'. We assume that each £ j (j = 1, 2, 3, 4) has a zero mean and is distributed independently of Zm (m = 1, 2, 3), e, and the vectors En (n = 1, 2, 3~. Equations (12) through (15) were specified so that any common determinants of every pair of unobservable x*'s are explicitly taken into account. Accordingly, we assume that the covariance of ej and £\ iS zero for all j not equal to j'. Finally, in order to specify the scales of xl*, x2*, and X3* we adopt the convenient normalizations V(£1) = V(~2) = V(£3) = 1. These normalizations and the coefficients of unity on x2* in equation (15) remove the trivial indeterminacies due to the otherwise arbitrary units of measurement of x,*, xo*, an*, and X4*. ~, -- a, . -- ~ ESTIMABILITY OF THE PARAMETERS OF THE ILLUSTRATIVE STRUCTURAL MODEL The structural equation model we have presented involves coefficients with direct interpretations. In contrast, we argued in the previous section that the coefficients estimated in many of the studies we reviewed are best interpreted as mixtures of structural parameters. Such mixtures admit many different interpretations, often with very different policy implications. Our structural parameters, in contrast, yield direct tests of the incidence of discrimination in the criminal justice system. Hence an obviously important question is: Can the structural parameters of direct interest be estimated from observable data? A structural parameter that can be estimated consistently is said to be identified. The structural parameters of the model presented above are the 40 + K1 + K2 + K3 structure' coefficients (i.e., 14 D's, IS 8's, 5 y 's, 4 ~ ' s, n , x , and the K1 + K2 + K3 elements of the vectors al, a2, and as) and 10 unnormalized variances wit (i = 1, 2, . . . , 9) and V(£4). We have, in fact, verified that all of these parameters except w77 and V(£4) are identified.23 This exercise, while straightforward, involves many tedious algebraic

159 manipulations. Here we merely indicate the nature of the processes by which the identification status of the parameters was examined. The issue of identification is conveniently examined by considering the reduced form of the model comprised of equations (3) through (15). The reduced-form equation for each indicator (y variable) is obtained by manipu- lating algebraically equations (3) through (15) to obtain expressions for the indicators solely as functions of observable exogenous variables (i.e., the z's, x vectors, and e) and disturbances (i.e., the u's and It's). The reduced form for the system is the collection of the nine reduced-form equations. The reduced-form system can be written most convent iently employing matrix notation. Let ~ - (Y1, Y2, . . . , ye)' be the 9 x 1 vector containing the indicators of the structural model. Then the nine equations comprising the reduced form of equations (3) through (15) can be written compactly as = ~lZ1 + Wee + ~Z2 ~ ~3Z3 + ~1 + 02~2 + 0~3 + v, (16) where: n1 = a 9 x 1 vector containing the reduced-form coefficients of the race variable Z1 (i.e., the ith element of n1 is the coefficient of Z in the reduced-form equation for Yi), We = a 9 x 1 vector containing the reduced-form coefficients of the education variable, n2 = a 9 x 1 vector containing the reduced-form coefficients of Z2' 1 vector containing the reduced-form Z3, n3 = a 9 x coefficients of n1 ~ a 9 x K1 matrix containing the reduced-form coefficients of the K1 variables contained in the vector xl, 02 ~ a 9 x K2 matrix containing the reduced-form coefficients of the K2 variables x2, a 9 x K3 matrix containing the reduced-form coefficients of the K3 variables Ad, and a 9 x 1 vector of reduced-form disturbances. 03 = The reduced-form coefficients (i.e., the elements of the ~ vectors and ~ matrices) are specific, derivable functions of the structural coefficients. The reduced- form disturbances (i.e., the elements of v) are specific,

160 derivable functions of the structural parameters and the structural disturbances ui,(i = 1, 2, . . . , 9) and ej (j = 1, 2, 3, 4). The reduced-form coefficients are displayed in Table 3-1 and the reduced-form disturbances are displayed in Table 3-2. The identification issue can be decomposed into the following two questions. Can the reduced-form coeffi- cients and the variances and covariances of the reduced-form disturbances be consistently estimated from observable data? Assuming an affirmative answer, can the values of the structural parameters be deduced from knowledge of the values of these mixtures of the structural parameters? , . , The first of these questions is statistical and the second purely algebraic.24 With regard to the first question, our assumptions concerning the Ui (i = 1, 2, . . . , 9) and ej (j = 1, 2, 3, 4), plus standard assumptions about the sampling process, imply that least-squares regressions of the indicators on the observed exogenous variables (i.e., the z's, x vectors, and e) yield consistent estimates of the elements of the _ vectors and ~ matrices.25 In addition, the variances and covariances of the v's, which involve structural coefficients and the 10 unnormalized variances, can be consistently estimated from the reduced-form residuals.26 In turn, it can be shown that these two sets of estimates are sufficient to solve uniquely for estimates of the structural parameters (with the exception of w77 and V(~4)). Thus the parameters of the structural model (with the exception of w77 and V(£4)) are identified. To provide further insight into unobservable variable models generally and the identification issue in particular, consider the following simple model, which embodies the essential features of the reduced form of our illustrative model:27 Y1 = ~ 1Z + V1 ~ Y2 = 62Z + V2 , and Y3 = 63z + v3 , (17a) (17b) (17c) where Y1, Y2, y3, and z are observable and vl, v2, v3 unobservable. This model is a simplified version of the reduced form of our illustrative structural model. It contains an observable variable z and a set of unobser- vable disturbances vl, v2, and V3. To further the analogy between this simple model and the reduced form of our structural model, two additional are

161 N N a; o U cn au 'lo U U U] o 4- cn U as 'ad ~4 o Cot , - ~1 I xl Xl So o 1 ~.. ~ N O In JJ PA 'J a) I O 1 ~0 CJ En C c ~O e`4 CD ~a~ ca ~D C~ C~ + - C~ ~r (rL C~ a~ C _ r~ + 1- C:) U. + ~ ~ + ~4 O4 ~Q - 1 O ~U ~\0 a ~tD ~> (D ·r + + >- \0 CN ~ F _ ~_ v ~D ~ ~ + _ ~_ ~ O ~ ~ ~al _ ~ ~1 a~ - _' ~1 + cn ~< ~_ C~ C~ + CO ~C~ C~ a ~D + ~a c~} + - a a~ - ~ ~1 o o ~ - ~- ~ ~1 ~n - c~ Q) ~4 + a [c> ~+ cn ~c - + 43 c r~ (D+ +~ ~n ~+ cO c a~ ~< - o o c~ oO Q _ ~ ~1 _ ~ ~1 + ~D a~ - _ ~1 ~r _ ~ ~1 cn - _ ~1 ~-~ ~ ~1 + ~ . - 111 il. ~ _ U~ C~ `:r ~n ~: - ~J ~ ~n J3 -;r _ ~ _ ~ ~1 + ~n ~D cn ~0 + C~ C~ ~D aJ _ Q} + C~ Lo ~D + ~n t3 td ~D t11 j ~>_ _ U ~+ ctl \0 ~<D _ C~ ~ ~0 ~+ - c^, ~1 c^' c^ c^ ~ ~1 - c~ + A L^, ~:L _ C _ + - 111 - - D 111 - - 111 - _ - ~' .^4 + ~r _ - L^, ~ | C L, _ ~D ai + LA ,A] a1 ~ .^J D 1_ ~ ~D L^. + C D _ _ + C~ + '^ ~A] ~D L^. + _ C _ ~ ~ r_ r~ ~ a~ + ,AJ ~D + .^4 oo 0D +A] a' 0\ _ ~ ~1 00 CO a3 cr, _ ~ ~L ~ c^. ~ ox - ,A] ~1 o 0 C~ ~n ~o + 81 oo ~n - - a~ oo :D + LA 0 cn >` ~ > ~>~ _ -] 81 00 al 00 0\ a~ - a~ 00 ~n 00 - C7` a' 00 <D + + _4 LA a ~L cn _ _ 00 c^, a~ +

162 + ~ ~+ + ~D + ~+= Om + + + + ~+ =1 ~+ ~x ~ + 1 ~J ,<: '- ~L _' ~,< ~ ~a1 ~D cr. ; ~r~ r~ ~ + o o o o o o ~o o o ~C~ e' U C) _ ~D ~ ~oo s ~+ C~ . . ~X ~m. ~D a, cn ~'n ~ ~D ~+ CO O ~+ r ~oo a1 ~ca a3 ~o + ,1 u ~=' t U U) r~ ~C~ U tn V U CO o a; a~; C~ 1 CO ~1- `;r ~ + ~O 0 ~\0 CN O O O C ~=1 It~ ;_ O1 U~ cn ~D ~ + .. ~+ O ~c~ a ~cn ~a ~c ~ U1 ~+ a ~D oO ~+ + a ~ax .,4 ~+ ~D ~;~ ~r~ ~+ .,' cn. ~n ~ ~n N ~ + ~;r ~c ~C ~L ~a:l ~D ^;r c ~ O ~u ~cn r~ cn a' c ~- ~'a + ~ ~v ·- ~> ~^ :^ :^ :^ :^ E ~

163 assumptions are introduced. First, it is assumed that z is uncorrelated with vl, v2, and V3. This is analogous to the reduced form of our illustrative structural model in that the observable variables of the reduced form are uncorrelated with the reduced-form disturbances. Second, it is assumed that the unobservable disturbances are composed of a common factor x* and equationrspecific factors ul, u2, and U3 with variances ~1, ~2, and ma respectively: V1= 81X* + U1 ~ V2 = 62x* + u2 , and v3 = 03x* + u3 (18a) (lab) (18c) The variable x* (whose variance is normalized to one) is assumed to be the sole common factor leading to correlations among v1, v2,and V3. Accordingly, it is assumed that x*, ul, u2, and U3 are mutually uncorrelated. This is a simplified version of the structure of the reduced-form disturbances of our illustrative model. It captures the essential features of the reduced-form disturbances in our structural model.28 The principal value of this simple model is that it allows us to demonstrate fairly simply the way identification was verified for our more complicated structural model (and the way identification can generally be verified in structural equation models). The model contains 10 parameters: 81, 62, 03' 61, 02, 63, w1, w2, we, and V(z). Data on the four observable variables (Y1, Y2, y3, and z) can be used to compute 10 sample moments (i.e., the population four variances and six covariances). In these moments are related to the parameters of interest by29 V(yl) = DlV(z) + 62 + w1 ' (19a) V(y2) = V(z) + 6~ + w2 ' (19b) V(y3) = 63V(z) + 03 + w3 ~(19c) C(Y1,Y2) = 8102v(z) + 6102 ' (lad) C(Y1,Y3) = 8103V(z) + 0103 ' (lee) C(Y2,Y3) = 6203V(z) + 6203 ' (19f) C(yl,z) = elV(z) , (leg) C(y2,z) = 62V(z) , (19h) C(y3,z) = 03V(z) , and (19i) V(z) = V (z) .30 (19j)

164 These 10 equations can be solved simultaneously for the parameters, 61, 82, 83, 61, 02, 63, ~1' ~2' and ~3 For example, the effect of x* on Y1 can be written31 as 1 = ({C (Y1,Y2) - [C (yl,Z)C (y2'Z)/V (Z) ] ~ X fC (Y1,Y3) - [C(Yl'Z)C(Y3,Z)/V(Z)]} {C(y2,y3) - [c(y2,z)c(y3,z)/v(z)]~)l/2 (20) An estimator Of 81 formed by substituting corresponding sample moments for the population moments on the right side of equation (20) is consistent for 01. The fact that a similar procedure can be used to derive estimators for the other parameters of the model establishes that all of the parameters are identified. In essence, it is possible to estimate effects of unobservable variables appearing in multiple structural equations because common movements of the indicators reflect in part the effects of movements in the common unobserved factors. This is illustrated by the expressions in equation (19), which relate moments of observable variables to the 0's, the coefficients of the unobservable variables. In an identified model such expressions can be solved for the structural parameters. Verification of the identification status of our illustrative structural model is a straightforward if tedious application of this principle.32 HOW THE ILLUSTRATIVE STRUCTURAL MODEL ALLOWS DI SENTANGLI NG THE VA RI OUS SOURCE S OF RACE-OUTCOME CORRELATIONS: A HEURISTIC DI SCUSSION The fundamental difficulty in empirical examinations of racial discrimination in the criminal justice system is the fact that correlations between race and case outcomes are likely to reflect both discriminatory and nondis- criminatory factors at distinct stages of the criminal justice process. Distinguishing empirically among these various sources of correlation appears to be the appropriate focus of future empirical studies. Regression analysis and other standard multiple correlation techniques seem ill-equipped for this objective: It seems quite implausible that the primary determinants of case disposition can be measured well enough to allow confident interpretation of partial

165 correlations between race and case outcomes. In the previous two sections we presented a structural model of case dispositions and reported that the parameters of interest are in principle estimable. In this section we attempt to indicate essentially how this model allows disentangling of the parameters involving race. The structural equations presented above can be represented most generally as Bx* + 0z + u , Ax + Ay + ~ , where: a 9 x 9 nonsingular matrix33 of structural coefficients of the indicators, As- (Y1, Y2, ~ Y9) B = a 9 x 4 matrix of structural coefficients of the unobserved exogenous variables, (xl*, x2*, X3*, X4*)t, - a 9 x 3 matrix of structural coefficients of the observed exogenous variables directly (21) (22) affecting the indicators, Z - (Zl, Z2, Z3) ' U - (U1, U2, · · · ~ U9) ~ H = a 4 x 4 nonsinaular matrix of structural coefficients, A = a 4 x (K1 + K2 + K3) matrix of structural coefficients, x - (x1 , x2 , x3 ) ~ A - a 4 x 9 matrix of structural coefficients, and ~ - (£1, ~2, ~3, ~4) We have placed numerous a priori restrictions on the matrices r , B. 0, H. A, and A, the covariance matrices of _ and £, and other covariances. Without restrictions, the parameters of the model described by equations (21) and (22) would not be identified. Thus, a priori restric- tions are the source of identification in structural equation models. Restrictions that are inappropriate, however, introduce biases. Thus it seems especially useful to consider the nature of the restrictions of the model presented above that allow disentangling of the various possible sources of correlation between race and case outcomes.

166 The first column of Table 3-1 displays the elements of n1. They are the coefficients of the race variable Z in the nine reduced-form equations. The structural coefficients of race are 611' 621, 611, 631, 641, 671, 681' and 891- Inspection of Tables 3-1 and 3-2 reveals that these parameters appear in no other reduced-form coefficients, nor do they appear in the variances and covariances of the reduced-form disturbances. Thus even if all the other structural parameters In n1 were known, only these mixtures of structural parameters can be helpful in identifying 611, 621, 011, 831, 841, 071' 081, and 891. Note that the reduced-form coefficient Of Zl in the y5 equation is merely y54 times the reduced-form coefficient of y4. Thus, if we know y54, one of these reduced-form coefficients is redundant. Accordingly, we have eight equations (corresponding to the eight linearly independent elements of n1) that can be used to solve for the eight structural parameters that appear in n1 but do not appear elsewhere: 611, 621, 611, 631, 041, 671, 881, and 691- Clearly, if these eight equations were Lo involve more than eight unknown parameters, then at least some of these parameters would not be identified. It would appear, then, that the restrictions that does not play a role in determining Y2 (victim cooperation) nor y5 (whether bail is made) are essential in separating the various sources of correlation between race and case disposition. The exclusion of Zl from equation (4) seems quite warranted once it is recognized that many victims of residential thefts are of the same race as the people we fear are the victims of discrimination in the criminal justice system. It seems entirely reasonable to assume that if people discriminate on the basis of race they do not systematically discriminate against people of their own race. The other crucial restriction that allows identification is the exclusion of Zl from the accused's decision to post bail. Such a restriction seems quite reasonable on the basis that people of the disadvantaged group do not discriminate against themselves. In fact, decisions made by the accused appear to provide extremely valuable information. This discussion seems to provide useful guidance for construction of other models of the criminal justice process. If one is directly concerned about disentangling the various sources of correlation between race and case outcomes, especially useful are indicators . _ _

167 from whose structural equations the race variable may be legitimately excluded. The literature has raised the specter that all agents of the state (i.e., police, prosecutors, juries, and judges) use their discretion to discriminate against disadvantaged groups. Thus it appears that excluding race from equations representing decisions of agents of the state will always be controversial. Decisions made by the accused that depend on (at least some of) the primary determinants thus seem to provide the most useful information in identifying the structural parameters of racial variables. We close this section by suggesting indicators not represented in our illustrative model that might be especially useful. Perhaps the most promising is whether the accused testifies at trial. This decision should depend on at least some of the x*'s but not directly on race.34 Another potentially powerful indicator is whether the accused accepts a plea bargain. Klepper et al. (in this volume) provide a model of the plea bargaining process that would be helpful in structuring a plea bargaining equation (this is discussed in greater length below). A number of other decisions made by the accused might also warrant more explicit consideration. These include the choice between a bench and a jury trial, whether the accused volunteers restitution, whether the accused enters a drug or alcohol rehabi libation program, and more detailed information concerning the extent of the defense effort. IDENTIFICATION IN LATENT VARIABLE MODELS WITH SAMPLE SELECTI ON - - Sample selection is a natural feature of the criminal justice system. Cases are screened from the system at a number of junctures. In our illustrative model, some cases are dropped from the system because of dismissals and acquittals. The model implies that these sample selections cannot be considered to be random. Klepper et al. (in this volume) emphasize that this type of nonrandom sample selection may introduce substantial biases into statistical analyses of case disposition. The purpose of this section is to analyze the impli- cations of sample selection for the identification status of our model. It appears that sample selection does not alter the identification status of our illustrative model (subject

168 to an important qualification discussed below). The basis for this claim is demonstrated in the context of the simple unobservable variable model composed of equation (18). We introduce sample selection by assuming that we observe a random sample on Y1, but Y2 and y3 are observed only if Y1 > 0 (e.g., a presentence report and a sentence are observed only if the accused is convicted). In order to compute the variances and covariances of the observables in the selected sample, we augment the assumptions of the simple model by assuming that x*, z, ul, u2, and U3 are normally distributed.35 Then the moments of the observables in the selected sample can be written as36 V(yllyl > 0) = 1 - ab (23a) V(y Iy1 > 0) = V(Y2) - d(8l02 + 81 2)2 (23b) V(y3lY1 > 0) = V(y~) ~ d(6163 + 8103) (23c) C(yl,y2lYl > 0) = (a - b2)a 1(0102 1 2) (23d) (Yl'Y3lyl > 0) = (a - b )a (0103 + ~ ~ ) (23e) C(y2,y3lY1 > 0) = C(Y2,Y3) d(6l82 + 81 2) x (8103 + 6113) C(yl,z~yl > 0) = (a - b )a 01V(z) C(y2,z~y1 > 0) = C(Y2,Z) ~ d(6l62 + 8102) 1 ( ) C(y ~ZIY1 > 0) = C(y3,Z) - 6(010] + 6163)01V(Z) V(zly1 > 0) = V(Z) - d(0lV(z) ), (23f) (23g) (23h) (23i) (23j) where V(yllyl > 0) denotes the variance °f Y1 given that Y1 is positive, etc., 2 2 a 01 + 01V(z) ~ w1 ' b - t(0)/(1 - t(0)) , d - a~1 _ a~2 + b2 , and t(~) and t(.) are respectively the standard normal density and the cumulative distribution functions. The expressions in equation (23) are more complicated than their counterparts in equation (19), but in fact yield solutions for all of the structural parameters.37 Sample selection does not introduce any additional parameters to be estimated and does not reduce the number

169 of moments38 of observable variables. Thus the presence of sample selection does not appear to alter the identification status of an otherwise identified structural model involving latent variables. This somewhat surprising result is attributable to the fact that we observe multiple indicators after sample selection occurs. To see this, consider the compli- cations introduced by sample selection in Heckman's (1979) model. Heckman considers a two-equation system in which the dependent variable in the second equation is observed only if the dependent variable in the first equation exceeds a threshold. To make this concrete, consider the two equations of our structural model corresponding to the dismissal and charge decisions (i.e., equations (3) and (5)). Suppose that the prosecutor discriminates against lower-5ES defendants in the dismissal decision. Equation (3) implies that the greater the three primary determinants, the greater the probability of dismissal. Consequently, cases that are not dismissed will tend to be above average on an index that combines these three factors (i.e., Dllxl* + 812x2* + 013x3* in equation (3)). If there is discrimination against lower-SES defendants, then higher-SES defendants whose cases are not dismissed will tend to have a higher value for each of the three primary determinants than lower-SES defendants whose cases are not dismissed. Now consider the charge decision, which is also affected by the seriousness of the offense and prior record. The above argument implies that higher-SES individuals whose cases are not dismissed will tend to be above average on seriousness and prior record. If either of these factors is not adequately controlled for, estimates will suggest reverse discrimination at the charging stage when in fact no discrimination is present. More generally, sample selection will tend to introduce biases understating the extent of discrimination at all stages following the dismissal decision when discrimination occurs at the dismissal stage. This problem does not appear to affect the esti- mability of structural equation models precisely because they control completely for the effects of unobservable variables. In the context of the present application this conclusion must be qualified in one important respect. In our simple model, we assumed that Y1 is observable. This enabled us to estimate the covari- ances Of Y1 with Y2, Y3, and z that would apply to

170 the selected sample. The unconditional covariances of Y with Y2, y3, and z were then computed by adjusting the covariances calculated from the observations that were not screened from the sample. The first step in the process -the estimation of the population covariances for the selected population of cases--is essential. Because of the presence of limited dependent variables, however, we may not observe the value Of Y when Y2 and y3 are observed. Instead, the variable Yl may be qualitative and be constant for all observations that remain in the sample after selection. For example, consider the dismissal equation (3) in our illustrative model. The dependent variable in this model is an index of the strength of the case for purposes of prosecution. Generally, we cannot observe this index, but only whether a case is dismissed. In this instance, we learn nothing about the value Of Yl except that it is above or below some threshold. For cases that are not dismissed, we may observe no variation in Y1. This will preclude estimation of the covariances involving Y1 using the selected observations, implying that the approach to identification described here breaks down. This suggests that if some information can be observed for the indicator that determines selection when cases are not dropped from the system, then it will be possible to control for the effects of selection. This may be quite possible for one of the two selection points in our model: the dismissal decision. In some instances, prosecutors must justify in writing their reasons for dropping a case. The final decision to drop a case is often made only after a number of reviews of the same case by different attorneys in the prosecutor's office (for example, see Frase's (1978) detailed analysis of the dismissal decision for federal cases). Such reviews may provide the basis for measuring the strength of the case for purposes of prosecution. Moreover, in some instances a qualitative ranking of the strength of all cases may be assigned before a final prosecution decision is made. Such information would certainly provide a means of distinguishing the strength of the case for purposes of prosecution for cases that are not dismissed. The other selection point occurs at the trial stage. It is unlikely that any information would be available to construct an index of the strength of the evidence against the defendant for cases that result in conviction. This is not problematic, however, because this selection does not alter the distribution of any of

171 the unobservable variables affecting the decisions that follow conviction. This is because the conviction decision is assumed to depend only on the quality of the evidence and not seriousness and prior record. Only the latter two unobservables enter later decisions. If it is not possible to observe information about the strength of the prosecution case in cases that are not dismissed, then it may still be possible to deal with selection. An alternative approach to the selection problem is provided by Beckman (1979) and Olsen (1980). It requires restrictions on the coefficients of observable variables in the selection and later equations. Such an approach appears to be feasible and natural in our model. It is discussed in a more general context in Klepper et al. (in this volume). ADDITIONAL SPECIFICATION ISSUES In this section we discuss briefly three types of specification issues. First we consider some potentially worthwhile simplifications of our illustrative structural model. We then consider some interesting and (in principle) desirable extensions of the illustrative model. Finally, we discuss issues relating to the fact that often only qualitative rather than continuous data concerning various outcomes of the criminal justice system are available. Data limitations might dictate that some of the equations of our illustrative model cannot be estimated. In fact, the equations for two indicators for which data are costly to obtain--bail amount and the forcefulness of the presentence report--could be eliminated39 without sacrificing all of the benefits of the structural approach. Neither of these equations involves restrictions that seem crucial to sort out the roles of race in affecting the other seven indicators. The major cost of failing to observe the bail amount would be an inability to examine directly whether estimated effects of race on the probability of making bail are due to effects of race on the level of bail. The major drawback of failing to measure the forcefulness of the presentence report is similar. In such a case we would be unable to distinguish between discrimination in presentence reports and discrimination in sentencing. Less complicated structural models would also result from further a priori restrictions on the illustrative

172 model. Two particular restrictions seem quite reasonable and useful in terms of simplifying the analysis.40 The first involves the assumption that all of the relevant factors affecting prior record are observed.41 Since this would reduce the number of latent variables, we would then need fewer observable indicators (and hence structural equations) to identify the structural parameters of a model simplified in this way.42 Another plausible restriction that would simplify the analysis is the assumption that case quality is uncorrelated with race. As discussed above, the reasons and evidence supporting such a correlation are somewhat less compelling than the reasons and evidence supporting a correlation between race and seriousness. Such a simplication would mean that 621 in equation (13) would equal zero; then, following the discussion above, we would need only a single structural equation omitting race to identify the structural coefficients of the race variable. One equation, such as the victim cooperation equation, could then be deleted without losing the ability to sort out the sources of correlations between race and the other indicators. Moreover, in some contexts it might be appropriate to assume that case quality is uncorrelated with all of the other exogenous variables in the model. Such an assumption would also reduce the number of indicators (and hence structural equations) necessary for identification. Another possibility is to eliminate restrictions from the model even if this compromises the identification of some of the parameters. It may still be possible to extract useful information about underidentified parameters. For example, one might be willing to estimate some structural parameters conditional on imposed values of other structural parameters. Another approach involves estimating upper and lower bounds on structural parameters from the fact that the variance- covariance matrix of the true variables must be nonnegative definite (see Klepper, 1980). Finally, Bayesian analysis explicitly incorporating subjective prior information about the structural parameters may also contribute to our knowledge concerning the role of extralegal factors in affecting criminal case disposition. There are important reasons to begin with consideration of parsimonious models. Despite this fact, it seems worthwhile to mention a few apparently useful extensions of the illustrative structural model. First, it might be particularly valuable to incorporate a plea

173 bargain decision into the model because most convictions result from guilty pleas. Using the plea bargain theory in Klepper et al. ~ . . . (in this volume), an additional unobservable variable representing culpability might be introduced into the model. Another desirable extension of the illustrative structural model would involve use of an explicit theory to structure the suspected correlations between race and seriousness and case quality. In the present formu- lation, for example, a positive estimate Of 611 would ~na~cate the existence of a correlation between race and seriousness but would provide no information concerning the source of such a correlation. Recall from the discussion that the literature contains at least three reasons to expect race and seriousness to be correlated. If the source of the correlation is labeling, as emphasized by Lizotte, then we might interpret this as legislative discrimination. If the economic basis to expect a correlation is operative, we might interpret 611 as reflecting discrimination in the labor market. v~s~ngu~sn~ng among such possibilities would seem to be quite desirable for policy purposes. Finally, we might not be able to observe directly anything about the financial capabilities of defendants. If so, wealth could be modeled as an unobservable. Then variables such as the average income in the census tract containing the residence of the accused might be regarded as a classical proxy and hence an indicator of the wealth of the defendant. When other indicators of wealth are observable (e.g., level of bail, release on bail, choice of attorney) models treating wealth as an unobservable are likely to be identified. Finally, we briefly mention specification issues relating to the fact that some of the indicators we have defined may not be directly observable. In some instances such variables are modeled as direct causes of other indicators or latent variables. An interesting specification issue is whether such variables should be viewed identically in their roles as both indicators and causes. For example, consider Y2, which was defined as an index determining the probability that the victim cooperates with the prosecution. This variable also appears in equation (13) as a factor contributing to case quality. One might interpret Y2 in equation (13) as a dichotomous outcome determined-by the continuous-index version Of Y2. But victim cooperation might be more plausibly viewed as a continuous variable representing _ . . . . . .

174 the enthusiasm with which the victim cooperates. Similar remarks apply to Y6 (i.e., attorney choice), which appears as an indicator in equation (8) and a cause of the trial outcome in equation (9). It seems clear, however, that y3 (i.e., charge) and y5 (i.e., release on bail) are most plausibly viewed as qualitative outcomes of the indexes defined in equations (5) and (7) when they appear as observable causes in equations (8) (i.e., choice of attorney) and (15) (i.e., the case Duality at trial), respectively. A useful paper in this regard is Beckman (1978), which presents an estimation scheme for situations involving endogenous qualitative regressors. Finally, the fact that we observe only partial information about various indicators raises estimation issues, even if these variables do not also appear as causes of other variables. Estimation would require explicit modeling of the links between the observed data and the indicators of the illustrative model. For example, Muthen (1979) discusses latent variable models with qualitative indicators. Maximumrlikelihood techniques, while somewhat complex in this context, are likely to provide a feasible approach to estimation CONCLUDING REMARKS , ~ In our view, the empirical literature on the criminal justice system has evolved in a natural and appropriate way. Total correlations between case disposition and race and YES suggest an alarming degree of discri- mination. Studies aimed at probing the source of these correlations have attempted to control for the effects of legally relevant variables using crude, albeit the best available, proxies for such factors. reviewed are of this variety. The studies we For the most part, they still find evidence of discrimination, although less than is suggested by the total correlations between case disposition and race and SES. The fact that inclusion of legally relevant variables reduces the correlations between case outcomes and variables such as race and SES provides empirical support for theories that predict correlations between legal factors and personal characteristics of defendants. Under such circumstances, failure to control fully for the effects of legally relevant factors implies that inferences about the extent of discrimination are likely to be erroneous. For this reason the techniques

175 currently being used offer little hope of providing a reliable basis for policy reform. The approach proposed here seems like a logical next step. Satisfactory resolution of the role of extralegal factors in determining criminal case disposition will be difficult. Structural equation mode ;na chest helm hill it is not a panacea. The restrictions in any identified structural model are likely to be controversial. The possibility of compatible indications emerging from a broad range of structural models raises the hope of developing a consensus. NOTES 1. By case disposition we mean the outcome of a case following arrest andVor indictment. The alternative outcomes include acquittal, dismissal, and various types of sentences (given conviction). Case disposition is generally measured by an index that (arbitrarily) is commensurate to the different types of outcomes. 2. Klepper et al. ~ . ~ . . nine squares, these two are the most sensitive to statistical biases due to sample selection. Conse- quently, the extent of discrimination may be particularly underestimated in these two studies because of the specially selected nature of their samples. (in this volume) argue that among the - 3. Other factors such as prior record and extralegal characteristics of the defendant are also cited (for example, Swigert and Farrell, 1977:25; Farrell and Swigert, 1978:447), but it seems these factors are expected to operate through bail amount (only Lizotte holds constant the effect of bail amount). 4. Although Lizotte (1977) used a strange ordering for the quality of different types of legal representation. 5. For convenience, throughout this paper all random variables are expressed as deviations from their means. 6. An estimator is unbiased for a parameter if its mathematical expectation is equal to that parameter. An estimator is consistent if in the limit as the sample size goes to infinity, the estimates are arbitrarily close to the parameter of interest. ~ ~ Strictly speaking,

176 consistency of least squares requires additional (but in this context uninteresting) assumptions concerning the sampling process on x* and z. 7. When ~ equals one and ~ is independent of x*, z, and u, x is called a classical measurement of x*. The results discussed here are straightforward extensions of results discussed by Garber and Klepper (1980) for the case of classical measurements. 8. Formally, f = V(£)/V(xlz). 9. This result is due independently to McCallum (1972) and Wickens (19721. For cases in which more than one variable is measured with error, this result does not generalize straightforwardly (see Garber and Klepper, 1980). 10. These conditions follow from faD/8 = fPx~z[Py~x*lzV(ylx*)/py~zlx*v(ylz)] , where P ~x*lz and Py~zlx* are respectively the correlation coefficients of y and x* given z and y and z given x* and Px~z is the correlation coefficient of x and z. This result can be derived by exploiting the relationship between regression coefficients and-the second moments of the respective conditional distri- butions. 11. An especially valuable collection of theoretical and empirical studies is Goldberger and Duncan (1973). Our discussion borrows from Goldberger (1973), the introduc- tory essay in this volume. 12. We consider residential thefts to include the crimes of breaking and entering, petit larceny, grand larceny, second-degree burglary, first-degree burglary, etc. We avoid the plea bargain issue, despite its importance, because of the apparent lack of any widely held views concerning the determinants of the plea bargain decision. Incorporation of especially contro- versial relationships in the illustrative model could seriously compromise our objectives. 13. In the Washington, D.C., superior court about 50 percent of all arrests are rejected at the initial

177 screening or subsequently nailed by the prosecutor (Forst et al., 1977). 14. We use 0's throughout for coefficients of unobservable variables and B's for coefficients of observable variables. The first subscript of each coefficient refers to the indicator and the second to the respective unobservable or observable exogenous variable. 15. Discrimination is presumed to be on the basis of race and wealth. The two variables together can be interpreted as the primary components of the SES of the defendant. Alternatively, we might have introduced another observable variable to represent the SES of the defendant. The additon of an accurate measure of SES would not alter the model in any fundamental way. We assume discrimination on the basis of race and wealth alone only to simplify the exposition. 16. Typically, the bail is paid by a bail-bonding agency. The accused pays a nonrefundable fee based on the amount of the bail. Equation (7) can be interpreted as describing the decision to pay this fee. 17. Recall that we model only those cases of residential theft disposed by a dismissal or trial verdict. 18. As will become clear with the introduction of equation (15) below, we distinguish between case quality before and at trial to examine the claim that release on bail is an important determinant of the quality of the defense. 19. As will be more clear when equation (13) and the stochastic assumptions pertaining to £1 are introduced, we attempt to use observable variables to control entirely for correlation between xl* and x2*. Education is viewed as an important correlate of each of these variables. 20. In the section on additional specification issues, we consider treating prior record, or more precisely all the determinants of prior record, as known and observable. 21. Since xl*, x2*, X3*, and X4* are never observed, the scales on which these variables are measured are arbitrary. Until these scales are specified, the

178 magnitudes of their coefficients are trivially indeter- minate. In order to remove such indeterminacies it is necessary to "normalize" these coefficients by directly or indirectly specifying the units of measurement of each of the unobservable variables. Specific normalizations are chosen for analytic convenience. The coefficient of x2* in equation (15) is chosen as one to specify the scale on X4*, given the normalization (presented below), which defines the scale of x2*. 22. Note that y5 was specified as an index variable in equation (7) but as a dichotomous outcome in the present equation. The complications associated with this type of specification issue are discussed in a later section. 23. Inspection of equations (3) through (15) reveals that w77 and V(£4) are not identified. This is obvious since both X4* and U7 appear only in equation (9). Thus in this model randomness in convictions (represented by U7) cannot be distinguished empirically from random influences on the quality of the evidence at trial (represented by E4). This lack of identifiability is of minimal concern since neither all nor V(~4) is of direct interest. 2 4. The second step involves solving for the structural parameters as functions of the reduced-form parameters (this is illustrated by equation (7)). Use of Slutsky's Theorem (see Goldberger, 1964:118-119) establishes that these solutions provide a basis for consistent estimation of the structural parameters. 25. This follows from the fact that the reduced-form disturbances are uncorrelated with the independent variables of the reduced form. 26. These population variances and covariances can be computed quite simply from Table 3-2 since the ui (i = 1, 2, . . . , 9) and ej (j = 1, 2, 3, 4) are mutually uncorrelated. 27. Various symbols (such as Y1, v1, etc.) are redefined here in order to emphasize the analogies between the simplified model employed here and our illustrative model of the criminal justice process .

179 28. These features are the ones that enable us to estimate the various mixtures of structural parameters appearing in the reduced-form disturbances of our model. 29. To see how these expressions are computed, consider for example equation (19d). Since all variables are assumed to have zero means, the covariance of Y1 and Y2 can be computed as E(yl~y2) = E(8lz + Dlx* + Ul)(02Z + 02x + u2) Equation (19d) then follows from the assumptions that z, x*, ul, and u2 are independent and V(x*) = 1. 30. The apparently trivial nature of equation (19;) merely reflects the fact that since z is observable its variance can be estimated directly. 31. This expression for 81 results from the use of equations (led), (lee), (19f), (leg), (19h), (19i), and (19j). It can be verified straightforwardly by substitution of these equations into equation (20) . 32. For example, consider how we checked that y54 is identified. Table 3-1 indicates that the coefficients of Z3 in the reduced-form equations for y5 and y4 are respectively Y541042023 and \042623 Division of t former coefficient by the latter provides a solution for r 54 and thus a basis for consistently estimating y54. 33. The assumption that r~1 exists says merely that values of the x* and z variables and the values of the structural disturbances uniquely determine the values of the indicators. 34. It might be argued that race would enter into this decision: Perhaps anticipation of racial discrimination would affect the accused's decision concerning testi- fying. If that is the case, however, this indicator would still be quite valuable. Suppose that defendants anticipate discrimination in the way described by the other structural equations. In that case race will affect the testimony decision through its effect on expected sentence. This would provide other restrictions that would be useful in disentangling the various structural parameters associated with race.

180 35. Dealing with sample selection requires the use of a specific distribution because the observed sample is viewed as a random sample from a truncated distribution. Assuming normality per se is not required. 36. The expressions in equation (23) correspond to those in equation (19) and are derived using results reported in Johnson and Kotz (1970:81-83; 1972:70). 37. Perhaps the easiest way to verify this is to solve the expressions in equation (23) for the unconditional moments given by the expressions in equation (19). As reported above, knowledge of these moments is sufficient to identify all of the structural parameters. A particularly helpful fact in checking identification here is that a = V(yl) can be consistently estimated by the sample variance of Y1 computed from all of the observations on Y1. 38. Although it does complicate the form of these moments. 39. Formally, they can be eliminated by substitution of y4 from equation (6) into equation (7) and substitution of Ye from equation (10) into equation (11). 40. We did not invoke these restrictions in the illustrative model because incorporation of controversial assumptions would compromise the major purpose of this paper. 41. Formally, this involves assuming that the variance of £3 (see equation (14)) is zero. The attractiveness of such an assumption certainly depends on the extensiveness of the available information concerning prior record. 42. Note, however, that one would not want to delete equations representing decisions resulting in sample selections because it is precisely the structure provided by these equations that allows us to correct the sample selection bias.

181 REFERENCES Arkin, S. D. 1980 Discrimination and arbitrariness in capital punishment: an analysis of post-Furman murder cases in Dade County, Florida, 1973-1976. Stanford Law Review 33(November):75-101. Chiricos, T. G., and Waldo, G. P. 1975 Socioeconomic status and criminal sentencing: an empirical assessment of a conflict proposition. American Sociological Review 40(December):753-772. Clarke, S. H., and Koch, G. G. 1976 The influence of income and other factors on whether criminal defendants go to prison. Law & Society Review ll(Fall):59-92. Farrell, R. A., and Swigert, V. L. 1978 Prior offense record as a self-fulfilling prophecy. Law and Society 12(Spring):437-453. Forst, B., Lucianovic, J., and Cox, S. J. 1977 What Happens After Arrest? A Court Perspective of Police Operations in the District of Columbia. Washington, D.C.: Institute for Law and Social Research. Frase, R. S. 1980 The decision to prosecute federal criminal charges: a quantitative study of prosecutorial discretion. University of Chicago Law Review 47:246-330. Garber, S., and Klepper, S. 1980 Extending the classical normal errors-in- variables model. Econometrica 48(September):1541-1546. Gibson, J. L. 1978 Race as a determinant of criminal sentences: methodological critique and a case study. Law and Society 12(Spring):455-478. a Goldberger, A. S. 1964 Econometric Theory. New York: Wiley. 1973 Structural equation models: an overview. Pp. 1-18 in A. S. Goldberger and o. D. Duncan, eds., Structural Equation Models in the social - Sciences. New York: Seminar Press.

182 Goldberger, A. S., and Duncan, O. D., eds. 1973 Structural Equation Models in the Social Sciences. New York: Seminar Press. Hagan, J. 1974 Extra-legal attributes and criminal sentencing: an assessment of a sociological viewpoint. Law and Society Review 8(Spring):357-383. 1975 Parameters of criminal prosecution: an application of path analysis to a problem of criminal justice Journal of Criminal Law & . Criminology 65(4):536-544. J. J. Dummy endogenous variables in a simultaneous equation system. Econometrica 46(July):931-959. 1979 Sample selection bias as a specification error Econometrica 47(1):153-161. N. L., and Katz, S. Distributions in Statistics: Distributions 1. New York: 1972 Distributions in Statistics_ Multivariate Distributions. Heckman, 1978 Johnson, 1970 Continuous Wiley. Continuous New York: Wiley. Klepper, S. 1980 Summarizing the Data for the Classical Normal Errors-in-Variables Model. Unpublished manuscript. Department of Social Science, Carnegie-Mellon University. LaFree, G. D. 1980a The effect of sexual stratification by race on official reactions to rape. American Sociological Review 45(October):842-854. . 1980b Variables affecting guilty pleas and convictions in rape cases: toward a social theory of rape processing. Social Forces 58(March):833-850. Lizotte, A. J. 1977 Extra-legal factors in Chicago's criminal courts: testing the conflict model of criminal justice. Social Problems 25(5):564-580. McCallum, B. T. 1972 Relative asymptotic bias from errors of omission and measurement. Econometrica 40(July):757-758. Muthen, B. 1979 A structural probit model with latent variables. Journal of the American Statistica 1 Association 74(December):807-811.

183 Olsen, R. 1980 A least squares correction for selectivity bias. Econometrica 48(November):1815-1820. _ Swigert, V. L., and Farrell, R. A. 1977 Normal homicides and the law. American Sociological Review 42(February):16-32. Tiffany, L. P., Avichai, Y., and Peters, G. W. 1975 A statistical analysis of sentencing in federal courts: defendants convicted after trial, 1967-1968. The Journal of Legal Studies 4:397-417. Wickens, M. R. 1972 A note on the use of proxy variabl Econometrica 40(July):759-761.