9
An Illustration of Methodological Complexity: Racial Profiling

We end Part II with a specific example of an area for which research on the role of racial discrimination is important but difficult to carry out. The example we use is racial profiling. Given the challenges to measurement, we do not endeavor to prescribe state-of-the art methods for determining when racial profiling exists or its effects. Rather, our discussion of specific issues regarding methods and data is intended to remind researchers, policy makers, and the public of the difficulties of causal inference with regard to profiling, which may also be relevant for other areas in which racial discrimination may occur.

We begin with definitions of profiling and racial profiling. Profiling is a statistically discriminatory screening process in which some individuals in a population (e.g., automobile drivers, income tax filers, people going through customs, people boarding an airplane) are selected on the basis of one or more observable characteristics and then investigated to determine whether they have committed or intend to commit a criminal act (e.g., sell or smuggle drugs, cheat on taxes, blow up an airplane) or other act of interest. The particular characteristics used in profiling are chosen with the goal of selecting people who are most likely to warrant further investigation and typically depend on the setting. For example, people who purchase one-way airline tickets using cash on the day of their flight may be selected for further scrutiny by airport personnel based on an assumption or empirical evidence that they are more likely than others to pose a risk of premeditated violence to passengers.

We reserve the term “profiling” for screening situations in which there is reason to believe that criminal behavior could be committed, but there is



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 186
Measuring Racial Discrimination 9 An Illustration of Methodological Complexity: Racial Profiling We end Part II with a specific example of an area for which research on the role of racial discrimination is important but difficult to carry out. The example we use is racial profiling. Given the challenges to measurement, we do not endeavor to prescribe state-of-the art methods for determining when racial profiling exists or its effects. Rather, our discussion of specific issues regarding methods and data is intended to remind researchers, policy makers, and the public of the difficulties of causal inference with regard to profiling, which may also be relevant for other areas in which racial discrimination may occur. We begin with definitions of profiling and racial profiling. Profiling is a statistically discriminatory screening process in which some individuals in a population (e.g., automobile drivers, income tax filers, people going through customs, people boarding an airplane) are selected on the basis of one or more observable characteristics and then investigated to determine whether they have committed or intend to commit a criminal act (e.g., sell or smuggle drugs, cheat on taxes, blow up an airplane) or other act of interest. The particular characteristics used in profiling are chosen with the goal of selecting people who are most likely to warrant further investigation and typically depend on the setting. For example, people who purchase one-way airline tickets using cash on the day of their flight may be selected for further scrutiny by airport personnel based on an assumption or empirical evidence that they are more likely than others to pose a risk of premeditated violence to passengers. We reserve the term “profiling” for screening situations in which there is reason to believe that criminal behavior could be committed, but there is

OCR for page 186
Measuring Racial Discrimination no specific knowledge of a particular suspect or criminal scheme.1 We thereby distinguish profiling from situations in which a specific description of a suspect is issued on the basis of presumably reliable information. Racial (or ethnic) profiling is a statistically discriminatory screening process in which race (or ethnicity) is used as one, or the only, observable characteristic in the profile. The problem of racial profiling in law enforcement has attracted a great deal of public attention in recent years. Such profiling is conceptually no different from the kinds of discrimination previously discussed in this report (see Chapter 4); it is simply one instance of the more general phenomenon we have termed statistical discrimination. Racial profiling in the criminal justice arena entails the use by law enforcement personnel of statistical generalizations about a group of people based on their race. To the extent that these generalizations reflect overt racial prejudice or issue from subtle, race-influenced cognitive biases, profiling is indistinguishable from the explicit prejudice we have already discussed. Even when race-based generalizations are consistent with one reading of the evidence (as when, in a certain locality, police officers give heightened scrutiny to blacks because they know that in that locality and on average blacks are more likely than whites to be involved in certain kinds of crime), it remains the case that profiling is a type of statistical discrimination. Thus, our earlier discussion of statistical discrimination based on race also applies to racial profiling. Earlier we noted that it is unlawful to judge an individual job applicant on the basis of the average characteristics of the applicant’s racial group, regardless of whether the employer’s assessment of the racial average is accurate (see Chapter 4). Similarly, most observers believe it is wrong for domestic law enforcement personnel to base their routine treatment of individuals on the average behaviors of racial groups. Thus, the results of a Gallup poll in 1999 showed that 81 percent of Americans did not approve of racial profiling, defined as the practice by police officers of stopping drivers from certain racial or ethnic backgrounds because officers believe these groups are more likely to commit certain crimes (Gallup Poll, 1999). There have also been many policy statements by police officials and legislative bodies declaring the unacceptability of racial profiling in police work.2 Recently, the Bush administration issued policy guidance on racial or ethnic 1   For concreteness, we refer to profiling with reference to a criminal act, but the term applies to screening to detect any activity of interest. 2   See, for example, National Conference of State Legislatures (2002); Minnesota’s statute on racial profiling (http://www.aele.org/minnprofile.html [accessed January 29, 2004]); and Tulsa Police Department Policy 31-316B (http://www.tulsapolice.org/racial_profiling_policy.html [accessed June 9, 2003]).

OCR for page 186
Measuring Racial Discrimination profiling forbidding its use in federal domestic law enforcement: “‘Routine patrol duties must be carried out without consideration of race,’ the Justice Department policy states. ‘Stereotyping certain races as having a greater propensity to commit crimes is absolutely prohibited’” (Allen, 2003:A14). The only instance in which domestic law enforcement officers may use race is when it is part of a specific description obtained from a witness or informant about a specific crime. Even when statistical profiling is not explicitly racial, to the extent that it relies on characteristics that are distributed differently for different racial groups, the result may be to produce a racially disparate impact. For example, if the police tend to stop cars with broken tail lights more frequently and if disadvantaged racial groups are more likely to drive older cars, then the profile—stop cars with broken fixtures—will result in a higher stop rate for these groups. Recall that in the employment context the use of screening criteria having a disparate impact on a protected racial group is legitimate only if the employer can demonstrate an objective and suitably compelling connection between the screening criteria and the employer’s economic bottom line. So, too, in the context of law enforcement, nonracial profiling that relies on traits distributed differently among racial groups and that results in a racially disparate impact must be justified by demonstrating an objective association between those traits (e.g., broken tail lights) and the outcome of interest (criminality). This would be the case, for example, if it could be shown that drug couriers typically drive older cars (e.g., because they are poorer or because their cars would be confiscated if they were caught carrying drugs). In this chapter we discuss racial profiling primarily in the context of measurement—that is, how it may be possible to determine when racial profiling is (or is not) occurring in law enforcement. Allegations of discriminatory racial profiling—mainly by police making traffic stops—have increased in frequency in the past few decades.3 Yet methods and data with which to establish that disadvantaged racial groups are being stopped at higher rates than others and that racial profiling explains some or all of the differences in selection rates are not well developed. The measurement and modeling issues are similar to those discussed in Chapter 7 on using statistical models with observational data to measure discrimination by inference, but some special issues in the profiling situation warrant attention. We also briefly discuss racial or ethnic profiling as a policy option in the context of the increased threats to public security from terrorist attacks. The panel began its deliberations scarcely one month after the attacks of 3   Issues of racial profiling in other settings, such as inspection by customs officials for carrying of drugs or other contraband, have also attracted political and legal attention (see Harris, 1997, 1999a; Washington Post, 2002; Webb, 1999; White, 2000).

OCR for page 186
Measuring Racial Discrimination September 11, 2001, so we could not help but be aware of how public discussion and perceptions regarding profiling had changed. Hence, we deemed it of value to discuss the issues involved in the possible use of racial or ethnic profiling (or profiling using characteristics that correlate highly with race or ethnicity) as a tool with the potential to help prevent future terrorist attacks. Some issues are technical, involving how or whether one could determine the potential effectiveness of race, ethnicity, and other characteristics as profiling factors. Other, even more important, issues involve the heavy societal costs of using race or ethnicity (or variables highly correlated with them) in profiles. Of course, time has passed since we began our deliberations, and public officials, as well as the nation as a whole, have continued to discuss and debate the pros and cons of profiling in the terrorism context. We have not been connected to those debates and do not comment on specific rulings or positions that have been proposed or adopted in the interim (e.g., the Bush administration policy guidance that permits ethnic profiling in narrow circumstances involving international terrorism). Our deliberations were concerned with the general issue of racial or ethnic profiling—how to determine when and whether it occurs in situations when one would want to prevent it and what considerations might need to be taken into account if one wanted to implement it even though it is, by our definition, discriminatory. Although we have not deliberated about and have no comment on specific profiling proposals, we hope the general points we raise will serve to aid public evaluation of the issues. MEASUREMENT ISSUES Two main measurement issues arise in attempting to establish the existence of racial profiling in a law enforcement situation. The first is how to determine that racial or ethnic groups are being subjected to enforcement actions (e.g., traffic stops, searches, citations, arrests) at disparate rates. The second is how to determine that racial profiling is a causal factor in disparate selection rates. The discussion here addresses primarily the first issue; the second presents modeling challenges similar to those discussed in Chapter 7 on measuring racial discrimination in labor markets and other settings. Establishing Disparate Outcomes in Profiling Situations Data Sources on Racial Profiling Much of the available data on racial profiling come from anecdotal experiences of nonwhites. In a typical case, a nonwhite person may be pulled

OCR for page 186
Measuring Racial Discrimination over for a minor traffic violation (e.g., speeding 5 miles over the limit) and searched on suspicion of carrying contraband. Similarly, a nonwhite person may be stopped and questioned for being in a predominantly white neighborhood (see Harris, 1999b). Although these incidents are clearly discriminatory, such complaints do not prove that police officers and security personnel engage in racial profiling generally, or even that members of minority groups are necessarily detained more often than others. However, the substantial number of complaints occurring in certain types of situations (e.g., traffic stops) indicates how widely racial profiling is believed to be—and could in fact be—used. A second source of data on racial profiling is official records, such as state and local police data on traffic and pedestrian stops, searches, warnings, citations, and arrests. Many states, including Maryland, New Jersey, and North Carolina, have enacted legislation for the collection of detailed data on stops and have mandated studies of racial profiling.4 Despite these efforts, however, relatively few data sets are complete, accurate, and available for analysis (Glaser, 2003; Harris, 1999b). For example, police records on stops may not include the race of those individuals stopped but not cited or arrested by police, and there may be little consistency in reporting race for a variety of reasons. An important use of detailed police data on traffic stops is to provide early warning of individuals who engage in inappropriate racial profiling. This use can be fraught with danger, however, if the data do not reliably indicate such behavior. One obvious concern is that officers may manipulate their reports if they perceive they are in danger of disciplinary action. Or if they stop members of disadvantaged groups on the basis of race, they may make unnecessary stops of advantaged groups to balance their “portfolio.” These corrective actions may keep the record clean but are inefficient as well as discriminatory against the members of such groups. On the other hand, police officers who make appropriate stops may in some cases face unwarranted charges of racial profiling if their stop rates by race are compared with population (baseline) rates that are poorly measured (see below). To the extent that official data are biased in any of these ways because of their use for individual disciplinary actions, the data will also be biased for research purposes. Yet another source of data on profiling is direct observation of selection decisions. It may be possible for researchers to collaborate with officers in the field to elicit information on what factors they take into account as 4   For state legislation mandating data collection and other efforts, see Institute on Race and Poverty (2001), National Conference of State Legislatures (2002), and Police Foundation (2001).

OCR for page 186
Measuring Racial Discrimination they make stop decisions. One could then examine the consistency in those factors across different officers and between the decisions made when accompanied by a researcher and those made when officers are on their own. Such studies must be carried out carefully to avoid biasing the results by virtue of the direct involvement of the researcher. Methods for Estimating Disparate Selection Rates Regardless of how complete or accurate the data collected on such law enforcement actions as traffic stops or customs searches may be, those data are likely not to be sufficient in and of themselves to establish the existence of racially disparate outcomes. For example, a finding that more blacks are stopped than whites at a certain intersection may reflect the fact that more black drivers pass through that intersection (because of residential or employment isolation) than do white drivers. Indeed, the most common problem cited across studies of police profiling (e.g., Engel et al., 2002; Fagan, 2002; Lamberth, 1994, 1996; Ramirez et al., 2000; Zingraff et al., 2000) is identifying the appropriate population to classify by race for comparison with the racial classification of those stopped by the police—the so-called denominator or base rate problem. For example, if one has police data on the percentage of nonwhites stopped at an intersection among all people stopped, is an appropriate comparison measure the percentage of nonwhites in the population living around that intersection, the percentage of nonwhites observed to drive by that intersection on a daily basis, the percentage of nonwhites observed to violate speed limits or other traffic rules at that intersection, or some other measure? Engel and Calnon (forthcoming) report on five different approaches used by researchers to gather baseline data for determining racially disparate outcomes for traffic stops: census data, observations of roadway usage, assessments of traffic-violating behavior, citizen surveys, and internal departmental comparisons. They review various studies that use these strategies to construct baseline measures for traffic stops and describe the strengths and limitations of each. Census data. Estimates of the driving population derived from decennial census data are commonly used as baseline measures of traffic or pedestrian stops (see, e.g., Harris, 1999b; Zingraff et al., 2000). In practice, the racial composition of stops is often compared with the racial composition of the census population in the immediate vicinity of a stop point, sometimes in combination with motor vehicle records on the racial composition of drivers resident in the area (Engel et al., 2002). However, the flow population can be quite different from the resident population (Zingraff et al., 2000). This is certainly the case with traffic flow: The composition of the drivers

OCR for page 186
Measuring Racial Discrimination passing through a particular neighborhood, particularly on a major highway, may bear little relationship to the neighborhood’s residential composition. One might try to take a random sample of drivers passing a particular point (perhaps using pictures taken with a bright flash camera so there will be enough light to permit the identification of race) to establish a distribution across the relevant racial groups.5 However, there could be serious concerns about the accuracy of such identification, and the sample results could well change with the time of day, day of the week, or season. Pedestrian stops might be more representative of the underlying population but not necessarily so in business districts or high pedestrian traffic areas, where stops are more likely to occur. More sophisticated—although not necessarily more accurate—estimates of the relevant baseline population have been developed from census data by using the racial composition of neighboring counties weighted inversely by the county’s distance from the observation point. Engel and Calnon (forthcoming) suggest using baselines that capture differences in frequency and patterns of driving by race. And estimates for a city with a large minority population have been corrected using census data to take account of the mix using public transportation (Rojek et al., forthcoming), although the validity of such a correction process has not been established. All of these approaches need to be calibrated with observation samples. Observational data. Reports on racial differences in driving patterns and frequency obtained by observation can be compared with differences in rates of stops, citations, searches, and arrests, although the collection of observational data entails costs that can limit the utility of this method for establishing differential outcomes. Examples of observational studies include those of Lamberth (1994, 1996), using data from rolling surveys of the driving population and traffic violators in New Jersey and Maryland, respectively. Lamberth (1996) reports on a study in which observers driving at the posted speed limit categorized the racial composition of about 5,700 drivers traveling over the speed limit (violators) or not (nonviolators) on particular stretches of I-95 in Maryland. Lamberth used these data to establish a benchmark of law-violating and law-abiding behavior. Although one can imagine the difficulty involved in spotting the race of drivers in cars speeding past the observers, Lamberth does establish an important point—that 5   Because census race reports are provided by household members, whereas police stops are based on observation, visual identification of race would need to be compared with self-reports so the census data could be adjusted to reflect the likely distribution that would result from observation.

OCR for page 186
Measuring Racial Discrimination most of the cars observed (93 percent) were traveling above the posted speed limit, a situation in which police have the ability to stop almost any car for speeding. Lamberth clearly believes that racial differences in stop rates when almost everyone is speeding must reflect racial bias. This conclusion, however, rests implicitly on the proposition that speeding was the only basis for stopping cars on the Maryland highway (although one could look only at those stopped for speeding) and that there was virtually no difference in the distribution of speeds for white and black drivers. Lamberth’s own data show that whites were more likely than blacks to be driving at the lawful speed on I-95 in Maryland. (Specifically, 7.9 percent of the white drivers observed in Lamberth’s study, but only 3.6 percent of the black drivers, were not speeding.) Indeed, a subsequent study conducted on the New Jersey turnpike using radar devices and cameras to determine car speeds and the race of drivers revealed that blacks did drive at very high speeds more often than whites, which would likely cause them to attract more attention from police.6 Yet even if racial differences in the rate of stopping motorists on Maryland highways can be explained by differences in driving behavior, the racial disparities in rates of search for illegal activity conditional on being stopped appear to be quite large. The Maryland State Police reported stopping and searching 823 drivers on I-95 during the observation period of Lamberth’s (1996) study; 73 percent of those drivers were black and only 20 percent white (the remaining drivers were other racial minorities). Yet blacks accounted for only 18 percent of the speeding drivers who were eligible to be stopped on I-95 (from Lamberth’s data), compared with 73 percent of those who were actually searched (from the police data). Lamberth (1994) obtained similar results in his New Jersey study. Assessment of traffic-violating behaviors. Few studies have determined whether traffic-violating behaviors vary by race. Lamberth (1994, 1996) tried to establish base rates in his studies; however, he did not determine the severity of violating behaviors. Severity in the case of speeding involves both the rate of speed of a driver and the speed at which state police issue citations, which can differ from state to state. For example, if police in a state routinely allow drivers to exceed the posted speed limit by 10 mph, researchers would need to establish the rates at which different racial groups 6   The study found that in the southern part of New Jersey, where claims of racial profiling had been most common and where the speed limit was 65 mph, 2.7 percent of black drivers compared with 1.4 percent of white drivers drove faster than 80 mph. The racial disparity was even greater for those driving faster than 90 mph. On the other hand, the study did not find any racial differential in speeding in northern New Jersey areas having speed limits of only 55 mph (Kocieniewski, 2002).

OCR for page 186
Measuring Racial Discrimination exceed that limit to use in comparisons with stop rates. Engel and Calnon (forthcoming) cite researchers who have estimated the degree to which drivers violate the speed limit (e.g., Lange et al., 2001; Smith et al., 2000) but conclude that their methods still do not fully capture differences in the severity of speeding. One reason is the difficulty of reliably measuring all behaviors associated with traffic-violating behaviors.7 Citizen surveys. Researchers may conduct surveys of individuals regarding their driving patterns to create baselines for comparison with data on traffic stops. (They may also conduct surveys of individuals concerning their interactions with police to compare with some baseline.) One advantage of citizen surveys is that they provide self-reports on a driver’s race. However, self-reporting is less relevant to race as perceived by the police, who are potentially profiling. Moreover, self-reporting is probably less effective for gathering information on traffic violations because of underreporting by respondents, who may view admitting to such violations as socially undesirable or fail to report their violations for other reasons. Baselines developed from citizen surveys may also be inaccurate as a result of differences in driving patterns across local jurisdictions and in the driving population by day of week or time of day (Farmer, 2001). Internal departmental comparisons. An alternative to creating external baselines is to use comparisons of rates of stops and other behaviors among police officers to identify typical rates. This method is often used as part of a police department’s approach to identifying and studying officers who exhibit problematic behaviors, such as high rates of complaints (Walker, 2001). Walker acknowledges that such an approach would not be effective in departments in which institutional discrimination was practiced (i.e., in which departmental policy, explicitly or implicitly, allowed or encouraged race-based profiling). It would also not be effective in cases in which police data reports did not include officers’ names for fear of civil and criminal liability or in which officers manipulated the data in one or more respects (as discussed above). Summary. Engel and Calnon (forthcoming) conclude that methods for identifying racial disparities in police stops are weak but improving. They suggest the best strategy is to use multiple baseline measures to make comparisons with official police data. To best estimate a baseline population, they suggest using surveys and observational studies conducted in various loca- 7   It may be that data from jurisdictions that have installed cameras at intersections that automatically take pictures of certain kinds of violations will be helpful in this regard.

OCR for page 186
Measuring Racial Discrimination tions over a long time period, although such factors as cost and size or composition of geographic areas can impede the collection of appropriate baseline data. Disparities Versus Discrimination Assuming that the existence of racially disparate outcomes in law enforcement situations has been established, the second and more difficult analytical challenge is to determine the extent to which race-based profiling explains the measured disparities. Seven of 13 studies of traffic stops conducted between 1996 and 2001 (reviewed in Engel et al., 2002) concluded that racial discrimination by police officers fully explained the observed racial differences in stops (American Civil Liberties Union, 2000; Harris, 1999b; Lamberth, 1996; State of New Jersey v. Pedro Soto, 734 A.2d 350, 1996; Smith and Petrocelli, 2001; Spitzer, 1999; Verniero and Zoubek, 1999).8 However, these studies have been criticized for not having the right type of data to rule out other explanations for the disparities. For instance, Lamberth’s (1996) findings (see above) revealed a disproportionately negative outcome for nonwhites in a population for which the likelihood of being stopped was assumed equal for both whites and nonwhites. Yet it is possible that differences in offense rates existed across these groups and that disparities were in part the result of differences in driver behavior and not police behavior. The remaining six studies reviewed by Engel et al. (2002) acknowledge that factors other than race, such as differences in driving behavior or in neighborhood characteristics that affect the level of policing, could explain the observed disparities (Cordner et al., 2000; Cox et al., 2001; Lansdowne, 2000; Texas Department of Public Safety, 2000; Washington State Patrol, 2001; Zingraff et al., 2000). For example, Zingraff et al. looked at citation rates for black and white men categorized by age and found an interaction effect between race and age such that blacks did not always have the higher traffic citation rate. Thus, black men aged 22 and younger were 24 percent less likely to receive citations than were white men in this age group. (The same was true in comparing young black with young white women.) In contrast, black men aged 23 to 49 were 23 percent more likely to receive citations than were comparably aged white men, while black men aged 50 and older were 70 percent more likely to receive citations than their white counterparts. Generally, Engel et al. (2002) conclude that interpreting the findings from extant studies of racial profiling is problematic because there is no 8   All 13 studies estimated at least some degree of racial disparities in policing behavior.

OCR for page 186
Measuring Racial Discrimination theory guiding the research and data collection. As we have argued in other areas of analysis of discrimination, such as discrimination in hiring behavior by firms (see Chapter 7), it is essential to have an appropriate model of the process that could lead to racial profiling with clearly articulated and justified assumptions if one is to credit a conclusion about the existence of profiling. At least two different models could be examined in the area of racial profiling. One model would attribute racial profiling largely to the behavior of individual officers (“bad apples”) who are prejudiced against minorities. Another model would attribute racial profiling largely to statistical and institutional discrimination.9 Each model has implications for data collection and analysis. As in other arenas, the difficulty of causal attribution strongly suggests that multiple approaches and kinds of data should be used to understand the extent and types of racial profiling behavior in law enforcement situations. PROFILING IN THE CONTEXT OF TERRORISM Because of renewed interest in the United States in the possible use of profiling to identify and apprehend potential terrorists before they commit violent acts, we briefly examine the challenges of identifying screening factors that could potentially select would-be terrorists with a significantly higher probability than purely random selection. Following the attacks of September 11, 2001, media commentators discussed the possibility of racial or ethnic profiling for selecting airplane flight passengers for additional investigation; some also questioned the value of purely random screening, which results in picking up individuals likely to be harmless (e.g., elderly women) (Quindlen, 2002; Wilson and Higgins, 2002). We identify two sets of issues for consideration: The first involves the difficulties of specifying an effective profile; the second relates to the possible benefits and costs of profiling—not only monetary costs but also social costs that are difficult to measure yet highly important to take into account. We consider not only racial or ethnic profiling as such but also the use of other profiling factors that correlate highly with race or ethnicity so that minorities are singled out disproportionately when the profile is used (disparate impact discrimination). 9   As noted above, statistical discrimination occurs when police officers use their belief, for example, that young nonwhite males are more likely to be carrying contraband, to justify targeting this group disproportionately in traffic stops, or rely on data showing higher arrest rates for this group for drug offenses and violent crimes. Institutional discrimination occurs when police departments, overtly or implicitly, condone or encourage racial profiling by officers.

OCR for page 186
Measuring Racial Discrimination By using such terms as “costs” and “benefits,” we do not mean to deny the fundamental importance of the civil rights context in considering the issue of racial or ethnic profiling. In that context, racial profiling is considered statistical discrimination and therefore wrong under any circumstances, whether or not it could be proven that there are costs associated with not profiling. Consider the analogy to free speech. People have a right to express themselves. We do not talk about the benefits and costs of free speech; instead, we say there is a right to free expression that continues to exist even when that free expression poses costs to others. But even that right has limits: It cannot be exercised when the costs to others are very large (e.g., yelling “Fire!” in a crowded theater).10 Thus, we believe it important to review arguments about effective and ineffective profiles and possible costs and benefits of profiling because arguments for the use of the practice have been and will likely continue to be made in an environment of heightened concerns for public safety. Developing Effective Profiles We first review the kinds of additional screening that could potentially help protect the public in such situations as boarding an airplane to provide a context for the possible development of racial or ethnic screening factors. At one extreme, a decision could be made to subject every passenger to intensive scrutiny and interrogation well beyond the previous norm. At this time, however, the public does not appear to be willing to tolerate such a level of scrutiny for all passengers because of the hassles and delays as well as the higher costs for security personnel. Given agreement, however, that some kind of screening is desirable to help prevent a terrorist attack, a procedure must be developed for selecting a subset of passengers to be screened. The selection could be random or, more likely, could be based on several profiling factors. Such factors could include one or more of the following: immutable (or relatively immutable) characteristics such as skin color, sex, and national origin; behavior and dress (e.g., wearing a turban, carrying a backpack, appearing nervous); flight patterns (e.g., purchasing a ticket at the last minute); and background information associated with a 10   Of course, the analogy is only partially on point. Limiting the freedom of individuals to yell “Fire!” in a crowded theater constrains the freedom of everyone. In contrast, profiling constrains the civil liberties of a subset of persons and leaves the civil liberties of others intact. Hence, although the free speech analogy does suggest that civil liberties are not absolute and have been limited for the public good, it also suggests that their limitation usually imposes constraints that are universally shared. By definition, profiling, to be effective, cannot impose widely shared constraints.

OCR for page 186
Measuring Racial Discrimination name, address, and date of birth obtained from various databases (e.g., credit card histories). The goal in developing a screening profile is to identify factors that will select would-be terrorists with a significantly higher probability than purely random selection. Several problems make achieving this goal extremely difficult—in particular, the lack of adequate experience with which to establish the effectiveness of various profiling factors, the ways in which the predictive performance of profiling models can be impaired, and the difficulty involved in setting false-positive and false-negative standards for effectiveness. Inadequate Data Data must be available with which to evaluate the predictive power of alternative profiling models in terms of the factors to include and the weight to assign to each factor. In the case of airline security, this evaluation is made most difficult because terrorist incidents in the United States are very rare events, and the estimated numbers of known terrorists and their associates are very small compared with more than 2 million air passengers and the number of innocent people who are profiled on any given day. Even though all 19 of the September 11 attackers were young Middle Eastern men, it is difficult to draw reliable conclusions from this fact regarding the propensity of any other young Middle Eastern men, let alone anyone else, to engage in future terrorist acts, given the many other factors involved and the rarity of terrorist actions. Even when large numbers of data points are available for analysis, as is true of traffic stops, it is difficult to draw valid conclusions about the relative effectiveness of race or other profiling factors. In this context, effectiveness can be measured by comparing “hit rates” among different groups of automobile drivers—usually defined, for example, as the percentage of drivers whose cars are found to contain contraband (e.g., drugs) among the subset of drivers who are stopped and searched.11 Engel and Calnon (2001:Table 1) review 15 studies that examined the effectiveness of racial profiling in traffic, pedestrian, and airport stops. The estimated hit rates (in terms of finding contraband in searches given a stop) varied from under 10 percent to as high as 60 percent. By race, eight studies found similar hit rates in searches for whites and nonwhites, but it is 11   If the same factor, such as race, is used to determine which drivers to stop and also which of those stopped to search, hit rates could be defined for each group as the percentage of drivers found to be carrying contraband among all drivers stopped.

OCR for page 186
Measuring Racial Discrimination difficult to interpret these findings lacking other information about the stops. If blacks are stopped and searched at higher rates than whites solely because of racial profiling, similar hit rates may indicate similar propensities for carrying contraband and hence the ineffectiveness of racial profiling.12 Such a conclusion may not be valid, however, if other factors enter into the profiling.13 The remaining seven studies found higher hit rates for blacks and Hispanics compared with whites. Engel and Calnon (2001) conclude that these studies do not provide sufficient evidence about racial or ethnic differences in hit rates either, primarily because of the lack of control for other factors (e.g., extralegal and legal characteristics of the stop) that might influence the likelihood of discovering contraband. Given the undesirability of using racial or ethnic variables in profiling on civil rights grounds, any proposed model for detecting terrorists that includes variables that are highly correlated with ethnicity (and especially ethnic variables) would have to be challenged in terms of their contribution to the predictive value. The model would also have to be evaluated very carefully to determine the reliability of the estimates of each variable’s contribution to the model’s effectiveness and especially the contribution of those variables directly or indirectly related to ethnicity. Prediction, Not Causation A second serious problem with developing effective profiling models is that they are almost by definition predictive, not causal, models. There is no process from which one can infer that such characteristics as wearing torn clothing or a turban or appearing to be of Arab origin are related causally to terrorist behavior; one can only hope to identify factors that have a high correlation with terrorist behavior, which is rare in any case. In the event a profiling model is developed with factors that are reliably estimated to be highly associated with terrorism at a point in time because causation is not involved, terrorist groups are likely to take steps to invalidate or “game” the profile. Thus, if a terrorist group were able to identify the kinds of characteristics that result in being pulled aside (or not) for additional investigation, it could enlist a person without those characteris- 12   For estimating hit rates in this situation, the higher stop and search rates for blacks simply provide a larger sample for that group. 13   If an experiment could be conducted in which people were stopped and searched at random in the same areas in which security personnel initiate stops, it might be possible to examine this issue.

OCR for page 186
Measuring Racial Discrimination tics to carry out a terrorist act.14 If this is the case, random screening may be more effective than profiling because it cannot be gamed.15 A related problem is that an effective profile would essentially harden the primary targets, which in this case comprise airliners. This effect could cause terrorists to shift their attention to “softer” targets. If so, that would represent success in protecting the primary targets, but it would force attention to the question of how broadly we can protect the wide array of potential targets. Would the same profiling instruments work as well elsewhere (say, on mass transit)? That forces consideration of the broad array of threats and vulnerabilities of all possible targets, an issue that is clearly beyond the scope of this panel. Standards for Effectiveness A third problem in developing profiles for such purposes as screening airline passengers is determining the standard by which one judges effectiveness. Because associations are never perfect, any profiling model will fail to detect some terrorists, and models developed with limited data may well generate high rates of false negatives. In other words, such models may fail to select terrorists, especially those who do not fit the profile. Moreover, because the base rate is so low, any profiling model will also generate a very high rate of false positives; that is, it will select many people who fit the profile but are innocent of any crime or criminal intent. Costs and Benefits of Profiling The benefits of an effective profiling model are readily stated in general terms. In the terrorism context, they include the possible prevention of terrorist acts that, if not detected, could result in catastrophic loss of lives and property. Furthermore, it might be posited that a high rate of prevention of planned attacks could, over time, discourage terrorist groups from planning further attacks. Because terrorists typically seek to inflict severe damage, societal concern about the potential loss of hundreds or thousands of lives in an attack (as occurred on September 11) is understandably high. 14   For this reason, security agencies strive to keep profiling features secret. The possibility of gaming also argues against using such obvious factors as ethnicity or other features indicative of national origin and turning instead to less obvious factors (e.g., particular travel patterns). 15   Random screening is not the same as haphazard selection; random screening involves the use of a randomizing device, such as a computer algorithm, to determine which persons to stop.

OCR for page 186
Measuring Racial Discrimination Yet when an antiterrorism profiling model uses race or ethnicity or factors that correlate highly with race or ethnicity, particularly when such factors are given high weight in the profile, the inevitably large false-positive rates mean that large numbers of members of disadvantaged groups will be falsely singled out for scrutiny. As a result, not only will these individuals experience hassles and delays, they will also likely feel angry, humiliated, and stigmatized. Such stigmatization could well have high negative costs for society at large—if not in the immediate future, then in the longer term. One such cost could be the reinforcement of stereotypes associating minorities with criminal propensities, which could have the damaging effect of reinforcing discriminatory attitudes and behaviors in other domains and having negative feedback for some behaviors of the targets of discrimination (see Chapter 11). A related cost could be the desensitization of the public to the need to be vigilant in protecting important civil liberties, which could lead in turn to readier acceptance of the erosion of civil rights for more and more groups of people who were not originally targeted in profiling. Yet another cost could be possible retaliation (e.g., future terrorist acts) by individuals driven by anger and resentment for being wrongly targeted. With regard to which groups in society are likely to bear the costs of profiling disproportionately, we note two related points. First, profiling on the basis of race or ethnicity is by its very nature less useful when applied to large groups (when there is only one group, it cannot be used at all). To reduce the false-positive rate, one wants to target profiling on small, narrowly focused groups. The consequence is that the burden of racial profiling will typically fall on smaller groups. Second, such groups may be disadvantaged in other ways and less able to oppose the use of profiling compared with the majority group. Finally, as noted above, it could happen that assessing people on the basis of race or ethnicity in one domain (which is what racial profiling does) may spill over into a reduced concern for civil liberties in other contexts. Trade-offs Analysts might consider developing formal cost-effectiveness models to compare the benefits and costs that could be expected from the use of racial or ethnic profiling as a tool in such situations as screening flight passengers to help identify terrorists. Such a task would be challenging in the extreme, although attempts to develop such models could help illuminate the difficult trade-offs involved in assessing the value of profiling. Thus, on the benefit side, it would be difficult and contentious to estimate the number of lives that might be saved through profiling and, further, to estimate the value of those lives. On the cost side, although it might be possible to assign

OCR for page 186
Measuring Racial Discrimination monetary values to the hassles and delays experienced by those law-abiding people who are improperly singled out for scrutiny, it would be very difficult to weigh stigmatization and such larger societal values as the possible serious erosion of civil liberties over the long term. Ultimately, assessment of the possible use of ethnic profiling in fighting terrorism should involve careful, sober, deliberate consideration by policy makers and the public of three main factors: the desire to protect against the likelihood, albeit very small, of catastrophic terrorist events; the reality that racial (or ethnic or national origin) profiling is likely to be only marginally effective in detecting terrorists in airports and similar venues and, at the same time, will subject many innocent people to harassment and stigmatization; and the importance our society places on protecting core societal values of equal protection and liberties for all. Over time our society has progressed, through civil war, constitutional amendments, legislation, and court cases, to a conclusion that race-based discrimination in such domains as job markets, housing, and voting is unacceptable and should not be allowed, despite arguments that might be offered to the contrary (e.g., allegations that the presence of disadvantaged racial groups lowers property values). We have reached that conclusion not only for overt race-based discrimination but also for discrimination against racial minorities that results from the use of ostensibly neutral procedures lacking a clear justification. One might argue that similar conclusions extend to discrimination based on ethnicity. Whether our society should maintain that posture in fighting international terrorism is a matter the public might wish to debate. What we have endeavored to do in this brief review is to identify the difficult issues involved, not only in developing profiles but also in assessing their costs and benefits when such vitally important and almost impossible-to-quantify dimensions as public security and core principles of liberty and equality are at stake.