Revising Federal Standards: Issues for Consideration

The U.S. Office of Management and Budget is considering revision of the federal standards of racial and ethnic classification in response to two interrelated factors: the demographic and social changes that are occurring in the United States and increasing expressions of dissatisfaction with the current standards among data users, data providers, and the public. Moreover, these two factors affect and are affected by changes in people’s subjective views of their own racial and ethnic identity.

From a demographic perspective, the U.S. population has reached a stage at which it is ethnically diverse and ethnic intermarriage is increasingly common; consequently, there will be increasing numbers of people with multiple ancestries, for whom future preferences for self-identification are unknown. Workshop participants observed that these factors raise questions about the usefulness of demographic analysis and population projections based on conventional assumptions of “closed” ethnic groups with no exogamy. Perhaps the largest degree of consensus at the workshop was that any revision in the standard will itself need to be able to adapt to change.

Yet this recognition of the need for flexibility and adaptation conflicts with other considerations. For example, the responses by federal agencies to questions of criteria for standards showed fairly strong agreement on three issues (see below): exhaustive and mutually exclusive categories; simplicity in categorization; and as much continuity with historical data as possible.1 Agencies also

1  

The 1970 census definition of Hispanic, for example, based on surname allocation, makes it virtually impossible to compare data from it with data from the 1980 and 1990 censuses, based on self-identification.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification Revising Federal Standards: Issues for Consideration The U.S. Office of Management and Budget is considering revision of the federal standards of racial and ethnic classification in response to two interrelated factors: the demographic and social changes that are occurring in the United States and increasing expressions of dissatisfaction with the current standards among data users, data providers, and the public. Moreover, these two factors affect and are affected by changes in people’s subjective views of their own racial and ethnic identity. From a demographic perspective, the U.S. population has reached a stage at which it is ethnically diverse and ethnic intermarriage is increasingly common; consequently, there will be increasing numbers of people with multiple ancestries, for whom future preferences for self-identification are unknown. Workshop participants observed that these factors raise questions about the usefulness of demographic analysis and population projections based on conventional assumptions of “closed” ethnic groups with no exogamy. Perhaps the largest degree of consensus at the workshop was that any revision in the standard will itself need to be able to adapt to change. Yet this recognition of the need for flexibility and adaptation conflicts with other considerations. For example, the responses by federal agencies to questions of criteria for standards showed fairly strong agreement on three issues (see below): exhaustive and mutually exclusive categories; simplicity in categorization; and as much continuity with historical data as possible.1 Agencies also 1   The 1970 census definition of Hispanic, for example, based on surname allocation, makes it virtually impossible to compare data from it with data from the 1980 and 1990 censuses, based on self-identification.

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification agreed that considerable field testing should be undertaken before changes are made. It is not possible to both meet these criteria and have a flexible and adaptable system that recognizes a variety of subjective self-perceptions of racial and ethnic identity. This conflict about the criteria for a classification standard relates in part to the many purposes for the standard, which are far more varied than originally intended. At the federal level, the standards are used for statistical records and analysis, program administration, and civil rights compliance. Although Directive 15 was originally promulgated solely for the use of federal agencies, it has become the de facto standard for state and local agencies, the private sector, the nonprofit sector, and the research community. Users of the data are limited by available reporting and tabulations unless they go back to original survey tapes and create their own cross-classification tables by other groupings. In many cases even this is impossible, since the responses were coded in accordance with the Directive 15 categories. Directive 15 has also had a trickle-down effect. Since state and local agencies and the private and nonprofit sectors often use Directive 15 for information they report to the federal government, they may use the same categories to achieve internal consistency, even for data that are not destined for federal reporting. Similarly, researchers often use the Directive 15 race and ethnicity categories because they are the only classifications available in the data.2 Although Directive 15 was never intended to establish a national standard for race categories, it has come to function partly in that way. Thus, in considering revising federal standards, it is important to keep in mind that they are used to meet a wide range of data needs, collection methods, and presentation formats, from the detailed tables in government publications to the question format given to school children for enrollment. BASIS FOR CLASSIFICATION The Directive 15 standards for reporting of race and ethnicity are explicitly nonscientific, stating: “These classifications should not be interpreted as being scientific or anthropological in nature . . .” Although the directive is clear about what it is not, it is silent on the basis of its chosen categorization. National origin, race, culture, and community recognition are all combined to create a complicated and inconsistent classification scheme. At the workshop, one frequently heard suggestion for improving the categorization would be to state more clearly the principles on which it is based. 2   The major exception is research analysis of decennial census data, for which specific ethnic or ancestry categories are often created for analysis. And as noted above, some researchers also conduct their own surveys and develop their own race and ethnicity coding schemes.

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification One of the prime criticisms of the current classification is that it lacks a consistent logic. Some of the categories are racial, some are geographic, some are cultural. Several participants commented that a primary concern of respondents, if not of federal agencies, is the perception of a fair treatment in the definition of categories. The categories now represent a combination of historical, legal, and sociological factors, but that is not explicitly acknowledged. Much of the difficulty in the creation of a consistent, rational classification system lies in the fluid nature of what race and ethnicity are. Race and ethnicity are inherently complex concepts, with multiple sources of definition. There is no scientific basis for the legitimacy of race or ethnicity as taxonomic categories. That is, although there clearly are many and varied racial and ethnic distinctions, their multiplicity of sources defies a single-variable classification scheme based on a single individual characteristic. The challenge of creating logically consistent standards is magnified even more by self-definitions of race and ethnicity, in which a devised set of categories may well not coincide with people’s views of themselves. Schematization of the current classification highlights areas of inconsistency in the current definitions (see Table 2, above, and Appendix B). Geographic origin is the only criterion applied to all five categories, although the term used is “a person having origins in,” and “origin” is not defined. Since the initial origins of all humans are unknown, the term does not refer to the far distant past; nor does it refer to the immediate past, when the origins of all U.S. residents except immigrants would be North America. Rather, the concept is indeterminate as to timing—one’s parents? grandparents? any ancestor?—and so very open to differences in interpretation, especially if the basis of classification is self-identification. The black category is the only explicitly racial one: “A person having origins in any of the black racial groups of Africa. ” It is also inconsistent with the definition for the white category: “A person having origins in any of the original peoples of Europe, North Africa, or the Middle East,” which does not include anything about racial membership.3 The racial definition for black is apparently meant to exclude white immigrants from Africa. Cultural affinity is used to define Hispanics and American Indians, although in slightly different ways. Hispanics are “of” Spanish culture, while American Indians or Alaskan Natives “maintain cultural identification through tribal affiliation or community recognition.” From a purely taxonomic standpoint, the current classification suffers from several faults. It does not cover persons of Australian or New Zealand origin. It places persons from Spain in both the white and Hispanic categories, since Spain 3   The lack of a definition of “origin” is particularly noticeable for this category, which uses both “having origins in” and “original people of.”

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification is a European country. It allows no place for American Indians whose cultural identity is not recognized by a tribe, but who are nonetheless descendants of the original peoples of North America. It leaves ambiguous or confused the status of many groups, including people from Brazil, Madagascar, and Cape Verde. From a social standpoint, the current definitions seem to reinforce the link between dark skin color and race, while ignoring skin color in the definitions of nonblack groups. Although some observers protest the inconsistency and conceptual vagueness of the current categories, defenders of them point out that the very complexity of the definitions reflect the real-world complexity of race and ethnicity. And in some practical ways, Directive 15 is successful: the categories correspond to generally identifiable categories for the vast majority of Americans. A MIXED-RACE OR “OTHER” CATEGORY The practical considerations of adding a multirace category or some other option for people of multiple ancestry are many. First, there is the issue of the status of multiracial peoples in terms of current civil rights legislation. Given that the current approach includes only one majority group, “white,” a key question would be whether multiracial people would be a protected category, with the same legal rights to representation as current minority categories, or whether multiracial responses would be coded back to one of the existing five categories. Second, if a variety of responses are going to be reallocated to the existing or any other set of categories, any coding algorithm will present potential controversies. For example, one option that has been proposed is to redistribute multiracial people to the current single-race categories: that is, if there were a population of 80 whites, 10 blacks, and 10 multiracial individuals, the recoding would be 89 whites and 11 blacks. This algorithm does not change the relative sizes of the single-race categories, but it is an arbitrary allocation of multiracial people to single-race categories. If the 10 multiracial people each had one black parent and one white parent, the recoding algorithm totally misrepresents the multiracial people. And if a person’s parents are multiracial, the complexity of classification will further increase. Another option for a multirace category is to allow respondents to write in multiple races, but the large number of races that is likely to result only raises more questions of classification. The reallocation of a multiple-race response to a single-race category would also raise difficulties, since the respondent had clearly indicated a preference for an identity other than one of a single race (or ethnicity). If self-identification is taken as a basic principle, there are no grounds for recoding a multirace person to a single race. It is difficult to imagine any logical recoding algorithm for people who decline to provide a single-race affiliation. In a 1993 set of congressional hearings by the House Subcommittee on

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification Census, Statistics and Postal Personnel on the federal racial and ethnic standards, representatives of some civil rights and ethnic organizations said that they would be open to the testing of a multirace category or the self-reporting of multiple race ancestries (Fletcher, 1994; Der, 1994). But there was opposition from some American Indian groups, who argued that inclusion of a “multirace” category would compromise the current usefulness of data on the American Indian and Alaskan Native population. According to workshop participants, American Indian groups have not been asked about their reaction to collecting data with multiple responses. Such an approach would allow separate counts of the groups, although in a more complicated manner. Third, there are questions about the statistical reliability of any new category (see below). Small categories are more vulnerable to inaccuracies from sampling error than are large categories. They are also more vulnerable to the mismatches of numerators and denominators in rate calculations. Another factor that is related to statistical reliability concerns the variability of categorization over time. People who identify themselves as multiracial at one stage in life may identify themselves in a single race or ethnic category at another time.4 Fourth, there is likely to be an element of confusion and nonresponse to a mixed-race category. The majority of Americans are probably of mixed ancestry, if one defines ancestry with sufficient narrowness, so the question of how far back in time to consider would be raised. Presumably, the purpose of a mixed-race category would be to include only those people whose parents are in separate race and ethnicity categories under the definitions in the current federal standard. It would be important and perhaps difficult to make this clear to respondents. The Census Bureau’s experience with the “other” race category in the last two censuses provides some information about what the addition of such a category might entail. A large proportion (41 percent) of the write-in responses were reclassified into one of the Directive 15 categories. As noted above, for the Modified Age-Race-Sex (MARS) file that the Census Bureau prepares for the use of other agencies and researchers, people in the “other race” category are assigned the same race as another nearby (in terms of processing) person who gave the same answer to the Hispanic-origin question. There is no way to evaluate how this reclassification corresponds to people’s self-perceptions or, in fact, to any other basis for classification. Additional experience with write-in responses comes from those to the open-ended ancestry questions in the 1980 and 1990 censuses. Such questions have the advantage of flexibility for respondents and offer a means of studying trends and 4   Of course, this possibility also exists for multiracial people under the current standards as well, but it is likely that the creation of an intermediate category would increase transitions between categories, since the “distance” between the categories would presumably be closer (the distance between black and multirace in contrast to the distance between black and white).

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification characteristics of self-identity. But they also have several disadvantages. First, although there are certainly some people who prefer a write-in response, there are probably also some people for whom an open-ended question may cause confusion about what they are “supposed” to respond. Thus, it is not entirely clear that an open-ended question results in higher response rates or more accurate responses. Second, the processing costs of open-ended questions are higher than those for fixed categories. Third, if the write-in responses are to be reallocated to major groups, classification algorithms must be written, tested, and harmonized across agencies. Without reallocation to a limited group of categories, agencies face the burden of how to present hundreds of different responses, making it virtually impossible to cross-classify race with other variables. Even some apparently simple nomenclature or classification issues may be troublesome: for example, should “blacks” and “African Americans” be grouped together, and if so, under what label? Lastly, as noted above, it is not clear how existing civil rights law, written to accommodate major groups, would accommodate the varied and subtle distinctions that would result from open-ended questions. More broadly, it is also not clear what significance to attach to such distinctions: are they linguistic or cultural and do they have social consequences? Open-ended questions may prove more feasible for major statistical agencies with large data processing resources, like the Census Bureau, than for agencies for which the collection of racial and ethnic data is only a small portion of their administrative mandate, such as the Federal Reserve Board or the Department of Veterans Affairs. Overall, however, closed-ended questions are preferable. Most surveys and administrative forms have limited space, and answer categories need to be obvious to respondents. MEASUREMENT ISSUES Measurement issues are at the heart of the reasons for considering revisions to Directive 15; they are also central concerns for the evaluation of possible changes. This section describes five major measurement issues: relationship between statistical systems, validation of race and ethnicity data, correspondence of self-identification and observer identification, coding, and sample size requirements for surveys. Census data and data from other federal and nonfederal statistical systems are closely related and interdependent. For example, census data are the denominator for virtually all birth, mortality, and morbidity rates. In turn, these rates are used to develop the Census Bureau’s population estimates and projections. Most federal agencies use census data as the baseline for designing sampling frames, as well as the basis for personnel pools for equal employment opportunity compliance. However, the census is not the only source of racial and ethnic data. Other

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification agencies may have different estimates and projections due to variability in data collection methods, the content and format of questions, and the application of Directive 15 categories. Different methods for obtaining information on race (namely self-identification and observer identification) can yield different responses even for the same person. As noted above, the choice in the current standards to report racial and ethnic data separately or in a combined format results in different counts of racial groups. The decennial census collects data on an “other” race, unlike most other agency surveys. As noted above, in order to provide comparable race data with other statistical systems, the Census Bureau developed a Modified Age-Race-Sex file that reassigns the “other race” entries to one of the standard categories. The Bureau uses a complicated set of algorithms to specify the reallocation process. Since it is the only agency with an exception for this “other” category, it does not need to standardize its algorithm with other agencies. However, if a new system provides this other option, a standard algorithm would be needed across agencies. The alternative option would be one of not reclassifying, simply listing “other race” in tabulations. Statistical categories are often measured by two not necessarily complementary criteria—reliability and validity. Reliability is measured by the consistency of responses at two or more times, while validity is the degree to which the categories measure the concept that the researchers intend to measure (i.e., one’s racial or ethnic identity). For racial and ethnic data in the decennial census, consistency varies by categories. In general, there are higher consistency rates for larger and older categories, such as blacks and whites, and lower rates for smaller and newer categories, such as Hispanics. Because of the many factors of self-identity and social context discussed throughout this report, there are no commonly accepted methods for measuring the validity of racial and ethnic data. Racial and ethnic classifications as outlined in Directive 15 explicitly allow for self-identification. Some workshop participants pointed out that the acceptance of subjective self-classification was itself quite revolutionary in its time. It replaced such methods as blood quantum and observer identification, which were at one time probably believed to be objective, but are now also recognized as subjective. The fact that self-identification is taken for granted as the preferred means for classification reflects the social changes that have taken place in U.S. society in recent years. Although most federal agencies, localities, businesses, and researchers express strong preferences for self-identification of race and ethnicity, self-reporting is not always practical or possible. Births and deaths are examples of vital events that cannot be self-reported, and they are only the extremes of the mechanisms for identifying race and ethnicity that vary throughout a person’s life. Newborns are assigned the race and ethnicity of their mother, as noted above. In childhood, race and ethnicity are identified by several different parties. At home, for example, when responding to the census, the person completing the

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification questionnaire reports children’s race and ethnicity. In school, children, parents, or administrators may be the primary reporters. In hospitals and other institutional settings, children, parents, or administrators may do the classifying. As people get older, they self-report more, particularly if they become heads of households. Many adult members of households are not the heads of their households, however, and even for the census any household member may provide responses for all household members. Moreover, in some states, people are asked to self-report race and ethnicity in employment situations, but institutions do sometimes use observer classification. The extent of administrative writing in is not known, but examples of its occurrence are found both when a person does not respond to race and ethnicity questions and when employers (or other institutions) decide to classify individuals, a permissible practice under the current standards. In fact, a workshop participant from private industry pointed out that some states, including New York and Ohio, prohibit self-reporting on employment applications (H. Kintner, personal communication). The various types of racial identification over the life cycle of an individual are shown in Table 5. In addition to the various sources of identification and classification over a person’s life, there is also the temporal effect of evolving racial and ethnic identity. During people’s lives, the concepts of race and ethnicity around them may change, and as they get older, their own racial and ethnic identifications may change. Newly identified groups emerge, in response to changes in their civil status and the presence of new immigrants. As the society adjusts to new immigrants and the immigrants themselves assimilate to U.S. culture, the categories may again change. The principal requirement of the coding of race and ethnicity responses under the current standard is that subcategories may be grouped under the five major categories. Since the current directive allows neither a write-in response nor multiple responses, coding is not a major issue. However, if a new standard included open-ended questions, there would be two major concerns: the cost of processing the responses to such a question and the algorithms used to recode or reclassify the write-in responses to other categories. Two distinct types of accuracy are at issue in racial and ethnic classification. The first, accurate classification of individuals, receives the bulk of attention in this report. The second, the accuracy of group estimates, is a statistical issue that is at least of equal importance. Group estimates—whether of unemployment rates, vital statistics, housing discrimination claims, or any other measure that classifies by race and ethnicity—are actually the object of most of the work that involves Directive 15. Since the primary objective of the racial and ethnic breakdown is to gain insight into disparities between groups, stable estimates are of critical importance. Statistical reliability becomes an issue whenever the characteristics of a sample population are being generalized to the whole population. Estimates of the national unemployment rate, for example, are developed not by complete enumeration of the population, but rather by information from a sample survey of the population. The estimate for the national population is the estimate from the sample plus or minus the standard error generated by the sampling process. The bigger the sample, the smaller the error attached to the estimate from it so that the estimate is more likely to be an accurate measure of the larger population.

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification TABLE 5 Racial and Ethnic Identification and Classification In a Person’s Life Life-Cycle Stage Person Making Decision Birth Mother Childhood Household head for the decennial census Self, parents, or administrators for school forms Self, parents, or administrators for health forms Adulthood Household respondent for the decennial census Self or administrators for employment forms Self or administrators for health forms Self or administrators for miscellaneous governmental and business forms Death Physician or funeral home administrator (perhaps in consultation with relatives) enumeration of the population, but rather by information from a sample survey of the population. The estimate for the national population is the estimate from the sample plus or minus the standard error generated by the sampling process. The bigger the sample, the smaller the error attached to the estimate from it so that the estimate is more likely to be an accurate measure of the larger population. The importance of sample size for data on racial and ethnic groups is that statistical reliability depends on the number of people sampled within each group, not on the percentage of the full sample that belongs to the group. To achieve the same degree of statistical reliability, one must sample as many cases from a small group as from a large group. Thus, if one wants the same statistical reliability for two groups, one of which is 2 percent of the population and one of which is 20 percent of the population, one may need a sampling rate for the small group that is 10 times higher than that needed for the large group. In the current categorization system, the smallest group, American Indians, accounts for close to 1 percent of the population. In an extremely large national survey, like the decennial census long-form questionnaire (which was sent to 1 of every 6 households in the United States in 1990), analyzing data for a small group is not necessarily problematic because the absolute number of respondents will still be quite large. However, the number of people in a small group may not be sufficiently large for statistical reliability for data for such subnational areas as states, cities, counties, or census tracts. In smaller surveys, such as the Current Population Survey or for detailed multivariate analyses, reliable estimation of the characteristics of small racial or ethnic groups can only be achieved, even at the national level, through oversampling, which is expensive. This problem is even

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification more difficult for characteristics for small subgroups, such as high school dropouts among black or Hispanic women. TECHNICAL ISSUES OF CLASSIFICATION Workshop participants raised four important issues regarding classification of racial and ethnic data (some of which overlap the broader issues discussed above): comparability of data between agencies and over time, mutually exclusive categories, fuzzy and dynamic boundaries between groups, and exhaustive categories. Comparability of data between agencies and over time is of great concern to both the federal agencies and to the users of government data. The first goal is an explicit aim of Directive 15, and the second has emerged as the directive has become a widely used standard. The large degree of change in census formats makes comparability of racial and ethnic data for long periods difficult. However, in accordance with Directive 15, administrative statistics have been kept on an annual basis since 1978, creating an extensive amount of comparable data for almost two decades. Agencies, particularly those involved in enforcing civil rights legislation, have expressed concern that changing the standards would make comparisons within and across race and ethnicity categories over time more difficult. They urge that consideration be given to the ways in which any new categories would be made to correspond to the current categories. Interagency consistency in using the categories seems to be fairly high under the present system. What remained unclear from the workshop is whether data are comparable across agencies in view of the varied data collection and presentation procedures and the lack of evaluation of these data by most agencies. Proposed changes that require additional interpretation of race and ethnicity categories also raise the possibility of differential implementation by agencies. The desire for mutually exclusive categories is widespread among data gatherers and users. Not only does such exclusivity make data processing and analysis a manageable task (since the total population will equal the sum of category subpopulations), but it also requires people to identify with exactly one group, which is believed to enhance the consistency of responses over time. The alternative to mutually exclusive classification is to allow multiple responses. Although such an approach permits some people a more accurate self-identification, there is no straightforward method for comparing multiple responses with single responses. And, as noted above, there is no unambiguous logical way to recode multirace responses into single-race categories.5 5   Multiple ethnic responses have been used for recent censuses in Canada, with Statistics Canada (the national statistical agency) coding and aggregating the data as needed for federal agencies and other data users. The Canadian experience provides interesting ways to deal with nonexclusive categories; see White et al. (1993) for a summary of Canadian use of ethnic data. It is not clear, however, how applicable the Canadian statistical experience is to the demands in the United States for racial and ethnic data for protected classes in civil rights legislation and for other compliance and legislative purposes.

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification As noted repeatedly at the workshop and throughout this report, ethnic boundaries can be very changeable, and the data on ethnic groups have been fluid and fuzzy. The numbers reported in the decennial censuses for ethnic groups, primarily American Indians, have shifted in ways that cannot be explained by purely demographic flows (births, deaths, and international migration).6 Although some of the shifts have occurred in response to changes in the design and content of questionnaires, shifts have also occurred in response to the perceived popularity (e.g., higher status ethnic groups) or unpopularity (e.g., German ethnicity during World War II) (Ryder, 1955; Winawer-Steiner and Wetzel, 1982). New groups have emerged or faded as some immigrants have changed their sense of identification with acculturation, prolonged residence, intermarriage, and assimilation. Some groups, notably indigenous peoples, have attempted to maintain their group boundaries despite intermarriage. The dynamic nature of ethnicity has resulted in boundaries that are not distinct for some groups. A highly distinct group would not have any members that might be in another group. The boundary between Hispanic and non-Hispanic, for example, is not highly distinctive. One could specify two characteristics that are common to Hispanic group membership: origin from one of the Spanish-speaking countries or Latin America, the Caribbean, or Spain and Spanish-speaking or a descendant of a Spanish-speaking person. People who have these characteristics are likely to report themselves as Hispanic. However, there are people who report themselves as Hispanic who do not have both of these characteristics: they may be from Latin America but speak a native language. And there are people who have both characteristics but do not think of themselves as Hispanic: for example, Chileans of Italian origin. There are also groups such as Filipinos, who differ widely in their reporting of Hispanic ethnicity. And of course there are people of mixed black and Hispanic ancestry, who may identify more strongly with the black identity and thus not report themselves as Hispanic. Because of the complexity of ethnicity, with its determination affected by many characteristics, the fuzzy nature of boundaries presents special challenges 6   For example, the number of people reporting Cajun as their ancestry group grew from 30,000 in the 1980 census to 600,000 in the 1990 census (Statistics Canada and Bureau of the Census, 1993:42). Although natural fluctuation may explain part of the increase, another possible factor is that the examples provided on the census questionnaire serve as a response category. In 1990 Cajun was added as an example to the ancestry question on the census long form. Further evidence of this effect is that the number of respondents reporting French ancestry declined from about 13 million in 1980, when French was given as an example, to 10 million in 1990, when French was not given as an example (McKenney and Cresce, 1993:189).

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification for the reporting of ethnicity for censuses and surveys. For most ethnic groups, there is a lack of separateness in the characteristics for classification. Groups usually overlap on one or more variables for classification so that, especially for surveys and censuses with self-reported data, there is a special elasticity in reported ethnicity. And as people shift their sense of ethnicity, other groups may also shift in response. In speaking about ethnic boundaries, Canadian demographer T. John Samuel remarked at the 1992 Statistics Canada-Bureau of the Census Conference on the Measurement of Ethnicity: “[Maybe] it is time to move away from notions of mosaic, flower garden, the rainbow, the symphony orchestra, patchwork quilt and kaleidoscope to a fruit punch, the ingredients of which can easily change, thereby changing the taste itself” (Statistics Canada and U.S. Bureau of the Census, 1993:81). Along with mutually exclusive categories, workshop participants expressed a strong preference for exhaustive categories—a system of categorization that includes every person in the population or every combination of attributes used for classification.7 The current system—at least in its formal rules—has been criticized for not being exhaustive (e.g., Hahn and Stroup, 1994): for instance, it does not include people whose ancestry is American Indian but who do not identify culturally with a tribe, and it does not include blacks whose origins are from outside Africa. Since these groups are quite small, however, their omission does not appear to have caused any serious distortions in racial and ethnic data. The argument in favor of modifying existing categories to be exhaustive is a logical one, in line with that for a rational classification system. But there are many pitfalls. For example, the definition of American Indian is strongly intertwined with legal definitions of tribal status. For another example, the inclusion of people of Australian aboriginal populations in the black category raises questions about the meaning of “black”: in Directive 15, black is defined as people of African origin. OPERATIONAL ISSUES Because Directive 15 has been in place for more than a decade, most agencies have adjusted to its categories in their data collection and reporting systems. Workshop participants noted concerns about several operational aspects of a possible change in the federal standard: the burden on respondents and those who deal with the data, effects on current operations, and costs. Simplicity is a much-valued characteristic of a system for individual respon 7   The idea of exhaustive categories is sometimes taken to mean that all socially recognized groups should be specified in the federal standards—that is, exhaustive of racial and ethnic groups, rather than exhaustive of everyone in the population. Such an approach might be interested in categories for Jews or Muslims. However, there appears to be little rationale for a standard that is exhaustive of all groups.

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification dents, for the clerical workers who process data, and for analysis and presentation. The present classification system may have its inconsistencies, logical peculiarities, and unhappy respondents, but at least it is now fairly simple and effective for a large portion of the population and the administrative agencies. The one major exception is Hispanics. For the race question in the 1990 census, about 40 percent of Hispanics checked “other race”: the meaning of the current categories—white or black—clearly does not conform with the self-identification of a substantial proportion of Hispanics. If the current system is modified, it will alienate those who are comfortable with the current standard. In considering whether to change the standard, one has to consider the tradeoff between the people who are dissatisfied with the current system and those who are likely to be dissatisfied with a new one. For the latter group, they may simply choose not to respond because of complexity—more choices, more explanations. In conforming more closely to the nuances of social reality, changes in the system that include more complex categorization will undoubtedly increase the burden on respondents, whether they are self-reporting or dealing with an observer (e.g., a survey taker). Most obviously, instructions for new categories would be needed, and in some cases they might be relatively lengthy. For example, an explanation of what is “multirace” or the definition of “Middle East” ancestry would be essential if those categories are introduced. An open-ended question may also be burdensome for some respondents: it is much easier for most respondents to choose from fixed categories than to write in a response. Open-ended questions require understanding beyond the given categories and the ability to provide an alternate response (Schuman and Presser, 1981). Given the inconsistencies between self-reporting and observer reporting for fixed categories, open-ended items may prove to be even more challenging for respondents than the current system. Operations will necessarily be affected by a new classification system. Inaccuracies introduced in the transition from one system to another are inevitable and of unknown cost and quantity. If a new system involved reallocation (recoding) procedures, they will introduce more errors. Even after a new system is in place, there is the ongoing burden of reconciling data collected under the different systems. Changes would also seriously affect the monitoring of activities related to civil rights compliance, which require comparable data over time. Because the Directive 15 standards are now widely used, any change in them would also have operational effects on state and local agencies, as well as for some private companies and researchers. Finally, there are the actual time and money costs of a change. Workshop participants agreed that these costs cannot be quantified, but they certainly must be considered. The most obvious short-term cost to the statistical agencies—although probably the least expensive in the long term—would be the costs of redesigning forms and printing them. More substantial costs would be those related to data processing and presentation. If a new system included open-ended

OCR for page 35
Spotlight on Heterogeneity: The Federal Standards for Racial and Ethnic Classification responses, there are costs for processing those responses. For example, the staff costs of coding the 8 million write-in responses to the “other” race item in the 1990 census were $1.7 million, or 21 cents per response. A different kind of cost is that related to educating respondents, data collectors, and data users about the new standards. In light of the many and unknown costs, there was wide consensus among the workshop participants that the advantages of any new standards should be demonstrated and that there should be substantial field testing of them. Nevertheless, they noted that the burdens and costs related to possible revisions, in and of themselves, are not sufficient reasons for maintaining the current standards.