3
Measuring Race, Ethnicity, Socioeconomic Position, and Acculturation

The last chapter discussed the meaning of the concepts of race, ethnicity, socioeconomic position (SEP), and acculturation and language use and how they interact in affecting health and health care. This chapter discusses how these concepts are measured. The chapter first describes how race and ethnicity are typically measured in U.S. data systems and then describes measures of SEP and acculturation and language use. We then discuss some data collection issues that apply to the measurement of each of these concepts.

RACE AND ETHNICITY

The complexities of defining race and ethnicity make measuring these concepts complex as well. In part, these complexities arise because the concepts are defined socially and politically, and although categorizations are often made based on phenotypical characteristics, they are not clear biological concepts. Individuals classify themselves in racial and ethnic categories but are also classified based on others’ perceptions. An individual’s self-report of race or ethnicity is probably the most useful and the most consistent measure of his or her race and ethnicity and is, therefore, the one most frequently used. But even self-classifications may not be consistent across settings or time (Harris, 2002). Others’ perceptions of an individual’s race and ethnicity are less consistent but may be of interest in some circumstances; for example, a physician’s assessment of a patient’s racial or ethnic background may be relevant to understanding the pattern of treatment received.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs 3 Measuring Race, Ethnicity, Socioeconomic Position, and Acculturation The last chapter discussed the meaning of the concepts of race, ethnicity, socioeconomic position (SEP), and acculturation and language use and how they interact in affecting health and health care. This chapter discusses how these concepts are measured. The chapter first describes how race and ethnicity are typically measured in U.S. data systems and then describes measures of SEP and acculturation and language use. We then discuss some data collection issues that apply to the measurement of each of these concepts. RACE AND ETHNICITY The complexities of defining race and ethnicity make measuring these concepts complex as well. In part, these complexities arise because the concepts are defined socially and politically, and although categorizations are often made based on phenotypical characteristics, they are not clear biological concepts. Individuals classify themselves in racial and ethnic categories but are also classified based on others’ perceptions. An individual’s self-report of race or ethnicity is probably the most useful and the most consistent measure of his or her race and ethnicity and is, therefore, the one most frequently used. But even self-classifications may not be consistent across settings or time (Harris, 2002). Others’ perceptions of an individual’s race and ethnicity are less consistent but may be of interest in some circumstances; for example, a physician’s assessment of a patient’s racial or ethnic background may be relevant to understanding the pattern of treatment received.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs The ways that race and ethnicity have been classified in data collection have changed over time and across settings. This includes classification in federal data collection systems, as we will detail below, but also data collected through record systems outside the purview of the federal government. This lack of consistency across settings or across methods for collecting data on race and ethnicity can pose problems for the interpretation of such data. For example, the population of American Indians is reported to have tripled between the 1960 and 1990 censuses, an increase that cannot be explained by migration or demographic changes (Sandefur et al., 2002) but that may be attributable to variations in self-identification or in the federal definition of that identity. Office of Management and Budget Standards for Collecting Data on Race and Ethnicity The federal government has been collecting data on race since the first U.S. census in 1790, with data on ethnic background added in later censuses. Census standards for racial classification have changed greatly over time. In the first census, enumerators classified free residents as white or “other,” with slaves counted separately. Table 3-1 gives a history of how race and ethnicity have been classified in each census since 1790. Over time, as the nation’s population became more diverse and as individual ethnic groups identified themselves, more categories were added and occasionally some were dropped. Some changes worth noting in recent decades include the addition of a question on Hispanic ethnicity as a separate item in 1977, and in 2000 the option for allowing individuals to identify with more than one racial group. No federal standards for the collection of data on race and ethnicity existed until 1977, when the Office of Management and Budget (OMB) developed and issued a set of standards, called Statistical Directive Number 15, for the collection of these data. These standards were developed to provide consistency in defining race and ethnicity for civil rights legislative use, monitoring equal treatment, and other public policy uses (NRC, 2004). The Statistical Directive Number 15 classification system included four categories for race (white, black, Asian or Pacific Islander, and American Indian or Alaska Native) and two for ethnicity (Hispanic and non-Hispanic). Self-report was established as the preferred method of collecting data, and respondents were instructed to choose only one race and one ethnicity.1 1   Because Hispanic origin is given special priority, equal to basic racial categories, in the OMB standards, the term ethnicity is often used to refer solely to the response to the question on Hispanic or non-Hispanic origin. Throughout this report, the term ethnicity is used both

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs TABLE 3-1 Racial Categories in the U.S. Census, 1790-2000 Year Category 1790 Free whites, Other free persons, Slaves 1800 and 1810 Free whites; Other free persons, except Indians not taxed; Slaves 1820 Free whites; Slaves; Free colored persons; Other persons, except Indians not taxed 1830 and 1840 Free white persons, Slaves, Free colored persons 1850 White, Black, Mulatto 1860 White, Black, Mulatto, Indian 1870 and 1880 White, Black, Mulatto, Chinese, Indian 1890 White, Black, Mulatto, Quadroon, Octoroon, Chinese, Japanese, Indian 1900 White, Black, Chinese, Japanese, Indian 1910 White, Black, Mulatto, Chinese, Japanese, Indian, Other (plus write-in) 1920 White, Black, Mulatto, Indian, Chinese, Japanese, Filipino, Hindu, Korean, Other (plus write-in) 1930 White, Negro, Mexican, Indian, Chinese, Japanese, Filipino, Hindu, Korean, Other races (spell out in full) 1940 White, Negro, Indian, Chinese, Japanese, Filipino, Hindu, Korean, Other races (spell out in full) 1950 White, Negro, Indian, Japanese, Chinese, Filipino, Other race (spell out) 1960 White, Negro, American Indian, Japanese, Chinese, Filipino, Hawaiian, Part Hawaiian, Aleut, Eskimo 1970 White, Negro or Black, Indian (American), Japanese, Chinese, Filipino, Hawaiian, Korean, Other (print race) 1980 White, Negro, Japanese, Chinese, Filipino, Korean, Vietnamese, Indian (American), Asian Indian, Hawaiian, Guamanian, Samoan, Eskimo, Aleut, Other (specify) 1990 White, Black, Indian (American), Eskimo, Aleut, Chinese, Filipino, Hawaiian, Korean, Vietnamese, Japanese, Asian Indian, Samoan, Guamanian, Other Asian Pacific Islander, Other race 2000 White; Black, African American, or Negro; American Indian or Alaska Native (specify tribe); Asian Indian; Chinese; Filipino; Other Asian (print race); Japanese; Korean; Vietnamese; Hawaiian; Guamanian or Chamorro; Samoan; Other Pacific Islander (print race); Some other race (individuals who consider themselves multiracial can choose two or more races) SOURCE: National Research Council (2004). The use of these standards was required in all census and survey data collected by the federal government, as well as for federal administrative records and federally sponsored research (OMB, 1977). The standards did not apply to state or private data collection efforts except as required by     in that limited sense (to refer only to Hispanic origin) and in a broader sense to refer to other ethnic distinctions.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs state-run federal programs, although some private surveys, especially those financed by the federal government, used these same classifications (see also Mays et al., 2003). In the past 2 decades, the U.S. population has become significantly more heterogeneous as the populations of nonwhite minority groups have grown. After congressional hearings in 1993 (House Subcommittee on Census, Statistics, and Postal Personnel), OMB announced a review of the 1977 standards. In addition to accounting for increased heterogeneity, the review was also to address challenges in identifying those of multiracial heritage. The interagency review included tests of questions regarding race, multiple racial categories, and ethnic categories in a special supplement to the 1995 Current Population Survey (CPS) (BLS, 1995), as well as in both the 1996 National Content Survey and the Race and Ethnicity Targeted Test (U.S. Bureau of the Census, 1996). Based on the results of these tests and public comment, OMB revised Statistical Directive Number 15 in 1997. The revised standards featured three important changes. First, five racial categories were to be used in measuring race: Black or African American, White, Asian, American Indian and Alaska Native, Native Hawaiian and other Pacific Islander. Second, respondents would be permitted to select more than one race. And third, the question on ethnicity was to be changed by asking respondents whether or not they were Hispanic or Latino, and was to be asked before the race question was asked. The standards were to be effective immediately for all new and revised federal systems and no later than January 1, 2003, for existing systems. Agencies are permitted to add categories when more detailed data are needed as long as the data can be aggregated to the minimum five categories of race. As with the 1977 guidelines, these minimum standards apply to all federal data collection activities but not to state or private-sector data collection, except when required for federally sponsored statistical data collections, including all federal administrative and grant reporting (OMB, 1997).2 OMB emphasized “The categories represent a social-political construct designed for collecting data on the race and ethnicity of broad population groups in this country, and are not anthropologically or scientifically based” (OMB, 1997, p. 16).3 In many instances, especially in health settings where culture and environment play an important role in outcomes, finer measures of ethnicity are needed in order to measure heterogeneity within broad ethnic categories. For example, heterogeneity among Hispanics in terms of their health has 2   See http://www.whitehouse.gov/omb/fedreg/ombdir15.html. 3   As social and political changes occur, these broad categories may or may not be appropriate in the future. Presumably the OMB will revisit the categories as the need arises.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs been documented. One landmark study of low birth weight and infant mortality among Hispanics (Becerra et al., 1991) showed that infant mortality rates varied by the infant’s mother’s Hispanic ethnicity: infants of Puerto Rican descent had the lowest birth weights and the highest infant mortality rates of all Hispanic groups, while infants of Cuban descent had the lowest infant mortality rate. Thus, especially for more localized use of data, it is important to distinguish among ethnic groups within a racial category (e.g., between Filipinos and Japanese). State and local public health agencies need to target programs and interventions to particular groups with specific problems and communication needs. More refined measures of ethnicity are also needed because these ethnic groups want to know about the health of the populations in their communities. The revised OMB standards allow for the collection of data in more narrowly defined categories such as these, as long as the additional categories can be aggregated back to the standard categories. The Importance of Collecting Data on Race and Ethnicity As discussed above, data on race and ethnicity are necessary to measure disparities in health and health care in order to understand the causes of them. Such data can be obtained from federal, state, and private-sector data collection systems, including surveys and records used for health care services and programs. There is currently considerable variability in the kinds of data collected in these systems, and the panel believes that it is important for each system to collect standardized racial and ethnic data. A concerted effort involving both the public and private sectors is needed to improve the recording of individuals’ racial and ethnic descriptors in health information systems and to enhance the nation’s capacity to generate health information of comparable content and quality for all racial and ethnic groups, particularly those segments at highest risk of health problems. CONCLUSION 3-1: Measures of race and ethnicity should be obtained in all health and health care data systems. SOCIOECONOMIC POSITION Chapter 2 identified crucial dimensions of SEP—education, occupation, current income, wealth, and life history of income—and briefly discussed how they are related to health and health care. In this section, we briefly discuss measurement of these dimensions of SEP. There is a significant literature on the measurement of these dimensions and with respect to their relationship to health and health care (see the paper by O’Campo and

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs Burke in Appendix C; Duncan et al., 2002; Oaks and Rossi, 2003; Williams, 1996). Educational Attainment Level of education reflects several aspects of resources and ability relevant to health and health care. In addition to conveying a certain level of ability—intellectual, behavioral, and financial—it can also be used as a proxy for one’s knowledge of or ability to process and understand factors affecting health as well as diagnoses and treatments. It may also bear on one’s ability to communicate effectively with physicians and other health care professionals. One important characteristic of educational measures is that they are relatively stable over the adult life course and thus measure a “permanent” component of SEP. Thus education is also a measure of resources that have consequences throughout the life course. Education is correlated with other concepts of SEP—occupation, income, and social status—that are related to health and health outcomes. Most often, education level is measured by years of schooling completed or by credential obtained (for example, high school diploma, associate’s degree, bachelor’s degree, etc.). Usually, for health surveys and data collection, level of education is measured in one question. The federal government currently has no standard way of measuring educational attainment in its data collection efforts, but a Federal Interagency Committee on Measures of Educational Attainment recommended that the measure of education used on the 2000 census long form should be the model for measuring education in all federal surveys and administrative data collections. This measure, shown in Box 3-1, combines years of schooling completed through high school with detailed categories for college and advanced degrees (Federal Interagency Committee on Measures of Educational Attainment, 2000). One deficiency of measures of educational attainment is that they do not reflect geographic and individual variations in the quality of education and therefore imprecisely reflect literacy levels and other intellectual skills. Occupation Occupation constitutes another distinct aspect of SEP. A limited amount of information on occupation can be collected by a single open-ended response item. This measure has the advantage of simplicity and brevity, and so can be collected on record systems where there is not much time to get many details about one’s occupation. Furthermore, people are less reluctant to report their occupation than their income and wealth. However,

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs BOX 3-1 Educational Attainment Question from the 2000 Census Long Form What is the highest degree or level of school this person has COMPLETED? Mark ONE box. If currently enrolled, mark the previous grade or highest degree received. No schooling completed Nursery school to 4th grade 5th grade or 6th grade 7th grade or 8th grade 9th grade 10th grade 11th grade 12th grade, NO DIPLOMA HIGH SCHOOL GRADUATE—high school DIPLOMA or the equivalent (for example: GED) Some college credit, but less than 1 year 1 or more years of college, no degree Associate’s degree (for example: AA, AS) Bachelor’s degree (for example: BA, AB, BS) Master’s degree (for example: MA, MS, MEng, MEd, MSW, MBA) Professional degree (for example: MD, DDS, DVM, LLD, JD) Doctorate degree (for example: PhD, EdD) SOURCE: U.S. Bureau of the Census (2000). coding these responses can be quite difficult. In addition, such a measure may not provide much information on social or economic conditions gained through the occupation, or on environmental and work conditions to which one might be exposed to while at work. Thus, the simple measure may not always be a strong indicator for health research. Additional information on specific occupation, industry, and years at the position may be collected to give a better sense of the conditions of the occupation. Studies concerned with environmental and occupational risks that may affect health would need additional information on those conditions. These are more difficult to collect and unlikely to be collected except for studies examining very specific health threats. The federal government’s standard categorization system for measuring occupation, the Standard Occupation Classification System (see www.bls.gov/soc/home.html), hierarchically classifies occupations into both broad and very specific groups. The system is intended to be used by all federal statistical agencies that collect data to classify workers.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs Current Income Current income can be an important determinant of the resources available to an individual or family. It is usually measured as an individual’s, family’s, or household’s total cash income, over a month, the preceding 12 months, or the calendar year, and usually on a pretax basis. Posttax information may be a more relevant measure, but it is more difficult to obtain (Duncan et al., 2002). Individual, family, and household income measures can be relevant to health and health care. Family or household income is usually based on the sum of the incomes of family or household members and is often used to measure individual SEP, implicitly assuming that resources are shared by the family. In reality, however, some members may have access to more of the family’s resources than others, or have more control over how the assets are allocated; for example, when one adult family member is a homemaker and one has paid employment. Conversely, individual income may not accurately measure an individual’s resources if those resources are shared with family members. When family income is measured it is also important to know the number of family members and sometimes the composition of the family (in terms of adults and children or perhaps elderly adults as opposed to nonelderly adults) to determine the adequacy of the resources. Measures of income can be considerably affected by how questions are asked, as shown by comparisons of results from single-item income measures like the one in the long form of the census and the more complex series of items in the Annual Demographic Supplement to the CPS. Furthermore, many respondents consider income a sensitive topic, leading to high levels of item nonresponse on income items in surveys and some administrative systems. Wealth Wealth measures a dimension of SEP different from current income. The two are correlated, but the correlation is not perfect.4 Income is a measure of current flow of resources, although higher levels of income may enable a family to accumulate wealth. Wealth reflects an individual’s or household’s stock of accumulated resources, such as property, savings, and other assets at a point in time. Since wealth is transferable from generation 4   For example, elderly people often have little current income but significant wealth, and so wealth may be more important than income when assessing their SEP. In contrast, younger people may have high current income, but lower levels of accumulated wealth. Venti and Wise (1999) find substantial variation in wealth among people with similar income from earnings, even among those of the highest wealth levels.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs to generation, family wealth can be an important factor, although it may be very difficult to measure. Wealth is important because it may be a buffer against periods of low income or high consumption (such as a catastrophic health event requiring expensive health care that is not covered by insurance), and thus may affect both health and health care. For example, by buffering against such periods, wealth may allow an individual or family to avoid the effects of deprivation, which may result in greater health. Duncan and colleagues (2002) and J. Smith (1999) both document the significant effect wealth has on health even when other measures of SEP are considered. Wealth may also allow a family to obtain health care treatments that could not otherwise be paid for through current income resources. For example, the decision to obtain care may depend on whether accumulated assets are available (and sufficiently liquid) to pay for the procedure. The inequality in wealth levels between blacks and whites is even more pronounced than the inequality in income (Barsky et al., 2002; Oliver and Shapiro, 1995). Differences exist both within levels of social class—middle-class blacks have lower levels of net worth and net financial assets than middle-class whites—and among those with similar educational background—college-educated blacks have lower levels of net worth than college-educated whites (Oliver and Shapiro, 1995). Wealth is most effectively measured by collecting extensive data on the value of financial assets (e.g., savings accounts, stocks, and bonds), retirement accounts such as 401(k) funds, pensions, real estate holdings and home ownership status, business equity, and ownership of large durables, such as vehicles. Debt information is also part of wealth measurement and may be used to capture a household or individual’s net wealth. Extensive measures of wealth are collected in only a few surveys, such as in the Health and Retirement Survey. In other surveys, less extensive measures of wealth are collected. For example, the Medical Expenditure Panel Survey (MEPS) collects information on asset and debt levels, and the National Health Interview Survey (NHIS) and National Health and Nutrition Examination Survey (NHANES) collect data on home ownership. Measures of wealth are rarely found in administrative or private records unless they are needed to administer the program (e.g., Medicaid collects some information on assets). O’Campo and Burke (see their paper in Appendix C) review how wealth is measured in health and health care data collection systems. Lifetime Income and Wealth History While wealth and current income measure current economic resources, lifetime income and wealth and their dynamics can also be important dimensions of SEP. For example, prolonged economic deprivation, which may entail exposure to negative environmental factors, poorer nutrition, or

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs stressful conditions, may wear on one’s health. Exposure to economic deprivation during critical times over the life course may affect health. Or severe fluctuations in current income or wealth levels may have negative impacts on general health. Measures of lifetime income and wealth are usually available only from longitudinal surveys and quite difficult to collect. The Panel Study of Income Dynamics is unique in that this longitudinal survey has followed the same families since 1968 and has collected extensive income and wealth data on these families. Another possible source of lifetime income measurement is from earnings records reported to the Social Security Administration (SSA), which could be linked to Medicare records to measure lifetime earnings for beneficiaries. Area-Based Measures of SEP A person’s address and Zip Code are obtained for many data systems and are often used to link area-based measures of SEP to the individual record as a proxy for the SEP of the individual. For example, per capita income for the Zip Code code area in which the individual resides may be used as a proxy for an individual’s income, or the percent of the residents of the Zip Code area with a college degree may serve as a proxy for individual education level. These area-based measures of SEP usually come from U.S. Census long-form data items, which include information on income, employment status, education, language, country of origin, citizenship, and housing and residency. The smallest area of geography for which long-form data are released is the census block group, but data are also released at the tract and Zip Code levels.5 For example, Singh and colleagues (2003) linked county-level and census tract-level information on poverty rates and matched them to cancer registry data (from the Surveillance, Epidemiology, and End Results program) on incidence and mortality from various forms of cancer to study the relationship between cancer and socioeconomic position. More commonly, however, Zip Code-level information (e.g., median income in the Zip Code) is used as a proxy. The process of matching an individual’s address to a census unit (tract or block group) or other geographic unit (a county) is called geocoding. Geocoding allows the linking of individual data to group level data from different sources. We discuss the use of geocoding for developing proxies for individual measures later in this chapter. 5   A census block group is a set of census blocks that optimally includes 1,500 people, but sizes of block groups in the 2000 census ranged from 300-3,000 people. Tracts optimally include 4,000 people but in the 2000 census ranged from 1,000-8,000. Zip Code areas average 30,000 people.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs Collecting SEP Data Many health surveys and program data collection efforts do not collect detailed SEP data (Duncan et al., 2002; O’Campo and Burke, Appendix C). Measuring SEP in general is as difficult as measuring race and ethnicity. Many questions and follow-up questions may be needed in order to accurately measure income and wealth. Income and wealth are also highly sensitive topics for questionnaires. People tend to underreport their income and are often reluctant, for confidentiality reasons, to report it at all. The depth of this problem depends on the survey and on the type of income information requested. For additional reading on the quality of income and wealth data, see Bound and Krueger (1991); Cilke (1998); Coder (1992); Hotz and Scholz (2002); Moore, Stinson, and Welniak (1997); Rodgers, Brown, and Duncan (1993); Roemer (1999 and 2000). Because of the difficulty of measuring income and wealth, many health surveys do not obtain detailed information on those dimensions of SEP. Some surveys (e.g., NHANES, NHIS), sensitive to respondent reluctance to provide detailed information on income and wealth, only ask respondents to record the amount of income they receive each year on a pretax basis and also to check a category of annual income level. On the other hand, education and occupation are less sensitive items on surveys and useful data can be collected with single questions. Some administrative databases collect detailed information on income if a certain level of income is a prerequisite to participate in the program. For example, income information is collected to determine eligibility for the Medicaid program and some states with assets tests collect information on household assets such as savings accounts, investments, and automobiles.6 Some surveys collect information about participation in low-income programs such as Medicaid, State Children’s Health Insurance Program (SCHIP), the Women, Infants, and Children (WIC) program, and food stamps. Thus, participation in these programs can be used as a proxy measure of low-income status. However, the income eligibility limits of these programs vary across programs and across states. Furthermore, individuals tend to underreport their participation in these programs. Otherwise, the collection of SEP data in administrative records systems is rare and often limited to measures of education and occupation. 6   Measures of assets from these records may understate their true value because applicants to means-tested programs like Medicaid have an incentive not to report all of their income and assets. Survey measures of means-tested program participation and benefit receipt are also underreported by survey respondents such that measures of income from surveys may be biased downward in surveys as well (see Hotz and Scholz, 2002; Wheaton and Giannarelli, 2000).

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs SEP indicators have a substantial relationship to health and health care and can be important mediating factors for understanding racial and ethnic disparities. The panel therefore recommends that collection of these measures in health surveys and administrative data collections be made a priority, both to better understand racial and ethnic disparities and as a means of identifying effects on deprived groups that are not defined by race or ethnicity. Because of limitations in survey lengths and in the data that can be collected as part of an administrative process, it will not be feasible to collect a full set of SEP measures in all instances.7 However, DHHS should consider which measures can be collected in different settings and push for their collection. CONCLUSION 3-2: Measures of socioeconomic position should, where feasible, be obtained along with data on race and ethnicity. ACCULTURATION AND LANGUAGE USE Language barriers and cultural differences between patients and providers and other actors in the health care system can lead to lower quality of care or poor health outcomes.8 With knowledge of the acculturation of a patient or a community, health care providers are better positioned to provide effective health care, and the causes of disparities in health and health care can be better understood. As discussed in Chapter 2, acculturation involves a complex process of interaction with aspects of the U.S. culture by individuals from different cultures. Less acculturated individuals will be more likely to have been born outside the United States, to speak their language of origin, and have cultural traits that are more closely linked to their culture of origin. An individual’s degree of acculturation can be measured by cultural characteristics, such as language preference and English proficiency or cultural practices, or by other variables that measure status, such as place of birth or years or generations living in the United States. These variables may yield more information useful for studying health and health care because they are closer proxies to factors that tend to vary across cultures that may affect 7   In the context of this broad review of data systems and their collection of SEP, the panel cannot specify which measures of SEP should be collected in every data system. In later chapters, where specific data systems are discussed, the panel suggests measures of SEP it believes may be reasonable to collect. The primary point is that greater effort is needed to collect SEP data systematically along with data on race and ethnicity. 8   In general, language barriers occur when the patient and the health care professional cannot understand each other. In most cases in the United States, language barriers occur when one party (most often the patient) is not fluent in English.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs health (for example, diet or views of traditional medicine) and health care (interpretation and understanding of health care treatment). In an ideal setting a battery of questions would collect extensive measures of acculturation. However, this is often not possible. Status variables are more easily collected, and so are often used as proxies for the degree of acculturation—presuming that individuals born in another country tend to be less acculturated than those born in the United States, or that those who have lived in the United States for a longer time are more acculturated than those who have arrived more recently. This conceptualization of acculturation, although simplistic, provides the opportunity to characterize this phenomenon with the use of individual cultural and social proxies (Berry, 2003). Language proficiency (in most instances in the United States, this means the ability to communicate in English) is a fairly reliable proxy measure of acculturation and can be assessed in many different ways—ranging from simple to fairly complex. Longer language acculturation scales have been developed as instruments that assess acculturation processes with more multidimensional domains, including language and other indicators such as generation status and cultural orientation variables. The Acculturation Rating Scale for Mexican Americans-II (ARSMA II) is a good example of this type of multidimensional scale that includes concepts such as integration, separation, assimilation, and marginalization (Cuellar, Arnold, and Maldonado, 1995).9 Language indicators within these multidimensional scales have been shown to be powerful predictors of health status among Hispanics (Deyo et al., 1985; Cobas et al., 1996). Extensive scales of language proficiency or of acculturation exist and can be used when more detail is needed (Cuellar, Arnold, and Maldonado, 1995), but because of their length, are better suited for surveys than for records-based data collections. Some measures of language proficiency may include a set of questions on language that are scaled to give a score of language proficiency. One short language-based acculturation scale (Deyo et al., 1985) has just four questions: What language do you prefer to speak? What language is most often spoken in your home? What was your first language as a child? Do you read any English? These four simple questions cover a range of possibilities of language use and offer a useful gauge of a person’s English fluency. But often only one question is asked about the respondent’s preferred or primary language. These simple scales of acculturation based on language proficiency, either combined or as a single question, have proven to be good and reliable indicators for health care research among immi 9   Acculturation scales for Asian subpopulations have also been developed and are being used to assess variations in health and health care among these groups (Anderson et al., 1993).

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs grant and racial and ethnic minority populations (Deyo et al., 1985; Cobas et al., 1996; Aguirre-Molina, Molina, and Zambrana, 2001; Byrd, Balcazar, and Hummer, 2001; Unger et al., 2000). Language proficiency does not convey the entire spectrum of acculturation as a complex construct, but it has proven to be a reliable measure. Language generally accounts for over 70 percent of the variance in the total acculturation score on acculturation scales (Clark and Hofsess, 1998). In addition to language, other elements of acculturation experience include: place of birth, cultural expressions and feelings, attitudes, emotional behaviors and beliefs, and ethnic loyalty (Cuellar et al., 1995; Marin et al., 1987; Balcazar, Peterson, and Cobas, 1996; Clark and Hofsess, 1998).10 Place of birth is also a simple and useful (although not perfect) proxy indicator of acculturation. Such information is likely to be useful to improve program administration, to target language—appropriate educational materials and information, to suggest providers, or to allocate translation services. Generational status and time in the United States are additional indicators of the acculturation experience and can provide pertinent information about health care access and utilization of health services in addition to differences in health outcomes among immigrant populations living in the United States. However, these indicators, along with language, do not satisfy the criteria of multidimensional acculturation measurement (Clark and Hofsess, 1998; Chun et al., 2003). These are not perfect measures, of course. An individual’s degree of acculturation is dependent on other characteristics as well; home, work, and social interactions, exposure to the cultures in the United States, or proficiency with the English language before immigration. Other aspects of the individual that are associated with acculturation include socioeconomic status, discrimination, occupational experiences, and neighborhood environments (Clark and Hofsess, 1998; Cuellar et al., 1995). Furthermore, some of these measures, such as place of birth, will not be relevant to some Native American populations for which language and cultural practices may be barriers to interactions with the health care system. An even more challenging concept for acculturation is the notion of who represents the host majority culture. Many “minority” communities represent the majority or predominant population in a given geographic area or community. However, discussion of the implication of this for acculturation is beyond the scope of this review. 10   In a health care treatment setting, in addition to collecting data on the language preference of the patient, it may also be useful to collect data on whether any health care professionals treating the patient spoke the language of the patient.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs CONCLUSION 3-3: Measures of acculturation and proxies such as language use, place of birth, and generation and time in the United States should, where feasible, be obtained. We noted earlier that race and ethnicity are fluid concepts. While the meanings of race and ethnicity change over time, the relationship between race and ethnicity and SEP and acculturation may also change over time. For example, if patterns of immigration change (i.e., new groups from new areas immigrate to the United States and groups that have immigrated continue to assimilate), the existence and nature of disparities may also change. It is also possible that the economic and social positions of minority groups may change. Thus, how race and ethnicity are measured and the relationship of these two concepts to social and economic position and acculturation are likely to be reconsidered from time to time. CROSS-CUTTING ISSUES IN MEASUREMENT AND USE OF THESE DATA Some measurement issues are common to the collection of data on race, ethnicity, SEP, and acculturation and language use. Survey and administrative data systems are often limited in the amount of data that can be collected on each of these dimensions. As indicated earlier, sample sizes also limit statistically reliable estimation of disparities in health and health care within small population subgroups. In addition, health data are collected for several different purposes. Some data are collected to administer a program; for example, the primary use of Medicare claims data is to process payments for services. Data collected in the application process for health insurance are used to underwrite policies. Data on race and ethnicity are sometimes used to enforce civil rights laws. Because many of these data sets are not collected for research purposes, they may not have all the characteristics of an ideal data set for research on disparities in health and health care. Confidentiality and privacy issues may also limit their use because individuals who provided the data may not have been informed or have consented to let their data be used for purposes other than the primary reason the information was collected. Finally, because no one database is fully comprehensive for measuring race, ethnicity, SEP, and language use and acculturation, data linkages are often necessary to avoid the cost of new collections. The linkages can be difficult to arrange and come with their own privacy and confidentiality problems. This section discusses some of these problems in the collection of data on race and ethnicity, SEP, and language use and acculturation.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs Data Content and Coverage Surveys A number of surveys collect information on health, health care access, utilization, and quality. Some surveys are conducted at the national level, often sponsored by the federal government. Some surveys ask specifically about health status, such as the National Health Interview Survey, while others focus on other topics such as medical expenditures (the Medical Expenditure Panel Study) and collect only limited information on health status. States and localities conduct their own surveys as well. For example, the state of Hawaii has conducted the Hawaii Health Survey since 1968. The Pregnancy Risk Assessment Monitoring System (PRAMS)—state-based but coordinated by the Centers for Disease Control and Prevention (CDC)—surveys a sample of women in 13 states and the District of Columbia who recently gave birth to collect information on maternal behaviors before, during, and after pregnancy. Most of these surveys collect data on race and ethnicity.11 In federally supported surveys, if racial and ethnic data are collected, the minimum OMB standards for the collection of these data must be followed. There are, however, no such standards for the collection of SEP data (although there is a standard way to classify occupation), nor for data on language use and acculturation. Because these surveys focus on health data collection, they do not contain extensive measures of SEP. (See the paper by O’Campo and Burke in Appendix C for a complete listing of what SEP measures are collected in these surveys.) Questions about educational attainment and occupation are often included in the surveys, but only very limited information on income and wealth is collected. Most national surveys are designed to produce estimates for the nation as a whole, not specific racial and ethnic subgroups. Sample size limitations in many studies allow reliable estimates of health status or health care utilization for only the larger racial and ethnic groups—whites, blacks, and Hispanics. Some surveys do oversample12 certain minority groups (e.g., the 11   See the DHHS Directory of Health and Human Services Data Resources: http://aspe.hhs.gov/datacncl/datadir/index.shtml. 12   A specific population is oversampled when a survey interviews a disproportionately larger number of units (e.g., households or individuals) of that population than they constitute in the total population being sampled; for example, a survey that oversamples Hispanics will attempt to interview a higher proportion of Hispanics in the survey than Hispanics represent in the overall population. This is usually done in order to create a sample size of the specific population that is large enough to make statistically reliable estimates of the characteristics of the population.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs Health and Retirement Survey (HRS) and NHIS both oversample Hispanics and blacks, and the NHANES oversamples blacks and Mexican Americans). But samples are often not large enough to produce reliable estimates for smaller racial groups—Asians, Native Hawaiians and Pacific Islanders, Alaska Natives and American Indians. An exception is the Hispanic Health and Nutrition Examination Survey (Hispanic HANES), conducted from 1982 to 1984 for three Hispanic subgroups, Cubans, Puerto Ricans, and Mexicans. The examination of differences in health status among Hispanic subgroups and among non-Hispanic whites and blacks and Hispanics was made possible with this survey. Special efforts, like the Hispanic HANES, are often required to obtain adequate coverage and sample size for specific subgroups. A survey, unlike administrative records, affords the researcher more control over the populations that are covered (although limitations in survey methods may limit coverage). But with limited resources for data collection, it is not always possible to conduct special surveys with the same frequency as those conducted for the nationally representative populations. Thus, which populations are covered in surveys is determined mainly by resources and other government priorities (e.g., the need to obtain answers to specific policy questions) rather than by limitations inherent to the data collection methods and processes. Administrative Data Systems Data collected through federal and state health programs or through the operations of health care providers (e.g., hospitals) are often used to measure and understand health status and health care utilization. Examples of administrative data systems include the Medicare Enrollment Database, which represents enrollees since 1966, and the federal Healthcare Cost Utilization Project, which collects hospital discharge abstracts in a uniform way from 28 state data organizations. These administrative data sets are more often used for their information on health care utilization rather than on health status, as measures of the latter are not usually collected as part of the administrative process. Administrative data are also used to measure disease incidence; for example, the CDC maintains a number of disease surveillance reporting systems to which hospitals, labs, clinics, and other health care institutions submit information on disease incidence (for example, bacterial meningitis, HIV/AIDS, food-borne illnesses). Administrative data are collected to fulfill the particular purposes of a program or administrative process. Therefore, unlike survey data, where content is often (within budget and time constraints) under the control of the data collection agency, the content of administrative data is usually limited to the specific information needed for administrative objectives. For

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs example, a clinic may ask for a patient’s age, residence, occupation, and insurance coverage as well as medical information as part of the admission process. Similarly, information on income and assets might be collected to assess eligibility for Medicaid. It is sometimes possible to collect additional data items, but usually only a limited number. As we will discuss in Chapters 5 and 6, race and ethnicity are sometimes included, although not consistently. The populations included in administrative data collections are limited both by the scope of the specific program and to those directly served by the program. For example, only Medicare enrollees are covered by the Medicare Enrollment Database, and only those with a specific disease are covered by the disease surveillance systems. An advantage of administrative data sets, however, is their large sample sizes, which make it possible to obtain measures of interest for relatively small groups. Furthermore, medical records and billing or reimbursement systems contain extensive information on health care received by individuals. Thus, with the addition of suitable items on race, ethnicity, SEP, and acculturation, these data could be highly useful for studying disparities in health care. Statistical Uses of Data Information about individuals is used to make general statistical inferences about populations to monitor trends in disparities, to understand how disparities arise, and ultimately to design interventions to eliminate and reduce them. These inferences may be descriptive—describing health status, disease prevalence, and health care outcomes for a population of interest—or they may be used to draw causal conclusions about why an outcome or a difference in outcomes between groups is observed. While information at the individual level is needed in order to make these inferences, the specific identities of individuals are irrelevant and inferences are always drawn at an aggregate level. Such statistical uses are distinct from other uses of the data that require information about a specific individual. For example, income data on individuals applying for Medicaid are collected to assess eligibility for the program; data on particular hospitals or individuals treated in hospitals are collected to ensure enforcement of civil rights laws. Data on individuals may be used to underwrite insurance policies—that is, to assess whether or not coverage should be offered to an individual and at what rate. In each of these cases, data on individuals are collected to take action regarding a specific individual. In contrast, data on individuals used for statistical purposes are collected to make inferences at an aggregate level. Using data on individuals can create a situation where an individual’s identity and private information could be disclosed and could potentially be

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs used in a way that harms the individual. To prevent this from happening, when an investigator gains access to a confidential data set, he or she is usually required to agree to use it for statistical purposes and not release any confidential data on an individual. Breaches of such agreements may be punished by loss of access to data, of research funding, and of permission to conduct research. Many of the data sources used to understand health disparities are not collected specifically for these statistical purposes, but rather are used to administer services and programs. Their use for statistical purposes is secondary. The panel, in this report, will make recommendations that encourage the collection of additional items of race and ethnicity, SEP, and language and acculturation where possible so that statistical inferences about disparities can be made. But it does so with the recognition that these data need to be useful to the federal, state, and private institutions and systems for which they are collected, and ultimately for the individuals who provide the data voluntarily, not just for statistical purposes. There are some examples of how data on race and ethnicity could be used to benefit the individuals who provide the data and the institutions who collect the data. For example, a health insurance plan might collect ethnicity and language data on enrollees to target culturally appropriate information or program interventions to individuals in their primary language. Or a plan may want to target information on disease prevention to enrollees who belong to racial or ethnic groups with higher prevalences of certain diseases. CONCLUSION 3-4: Health and health care data collection systems should return useful information to the institutions and local and state government units that provide the data. Data Linkages Data linkages, or combining variables from two or more data sets, can facilitate new analyses (for policymaking, quality improvement, and research) without the expense and time needed for additional data collection. While there are tremendous opportunities for new analyses with linked data, there are also barriers to linking data sets. Linking data sets usually entails bringing together information that identifies individuals, such as names, social security numbers, or a program identification number. This means that privacy and confidentiality regulations and concerns must be addressed. Confidentiality concerns are also increased when data are linked because more than one data system is being employed, and individuals who provide data to one system may not want the information made available to another.

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs Sometimes there are legal limitations on the use of data linkages. For example, employers are legally forbidden to link some claims data from their employees’ health insurance records to employer-based records without protection of the employee’s privacy. The Health Insurance Portability and Accountability Act (HIPAA) regulations require removal of identifiers from publicly released data, although exceptions can be authorized (with appropriate safeguards) when required for research. The use of social security earnings data by researchers outside the agency is severely limited. A further barrier to data linkages is the need for negotiation across agencies and entities that maintain the data, which might have varying confidentiality provisions. Thus, linkages across agencies may require complex interagency negotiations. Several methods are used to guard against harmful uses of linked data and to protect confidentiality. Masking and deidentification are two procedures that maintain the integrity of an individual’s data but strip any personally identifying information from the linked record. The National Center for Health Statistics (NCHS), the Agency for Healthcare Research and Quality (AHRQ), and the Census Bureau all maintain restricted-access data centers that are housed in a secure setting but make the data available to researchers with proper credentialing and assurances of nondisclosure. These techniques facilitate the use of linked data. (See NRC, 2000 and NRC, 1993 for more extensive discussions of these methods.) Sometimes when it is impossible to link data on individuals from two or more data sets, individual data from one set are linked to geocoded area-based measures from another set of data, which serve as a proxy for individual measures. As mentioned previously in this chapter, geocoding and the use of area-based measures are not perfect proxies for an individual-level variable. Area-based measures both at the Zip Code and census tract level are not as precise as individual-level data (Geronimus and Bound, 1998). But some area-based measures have been found to be better than others for health outcomes models: for example, aggregate income, education, and occupation were better predictors of health outcomes than socioeconomic index measures (Geronimus and Bound, 1998). This study also found that census tract-level measures are not significantly better than Zip Code-level measures. Krieger (1992) found that block group measures of SEP performed better than census tract measures of SEP for some health outcomes, but that the opposite was true for others. In a more recent study that examined many different health outcomes (e.g., birth and death outcomes, incidence of cancer and other diseases, and homicide), Krieger et al. (2003) found that census tract- and census block-level measures of SEP gave consistent parameter estimates of the effects of these SEP measures on outcomes across different racial, ethnic, and gender groups, while Zip Code-level measures were less consistent. This study also found that the percent

OCR for page 41
Eliminating Health Disparities: Measurement and Data Needs of the area in poverty (tract, block, or Zip Code) was the SEP area-based measure that gave the most consistent estimates over different racial, ethnic, and gender groups than other area-based measures of SEP. Linking across data sets has great potential payoff in terms of increased content coverage over a single source of data. For example, SEP and language data from other data sources could be linked to a data set that does not cover these items, or demographic information or information on health outcomes could be linked to information about health care received. While there may be technical and privacy issues to contend with in the linkage of these data, these issues can be addressed. Linking data sets collected by entities outside DHHS (e.g., by states and the private sector) may be more difficult because data sharing agreements may need to be negotiated and because data formats may be less consistent. However, since many of the data sets available to measure health disparities are within DHHS, some of the burdens of dealing with department cross-agency protections of privacy and confidentiality could be reduced if strong leadership is exercised by the department’s groups and agencies with data collection and coordination responsibilities. CONCLUSION 3-5: Linkages of data should be used whenever possible, with due regard to proper use and the protection of confidentiality in order to make the best use of existing data without the burden of new data collection. Improvement of the data systems available to study racial and ethnic disparities in health will impose some additional burdens and costs on data systems. As we stated in the introduction to this report, it is beyond the panel’s charge and would require a special set of expertise to provide a detailed assessment of the costs of these improvements. However, some general principles for collecting costs and reducing burden are discussed in the next chapters of this report. The collection of data on race and ethnicity and some simpler measures of SEP and acculturation and language has proven to be feasible and not difficult, although some of the more complex measures of SEP and acculturation and language use are difficult to collect and may not be practical to collect in every situation. It is true that the collection of these data may require some changes in computer systems, but such changes occur in the normal course of events from time to time and would not be more burdensome in this case. The major costs in collecting these data appear to be at the point of contact—that is, when a patient’s or program enrollee’s information is obtained. These burdens can be reduced, for example, by designing the systems in a way that avoids repeatedly collecting the same information.