practices and issues related to collecting and sharing data across the health care system. Next is a discussion of steps that can be taken to address these issues and improve data collection processes. This is followed by a review of methods that can be used to derive race and ethnicity data through indirect estimation when obtaining data directly from many patients or enrollees is not possible.


Health care involves a diverse set of public and private data collection systems, including health surveys, administrative enrollment and billing records, and medical records, used by various entities, including hospitals, CHCs, physicians, and health plans. Data on race, ethnicity, and language are collected, to some extent, by all these entities, suggesting the potential of each to contribute information on patients or enrollees. The flow of data illustrated in Figure 5-1 does not even fully reflect the complexity of the relationships involved or the disparate data requests within the health care system. Currently, fragmentation of data flow occurs because of silos of data collection (NRC, 2009).

No one of the entities in Figure 5-1 has the capability by itself to gather data on race, ethnicity, and language for the entire population of patients, nor does any single entity currently collect all health data on individual patients. One way to increase the usefulness of data is to integrate them with data from other sources (NRC, 2009). Thus there is a need for better integration and sharing of race, ethnicity, and language data within and across health care entities and even (in the absence of suitable information technology [IT] processes) within a single entity.

It should be noted that a substantial fraction of the U.S. population does not have a regular relationship with a provider who integrates their care (i.e., a medical home) (Beal et al., 2007). For some, a usual source of care is the emergency department (ED), a situation that complicates the capture and use of race, ethnicity, and language data and their integration with quality measurement. While health plans insure a large portion of the U.S. population, their direct contact tends to be minimal, even during enrollment. Hospitals, which tend to have more developed data collection systems, serve only a small fraction of the country’s population. As a result, no one setting within the health care system can capture data on race, ethnicity, and language for every individual.

Health information technology (HIT) may have the potential to improve the collection and exchange of self-reported race, ethnicity, and language data, as these data could be included, for example, in an individual’s personal health record (PHR) and then utilized in electronic health record (EHR) and other data systems.1 There is little reliable evidence, though, on the adoption rates of EHRs (Jha et al., 2009). While substantial resources were devoted to this technology in the American Recovery and Reinvestment Act of 2009,2 it will take time to develop the infrastructure necessary to fully implement and support HIT (Blumenthal, 2009). Thus, the consideration of other avenues of data collection and exchange is essential to the subcommittee’s task.

Until data are better integrated across entities, some redundancy will remain in the collection of race, ethnicity, and language data from patients and enrollees, and equivalently stratified data will remain unavailable for comparison purposes unless entities adopt a nationally standardized approach. Methods should be considered for incorporating these data into currently operational data flows, with careful attention to concerns regarding efficiency and patient privacy.


Because hospitals tend to have information systems for data collection and reporting, staff who are used to collecting registration and admissions data, and an organizational culture that is familiar with the tools of quality improvement, they are relatively well positioned to collect patients’ demographic data. In addition, hospitals have a history of collecting race data. With the passage of the Civil Rights Act of 19643 and Medicare legislation in 1965,4


A PHR is a medical or health record owned and maintained by a patient him- or herself. EHRs are further defined in Chapter 6.


 American Recovery and Reinvestment Act of 2009, Public Law 111-5 § 3002(b)(2)(B)(vii), 111th Cong., 1st sess. (February 17, 2009).


 The Civil Rights Act of 1964, Public Law 88-352, 78 Stat. 241, 88th Cong., 2d sess. (July 2, 1964).


 The Social Security Act of 1965, 89th Cong., 42 U.S.C. § 7, 1st sess. (July 30, 1965).

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement