The Bradys have recently moved to Capital City from a small town in another state. According to the family practitioner in their former residence, their young daughter needs heart surgery, and they want to identify a surgeon and hospital with considerable experience and good outcomes to do the surgery. They need to choose a health plan from among those offered by Mr. Brady's employer that covers services by these providers. Which plan should they choose?
Mike, a shy three-year-old, has been brought to the attention of the Montgomery County Protective Services unit because of concerns about his failure to thrive. He has been living on and off with an aunt. She has no records of any previous medical care and no special knowledge of any illnesses. How can the county caseworker acquire health-related information as part of her responsibilities to manage this case and make recommendations for appropriate referrals?
Alice Johns, an elderly woman who rarely visits any physicians, appears in Dr. Mark's office with fever and other flu-like symptoms. Dr. Mark needs to know what infectious organisms have been appearing in their community lately as a guide to treating Mrs. Johns appropriately on this one examination. How can he quickly get this information?
Gerry Middlemarch, a health services researcher at State University,
leads a team that is studying treatments for low back pain. They want to know: What are the outcomes for patients who have surgery, who attend pain control clinics, or receive chiropractic services? How well are these patients functioning? How satisfied are they with their treatment? What have been the relative costs of care?
A man in jogging clothes arrives unconscious at the emergency department of Santa Teresa Memorial Hospital, having been found collapsed at an intersection nearby. Who is he? What medications is he taking? Is he diabetic? Does he have a history of heart disease? Is he allergic to any medications?
Officials in the Columbia County Department of Health must begin to plan for long-term development of health facilities in the area, making the best use of local resources, tax revenues, and bonds. In particular, they need to decide: whether to renovate the only community hospital to expand traditional inpatient care services; whether to shift inpatient beds to rehabilitation and skilled nursing beds or to authorize construction of a new nursing home; and whether to establish additional neighborhood clinics for maternal and child care, to upgrade the emergency medical services system for both adults and children, or to add staff for substance abuse facilities. How can they determine the community's greatest needs now and five to ten years from now, and how can they calculate the most cost-effective use of the county's limited health budget?
The Tectonic Plate Manufacturing Company, a large employer in Ironweed City, is facing soaring costs for its health benefit plan. The company is a somewhat paternalistic one, with a generous health plan, and it does not want simply to direct its employees to the ''cheapest" hospitals in the area. As the metropolitan area has many hospitals, which range widely in size, services offered, and reputation, how can the company determine which ones will likely have good outcomes with only moderate charges?
The local chapter of the National Paralysis Foundation has a young and energetic executive vice president who wants to move the organization more in the direction of outreach and case management and away from simple fund raising. She wants to find better ways to identify children and families to whom a greater range of services might be offered. How can she best target these efforts in this city of 800,000 as well as in the larger suburban counties surrounding the city?
These vignettes are fiction. Yet, every day, across the nation, these and similar scenarios play out and each describes a valid and legitimate need for medical information. Sometimes the questions are answered quickly and correctly. Sometimes they are not, especially when no central repository of information or network of data sources exists or can be queried easily. In response to this situation, many experts in the health care field share an exciting vision: a community-oriented database or group of linked databases that can address all kinds of inquiries about health care matters in a timely and satisfactory manner.
This report from an Institute of Medicine (IOM) study committee examines the potential that existing and emerging health databases offer for fulfilling this vision. It gives special attention to appropriate uses of data in such repositories and to adequate protections for the privacy and confidentiality of individually identifiable information. It concludes that "health database organizations" can play a pivotal role in health care delivery and research but that they, or other interested parties, will have to take significant steps to ensure that private information remains private. To promote these ends, the committee advances recommendations that are detailed in subsequent chapters. Taken together, the committee's findings, conclusions, and recommendations underscore the extreme importance of the ways in which health care information is to be controlled and used in the future.
Advancing the Prospects for Comprehensive Health Databases and Networks
The desire to understand and improve the performance of the health care system begets a need for data to answer the questions that opened this report. This, in turn, motivates proposals for the creation and maintenance of comprehensive, population-based health care databases that can provide such information with ease and reliability. The past quarter-century has already seen an exponential rise in the number, complexity, and sophistication of health databases, yet they do not approach in extent, inclusiveness, or quality the vision offered above.
What is the state of health databases today? The databases briefly noted here illustrate the range of existing databases; Chapter 2 discusses selected databases in more detail. Among the oldest and best known of the so-called administrative data sets are those associated with the Medicare program, particularly the Part A and Part B files (for, respectively, inpatient and outpatient services) and more recent compilations such as the National Claims History system. All states maintain some form of database for their Medicaid programs; more than two-thirds maintain databases on hospital
discharges, such as the Statewide Planning and Research Cooperative System (SPARCS) database in New York State; a similar proportion collect information on emergency medical services (chiefly for prehospital emergency vehicle runs); and more and more states are establishing state databases to support research, policy analysis, and performance of the health care delivery system. As something of a counterpoint to existing databases in the United States, Canadian provincial databases, such as those in Manitoba, contain information on virtually all health encounters (inpatient and outpatient) for all persons in the province, permitting the analyses contemplated for state databases and by this committee.
Other databases are maintained by insurers in the private sector; they are derived from insurance claim forms and include groups covered by service benefit, indemnity, or employer-based health insurance plans. In the past, such databases served chiefly to adjudicate claims for reimbursement; today, they also support research applications. Some health maintenance organizations (HMOs), particularly group and staff model HMOs maintain patient health records that can be used both for patient care and research. Other major databases (and public use files) have been specially constructed for research studies, such as the RAND Health Insurance Experiment, and for national surveys, such as the National Medical Expenditures Survey and the National Health and Nutrition Examination Survey and its various supplements.
Despite this activity and progress, many difficulties obstruct the realization of this committee's vision. Some problems relate to the content and structure of current health databases; others pertain more to the difficulties of creating and maintaining comprehensive databases. One major drawback is that most information gathered today reflects independent events and a single setting (almost always hospital admissions). In the absence of computer-based patient records, even hospital databases are often limited in the quality and quantity of the patient data they contain. Correspondingly, databases often have little or no information about ambulatory and other nonhospital services; thus, they lack facts about primary care, despite the major impact that primary care has on the public's health. Another, related challenge is that episodes of care—longitudinal records that tell how patients fare "in the system as a whole"—cannot easily be constructed. In addition, currently available information is not (or cannot easily be) adjusted for important characteristics about patients' sociodemographic circumstances or health status, and this makes it difficult to compare the performance of providers and practitioners or to set insurance premiums or capitated payments correctly and without bias.
Databases created from information generated by the use of health care services, such as those assembled from insurance claim forms, reflect information only on users of the health care system; missing is information on
those who never seek or obtain care. As a consequence, planners and others usually cannot use today's databases to learn much about the population as a whole or to assess unmet needs in a community. Moreover, many contemporary databases are essentially archives of information collected at some time in the (possibly remote) past; this retrospective aspect of the information may give little if any support for real-time patient care. Furthermore, much more information will be available on what was done to patients (the processes of care) than on the end results (the outcomes) of that care, yet those wishing to make decisions about treatments or providers prefer, indeed require, outcome-related information. Even the clinical information, if gathered through insurance claims or encounter forms, may be quite limited and of questionable reliability and validity; if obtained from paper-based medical records, then considerable manual abstraction and computer data entry are required (all tasks that introduce their own inaccuracies and biases). The cleanest and most comprehensive data on some topics may come from research projects, but such databases have their own limitations in populations covered, timeliness, and access by individuals or organizations not involved in research.
Other issues may be more prosaic, albeit no less difficult. Chief among these is cost. Creating and maintaining databases, whatever the original source(s) of information, can be expensive. When private entities bear the costs, they may see little reason to share information with others who have not helped to shoulder the monetary burden; when the public sector bears the costs, other claims on the public treasury may take precedence.
Another obstacle is competition in the health field. Rival health care providers or insurers have not been (and are not likely to be) willing to share what they may regard as sensitive, proprietary information. Antitrust considerations may also play a role in the reluctance of possible or actual competitors to share data; health care reform may prompt reinterpretation of antitrust rules, but this area was well beyond the committee's charge or expertise. Even organizations that do not directly compete may see little or no incentive to make their databases available to others. In any case, such groups may not wish to participate in collective actions to set standards for terminology, definitions of data elements, or electronic transmission of information; this has been especially true for organizations whose long-established internal systems would be expensive to change or upgrade.
Finally, as reasons accumulate for creating large health databases, so do the possibilities that such databases (or, more correctly, their users) will do harm to patients, to providers (institutions, physicians, and others), to payers (government, private insurers, and corporations), and to the public at large. The balance between the advantages of such databases and their potential for harm, or at least unfairness, to some groups is not yet clear, and the question of whether and how such entities ought to evolve has been
incompletely explored. This perception of potential harm from the proliferation of large databases is itself a barrier to their development.
In the past few years, diverse groups of researchers, business leaders, and policymakers at state and regional levels have begun to design and develop an array of databases, networks, repositories, and the like. These are intended to overcome some of the above problems and to permit far more sophisticated analyses of community health needs, practice patterns, and costs and quality of care than has been possible to date. The interests that have prompted such action cover a broad range: controlling business costs attributable to health benefits; applying computer technologies to decrease costs of processing insurance claims; evaluating and improving health care; conducting technology assessments; planning the expansion and contraction of health care facilities and services across the nation; and transmitting medical history information for an increasingly mobile population. The success of health care reform—as well as the ability to assess the effect of a reformed system on the health of the public—depends on access to the kinds of data that too often are unavailable.
Coincident with this conjunction of needs, interests, and enthusiasm are greatly enhanced electronic capabilities for data management in many aspects of daily life. Comprehensive, computer-based health data files can easily be linked, and information from those files can be moved essentially instantaneously. Thus, an unparalleled opportunity exists to apply computer technologies creatively to address many of the informational needs and data problems noted above. This report focuses on the actions that might be taken to foster such action and progress by what the IOM committee terms health database organizations.
Health Database Organizations
Many kinds of health databases, networks, and repositories exist today, although they differ in many characteristics. They may be created by business coalitions, built by entities supported with private funds, mandated by state health legislation, or established by federal action. For purposes of this report, these entities are collectively termed health database organizations (HDOs).
As ideally conceptualized by the committee, and discussed more fully in Chapter 2, HDOs have several important characteristics in common. They:
- operate under a single, common authority;
- acquire and maintain information from a wide variety of sources and put their databases to multiple uses;
- have files containing person-identified and person-identifiable data;1
- serve a specific, defined geographic area;
- have inclusive population files;
- have comprehensive data with elements that include administrative, clinical, health status, and satisfaction information;
- manipulate data electronically; and
- support electronic access for real-time use.
The prospect of creating these entities has raised numerous issues. Among the more conspicuous are: (1) worries on the part of health care providers and clinicians about use or misuse of the information that HDOs will compile and release and (2) alarm on the part of consumers, patients, and their physicians about how well the privacy and confidentiality of personal health information will be guarded. Addressing these two concerns was the chief focus of the IOM committee appointed to conduct this study. A third issue—the technical and political feasibility of building such repositories of health information and assuring that the expected benefits are achieved—is often voiced, but addressing it was beyond the scope of the study.
The Institute of Medicine Study
The direct impetus for this study came from discussions between staff of the John A. Hartford Foundation and the IOM in the early 1990s. The Hartford Foundation has a long-standing interest in issues relating to the generation and application of information to improve health care delivery and to increase the value of health care spending. Its interests have intensified in the present context of vastly greater computer capabilities in the health care sector, increasing attention to health matters in the business community, rising interest among health professionals in understanding the effectiveness and appropriateness of the health care services they deliver, and growing sophistication among consumers about health care matters. In
Person-identified means that the record contains an obvious individual-related identifier such as name or Social Security number. Person-identifiable means that the record contains a variety of facts that collectively can be used to infer the identity of the individual. That is, person-identified is a subset of person-identifiable data. These two related terms are discussed more fully in Chapter 2.
this framework, the foundation had supported earlier IOM studies on computer-based patient records (IOM, 1991a) and clinical practice guidelines (IOM, 1992a). It now had specific questions concerning potential obstacles to the successful implementation of regional health data networks or repositories—known as Community Health Management Information Systems (CHMISs)—whose creation it was supporting in several areas of the country.2
The Study Committee and Its Charge
In early 1992 the IOM appointed a study committee that conducted the major part of its work between March 1992 and December 1993.3 The study committee, chaired by Roger Bulger, M.D., consisted of 16 individuals (see roster) with expertise in the administration of medical centers and academic health centers, the practice of medicine, health insurance, utilization management, use of large administrative and research databases for research purposes, administration of large (nonhealth) corporations, consumer services, health and privacy law, ethics, data security, informatics, and state health data organizations.
During meetings and other study activities, the committee addressed the charge given below, which incorporated both the concerns of the Hartford Foundation (about what was then termed regional health data networks) and a somewhat wider set of issues and concepts that committee members themselves believed significant:
The study committee will examine regional health data networks and possible impediments to their effective implementation. The focus will be on ways to facilitate cooperative regional efforts among payers, employers, insurers, health care providers, and other parties that will be practical, useful, and acceptable to a wide array of community interests and mem-
bers. The study will address privacy, confidentiality, security, and other concerns about health-related information in several kinds of regional data repositories and files in the broad context of the uses to which these databases might be put. Specifically, the committee will seek to understand more about databases now in existence and those now under development and will consider how current impediments to their successful implementation might be addressed in the context of public and private decision making about the costs, quality, appropriateness, effectiveness, and cost-effectiveness of health care services and care providers. The committee will seek information from many sources (e.g., site visits, expert panels or workshops, focus groups, commissioned papers) and will produce an NRC-reviewed report.4
Questions Confronting the Study Committee
The IOM committee took as a given that, even as it conducted its investigations, a variety of HDOs were being created and moving into operational phases. It thus initially addressed itself to two critical "downstream" questions: (1) What current dangers arising from electronic data interchange and the widespread sharing of personal health data might continue, be exacerbated, be ameliorated, or be prevented by such entities? (2) What new harms might be anticipated and minimized or avoided by design? Within the broad sweep of these questions, several more specific issues surfaced during the study.
First, how will HDOs be governed? Developers, providers, consumer representatives, and others ask who will and should own these organizations, what sorts of organizations they will be, and how they should be governed. The different legal mandates that might give rise to such entities and contribute to their effectiveness also come into play. For instance, HDOs might emerge in the private sector as the result of the interests of a business coalition or provider association. Conversely, state legislation might prompt and direct their development. Yet other data repositories might come about through a combination of public- and private-sector interests, data sources, and governing structures.5
Second, what is the scope of an HDO? One factor is whether such databases can be designed and implemented to ensure that they encompass a given region's population, not just the users of health care in the area. The more inclusive and comprehensive the database, the more likely it is to have value for a broad range of users and uses, such as research into the epidemiology of disease or the effectiveness of medical treatments, health care planning, and quality assurance and improvement. Clearly, however, the more expansive the database, the more difficult and more expensive it is to create and to maintain. The arguments concerning the breadth of the population covered relate equally to health care providers; that is, databases that include all independent practitioners and types of facilities are certain to be more useful than those that, for example, cover only physicians with hospital admitting privileges.
Third, how good will the data be? Many of the experts contributing to the IOM committee's fact-finding efforts raised questions about data accuracy, quality, comprehensiveness, reliability, and validity. This led many committee members to wonder how the public, policymakers, providers, and others can determine whether data are factual, sufficiently complete, and appropriate for the analyses in which they are used. Even when data for a given purpose appear to be adequate in these respects, many observers worry about using data for aims other than those originally intended; a case in point is the use of information originally intended for administrative functions to support patient care or quality assessment applications.
Fourth, what about the "safety" of personal health data? Many individuals question whether private information about an individual (however "private" is construed by the individual in question) can be kept private and confidential in these databases, especially when such information is accessible over electronic networks. Aggregation of personal health data in data repositories greatly increases the possible benefits as well as the potential for harm. Thus, many wonder whether it will be possible to assure the public that very sensitive personal health data will be protected, and they ponder the circumstances under which various users should gain access to person-identifiable data.
Fifth, how "secure" will these HDOs be? Apart from protecting privacy and confidentiality through rules about access to data files that contain person-identified information or about release of person-identified information to others, what security measures for the system as a whole can and should be put in place? Many experts state that the threats of breaches of security are myriad and sometimes difficult to detect; although less technologically oriented, lay persons worry as well about unauthorized access to their personal information. All consider that finding ways to prevent, or alternatively to detect and mitigate, such security problems is a significant challenge.
Sixth, who will see and use whose data? The rules that now govern access to patient-, provider-, employer-, and payer-specific data, and should continue to do so in the future, all occasion concern. The reasons for and levels of apprehension differ widely depending on the potential users—patients (including their families and proxies); health care providers (and their employees); insurers and third-party administrators; employers (including those who self-insure medical care for employees); researchers; local health care planners; clinical and health services researchers; community and consumer interest groups; attorneys (including patients' attorneys); law enforcement officials; and other interested parties. For health care providers, employers, and insurers, data on competitors may be of intense interest; similarly, plaintiffs' lawyers in malpractice suits will seek to acquire information from HDOs concerning other patients cared for by the defendants. Many observers question whether access to such information should be permitted.
As another case in point: even if access to or use of person-identified information is severely restricted, one can still inquire about the proper uses of data on defined populations. For instance, should analyses be done and made public (even if individuals are never identified) on groups characterized by having certain diseases or belonging to a given socioeconomic or ethnic group?
One significant issue relating to health care providers is whether different rules should govern access to and public disclosure of data on specific institutions versus named practitioners. One can also ask about the propriety of releasing information on groups or categories of providers and practitioners. A principle of fairness in the use of data lay behind much of the committee's thinking on these matters.
Seventh, how should information be made public? Given that HDOs meet conditions for adequate data as well as those relating to security and to privacy and confidentiality of person- or patient-specific information, a further question is how to ensure that they release and disseminate useful knowledge and information in ways that can be understood by the public at large.
Eighth, where do current laws and statutes fit in? Present-day laws and regulations at both the national and state levels may pose constraints for regional HDOs, or they may not affect them at all. The impact of current statutes will depend on the issue at hand, the jurisdiction under consideration, and the reach of existing laws to secondary records. Among the issues are barriers to accessing certain categories of data, such as information on mental health or substance abuse treatment, and statutes establishing time limitations on keeping (or destroying) data.
The committee met five times between March 1992 and August 1993 to debate these matters. Several outside experts (see Appendix A to this volume) were invited to three of these meetings. They described analogs of regional HDOs and other databases and discussed several specific problems with the committee, such as the range of organizations, agencies, and individuals who now seek access to patient health records and possible approaches for addressing misuse of patient data.
To avail itself of expert and detailed legal analysis of issues beyond the time resources of its members, the committee commissioned a paper from an expert in privacy and confidentiality matters (Belair, 1993). The paper identified privacy interests relevant to HDOs (chiefly of the CHMIS variety); examined the impact of existing law on these organizations; advanced some short- and long-term options and strategies for privacy protection of patient-identified information; and gave particular consideration to the status and protection of clinical and other patient-identified data once they move beyond legal or other protections afforded to primary medical records.
When IOM studies with national significance involve activities initiated at the grassroots, state, and local levels, the IOM often makes a concerted effort to reach out to a wide range of people in those locales. The aims are to learn about the activities and to understand the views of interested parties about issues pertinent to the local efforts, and then to apply those lessons, as appropriate, to broad national, professional, and policy-related issues. The IOM takes care, in these circumstances, not to evaluate or draw public judgments about local efforts.
During the summer and fall of 1992, the committee conducted five major site visits to the following cities (and nearby locales): Memphis, Tennessee; Cleveland, Ohio; Des Moines, Iowa; Seattle, Washington; and Albany and Rochester, New York. During these site visits two or three committee members and IOM staff met with groups developing HDOs in business coalitions and other organizations, practicing physicians and representatives of local medical societies, insurers and third-party claims administrators, health maintenance organizations, consumers, hospital administrators and hospital associations, researchers, state and county health officials, employers, and computer system developers. (Sites and organizations visited are listed in Appendix A.)
Organization of the Report
The committee considered all the questions raised earlier but focused on two primary issues. The first is the public release of descriptive and evaluative data on the costs, quality, and other attributes of health care
institutions, practitioners, and other providers, which the committee assumed would be a major function if not a hallmark of HDOs. The second involves the opportunities, risks, and remedies for protecting the privacy and confidentiality of data that do (or may) identify individuals in their role as patients or consumers, not as clinicians or providers. These topics are taken up, respectively, in Chapters 3 and 4. Before that, Chapter 2 describes health databases and HDOs in more detail, discusses their ostensible benefits in general and with respect to a wide range of potential users, and introduces some caveats about how their intrinsic limitations (e.g., poor or incomplete data) must be recognized and overcome.
This report reviews the tremendous promise of regional health data networks for evaluating and improving health care and controlling its administrative costs. While the potential for great benefit to the public may be understood by those in the relevant fields, the potential for harm or lack of fairness in their use may create doubt and fear in many.
Powerful technologies (and electronic technologies are increasingly powerful) can be deliberately or inadvertently misused and cause great harm, in this case primarily in the loss of privacy and confidentiality and the resultant harms this may engender. To gain public support for the vision in this report, and for the public to make best use of the health-related information that will be released, carefully planned strategies must be developed for education about the data networks, about how the data can be used to help the public access and obtain better care, and about what each individual needs to know about the right to privacy and confidentiality and the steps being taken to protect their rights. The responsibility for providing usable public information should be assumed by those who undertake to make the vision of regional data networks become reality.