Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 36
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions 5 Research Challenges There are deep intellectual challenges where the disciplines of computer science and engineering, health/biomedical informatics, related social sciences, information technology (IT), and health care overlap. Indeed, interdisciplinary work will be necessary to go beyond incremental improvement of existing health care IT or the automation of traditional paper-based workflows. Systematic development of the health care IT-related research agenda is beyond the scope of this brief study, but the committee offers a framework for organizing such an agenda. It is important to distinguish between a solution to a specific problem in the health care domain and the technology-related efforts needed to realize it. The committee conceptualized the necessary technology-related efforts with respect to two separate dimensions. The first lies along an axis describing the extent to which new, generally applicable research is needed. A second lies along an axis describing the extent to which new research specific to health care and biomedicine is needed. Technology-related efforts can thus be separated into four (2 × 2) quadrants, as illustrated in Box 5.1.1 From a research management standpoint, such a clustering is helpful for better understanding the parties needed to undertake any given technology-related research effort, the likelihood of its success, the timescale 1 Conceptually, the segmentation of the domain into these four quadrants is quite similar to the division proposed in Donald Stokes, Pasteur’s Quadrant: Basic Science and Technological Innovation, Brookings Institution Press, Washington, D.C., 1997.
OCR for page 37
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions Box 5.1 A Segmentation of Health-Care-Related Technology Efforts General applicability Health care specific Relatively clear path forward from existing technologies Quadrant 1: Quadrant 2: General—applied efforts Health care—applied efforts Advanced research needed Quadrant 3: Quadrant 4: General—advanced efforts Health care—advanced efforts needed to achieve success, the appropriate funding mechanisms, and other such parameters. For example, efforts in quadrants 1 and 3 might be pursued by computer science researchers working in loose cooperation with the health and biomedical informatics communities, whereas efforts in quadrants 2 and 4 would require much tighter coordination and cooperation. These two dimensions emerge from the observation that health care IT draws on classic computer science challenges such as providing high availability with low system management overhead [C4O18], high data integrity, and a very high degree of usability. Such goals are essential foundations of many IT systems but are especially challenging to achieve in the context of health care IT, given the scale and diversity of the health care establishment and, in some cases, the need to support a large, broad user base. In addition, many benefits of systems often accrue only when they are viewed by researchers and caregivers as sufficiently trustworthy to replace older solutions. At the same time, some problems related to health care IT involve solutions that are highly specific to health care (e.g., developing high-quality devices for human-computer interaction [C1O2] that do not inadvertently help to spread infection as care providers move from patient to patient). As an illustration of how a solution to a major problem in health care might be decomposed into a technology-related research agenda, consider that most clinicians spend a significant amount of time in documenting the care provided to a patient.2 One challenge for health care IT would be 2 The committee noted this point in its site visits. And the literature has important examples as well. For instance, a survey of more than 2500 clinical oncologists showed that the amount
OCR for page 38
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions the creation of a self-documenting environment in which the necessary documentation could be generated with little or no additional effort on the part of the clinicians [C5O19] (see Section 5.2.5). But making progress toward this goal calls for efforts in all four quadrants of the matrix shown in Box 5.1. The existing technology and general applications of Quadrant 1 provide a clear path for indexing voice recordings. Speech-to-text transcription is a relatively mature technology for vocabularies of modest size as indicated by the variety of commercial software packages available. Speaker identification is routinely performed using voiceprints of the known participants, the patient typically being the remaining unknown speaker during a clinical encounter, and once a voice recording is transcribed to text, indexing within a known domain borders on the trivial. Full-text transcription today has relatively high error rates that make it unreliable as a basis for making clinical decisions, although as the technology further matures, error rates can be expected to drop.3 Another general application is information extraction from discourse analysis—a computer listening to a dialog (or examining a transcript) between two people would be able to make inferences about the topics under discussion. Research in this area would build on work in computational linguistics that dates to the 1980s. For deep information extraction (e.g., linking the conversations to key terms in the medical literature), fundamental research in Quadrant 3 is needed (for example) to understand how to relate concepts embedded in the words themselves to the rich store of background knowledge about the world that informs everyday discourse. As for health-care-specific applications, there is a fairly clear path using existing technology to develop systems that support patient-supplied documentation or documentation provided by the patient’s support system (e.g., family), which would increase the continuity and richness of information available for the clinician, as well as being helpful in dealing with expected future burdens on patients to manage their own care outside traditional health care organizations; this research agenda would fit into Quadrant 2. On the other hand, a system to provide a patient or caregivers with interactive explanations of a disease, particularized by the of time they spend filling out paperwork and documenting patient care has increased more than fourfold over the past 25 years. See S. Mayor, “U.S. Cancer Care Is Worse Due to More Paperwork,” British Medical Journal 322(7296):1201, 2001. 3 To be sure, claims regarding the impending maturity of speech recognition have been made for a long time, but as with user customization of interfaces (see Footnote 22), speech recognition is another example of an idea that was difficult to implement with the technology of 20 years ago but now is much more feasible with today’s technology and just as important today to pursue.
OCR for page 39
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions patient’s culture, learning style, value system, education, and life experience, remains beyond the current state of today’s science and would fit into Quadrant 4. Other examples of technology-related research efforts in each of the four quadrants are provided below: Quadrant 1 (General—applied efforts). Adaptation of existing IT and process solutions from other domains and industries, e.g., process and data integration technologies, human-computer interaction technologies, ubiquitous networking technologies, security, search, blogging, and social networking. Quadrant 2 (Health care—applied efforts). Identification of the best examples of coupled health care improvement and health care IT that have been successfully deployed or prototyped, followed by wide deployment of those examples. Use of existing data and process standards to obtain low-hanging fruit, e.g., portals, electronic messaging, disease management dashboards, decision support and reminders, process automation, and so on. Quadrant 3 (General—advanced efforts). Invention of new information technologies that are needed in health care, such as ontology management, systems that help to explain why decisions are made, large-scale machine learning, voice technologies, natural language processing, privacy management for access and data mining, and so on. Quadrant 4 (Health care—advanced efforts). Specific advanced work on advanced ontologies and reasoning in the medical domain, modeling of the human body and the virtual patient, interpretation of medical information to different communities, approaches to learning and improving data quality, aggregation of patient health care information into a trustworthy database with explicit representation of uncertainty [C4O17, C5O23]), and so on. 5.1 AN OVERARCHING RESEARCH GRAND CHALLENGE: PATIENT-CENTERED COGNITIVE SUPPORT Patient-centered cognitive support emerged as an overarching grand research challenge during the committee’s discussions. This section discusses how a research agenda might be assembled, together with representative research challenges, to illustrate the magnitude of the opportunity. Much of health care is transactional—admitting a patient, encountering a patient at the bedside or clinic, ordering a drug, interpreting a report, or handing off a patient. Yet transactions are only the operational expression of an understanding of the patient and a set of goals and plans for that patient. Clinicians have a “virtual patient” in mind—a conceptual
OCR for page 40
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions model of the patient reflecting their understanding of interacting physiological, psychological, societal, and other dimensions. They use new findings—raw data—to refine their understanding of their virtual patient. Then, based on medical knowledge, medical logic, and mostly heuristic decision making, they formulate a plan, expressed as an order (transaction), to try to change the (real) patient for the better. Today, clinicians spend a great deal of time and energy searching and sifting through raw data about patients and trying to integrate the data with their general medical knowledge to form relevant mental abstractions and associations relevant to the patient’s situation. As reported by Kushniruk, decision making by health care professionals is often complicated by the need to integrate ill-structured, uncertain, and potentially conflicting information from various sources.4 These various sources include but are not limited to myriad journal articles; memories from personal clinical experience; clinical guidelines; medical records from a host of providers (often working for different health care organizations); informal observations and thoughts from colleagues; and patient commentary and insights. Efforts to sift the data from this collection of sources force clinicians to devote precious cognitive resources to the details of data and make it more likely that they will overlook some important higher-order consideration. The health care IT systems of today tend not to provide assistance with this sifting task. Rather, they squeeze all cognitive support for the clinician through the lens of health care transactions and the related raw data, without an underlying representation of a conceptual model for the patient showing how data fit together and which data are important or unimportant. There is little or no cognitive support for clinicians to reason about their “virtual patient.” So the health care IT systems force clinicians to a transactional view of the raw data. As a result, an understanding of the patient can be lost amidst all the data, all the tests, and all the monitoring equipment. In the committee’s vision of patient-centered cognitive support, the clinician interacts with models and abstractions of the patient that place the raw data into context and synthesize them with medical knowledge in ways that make clinical sense for that patient.5 Raw data are still avail- 4 A. Kushniruk, “Analysis of Complex Decision-Making Processes in Health Care: Cognitive Approaches to Health Informatics,” Journal of Biomedical Informatics 34(5):365-376, 2001. 5 The notion of putting individual medical facts into an appropriate context is not new, having been described in the literature as early as 1969 (Lawrence L. Weed, Medical Records, Medical Education and Patient Care, Case Western Reserve University Press, 1969). Nevertheless, IT has progressed a long way since then, providing a more suitable medium in which to implement such a notion.
OCR for page 41
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions able, but they are not the direct focus of the clinician. These virtual patient models are the computational counterparts of the clinician’s conceptual model of a patient. They depict and simulate the clinician’s working theory about interactions going on in the patient and enable patient-specific parameterization and multicomponent alerts. They build on submodels of biological and physiological systems and also exploit epidemiological models that take into account the local prevalence of diseases. The availability of these models would free clinicians from having to scan raw data, and thus they would have a much easier time defining, testing, and exploring their own working theories. What links the raw data to the abstract models might be called medical logic—that is, computer-based tools examine raw data relevant to a specific patient and suggest their clinical implications given the context of the models and abstractions. Computers can then provide decision support—that is, tools that help clinicians decide on a course of action in response to an understanding of the patient’s status. At any time, clinicians have the ability to access the raw data as needed if they wish to explore the presented interpretations and abstractions in greater depth. One possible framework for future health care IT is depicted in Figure 5.1. This framework, which emerged over the course of the committee’s discussions and contrasts with the limited focus of today’s health care IT, represents an all-encompassing view of components and interactions among components needed to support the Institute of Medicine’s vision of 21st century health care. Future clinician and patient-facing systems would draw on the data, information, and knowledge obtained in both patient care and research to provide decision support sensitive to workflow and human factors. The decision support systems would explicitly incorporate patient utilities, values, and resource constraints such as those mentioned above. They would support holistic plans and would allow users to simulate interventions on the virtual patient before doing them for real. To carry out orders, clinicians would use transactional systems like today’s, but built into the decision support system rather than the other way around. In today’s systems, decision support is commonly an add-on to systems designed primarily for transaction processing and does not benefit directly from results of data mining. Rather than having data entered by clinicians into computer systems, the content of clinical interactions would be captured in self-documenting environments with little or no additional effort on the part of the clinicians. (That is, an intelligent, sensor-rich environment would monitor clinical interactions and reduce sensor input to notes that document the medically significant content of those interactions.) In addition to the research challenges related to modeling the virtual patient and biomedical knowledge are the challenges in modeling and
OCR for page 42
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions FIGURE 5.1 The virtual patient—a component view of systems-supported, evidence-based practice. The left side of the figure concerns patient care. Raw data about a patient (the electronic health record) constitute the foundational base. Next come the transactional systems that both produce and use raw data as health care is provided. These two components make up the majority of today’s health care IT. Above them, the committee envisions a computational model of the virtual patient. The right side of the figure represents biomedical science and research and its integral role in health care. Again, raw research data about biological and medical phenomena are at the base. Clinical research transactional systems add to and use raw data during the process of executing or running clinical research protocols. At the top are the models and abstractions that constitute biomedical knowledge. The thread connecting the top three components is what might be called medical logic. Mapping from medical logic to cognitive decision support is the process of applying general knowledge to a care process and then to a specific patient and his or her medical condition(s). This mapping involves workflow modeling and support, usability, cognitive support, and computer-supported cooperative work and is influenced by many non-medical factors, such as resource constraints (cost-effectiveness analysis, value of information), patient values and preferences, cost, time, and so on. The virtual patient poses the greatest research challenge but is only one component. Smooth integration with other components is the goal.
OCR for page 43
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions supporting multiparty decision making (that is, medical decisions made by family, patient, primary care provider, specialist, payer, and so on). Techniques to interconnect the components are likely to be equally challenging (see, for example, the discussions in Sections 5.2.3 and 5.2.4 on data integration and data management). Box 5.2 describes some of the technical research challenges for patient-centered cognitive support organized by quadrant. On the non-technical side, a variety of questions arise as to how the use of clinically oriented systems such as those described above might fit into the actual workflow of a health care organization. How would such support fit into the work patterns of future clinicians? What would the impact be on their work efficiency? How and under what circumstances would clinicians trust the output of these systems? How would responsibility for clinical error be apportioned given the integrative functions of these systems? A failure to answer such questions adequately may well impede clinician acceptance of new approaches, even if the technical challenges can be overcome. The committee’s vision for patient-centered cognitive support is not wholly new. Indeed, development of IT-based tools that examine raw data relevant to a specific patient and suggest their clinical implications was the focus of a great deal of medical expert system work a number Box 5.2 Research Problems Categorized by Quadrant for Patient-Centered Cognitive Support Quadrant 1 (General—applied efforts). Data and process integration technologies, high-quality graphics and sensitive user interface design, coding and application of existing human/health models, application of human language translation technology in some regions Quadrant 2 (Health care—applied efforts). Careful use of existing data standards and models, codification of best practices Quadrant 3 (General—advanced efforts). Reasoning, machine learning, explanation (why the software reaches a particular conclusion), multimodal interfaces (see Section 5.2.5 below); a model of models that would support needed extensibility Quadrant 4 (Health care—advanced efforts). Creation of new advanced models of differential diagnosis; automated machine learning at large-population scale, based on outcomes; a model of models for this domain supporting requisite extensibility
OCR for page 44
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions of decades ago.6 Similarly, biomedical informaticians have worked for decades on the problem of how best to summarize and present data using visual methods, a point of special import in the setting of hospital intensive care units (ICUs), where multiple streams of real-time data can be overwhelming. Much of that research also had to deal with issues of acceptance by ICU clinicians and with trust of the technology.7 And the importance of connecting biological knowledge to clinical applications has been given new emphasis by a recent focus on translational research by the National Institutes of Health.8 Nevertheless, the committee believes both that new challenges have indeed emerged and that many “old” problems have proven more difficult to address effectively than was first appreciated. Advances in IT such as the World Wide Web and ubiquitous computing challenge the health care IT community to think differently about how to exploit IT for health care purposes. A final and significant benefit for the committee’s vision of patient-centered cognitive support is that patients themselves should be able to make use of tools designed with such support in mind. That is, entirely apart from being useful for clinicians, tools and technologies for patient-centered cognitive support should also be able to provide value for patients who wish to understand their own medical conditions more completely and thoroughly. Obviously, different interfaces would be required (e.g., interfaces that translate medical jargon into lay language)—but the underlying tools for medical data integration, modeling, and abstraction designed for patient-centered cognitive support are likely to be the same in any system for lay end users (i.e., patients). 6 One of the primary lessons from this work was that although well-designed medical expert systems did have potential to improve clinical diagnoses and recommendations for treatment, many other issues needed to be addressed before they were ready for “prime-time” application. In addition, much of the early work on medical expert systems focused on relatively small problem domains, whereas the overarching medical context for improving health care involves the large problem domain of how all of the patient’s data and problems fit together. 7 See, for example, R.A. Fleming and N.T. Smith, “Density Modulation—A Technique for the Display of Three-Variable Data in Patient Monitoring,” Anesthesiology 50(6):543-546, June 1979; M.M. Shabot, P.D. Carlton, S. Sadoff, and L. Nolan-Avila, “Graphical Reports and Displays for Complex ICU Data: A New, Flexible and Configurable Method,” Computer Methods and Programs in Biomedicine 22(1):111-116, March 1986; I.A. Galer and B.L. Yap, “Ergonomics in Intensive Care: Applying Human Factors Data to the Design and Evaluation of Patient Monitoring Systems,” Ergonomics 23(8):763-779, August 1980; Y. Shahar and C. Cheng, “Intelligent Visualization and Exploration of Time-Oriented Clinical Data,” Topics in Health Information Management 20(2):15-31, November 1999. 8 See, for example, Jocelyn Kaiser, “NIH Funds a Dozen ‘Homes’ for Translational Research,” Science 314(5797):237, October 13, 2006, available at http://www.sciencemag.org/cgi/content/full/314/5797/237a.
OCR for page 45
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions 5.2 OTHER REPRESENTATIVE RESEARCH CHALLENGES In addition to patient-centered cognitive support, there are for the computer science community many other interesting research challenges relevant to health care. Several examples are provided to illustrate this main point, but there are indeed many more that are not covered in this report. 5.2.1 Modeling One aspect of the “virtual patient” in Section 5.1 involves modeling various subsystems within a real patient (e.g., different organs, digestive system, and so on) to show how they interact.9 Such models might operate on different or variable timescales—a model focusing on the absorption of nutrients through the digestive system might operate on a timescale of hours, whereas a model focusing on skeletal health, calcium depletion, osteoporosis, or particular bones might operate over years. Similarly, some models might represent molecular interactions, and others might represent particular cells, organs, or organisms. To first order, the physiological subsystems of all human beings are identical. Thus, a sensible approach to modeling subsystems in a specific patient is to appropriately parameterize a generic model of those subsystems. But finding appropriate parameterizations for any given model and coupling the different models and the data to drive them pose significant intellectual challenges. Some insight into model interoperability can be gained through the use of ad hoc techniques (e.g., XML-based “mash-ups” [Web applications that combine data from multiple sources] used in Web 2.0 applications) or through other existing component frameworks, but the overall problem of model interoperability for health care purposes is vastly more complex than applications that have been tackled before. Progress is being made in understanding specific metabolic pathways.10 The effects of a medication, as well as of some other treatments, are candidates for modeling. Such models will still require many of the parameters used to manage and classify the data.11 Genetic makeup, 9 The notion of a computational virtual human being that would provide a high-fidelity computational model of a human being that would respond realistically to various stimuli is not new. See, for example, “The Virtual Human Project: An Idea Whose Time Has Come?,” Oak Ridge National Laboratory Review 33(1), 2000. 10 See, for example, www.HumanCyc.org. 11 See, for example, PharmGKB, a project to curate information that establishes knowledge about the relationships among drugs, diseases, and genes, including their variations and gene products, available at http://www.pharmgkb.org/.
OCR for page 46
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions including the capability to produce pathway-controlling enzymes, is one of the most challenging aspects of making such simulations relevant. Coupling models will require a computational platform that can support multiple interacting components that can be combined into larger and more complex models. Such a platform must not only support parallel operation of the analytical processes but also allow assembly of hierarchical simulation and information structures, dynamically built, exploited, modified when possible on the basis of individual patient data and statistical aggregates thereof, and abandoned when no longer effective. At the supporting levels, multiple processing alternatives will exist. Specific, detailed simulations will provide the most specific and current results. Cached results can greatly reduce the computational effort for repeated sub-analyses. Where no analytical methods exist, results from biological or clinical trials or clinician assessments can be provided. Search and interpretation can provide yet another set of inputs. Being able to operate with a variety of computational paradigms in one setting can greatly enhance collaboration among communities that have similar objectives but that now ignore each other. Yet another challenge in modeling is building multilevel models that can successfully couple highly detailed physiologic models to the much looser clinical “models” that typically are based more on phenomenological relationships than on true underlying causes. Finally, keeping records of predictions and actual patient outcomes will allow incremental tuning of the approach. It will take much experience as well as careful approaches to do so in a way that converges on a stable and more optimal outcome. The actual determination of patient treatment will remain in the hands and minds of the clinician. But the feedback that can be provided by bringing data collections, metabolic models, and their processing to an interactive care setting is essential to extract value out of the many technology investments that are in process or being planned. Box 5.3 describes some of the technical research challenges for modeling organized by quadrant. 5.2.2 Automation The technical definitions of automation allow for multiple forms, depending on the degree of intelligence and autonomy exhibited. Systems that are completely automatic and that can be trusted to work properly without any need for human oversight or attention have proven to be effective and valuable. Systems that require human oversight or control, which in actuality is almost any complex system, fall under the category
OCR for page 48
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions large number of publications and symposia dedicated to this problem in all industries that are affected: aviation, process control, and medicine.14 The worst problem of automatic systems is an issue of trust. If personnel trust them, the trust is often over-generous, so that personnel are apt to believe erroneous indicators and operations for longer than is prudent, or they may neglect attending to and monitoring of the system even though it is not fully reliable. Similarly, a lack of trust may also be inappropriate, leading people to add to their workload to continually check on the operation of a system that is, in fact, quite capable of autonomous operation. The problems of over- and underautomation have been well documented in other domains and industries, but the committee believes that they have not been appropriately appreciated within the medical community. Much can be gained in an industry by the introduction of more intelligent, more autonomous systems, but the lessons from other disciplines must also be acquired and followed.15 Automation has been implemented most successfully in aviation and process-control manufacturing. Automation is also used in warehousing and traditional manufacturing, as well as in many modern electronic-commerce back-end systems. Stock trading is another example of an activity in which automation can be used successfully. All these cases differ from medicine (although prescription filling and checking may come closest to matching order-filling systems), however, and the lessons they provide cannot be carried over directly into medicine. But drawing on such hard-earned experience as a point of departure for medicine makes good sense. Finally, the introduction of automation is always a systems problem 14 In the medical domain, see, for example, J. Edworthy and E.J. Hellier, “Fewer But Better Auditory Alarms Will Improve Patient Safety,” Quality and Safety in Health Care 14:212–215, 2005; J. Edworthy and E.J. Hellier, “Alarms and Human Behaviour: Implications for Medical Alarms,” British Journal of Anaesthesia 97(1):12-17, 2006; A. Otero, P. Felix, F. Palacios, C. Perez-Gandia, and C.O.S. Sorzano, “Intelligent Alarms for Patient Supervision,” Proceedings of the IEEE International Symposium on Intelligent Signal Processing, WISP 2007, pp. 1-6, 2007. 15 See, for example, T.B. Sheridan, Humans and Automation: System Design and Research Issues, Human Factors and Ergonomics Society, Santa Monica, Calif. (Wiley Series in Systems Engineering and Management), 2002; D.A. Norman, “The ‘Problem’ of Automation: Inappropriate Feedback and Interaction, Not ‘Over-Automation’,” in D.E. Broadbent, A. Baddeley, and J.T. Reason (Eds.), Human Factors in Hazardous Situations, pp. 585-593, Oxford University Press, Oxford, 1990; C.E. Billings, Aviation Automation: The Search for a Human-Centered Approach, Lawrence Erlbaum Associates Publishers, Mahwah, N.J., 1997; D.A. Norman, The Design of Everyday Things, Doubleday, New York, 1990; B. Lussier, A. Lampe, R. Chatila, J. Guiochet, F. Ingrand, M.-O. Killijian, and D. Powell, “Fault Tolerance in Autonomous Systems: How and How Much?,” in 4th IARP-IEEE/RAS-EURON Joint Workshop on Technical Challenges for Dependable Robots in Human Environments, Nagoya, Japan, 2005.
OCR for page 49
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions Box 5.4 Research Problems Categorized by Quadrant for Automation Quadrant 1 (General—applied efforts). Application of automation systems that exist; more use of business process integration technology as it exists in information technology; application of simple rules that can make a big difference Quadrant 2 (Health care—applied efforts). Codification of low-hanging fruit; use of open-source and other community techniques to pool necessary information to produce better automation rules; application of simple things first, like electronic messaging, automated scheduling of various resources, and so on, and an emphasis on avoiding paralysis by analysis Quadrant 3 (General—advanced efforts). Explanation, self-testing of efficacy, advanced learning, and management of false-negative and false-positive conditions Quadrant 4 (Health care—advanced efforts). Extension of underlying data uses and modeling to improve model precision (e.g., more data feeding into drug interactions systems could be used to reduce false alarms); efforts to ensure that outcomes are known to the system so that it can self-report and learn that intermixes equipment, administrative procedures, and real people. Accordingly, research on automation for medicine will require a multidisciplinary team approach, including technical, medical, and social science expertise. Good design cannot be added on afterward, and intensive cooperative efforts involving people from all disciplines affected by any IT-based system are necessary from the start. Box 5.4 describes some of the technical research challenges for automation organized by quadrant. 5.2.3 Data Sharing and Collaboration The data relevant to health care are highly heterogeneous, and the types and quantity of data evolve rapidly. In addition to patient-record information that exists in multiple forms, health care requires data about drugs and diagnoses, including data from signals captured by biomedical devices, voice recordings, and data captured as codes. Data are typically stored in multiple locations on multiple systems. Sometimes such data are stored in structured databases, and in other cases relevant data are found in legacy systems, structured files, and databases and text files behind Web forms. Data are increasingly multimedia and high-dimensional, including voice, imaging, and continuous biomedical signals. Data of various types have different degrees of reliability, ranging from test
OCR for page 50
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions results (which may be quite conclusive) to patient-provided data (which could contain significant biases). Numerous health care IT challenges require the ability to share and integrate data across multiple systems and seamlessly move data from one system to another. To exploit highly heterogeneous data effectively, users—such as caregivers, medical researchers, and patients—need the ability to ask queries that span multiple data sources without requiring the data to be standardized or requiring the user to query each single database in isolation. That is, the user wants a single interface through which any query can be posed. Today, the challenge for data integration, by which is meant systems that enable data owners to share data and collaborate in flexible ways without having to store all the data in a single repository or have them all conform to a common schema, is understood from the systems and logical perspectives. One approach is to aggregate patient health care information into a common data repository [C4O14]. Although aggregation is a basic building block of data integration, aggregating all relevant data into a single repository is likely to be infeasible. As a result of a significant amount of research, there are commercial systems today that are capable of answering queries that span multiple sources without loading all the data into a single warehouse with a uniform schema. The user of such a system accesses the data through an abstraction called a mediated schema, and queries are then reformulated from the mediated schema onto the relevant data sources using a set of semantic mappings. These systems perform adequately, and the small additional cost of accessing remote systems at query time is offset by the management benefits of having systems that can share locally owned and maintained components. The main shortcoming of current data integration systems is that they are too hard to use. Designing a mediated schema and creating the semantic mappings between the sources and the mediated schema entail a significant effort that requires considerable subject-matter expertise. This is especially true when the schema is large, complicated, and likely to be continually evolving, as in the case of health care data. As a consequence, integration projects often fail midway since the costs of this design work are incurred up front before the benefits from that work are obtained. The above challenge suggests three specific research directions: Data integration systems that are fundamentally easier to use. The system should be able to examine the data sources available and suggest to the designers a possible mediated schema and mappings from the data sources to semantically related entries in the mediated schema. The system should point to gaps in the coverage of the data sources so that additional sources can be discovered or enhanced. The system should
OCR for page 51
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions present to the designer effective visualizations of the data and the schemata to further facilitate the process. Gaps in the system’s coverage can be detected by analyzing queries (e.g., frequent queries asking for an attribute of a patient which is not represented in any of the data sources of which the system is aware). Data integration that can proceed incrementally. It should not be necessary to completely integrate data sources in order to get some benefit from the collection of sources. One approach to reducing the effort required in data integration is what might be called “pay-as-you-go” data integration. A design goal should be the construction of systems that offer access to multiple data sources with little or no human effort, and that improve over time as the users realize where integration is needed most. For example, a system could begin by guessing approximate (and possibly incorrect) semantic mappings; over time, semantic mappings would be improved, thereby enabling more comprehensive answers to queries over the collection of data sources. Some of the specific challenges to obtaining such systems are (1) leveraging user interactions with the system to understand the semantics of the data, (2) developing collaborative techniques for improving the semantic cohesion of a collection of data sources, and (3) maintaining compatibility of incremental integration efforts with previous versions. More flexible architectures for data sharing and integration. Currently, the common architecture for such systems envisages a single mediated schema and mappings to that schema.16 While this architecture has the advantage that the data can still remain in the sources and be managed there, the creation of the mediated schema is still a centralized effort. Systems are needed that enable data owners to share data in a more ad hoc fashion and extend the coverage of data sharing as they see fit.17 Peer-to-peer architectures are needed for sharing data whereby it is easy to (1) discover data sources, (2) join the network of available sources without significant effort, and (3) retain control over the data and its privacy as necessary.18 In addition, such a system should enable tracking different versions of the data as the data evolve over time, and highlight the changes when appropriate. If these challenges can be met, it will be much easier to build and deploy data integration systems that require minimal set-up time and pro- 16 See, for example, a common architecture for enterprise information integration products from IBM (http://www-01.ibm.com/software/data/integration/) and BEA (now Oracle) (http://edocs.bea.com/liquiddata/docs81/index.html). 17 This embodies the philosophy underlying the Semantic Web approach. 18 See for example, Gio Wiederhold, “Mediators in the Architecture of Future Information Systems,” IEEE Computer 25(3):38-49, March 1992.
OCR for page 52
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions Box 5.5 Research Problems Categorized by Quadrant for Data Sharing and Collaboration Quadrant 1 (General—applied efforts). Application of known data integration technology, ontology management and analysis tools, and state-of-the-art search techniques (including user-machine learning and information retrieval technology to enable systems to self-tune) Quadrant 2 (Health care—applied efforts). Application of existing ontologies and knowledge sources in scalable, efficient systems Quadrant 3 (General—advanced efforts). Development of easier-to-use data integration and ontology management systems, to allow for incremental creation and annotation of semantic information; work toward resolving understanding about how to decide when and where semantics must be added, and when semantics can be induced based on raw information stored and usage models Quadrant 4 (Health care—advanced efforts). Advanced privacy management that supports needs for aggregative, epidemiological research vide valuable services without specifying complete and accurate semantic mappings. For example, certain data regarded as critical might be made interoperable through explicitly designed semantic mappings. But all data might be made available (i.e., visible) subject to control for confidentiality even if no mappings had been created. A care provider needing data for which no mappings were available would have to work harder to query those data, but those data would at least be visible and usable for clinical purposes. If and when a need is recognized for making a particular class of data semantically consistent, mappings could be created—and the system’s overall interoperability could be incrementally improved. Box 5.5 describes some of the technical research challenges for data sharing and collaboration organized by quadrant. To illustrate the importance of data integration, consider its application to the personal health record. In its ideal future form (not that of today), a personal health record contains an individual’s entire medical history, that is, from all interactions with all health care providers (and self-provided care as well) and is under the control of the patient.19 For information to be easily accessible to the patient, data supplied by different providers—likely each with their own local health care IT systems generating data in idiosyncratic formats and with different meanings—must be integrated in a way that they appear to have common semantics. Data 19 See, for example, Kenneth D. Mandl and Isaac S. Kohane, “Tectonic Shifts in the Health Information Economy,” New England Journal of Medicine 358(16):1732-1737, April 17, 2008.
OCR for page 53
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions protection—a key element of personal health records, in that the patient is empowered to apply fine-grained control of the information contained therein—also requires that patient-specified security and privacy policies act on all data elements referring to the targets of those policies. This requirement presents yet another data integration task. 5.2.4 Data Management at Scale20 Presuming the existence of large integrated corpora of data (the focus of Section 5.2.3 on data integration), another major challenge is in managing those data. Some of the important dimensions of medical information management include: Annotation and metadata. Raw data almost never speak for themselves, and their interpretation inevitably relies on metadata—annotations to the primary data that provide the necessary context. For example, the primary data for the human genome consist of a sequence of some 3 billion nucleotides. Metadata associated with the primary data help scientists to identify significant patterns within those data—a given sequence might be annotated as a gene or a regulatory element. Metadata could also be used to trace the provenance or lineage of data. For example, the value of certain data in an electronic health record could be enhanced if the data included information about the conditions under which certain data were obtained (e.g., physician observations of a patient’s description of symptoms might be accompanied by video and audio recordings of the session with the patient). With metadata, a primary problem is the design and development of tools to facilitate machine-readable annotations in large databases. Information extraction from text. The volume of medically significant information rendered in text form (e.g., physician or nursing notes) is large, and may in various instances be as or more significant than information rendered in different forms (e.g., lab instrument readings). Extracting useful medical information from textual notes is therefore an important problem that calls for computer science expertise in text processing, natural language processing, and statistical text-mining techniques as well as medical expertise to understand the concepts and ideas to which the information refers. New techniques are needed for extracting information such as patient names, doctor names, medicine names, and disease names from textual notes, and for generating automatic linkages between 20 An extended discussion of the data management challenges in biomedical data can be found in National Research Council, Catalyzing Inquiry at the Interface of Computing and Biology, The National Academies Press, Washington, D.C., 2005.
OCR for page 54
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions different relevant entities. Such extraction would make it possible to piece together a larger picture automatically while pulling information from multiple heterogeneous data and information sources. Extraction of data from tables and figures in reports is another example of a useful information extraction capability. Linkage. Clinicians often rely on multiple types of data to render a diagnosis—e.g., blood tests and clinical observations and imaging. Relationships between different types of data are best captured in ontologies,21 which are descriptions of concepts and relationships that exist among the concepts for a particular domain of knowledge. In addition to providing controlled, hierarchically structured vocabularies for medical terminology, they specify object classes, characteristics, and functions in ways that capture important concepts and relationships between those concepts (perhaps in a given area, such as internal medicine or cardiology or oncology). Ontologies containing such information facilitate the representation of working hypotheses and the evidence that supports and refutes them in machine-readable form, and can help clinicians reason their way through complex cases. Ontologies must also be revisable in the light of new research that may discover previously unknown relationships or develop new interpretations of existing concepts. An important research problem is thus the design of appropriate ontologies and automated approaches to populating and updating them through sources such as medical dictionaries, textbooks, and recent articles in the relevant literature, although it is an open question to what extent declarative approaches can capture and exploit all the relevant relationships. Fallback to programmed solutions provides an escape and should be possible to allow putting into practice implementations that can provide feedback and thus enable progress. Privacy. Epidemiological research and phase IV drug testing (post-approval) both depend on the aggregation of select medical data from large numbers of individual records, even if individual identities need not be associated with these data. The electronic storage of these records facilitates such aggregation, but aggregation on a large scale also has many privacy implications. An important research problem is thus how to mine these data without unduly compromising individual privacy when individuals have not explicitly granted data access permission. Additionally, even outside the world of epidemiological research, the management 21 The term “ontology” is a philosophical term referring to the subject of existence. The computer science community borrowed the term to refer to “specification of a conceptualization” for knowledge sharing in artificial intelligence. (See, for example, T.R. Gruber, “A Translation Approach to Portable Ontology Specification,” Knowledge Acquisition 5(2):199-220, 1993.)
OCR for page 55
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions of data in ways that permit data sharing among those with a need to know, while prohibiting other access, is a significant technical challenge. Scale and other systems issues. There are many challenges in creating and implementing the protocols and systems that will allow a variety of interlocking systems to provide a robust, high-performance information store that can be reliably and easily accessed by a variety of different classes of users, ranging from the patient and her designees to caregivers. For example, interlocking health care IT systems must enable and preserve the relationships among the different applications and workflows. In addition, the need to store data for a lifetime presents significant technical challenges if only because the storage lifetime could exceed the lifetime of some organizations. User interface. While technically not data management per se, the data models, data federation technologies, and security and privacy approaches must all support the wide variety of usage that is expected. What an emergency room physician needs is very different from what is required by a physician reviewing the data with an eye toward wellness, a point understood by at least some in the biomedical informatics community since the 1980s.22 Visualization tools that help users integrate and manage data pulled from multiple sources might also be considered part of a sophisticated user interface, and coupled with analytic techniques may help to solve problems that are not possible to solve using analytic techniques alone. There are many more dimensions to the problem than those described above, which are intended to be illustrative rather than exhaustive. In addition, Box 5.6 describes some of the technical research challenges for data management at scale organized by quadrant. In summary, the problems addressed in Section 5.2.3 and in this section are core problems that would lead to the creation of health care records with enormously diverse applications. These applications include providing the information that would, among others things, (1) power the virtual patient described in Section 5.1, (2) provide a strong foundation for epidemiological research, (3) improve communication throughout the caregiver ecosystem, and (4) offer information storage and retrieval that would enable patients and their family and friends to be more involved in their own health care. 22 See, for example, Eric Sherman and Edward Shortliffe, “A User-Adaptable Interface to Predict Users’ Needs,” pp. 285-315 in M. Schneider-Hufschmidt, T. Kuhme, and U. Mallinowski (Eds.), Adaptive User Interfaces, Elsevier, Amsterdam, 1993. User customization of an interface is an example of an idea that was difficult to implement with the technology of 20 years ago but now is much more feasible with today’s technology and just as important today to pursue.
OCR for page 56
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions Box 5.6 Research Problems Categorized by Quadrant for Data Management at Scale Quadrant 1 (General—applied efforts). Creation of systems that scale, using notions of “cloud” computing, coupled with local information to reduce management complexity Quadrant 2 (Health care—applied efforts). Compression, understanding of what to store and what not to store, prioritization of information; privacy of patient information Quadrant 3 (General—advanced efforts). Techniques for correcting or coding degrees of accuracy and precision in data; techniques for learning about and forming aggregate data sets; automated management techniques for large, highly valuable data sets that are often used across many organizations Quadrant 4 (Health care—advanced efforts). Applications for handling inaccurate data to improve input to health care data models, better coding techniques for information 5.2.5 Automated Full Capture of Physician-Patient Interactions As noted above, care providers spend a great deal of time in documenting their interactions with patients. Automated capture of patient-provider interactions would release such time for more productive uses and help to ensure more complete and more timely patient records. A comprehensive environment for capturing interactions would necessarily be multimodal, involving ways of capturing and interpreting visual images and conversations. Rather than one general-purpose environment, capture environments would likely be specialized to different settings—such as hospital room (e.g., nurse/patient), emergency room (e.g., ER physician/patient), routine consultation (primary care provider/patient), and specialist consultation (e.g., cardiologist or surgeon and patient). Some of the important dimensions in this problem domain include: Real-time transcription and interpretation of the dialog between patient and provider. Individual voices must be identified as being associated with the provider or the patient. The transcript must be parsed unambiguously, irrelevant information identified and ignored, and relevant information interpreted. Summarization of physical interactions between patient and provider based on the interpretation of images recorded by various cameras in the patient care room. In a hospital room, the system must be able to distinguish
OCR for page 57
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions between the administration of an intravenous antibiotic or a tubal feeding. In an examination room, the system must be able to identify parts of the body to which the patient or provider is pointing and correlate such gestures with the dialog. In all settings, cameras should be able to identify documents presented to patients, and to capture written annotations made by patient or provider, subject to appropriate privacy safeguards. The goal would be a system able to produce a useful summary and/or the equivalent of a video transcript that describes what happened. Transcript visibility for patients, and patients’ ability to correct and annotate the transcript. Correlation of the information contained in the audio and visual transcripts. Use of both types of information should increase the accuracy and utility of the resulting summaries. Some pieces of this technology exist, but even when they do, integrating them and making the results available smoothly, with little latency, are challenges to today’s computer science. Box 5.7 describes some of the technical research challenges for automated full capture of physician-patient interactions organized by quadrant. Box 5.7 Research Problems Categorized by Quadrant for Automated Full Capture of Physician-Patient Interactions Quadrant 1 (General—applied efforts). Use of photographic technology, integration of sensor systems (perhaps, from the simple temperature sensor to imaging), use of speech dictation for transcription and/or indexing of audio files, natural language processing on existing textual records Quadrant 2 (Health care—applied efforts). Creation of high-quality workflows, customization of physical devices for the hospital environment (e.g., with due regard for infection control and to minimize physician/patient distance), creation and use of appropriate language models to maximize machine capabilities, workflows to make transcripts available to patients, use of software systems post-visit to provide information Quadrant 3 (General—advanced efforts). Ever-improved speech recognition, multimodal interface development, summarization and extraction of key information, sentiment analysis, automatic privacy management Quadrant 4 (Health care—advanced efforts). Development of new modes of caregiver-patient-computer interaction where the interaction is tri-partite and the computer is not “in the way”; advanced empirical, health care informatics work aimed at understanding how to efficiently acquire and provide information via computer systems
OCR for page 58
Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions Lastly, a key non-technical issue to be faced by any full-capture system is patient acceptance. In some of today’s interactions between clinician and patient, a patient may rely on a clinician’s discretion to refrain from entering into the record certain sensitive information related by the patient. In the absence of believable assurances in full-capture clinical interactions that such sensitive information will not be recorded, patients may well be less forthcoming or complete in their accounting of their medical histories and circumstances. Such problems will have to be addressed before any such system will be widely acceptable.