inherent to this query included that the ICD-9 codes used for TIA and stroke were not validated in Mini-Sentinel, and that the longest look back for stroke or TIA events was 1 year, so that patients who experienced an event earlier than 1 year prior were missed.

In both of these examples, it was possible to get very quick information that provided guidance that FDA found to be useful in determining how much urgency should be attached to a specific question, while also helping to develop next steps. Along these lines, Query Health, an ONC-sponsored initiative, is working with many partners to develop standards for distributed data queries. As Elmore emphasized, the idea is to send questions to voluntary, collaborative networks, whose varied data sources may range from EHRs, to health information exchanges (HIEs), to other clinical records. These queries have the potential to dramatically cut cycle time on population questions, from years to days, and thereby, Elmore said, are critical to ONC’s strategy to bend the curve toward transformed health, and will play a foundational role in the digital infrastructure for a learning health system, focusing on the patient and patient populations, while ensuring privacy and trust.


In his comments on data harmonization and normalization, Christopher Chute stressed that data from patient encounters must be comparable and consistent in order to provide knowledge and insights to inform future care decisions. This normalization is also necessary for big-data approaches to queries. However, most clinical data in the United States, even within institutions, are heterogeneous, which presents a major challenge for harmonization efforts. ONC’s initiation of Meaningful Use is mitigating this challenge, but more work is needed.

Data normalization, Chute said, comes in two varieties: clinical data normalization of structured information, and processing of unstructured natural language. Moreover, three potential approaches to instituting this normalization exist. The first approach is for all generators of data, including lab systems, departmental systems, physician entry systems, to normalize their data at the source. Given the institutional effort necessary to realize this approach, it is not realistic in the short term. The second approach places all hopes for normalization in transformation and mapping on the back end of data systems; this approach sometimes works, but often is associated with ambiguous meanings and other transformation difficulties. Lastly, the third and most promising method is a hybrid approach, in which new systems begin by normalizing their data at the source, while established systems implement standard normalization protocols like meaningful use and data from legacy systems are transformed.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement