In discussing these approaches, Chute emphasized, it is important to comprehend fully the definition of normalization, as it has both syntactic and semantic meanings. Syntactic normalization is highly mechanical and involves correction of malformed messages. An example of such work is the Health Open Source Software pipeline created by Regenstrief Institute, which is capable of this type of syntactic normalization. On the other hand, semantic normalization typically involves vocabulary and concept mapping.

Both types of normalization assume that there is a normal form to target, yet extant national and international standards do not fully specify that target. Many standards exist, but, Chute said, they do not specify what is needed. The current standards and specifications of HIE and messaging are narrow, and do not look at the full representational problems of clinical data, so that efforts to meet the standards fall short on those fronts. Additionally, while there is tension on this point, machine readable, rather than human readable, standard representation is necessary for large-scale inferencing and secondary use.

Having elaborated on the definition and current characteristics of normalization, Chute turned to describing current efforts undertaken by ONC’s Strategic Health IT Advanced Research Projects (SHARP) Program, specifically SHARPn, whose major focus is on normalizing and standardizing data. SHARPn is approaching data normalization through clinical element model (CEM) structures, which are a basis for retaining consistent meaning for data when they are exchanged between heterogeneous computer systems or when clinical data are referenced in decision support logic or other modalities of secondary use. CEMs include the context and provenance of data, for example a patient’s position and body location will be recorded alongside his or her blood pressure reading.

This promising model has generated an international consortium, the Clinical Information Model Initiative (CIMI), which brings together a variety of efforts focused on CEMs. When comparing the resulting CEMs between different participating partners, it becomes clear that different secondary uses require different metadata, which raises the question of what structured information should be incorporated into these models. By binding value sets to CEMs, Chute suggested, it is possible to effectively institute semantic normalization. Ideally, all collaborating groups would implement the same value sets and they would be drawn from “standard vocabularies” like LOINC and SNOMED. However, it is likely that many value sets would have to be bound to these CEMs in order to truly have interoperability and a comparable and consistent representation of clinical data. Value-set management, therefore, is a major component of normalization, and terminology services and a national repository of value sets managed by the National Library of Medicine is one suggested approach to handling this challenge. Local codes would have to map to the major value sets, and



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement