National Academies Press: OpenBook
« Previous: Front Matter
Page 1
Suggested Citation:"1 Summary." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 1
Page 2
Suggested Citation:"1 Summary." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 2
Page 3
Suggested Citation:"1 Summary." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 3
Page 4
Suggested Citation:"1 Summary." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 4
Page 5
Suggested Citation:"1 Summary." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 5

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

1 1 Summary The goal of the National Cooperative Highway Research Program (NCHRP) Project 17- 57 is to develop a comprehensive roadmap for states to measure serious injuries in crashes. This goal has been motivated by the Moving Ahead for Progress in the 21st Century Act (MAP-21), which requires a set of performance metrics to include assessment of serious injuries in crashes. The first task of the NCHRP 17-57 project was to recommend a definition of serious injury for use in these performance metrics. We recommended using a Maximum Abbreviated Injury Scale score of 3 or greater (MAIS 3+) to define serious injury (Flannagan et al., 2012). The key element of this recommendation is to use a diagnosis-based definition of serious injury. However, using a diagnosis-based definition of serious injury for highway performance metrics requires data linkage between crash and medical outcome. The second task of the project was to recommend near-term solutions for measuring serious injuries in crashes. We recommend two approaches that allow for states to measure serious injuries using a medical-diagnosis-based definition such as MAIS 3+. The first is to use state trauma or hospital discharge databases to count serious injuries in crashes. A majority of states have reasonably comprehensive trauma databases in place and can use them for this purpose while more comprehensive linkage is being put in place. The second near-term approach is to use sampling of hospital records for a subset of crashes. Efficient, stratified sampling can allow states to estimate the number of serious injuries and their association to certain roadway, crash, vehicle, behavioral and occupant characteristics. A third near-term approach discussed was to use regression to “correct” KABCO-based measures. We do not recommend this as a near-term approach except in limited circumstances where other options are not available or older, legacy datasets are being used. A survey of states indicated that data linkage is a priority for a majority of states. Those that are currently linking are generally doing so using probabilistic linkage methods, typically developed through an existing Crash Outcome Data Evaluation System (CODES) program. Probabilistic linkage is a method of estimating which cases in a pair of datasets refer to the same person, even when the datasets do not contain a unique identifier for those individuals. Probabilistic linkage was the focus of the CODES program and allows states to link datasets after the fact. A variety of alternatives to probabilistic linkage are being considered and tried in a number of states. However, at this time, no state has successfully implemented a non- probabilistic approach to statewide linkage. This report presents a roadmap for states to develop comprehensive crash-related data linkage systems, with special attention to measuring serious injuries in crashes. The summary below provides a brief background on data and data linkage, and then presents a set of ten steps that states must complete to set up a linkage system at the state level. Requirements for Linkage Before presenting the roadmap, it is important to understand the four underlying requirements for linkage to occur. These include: 1) having statewide datasets in place, 2) having common identifiers in the datasets to be linked, 3) having access rules and a mechanism for controlling access, and 4) having a place or mechanism for storing the linked data. Statewide databases are required for linking data at the state level, and these databases must also be evaluated in certain ways before linkage results can be understood. Characteristics

2 of databases to be considered include coverage, schema, quality control, and timeliness. Coverage refers to the percent of possible reporting agencies that are participating or percent of possible cases that are included in the database. A schema is a codebook of variables, possible values, and formats. Variables and values should be complete and collected and reported in the same way across reporting units. Quality control (QC) is the process of checking consistency and completeness at the individual data-element level (e.g., percent missing data for different data elements; agreement between related data elements). Database quality issues will multiply through the linkage process, so it is critical to start with high-quality databases. Finally, databases should be quality-checked and available as quickly as possible. In some cases, linkage in near-real-time (e.g., within 24 hrs) is possible. In others, linkage occurs on an annual cycle. Either way, the component databases need to be processed in a timely way to allow linkage to proceed in time for analysis and planning to be completed. To link people in any pair of datasets, one or more common identifiers must be present in both datasets. The gold standard of identifiers is a single, unique, permanent person-specific, alphanumeric identification code (ID) used in all datasets and assigned to all people in all datasets. However, several less ambitious forms of linking variables can also be used effectively. These include event-specific/person-specific identifiers or collections of non-unique but common identifiers that can be used in combination. A good statewide data linkage system has rules for access as well as software to allow appropriate access and prevent inappropriate access. Rules for access must comply with the Health Insurance Portability and Accountability Act (HIPAA) and state law, and the level of access may be different for different individuals. It is even possible that state laws that impede linkage may need to be changed. Finally, the datasets must be stored in some way that allows future use of the linkage. A data warehouse is an integrated and standardized means of storing a variety of different datasets and allowing linkage between subsets of them. Access to only the de-identified linked data can be controlled through individual login and password in the data warehouse analysis software. Although a linked dataset can be stored separately, we recommend the data warehouse model because only one copy of each component dataset is kept and updated. That way, updates to the dataset are reflected in the linked dataset and copies do not proliferate. In addition, the owner of the original dataset can maintain control of the database. Roadmap to Link Data The roadmap consists of ten steps: Step 1: Set up a system for collaboration and communication among all relevant agencies. This is often done through the Traffic Records Coordinating Committee (TRCC), but data linkage projects might also have a focused advisory group and specific buy-in. This group will need to assess whether there are legal hurdles to linkage and address these (possibly through changes to state law) early on. Step 2: Catalog all available relevant databases. The catalog should include schemas, inclusion criteria, coverage, and quality issues. In some cases, datasets will have to be brought up to a higher level of quality or coverage before linkage can proceed. Step 3: Determine which databases will be linked. A long-term plan for the order of adding linkages might be mapped out in this step, along with hurdles that need to be overcome for each. The most effective linkage between crash and medical diagnosis data will include EMS (crash-EMS-hospital/trauma).

3 Step 4: Identify the identifiers. This involves determining what variables the datasets have in common. In some cases, a unique identifier will be present, allowing direct linkage via the identifier. However, in most cases, a unique identifier will not be present and groups of common variables should be identified. Step 5: Determine the linkage mechanism. If a unique, common identifier is available, linkage should be immediately possible. However, this is unlikely for most state crash databases. Instead, linkage will require the addition of a unique, common identifier to the data systems or probabilistic linkage, which uses a group of common (non-unique) identifiers. This report provides extensive discussion of potential linkage mechanisms that are used or being piloted by states. We are not aware of any state that is currently using a non- probabilistic approach to linkage involving crash data, though several are piloting methods or planning pilot tests. Two methods of after-the-fact linkage are probabilistic linkage and hand linkage. Other states are exploring ways to pass a unique identifier between databases to enable future linkage. Probabilistic Linkage. Probabilistic linkage was the focus of the CODES program. This method uses common variables in two datasets that do not have a unique identifier in common to infer which cases are linked and assign a probability to that linkage. While probabilistic linkage is an excellent tool for linking datasets that do not share unique identifiers, the quality of the linked dataset needs to be assessed, just as the quality of the original datasets were. Poor quality linkage can result in biased analysis results and requires more complex analytical techniques to overcome this limitation. Linkage quality metrics exist, but are not generally used by states. Although software complexity has been an issue for many states, especially at the beginning, North Carolina (for example) has demonstrated that the method can be used in a timely fashion, producing a linked crash-EMS-trauma dataset for use in planning within two months after the dataset closes (e.g., at the end of the year). Hand Linkage. Hand linkage is an after-the-fact method that is used to link EMS and trauma data in some states. The process uses software to identify a small set of possible cases in one database (e.g., a crash database) that might link to a single case in another database (e.g., trauma). The choice is then given to a human (e.g., the trauma registrar) who will make a judgment about which case is most likely to be the correct match. Unique Identifiers. Other linkage methods involve assigning and passing an identifier between agencies (e.g., police and EMS) at the scene. The gold standard of identifiers is the person-specific/global identifier. This is a number that is assigned to an individual and follows them throughout datasets and time, across hospitals and even across different crash events. Two states are working towards this, but it is challenging to implement. A person-specific/event- specific identifier applies to a single person within a specific event. Often, these identifiers do not follow the individual through hospital transfers, though it may be possible to identify transfers after the fact. Finally, an event-specific identifier is assigned to all victims in a single crash event, and then individuals must be separately identified and linked after the fact. For example, Global Positioning System (GPS) location and crash report number are event-specific. Although further identification of individuals is required using probabilistic linkage, the event- specific ID will improve probabilistic linkage quality. Step 6: Determine a storage mechanism for the linkage. The recommended approach is the data warehouse. The data warehouse typically consists of a front end (reporting or analysis tools), software that can access component databases, and translation software custom-built to standardize each component database. The included databases do not need to be stored in any

4 specific location, and updates to the original database will be reflected in the data warehouse. The warehouse can link databases that have a common unique identifier. With this approach, a unique identifier, either passed at the scene or added after the fact via probabilistic or hand linkage, is incorporated in the original datasets, wherever they reside. The data warehouse approach requires some up-front costs to write the translation software and incorporate each database. However, the approach has several advantages. First, it does not proliferate copies of the original datasets, but instead translates the original, allowing for changes to be made in only one place. Second, it allows the owner agency to retain control over the original dataset and its versions and corrections. Third, the data warehouse provides a single system controlling user access on a database-by-database and even variable-by-variable basis. Finally, it allows users to use one software tool for reporting and analysis for all included databases. An alternative used in some states is to store the linked database separately. The advantage of this approach is reduced cost, but the disadvantage is that the database is static and therefore does not automatically reflect any changes to the original. The database itself may fall under different laws than the original crash dataset because of the inclusion of medical information, and this may create access issues. Step 7: Harmonize common data elements. For databases to be linked, all common variables must have the same schema. This includes variable formats as well as codes and values. Wherever possible, it is best to standardize to a national schema or data standard, such as the National EMS Information System (NEMSIS) or the Model Minimum Uniform Crash Criteria (MMUCC). This facilitates linkage to other databases that are harmonized with those standards (e.g., National Trauma Data Bank (NTDB)) and allows for re-use of existing Extensible Markup Language (XML), tools, and training materials. Step 8: Set up a pilot project. A pilot of an assigned-on-scene identifier might occur in a limited geographical area. A pilot of probabilistic linkage should be done on at least one year of whole state datasets because the method’s success depends on dataset size and the contents of variables. Piloting allows identification of logistical and dataset issues before a full-scale effort is launched. Step 9: Set up a sampling program. Step nine is technically optional, but potentially very useful. We recommend setting up a sampling program where medical outcome is obtained for a subset of crash-involved people. This sample provides an estimate of serious injuries in crashes that can be used for performance metrics, without implementing a statewide linkage program. Moreover, it provides a way of evaluating the outcome of a developing linkage system. Early in any linkage process, it is likely that the resulting linked dataset will be biased or of low quality. An independent sample can provide accurate numbers for planning while setting the bar for a more comprehensive approach. Step 10: Statewide linkage. Finally, the last step is to set up a full-scale statewide linkage program. It should be an expanded version of the pilot program, and any issues that arose during piloting should be resolved. Recommendations In carrying out a data linkage program at the state level, there are some challenges that states will have to face. In some cases, finding answers to these challenges at a national level will be efficient. In other cases, state-specific challenges will need to be addressed individually. The

5 following recommendations revolve around issues that would be most effective to address at a centralized level, so that all states can benefit. 1. Development of a national crash data schema and corresponding XML, based on MMUCC, which would provide the same benefits that NEMSIS has provided to the EMS community. In particular, such a schema should be designed to incorporate MMUCC, additional state-specific variables, and to facilitate linkage to NEMSIS and NTDB schemas. 2. Development of clear methods and criteria for testing quality of linkage systems (probabilistic or otherwise). Levels of linkage quality (in terms of bias, accuracy, and completeness) should also be associated with guidance in how to analyze the data and how to improve linkage quality. 3. Development of a repository for lessons learned, methods used (including those tried and rejected), and contacts in states that can provide advice. This should include (but not be limited to): a) Lists of variables states use for probabilistic linkage (if appropriate) and linkage success; b) Software available and algorithms use for probabilistic linkage, along with the pros and cons of each; c) Non-probabilistic linkage approaches successes and failures; d) Background on the data warehouse model and how to build one over time; e) Lists of vendors used by states for different elements of the data linkage process; and f) Contact information for individuals involved in state data linkage projects to provide assistance or advice. 4. Development of marketing materials that TRCCs can use to advertise the benefits of linkage to all groups that need to be involved. Coordination of a message at the national level would be helpful to gain the involvement of agencies that are not as used to working together (e.g., state health agencies and state departments of transportation). 5. Development and hosting of workshops for state data holders to learn about linkage approaches and discuss challenges with other states. 6. Generate a clear, written interpretation of HIPAA in the context of data linkage that defines clearly what mechanisms must be put in place to link data and still maintain HIPAA compliance. While HIPAA does not prevent data linkage or even including linked (de-identified) data in a state data warehouse, it does put additional security requirements on datasets that include such information. 7. Investigate the potential for vehicle-to-vehicle (V2V) communication to aid in passing identifiers on the scene. This should include assessment of what an application would need to do, potential hurdles in implementation, and estimated short-term (software development) and long-term costs. This project could also investigate the general problem of using event-specific (but not person-specific) identifiers to improve probabilistic linkage among occupants within the event. Such work could be applied to other event-specific linkage approaches (such as passing crash report number to EMS and trauma databases). 8. Develop a more detailed sampling protocol that includes costs of sampling and estimates of sample size needed for a set of target analyses. A pilot sampling project should be included to ensure that logistical challenges and costs are fully identified.

Next: 2 Overview »
Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems Get This Book
×
 Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The Moving Ahead for Progress in the 21st Century Act (MAP-21) requires a set of performance metrics to include assessment of serious injuries in crashes.

The TRB National Cooperative Highway Research Program's NCHRP Web-Only Document 302: Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems presents a roadmap for states to develop comprehensive crash-related data linkage systems, with special attention to measuring serious injuries in crashes.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!