When monitoring and assessing MCM use during a PHE, different stakeholders have different data and information needs (see Stakeholder Perspectives in Chapter 2). Throughout the workshop, panelists and participants discussed potential sources of data to answer questions important to monitoring and assessing MCM use and opportunities and challenges around their use. This chapter is organized conceptually according to the following four areas:
- Defining and answering questions to inform data needs for MCM distribution and monitoring;
- Considerations and approaches to data collection;
- Existing data sources and datasets; and
- Disseminating data and information.
Yu called on panelists to describe, from their sector’s perspective, key questions that should be asked and the corresponding data needed to inform the monitoring and assessment of MCM use when responding to a PHE. The bidirectionality between crafting the right questions and understanding the applicability of data sources to answer those questions is critical to monitoring and assessment efforts, she said. Key takeaways from individual workshop panelists and participants are described throughout this section.
Operational Questions That Drive Data Needs
From his experience managing cross-state EHR databases with upward of 15 million patients and fielding data requests from researchers, Wilcox noted the importance of understanding what data are needed to answer pre-defined questions. He referred participants to the 2013 PCORI Methodology Report,1 which stated that problems can arise when research questions are driven by the availability of data, rather than by what needs to be known.
From the federal standpoint, Patel said, broad operational questions apply to MCMs in every PHE response that must be answered before a distribution plan can be developed:
- (Do)es the “right” product(s) exist?
- Is there sufficient confidence in the safety and efficacy profile of the MCM to proceed with its deployment?
- Are there adequate supplies to deploy the MCM (i.e., is the product commercially available, or does it need to be stockpiled)?
- What regulatory considerations apply to distribution and use of the MCM (e.g., EUA, investigational new drug application, mass dispending orders)?
Once these operational questions have been considered, a distribution plan can be crafted to support the use of the MCMs as governed by the appropriate regulatory mechanism. Decisions made regarding access to MCMs are fully contingent on the safety and efficacy profile of the products (see Box 3-1 for a case study of this issue) and include the following considerations, noted Patel and Petersen:
- How should the MCM be made accessible (e.g., access points, level of triage)?
- If an MCM is in limited supply, what are the ethical considerations underlying how distribution should be prioritized to populations with the greatest potential benefit, and how can the decision-making process be made as transparent as possible?
- What is the risk communication and action plan for the public, including consideration of the current acceptance level of the MCM by patients and providers?
- How can administration, adherence, and compliance be tracked? What communications channels will deliver data back to decision
1 The PCORI Methodology Report is available at http://www.pcori.org/sites/default/files/PCORI-Methodology-Report.pdf (accessed August 23, 2017). This report is currently under revision with an updated version forthcoming.
- makers (i.e., patient to provider to local, regional, or state level health agencies), and what is the level of understanding by patients and providers for reporting this information?
Data Needs and Data Elements
As summarized by Wilcox, three key questions to consider when considering data needs for monitoring and assessing MCM use include
- What data do you need based on questions that have been identified?
- How can you feed the data to the requisite stakeholders as quickly as possible?
- What is a good metric for determining if you are using the right data?
Certain baseline data are needed for any PHE, said Patel; however, depending on the specifics of the PHE (e.g., the severity and accompanying benefit–risk profile of an MCM), additional categories of information may be required. Henry “Skip” Francis, director of Data Mining and Informatics Evaluation and Research at the Center for Drug Evaluation and
Research at FDA added that different stakeholders need different types of data. For example, at a national level, organizations such as the Biomedical Advanced Research and Development Authority (BARDA) need data on cases, locations, and association of a PHE to the MCMs available in the Strategic National Stockpile (SNS). Hospitals have a more situational and regional focus and need data on cases relevant to their area, he added, and FDA looks at as many information sources as possible. Three main categories of data discussed by workshop panelists and participants for answering key operational questions for MCM monitoring and assessment included medical history data, symptomatology data, and data to inform threat containment.
Ataher noted that in many PHEs, there may be limited safety and efficacy information available for an MCM. The product may be FDA approved, but it is being distributed for an unapproved indication, and/or the product may be dispensed in a less than rigorously monitored setting (e.g., a POD). To fill this data gap for MCMs that may only have investigational-level data, one should consider the short-term data needs for dispensing the MCM and the long-term data needs for properly assessing the safety and effectiveness of the MCM for the indication for which it is being prescribed or distributed, added Ataher.
Baseline medical information is needed on the patient at the point of distribution in an emergency situation, said Ataher, including both prior medical conditions and symptomatology (see next section). EHRs may contain this information for the affected population and could be accessed after the event. In most cases, it is unlikely at the present time that EHRs will be readily available in an emergency situation to inform MCM treatment decisions, he added. However, Cobb remarked that stakeholders should continue to look for ways to better incorporate EHRs into PHE preparedness and response efforts.
Cullen highlighted the need to better capture symptomatology data. Data on diagnoses, treatments, laboratory test results, and vital signs are generally well captured. Symptomatology is more challenging to capture as definitions are not standardized (e.g., one provider’s definition of a cough may be different than another’s). In the early stages of an event, a patient’s clinical course and their symptomatic response to an intervention are important, Cullen said.
Petersen suggested leveraging existing IT resources for syndromic sur-
veillance and monitoring that are already available at local health care facilities to help inform decision making, and Levy suggested that 911 call dispatch records could be tapped for symptomatology data (see Existing Data Sources and Datasets later in this chapter). Levy added that data from 911 calls are used by local public health departments when dealing with a variety of incidents, including monitoring for carbon monoxide poisoning, power outages, or potential Ebola patients.
In the face of no available MCMs, Lee highlighted the diverse types of data that could inform the containment of a pandemic or emergent threat. Data of interest could include epidemiology data from the affected area, human travel patterns, social behavior patterns, clinical data, and information about vectors and the disease cycle, as well as information about the environmental conditions that support the disease cycle. These data provide a picture of the population and how they interact, allowing operational aspects to be overlaid to determine how to achieve containment. Containment was critical at the start of the Zika outbreak, for example, when no MCMs were available, she said.
Just because data can be collected does not mean they should be collected, said workshop participant Sheldon Jacobson at the University of Illinois. Furthermore, he added, data are not evidence; they are potential for evidence. All data are not created equally, Wilcox said, and data can be generally divided into two categories based on their collection and analysis:
- Data that are collected for a specific purpose following a predefined methodology (prospective data collection and analysis), and
- Data that happen to be available because there are systems that collect them, most likely for a purpose other than surveillance (retrospective data collection and analysis).
It is important to recognize the difference between these two types of data, develop ways to segment data based on this criterion, and determine which data can be used in what ways, said Wilcox. Exemplifying these points, Lance shared an example of an all-hazards approach to data collection in New York State from which lessons could be learned about systems for collecting and disseminating data (see Box 3-2). Throughout the workshop, individual workshop panelists and participants highlighted important considerations when collecting data to answer specific questions
(see Operational Questions That Drive Data Needs on p. 32), as well as approaches for collecting that data.
Understanding the End User of Data
When determining what data infrastructure is needed for monitoring MCM use, Jeff Brown, associate professor at Harvard Pilgrim Healthcare Institute, suggested considering three questions: Who is the user or the audience? What questions do they need to answer, and with what level of precision? How quickly do they need the information? Answering these three questions will determine the type of information to obtain, he said. If the users need the rigor of a clinical trial, they must understand it will take months to years to obtain the data. If that level of rigor is not needed, and the users can work with data that show signals and trends, there may be data already being collected (i.e., immediately available) that could be relevant. It is a matter of matching the question to the data and to the method and the timing.
Determining the Physical Location for Point-of-Care Data Collection
Rather than focusing efforts on data collection at “outpatient” MCM dispensing centers (e.g., PODs), where there is a lack of infrastructure and resources to collect these data, Petersen suggested that efforts be focused on data collection by health care providers, such as those in physicians’ offices or hospitals. Tools are already available for tracking upticks in admission and adverse events at these locations, he added, and they could be leveraged for monitoring and assessing MCM use. Lee proposed an intermediate solution of registering individuals who are treated at a POD and having that registration linked to their EHR and the corresponding immunization registry, as appropriate.
Aligning Data Collection with Timing and Sequence of PHEs
Data are often collected late in the chain of events because they are easy to collect at that point, but they are often more valuable early in the course of a PHE, said Jacobson. Yu observed that some key data elements are needed in support of clinical endpoints (e.g., for registration-enabling studies or post-marketing requirements or commitments) that can only be collected during the course of a PHE, that is, at the time of disease onset or exposure.
Multiple sources of data could be used in parallel during a PHE, said Yu. Collection from one data source could result in a sequence of events leading to different types of data collection, she noted, including observational studies, patient registries, electronic health data, big data, and clinical studies. In terms of how these different data streams are used, timing of data collection becomes an important issue, said Cullen. Can a system be designed so that the data points, the manner in which they are collected, and the way in which they are aggregated are responsive to the timing needs of each?
Lee emphasized the need for a feedback system to facilitate continuous improvement and real-time decision making and that data modelers must learn to manage evolving data and real-time information. Francis added that databases must adapt over time as an event evolves (see Box 3-3). Having a feedback system in the data collection and analysis process is valuable, Lee added. Analyzing data early during the course of a PHE can help to prioritize data needs and detect key data elements, allowing for allocation of resources to collect those elements. A feedback system could also facilitate adaptation of data models as new information is incorporated.
Patel emphasized the need for better integration of all levels of data from PHE responses into the evolving response structure. These data could be better integrated into distribution, dispensing, and upstream activities from MCM development to regulatory decisions. For example, the National Collaborative for Biopreparedness, funded by the U.S. Department of Homeland Security, has developed a system that collects and analyzes EMS data, 911 call center data, emergency department data, and other relevant information. This system aids both in syndromic surveillance and also establishes baselines, and it has fairly sophisticated analytic tools that can detect deviations.2
Identifying Missing or Unknown Data
Scott Proestel, director of the Division of Epidemiology, Office of Biostatistics and Epidemiology at the Center for Biologics Evaluations and Research at FDA, remarked that a key challenge in data collection efforts is capturing data that are missing or unknown. Lee added that, in some cases, data that are not collected can also provide helpful information. For example, patients not returning for follow-up care could indicate they are doing well after the intervention. It is important to use both known and unknown data to predict what is happening in the population, Lee said.
Understanding the importance of and finding missing data are also important, said workshop participant Harlan Dolgin of the Bio-Defense Network. For example, if an MCM for anthrax were distributed at PODs, he said, important data points would be how many affected persons did not come to the PODs, the adherence rate of those who did receive treatment at a POD, and how many people experienced adverse effects of the MCM but did not report them to a doctor. Avenues such as social media and polling could be helpful to collect these data after MCM distribution, he said.
Using Machine Learning and Artificial Intelligence
Francis suggested that machine learning strategies or artificial intelligence could identify data sources for MCM monitoring and assessment that would not normally be considered. Joe Vasey, an epidemiologist and bio-statistician with Practice Fusion, observed that some artificial intelligence and deep learning technologies are already being used in other fields, such as finance or defense, which also work with large amounts of “noisy” data. The question raised is whether and how such technologies could be applied to health data. Lee said that most models are designed for a specific type of data, instead of using all available data in a systematic and integrative way. The technology from artificial intelligence is not developed enough for health care data, Lee said, which are noisier than other sectors’ data.
Modeling and Interpreting the Data
Levy said having better methods for modeling data before a disaster occurs would be helpful, as response plans are often based on these data-modeling assumptions. She suggested local partnerships with academic research centers to test and validate assumptions prior to a PHE and noted the need to have all relevant stakeholders at the table when data modeling is developed. Rhona Cooper, public health preparedness clinical coordinator with the Philadelphia Department of Public Health, shared an example of an algorithm for a dual-model anthrax response composed of a flow chart with basic questions to be answered by entering data into fields in a database. This algorithm drives the production of a database that is usable at every level, she said.
When developing systems to monitor or detect safety signals, it is important to understand the expected error rate, said Wilcox. When tracking the safety or efficacy of a medical product, he emphasized, it is critical to predict what the expected rates should be. Until you observe a signal above the expected error rate, he added, “Your problem isn’t that you don’t have a problem, your problem is that you can’t see it,” and the monitoring system is ineffectual.
Stakeholders should have a robust understanding of existing data sources, what the capability and interoperability of data systems are, who has access to these data systems, and how existing technology can be used in different ways, said Patel. Ataher added that predesigning programs for rapid queries of multiple datasets in parallel could pool requisite information quickly. Because current data collection systems were not designed for monitoring medical products during a crisis situation, however, rapidly deployable health IT architectural designs should be developed and retrofit into existing systems, as appropriate, to detect and report unexpected signals, said Cullen (see Box 3-4 for a case study in this issue). Workshop panelists and participants discussed some of the many types of data and existing datasets that could be leveraged for monitoring and assessing the outcomes of MCM use, including patient narratives, EHRs, pharmacy databases, federal surveillance systems, big data, and social media.
Point-of-Care Patient Narrative
From a local health department point of view, Cooper noted that the most raw source of data is the patient interview. Stakeholders should consider the following questions when soliciting data from a patient: Who is the patient? What do they need? What are their individual issues? She noted that Medical Reserve Corps volunteers are trained to conduct patient interviews and enter the resulting data into an electronic database, and
patients’ driver’s licenses are scanned to automatically enter demographic information.
Electronic Health Records
Data in EHRs are collected purposefully and are usually the most consistent and least biased data sources, said Wilcox. However, EHR data are time intensive to collect, so they may not be as expansive as other data sources, and they are not amenable to large pattern-recognition algorithms because they are retrospectively analyzed, he added. Although there is a place for EHR data in conducting research on MCMs in PHEs, it is not in determining safety and efficacy, Higgs said. EHR systems were built to assist the physician and not the researcher, said Vasey, and data are generally not collected in the same way they would be collected for research purposes. In general, Francis added, EHR systems work well for claims data, but it is not always clear if the right medical information is being collected. Lee also noted that EHR systems include many unstructured tags, meaning there could potentially be hundreds of different ways for providers to characterize the same thing. It takes significant text-mining to merge all of these data, Lee observed. Furthermore, some EHR formats force providers to enter data at a level of specificity that may run counter to the care delivery goals in a PHE, noted Francis, because of time constraints at the point of care and in following up on data entries.
Groom of the Indian Health Service reiterated that EHR systems were designed for billing, not public health, although they have been evolving in response to Centers for Medicare & Medicaid Services (CMS) meaningful use requirements. She suggested the need for policies that would enable EHR vendors to respond in an emergency situation, assist with the development of solutions that could be broadly shared, and eliminate the need for third-party workarounds to extract data from the systems. Lushniak said that, in his experience, EHR vendors are very interested in helping public health move forward, but there are technical barriers that make it very hard to be nimble. The deployment of new software, for example, is subject to regulatory and certification processes. There is also tremendous variability in how institutions implement EHR systems, making it unlikely that any single solution could be deployed effectively across diverse systems.
Despite these limitations, Cobb noted that EHRs will continue to evolve to better serve the needs of preparedness and response efforts. One
workshop participant noted that EHR data can help to establish baselines (e.g., baseline morbidity and mortality rates in the affected population). Baseline data are critical to interpreting the clinical implications of an emerging infectious disease outbreak, and they could inform adaptive clinical trial designs and allow for some level of generalizability of trial findings beyond the trial setting. Maher added that EHR data could detect important signals to help define the focus of response preparations, such as what protocols to develop and where to preposition them to enhance response in an event. Higgs noted EHR data might be helpful in supporting observational studies in the real-world setting in order to identify additional indications or patient populations.
Cloud-Based EHR Systems
Cloud-based EHR systems represent an opportunity for more rapid transfer of information collected during PHEs. Cloud-based EHR systems are more adaptable and flexible than systems based on servers, noted Vasey, and the continuous data feed that results from entries by system users allows for MCM monitoring. Modules can be added to the system to monitor for certain signals in real time and to provide relevant educational materials for users. During the Zika epidemic, for example, Practice Fusion’s cloud-based EHR system allowed for a module of educational materials to be added for providers, said Vasey. Providers were also asked questions about their experiences and actions related to Zika. For future influenza outbreaks, cloud-based EHR systems can provide day-by-day monitoring of where illness is being reported, who is becoming ill, how many people are ill, what interventions are given, and so forth. He added that routine adverse event monitoring for medications allows physicians to record the adverse event occurrence in the EHR in real time, as they are talking to a patient. Another analogous system from which to derive lessons learned could be the Web-based countermeasure and response administration system being developed by CDC,3 which is customizable to a specific disaster scenario.
Government Surveillance and Tracking Systems
Brown highlighted Sentinel, FDA’s network of safety surveillance for the medical products it regulates, as a system that is currently leveraged by FDA in emergency situations. Systems such as this can be designed to be responsive in an emergency scenario when a PHE is declared and information is needed quickly. Platt noted that although the FDA Sentinel system is
a public health initiative, it is interpreted as research by many stakeholders. He recalled that one-on-one meetings and calls were held with the IRBs of the 20 systems that participate in Sentinel to explain that IRBs do not have jurisdiction over Sentinel’s data collection. Letters were obtained from the HHS Office for Human Research Protections confirming that Sentinel was not research covered by the Common Rule and from FDA stating that Sentinel was public health practice.
Cooper noted the lack of interoperability between concurrently utilized systems, such as the Pennsylvania Statewide Immunization Information System and the Knowledge Center (a software platform for real-time incident management). She mentioned that barcoded wristbands are used in mass casualty responses in Philadelphia through the Knowledge Center capabilities, allowing hospitals and EMS to communicate patient information. However, health officials have not yet been able to feed these data back into local PHE responses.
When the Philadelphia Department of Public Health activates a POD or conducts a POD exercise, Cooper said, it is required to create a new database using the Microsoft Access system. This is time consuming, she said, and during a PHE there is neither the time nor expertise to create more than a primitive database using questions and algorithms from federal sources. Furthermore, these databases currently lack long-term follow-up components (e.g., next vaccination appointment). She suggested the development of ready-made databases for already stockpiled, available, or approved MCMs that any local jurisdiction could use depending on the scenario.
Wilcox suggested that a potentially useful aspect of big data, in addition to mining data to answer specific questions, is the potential to identify changes in data flows and detect when data are changing in an unexpected way. Patel highlighted the need for innovative ways to look at the vast volumes of existing, unstructured big data and consider the interoperability between these data and other data systems. She referred workshop participants to a recent study of Yelp reviews of foodservice businesses that included reports of food-borne illness (Nsoesie et al., 2014), and asked whether and how such an approach could be applied to MCM monitoring and assessment: What information could be gleaned from big data, and is this type of data robust enough to inform monitoring and assessment efforts?
Emergency Medical Services
Workshop participant Rob Lawrence of the National Association of Emergency Medical Technicians highlighted the potential role of EMS in MCM monitoring and data collection. For example, EMS works at the epicenter of the opioid crisis, he said, and has the ability to conduct biosurveillance and syndromic surveillance to identify the point of consumption of opioids. A significant amount of data could aid opioid taskforces that are under way, he said. EMS also works closely with local public health departments. From a syndromic surveillance perspective, he said, EMS is often the first to see the signs of what is to come.
An example of an existing system that could be better leveraged is the pharmacy data system, Brown said. Pharmacy data systems are incredibly good in the United States, he said, and operate essentially in real time. Many people now get flu vaccinations at their pharmacy, and the data system could provide an almost “live” view of the status of flu vaccination across the country. He suggested that pharmacies could potentially be leveraged for mass vaccinations or mass dispensing, as an appropriate system for collecting dispensing data is already in place.
There is untapped potential in patient-collected data, whether it is collected by a mobile health application (see next section) or from social media, said Vasey. For example, social media data can be useful for reconstructing an epidemic, added Francis. Bakken shared a recent article on content and structural mining of tweets during the Ebola outbreak (Odlum and Yoon, 2015). The authors were able to collect data about public knowledge, sentiment, and the spread of information during the outbreak. Bakken emphasized that from the perspective of health equity, it is important to understand the demographics of a social media outlet when mining data or when considering using social media as an intervention strategy. For example, data suggest that Latinos and African Americans use Twitter more heavily than whites, Bakken noted. Bakken noted that Twitter provides a daily public sample that can be downloaded and used retrospectively. She pointed out that Twitter data are not necessarily unstructured. For example, the use of hashtags in tweets would be considered structured. She reiterated the importance of understanding the user demographics of any technology to be aware of how the sample might be skewed.
Lee noted that younger generations seem eager to share their experi-
ences and voice their opinions, and they can often be engaged to complete online surveys or post comments as a follow-up to medical interventions. A challenge is how to merge such data with more objectively collected data, she said, given that they are highly unstandardized and difficult to fit into a structured framework that make them amenable to analytic tools. Vasey suggested working with hardware manufacturers and software developers to make those data sources more capable of being integrated with other data sources (e.g., EHR or imaging data). Greg Burel, director of the CDC Division of SNS, suggested that social media could be mined to gather data on how receptive people are to taking the MCMs that were dispensed to them. For example, opinion polling suggests that the public will accept MCMs that are provided to them, but they will wait to see if they get sick before taking them. He pointed out that the issue of monitoring MCM use is not just what happens after people take MCMs, but also whether they take it at all, a question that could potentially be answered by social media.
Mobile Health Applications
The more nimble, user-friendly capabilities of mobile health applications on portable devices (smartphones in particular) provide opportunities to engage the public and collect data from individuals, said Cobb. Lee added that countries in Africa rely heavily on smartphone capabilities for health care events (e.g., screening, sharing test results, and follow-up activities). She suggested that the United States could similarly use this technology, though there are potential hurdles in terms of patient confidentiality concerns. Some U.S. patients are already connected to their providers through remote patient-monitoring devices for chronic diseases (e.g., blood glucose meters that automatically send a patient’s readings to their provider). Although user-based capabilities will not be evenly distributed across sociodemographic lines, Lee said, it is important to move forward and start to build the networks of knowledge and capability.
Delivering Data to the Correct End-User
How can the right data get to the people who could use them the best? What data are on-the-ground responders receiving? How are they using the data, and do the data look correct to them? The closer the data are to the point of care, the better, Wilcox said. It is critical that the people who are most familiar with what the data represent have the tools to navigate them, Wilcox added, and in many cases, stakeholders do not have the capability to query relevant datasets to meet their needs. Ataher added that
just having data is not enough: data need to be in the hands of the right person, someone who is capable of understanding, analyzing, and using the data. Wilcox also pointed out that sharing data across organizations is good, but linking volumes of standardized data does not necessarily mean there are more comprehensive data on any individual.
Direct Access to Information Sources
Levy pointed out that in a PHE, such as the recent Zika and Ebola outbreaks, information about the threat, MCMs, and the impact on the population comes through surveillance systems, to CDC, then to Public Health Emergency Preparedness awardees, and then to the local-level partners. At each level, nuances are lost, she said, and local health departments do not have the ability to directly question those who performed the research and the data collection. This inability handicaps local decision making, and she advocated for the local level to have more direct access to the source of information.
Petersen highlighted the challenges of dealing with guidance changes and updates during a PHE, which affects the implementation of plans and operations at the state and local levels. He called for better processes for disseminating guidance updates and informing the affected population, including clinicians who have to implement the guidance. For example, over the course of the Ebola outbreak, the guidance for use of personal protective equipment changed, and there was confusion about how organizations should be protecting health care workers.
Maher asked workshop participants how FDA and other agencies could better communicate changes in guidance and communicate information and questions upstream from local health systems to federal agencies. A variety of tools are available to gather information from local providers and health care facilities, Petersen said. For example, the Tennessee Joint Information Center works with health and association partners to push information out to stakeholders, and information is gathered through the Emergency Operations Center. Information gathered through these mechanisms is shared through webinars, conference calls, and other approaches.
Collection of health data is different from the collection of data in other sectors. Runnels summarized some of the challenges around MCM
data collection that were raised throughout the workshop discussions, including lack of standardization of health data; the difficulties of connecting databases and integrating data (and a variety of challenges specific to EHR systems); the dispersal of the data across many different sources during an event; the reliability and validity of the data; shortcomings of crowdsourcing and data mining; use and availability of analytical tools; and managing structured versus unstructured data. Some of these barriers and potential solutions for addressing them are detailed throughout this section.
Time and Resource Limitations
Other challenges to data collection during PHEs, said Amanda Peppercorn, senior medical director in Infectious Disease Research and Development at GlaxoSmithKline (GSK), are that health systems are stressed and physicians in a critical care situation do not have the capacity to enter large amounts of data in real time. As a potential solution, she said, GSK is now working with a clinical research organization to keep a particular MCM protocol up and running for the immediate future in order to test the protocol during an appropriate PHE.
Local health departments do not have the funding for sophisticated data collection, nor do they have the personnel to provide the epidemiological oversight needed to answer the questions asked by researchers, said Cooper. She emphasized that most local health systems do not have the capacity to collect the data requested of them for those who wish to analyze the data, and described data collection in MCM dispensing operations as early days. She cautioned that unless local health departments receive better guidance and assistance, they will continue to use unstandardized, self-created data collection formats. Staff can be trained to use basic electronic data systems, but local health departments generally have limited ability to collect data and could use support in this area. Researchers should be clear regarding what they want to know, Cooper added, and local jurisdictions can encompass those fields within their databases for their response operations. Ideally, she said, researchers would provide local-level health departments with all-hazards or MCM-specific databases.
Workshop participant Nick Boukas of the National Association of County and City Health Officials added that medium, small, and rural health agencies do not have the funding to make use of cloud-based IT systems to collect MCM data. In addition, they often do not have a full-time staff person dedicated to preparedness activities. Many county and city health officials serve dual roles; for example, the environmental health director might also serve as the preparedness coordinator and cover other assigned duties. Many of these communities do not have academic institutions they can partner with for analysis of the data they collect. Boukas
said his organization has been promoting a regional approach so that more rural communities have access to an academic institution that is not in their community, or even in their region. Boukas reminded participants that, although workshop discussions have been about what data needs to be collected, large segments of the country are largely unable to collect data the requisite data at this point. This gap needs to be addressed so that smaller health jurisdictions can collect and report their data and understand what the data mean for them on the local level.
The health care system in the United States is complex and heterogeneous, said Lee. There is no one unified system with common standards for collecting and sharing data. Even local health departments within a state, or individual hospitals under the same network, face challenges sharing information, she added. There are many operational limitations to be overcome, and the barriers are both vertical and horizontal, but progress must be made and should not be stunted by the search for the perfect solution.
A workshop participant said that different researchers, with the same or similar technology, might take different approaches to address the same questions and may define the necessary data elements differently. Similarly, community physicians (during an emergency response or for routine health care) have multiple ways to answer the same set of questions. The focus needs to be on first defining the questions and then collecting data to answer them.
There is also a range of idiosyncrasies across the various health data streams, Vasey said, and the way each individual or system collects data is different. When designing and developing data collection systems, it is important to keep in mind the multiple potential uses for those systems, he added.
Health data differs from other sectors’ data because they stem from the human condition; Francis observed that, in 30 years of clinical practice, he has never seen pneumonia present in exactly the same way. General surveillance of EHR data for a particular event can be very difficult because of that variety in symptomatology. The diversity of the treatment effect within patient populations also causes lot of confusion and fuzziness in the data, Francis said. Nonetheless, noted Vasey and Cooper, focusing on a particular MCM could make it somewhat easier to define a set of data that might be useful for monitoring and assessment.
Policy and Regulation (Dis)Incentives
Workshop participant Jessica Keralis of the Cadence Group suggested that lack of political will is a barrier to data collection. Political buy-in is needed to support the development of data systems that can function toward public health preparedness. Money follows politics, she said, and the form of the data collection systems follows the money. For example, EHRs were designed for billing, and many nurses and doctors resisted EHRs until financial incentives for implementation became available through the Patient Protection and Affordable Care Act (ACA). Software developers are now building EHR systems according to the requirements of the ACA, and it appears there is still no public health functionality forthcoming. Data scientists need to sell the importance of public health preparedness to the politicians who make the policies that influence the design of these products, she added.
Data Science Training
The most undeveloped resource is not necessarily a particular database or data source, Francis said, but our own expertise in working with the available data and the available IT tools. Joelle Simpson noted that Children’s National Health System has volumes of data, but it has been struggling to help staff develop the skills to use the data and produce the information being requested. Charles Cairns of the University of Arizona College of Medicine agreed that having educational and training programs are a necessary component of using data systems most effectively.
The field of data science requires a broad range of skill sets, including computer science, mathematics and statistics, machine learning, and traditional research methodologies, as well as specific subject-matter expertise, said Bakken. No single individual has all the skills and expertise needed to be able to address a given problem, she added. There are both statistical and computational challenges to working with big data, and there is a need for information visualization (applying visualization techniques to help detect signals and help experts understand the data).
Bakken also noted that many data science programs are being developed at both the undergraduate and graduate levels. She added that it has become a popular field, and admission to master’s programs is highly competitive. There are also mechanisms in place to augment existing doctoral and post-doctoral training grants with data science supplements, and there are excellent online training programs for those already in the field who need to advance their skills. Bakken also noted that informatics competencies exist for public health professionals. To advance data science for MCM monitoring and assessment, it would be helpful to identify the existing
knowledge and skills gaps, she said, and what might be done at both undergraduate and graduate levels to offer better training and develop a pipeline of individuals with data science and informatics competencies, along with their domain knowledge. Lee reminded participants of NIH training grant opportunities in this area.