With the amount of data in the world exploding, big data could generate significant value in the field of infectious disease. The increased use of social media provides an opportunity to improve public health surveillance systems and to develop predictive models. Advances in machine learning and crowdsourcing may also offer the possibility to gather information about disease dynamics, such as contact patterns and the impact of the social environment. New, rapid, point-of-care diagnostics may make it possible to capture not only diagnostic information but also other potentially epidemiologically relevant information in real time. With a wide range of data available for analysis, decision-making and policy-making processes could be improved. The Forum on Microbial Threats determined that the broader applications and implications of big data in these areas ought to be explored, where “big data” refers to any voluminous amount of structured, semi-structured, and unstructured data that has the potential to be mined for insights and information. Big data is characterized by the “four Vs”: the extreme volume of data, the wide variety of data types, the velocity at which the data must be processed, and the veracity of the data. Big data may be thought of as a tool for machine-aided human intelligence that will not replace human decision making but will rather provide insights that can make humans more efficient in choosing where to focus further research.
1 The planning committee’s role was limited to planning the workshop, and the Proceedings of a Workshop has been prepared by the workshop rapporteur as a factual summary of what occurred at the workshop. Statements, recommendations, and opinions expressed are those of individual presenters and participants and have not been endorsed or verified by the National Academies of Sciences, Engineering, and Medicine, and they should not be construed as reflecting any group consensus.
While there are many opportunities for big data to be used for infectious disease research, operations, and policy, many challenges remain before it is possible to capture the full potential of big data. Specifically, there are questions related to usage, access, interoperability, analysis, quality, validation, storage, privacy, security, and liability. Without exploring these issues, grave consequences can ensue. Some of these challenges could be elucidated by drawing on lessons from other sectors that have been immersed in using big data. For years, companies such as Google and Amazon have taken advantage of big data, condensing it into actionable insights or predictions and tailoring ad placement, customer experiences, and products to certain audiences. Much can be learned from these sectors that would allow the field of infectious diseases to harness big data and unlock various opportunities to enhance infectious disease research, operations, and policy.
The Forum on Microbial Threats convened a 1.5-day workshop in Washington, DC, to explore some of the opportunities and issues associated with the scientific, policy, and operational aspects of big data in relation to microbial threats and public health.2 The workshop was organized by an ad hoc committee whose members were Scott Dowell, Jennifer Gardy, Margaret Hamburg, Kent Kester, Lonnie King (Chair), George Poste, Martin Sepúlveda, Jay Siegel, and Lance Waller. As reported here, this workshop, which was held May 10–11, 2016, featured invited presentations and discussions that explored a number of topics, including preventing, detecting, and responding to infectious disease threats using big data and related analytics; varieties of data (including demographic, geospatial, behavioral, syndromic, and laboratory) and their broader applications; means to improve their collection, processing, utility, and validation; and approaches that can be learned from other sectors to inform big data strategies for infectious disease research, operations, and policy.
In accordance with the policies of the National Academies of Sciences, Engineering, and Medicine, the workshop did not attempt to establish any conclusions or recommendations about needs and future directions, focusing instead on issues identified by the speakers and workshop participants. In addition, the organizing committee’s role was limited to planning the workshop. The workshop proceedings has been prepared by workshop rapporteur Joe Alper as a factual summary of what occurred at the workshop.