As an introduction to the day’s proceedings, Victor Dzau, the president of the National Academy of Medicine; Alton Romig, the executive officer of the National Academy of Engineering; Francisco Becerra, the assistant director of the Pan American Health Organization; and Lonnie King, a professor and dean emeritus of the College of Veterinary Medicine at The Ohio State University, each provided a view of why the use of big data has the potential to be a valuable global public health tool.
As the amount of data that can be captured, communicated, aggregated, stored, and analyzed continues to grow exponentially, Dzau said, the opportunities to improve many aspects of health and health care are expanding as well. In addition to the trillions of bytes of data being generated by the global health care enterprise, social media venues, and personal devices such as mobile phones and wearable devices, companies and various organizations going about their business generate a huge amount of “digital exhaust data” that is captured as a byproduct of other activities. The expanding number of analytic tools that can extract meaningful insights about the world from these enormous datasets are leading to what Dzau called a tipping point where big data will revolutionize society.
For example, Dzau predicted, the ever-growing number of databases containing large amounts of aggregated data that can provide a picture of the state of health of the general public will allow the health care enterprise to spot problems early in their development and prepare remedies in advance that could prevent small incidents from become major pandemics. Clinical decision-support tools
will mine data to recommend specific treatments tailored to individual needs, leading to a new era of truly personalized medicine, and, particularly relevant to this workshop, big data will be used to manage global health risks, from antimicrobial resistance to pandemic preparedness. It is this last area, Dzau said, where big data may make the biggest contribution to humanity, given that there are few risks to humankind that threaten life at the scale of pandemics. It is his hope, he said, that new approaches for leveraging big data to improve disease surveillance and enable a more timely response to disease outbreaks that today are unchecked initially because of a lag in detection will eliminate such instances as the most recent outbreak of Ebola in West Africa, where the first case of Ebola was seen in December 2013 but for which the World Health Organization (WHO) did not declare a public health emergency until August 2014.
According to a report from McKinsey & Co. (Groves et al., 2013), U.S. public health surveillance alone could capture $9 billion in value annually. Dzau explained how traditional epidemiology uses data from field investigators, surveys, and hospital records, and he said that mining that information takes time and manpower—boots on the ground—and is always in catch-up mode. Using big data such as social media sources and advanced real-time analytics has the potential to, among other things, identify and track outbreaks as they are happening. The key to realizing this potential, Dzau said, is to learn from other industries that are far advanced in their use of big data and apply those lessons to problems relevant to infectious disease surveillance.
Another key to success, Romig said, will be taking a systems-based approach to the application of big data to health surveillance. Technology and engineering alone will not be enough to make optimal use of big data, he said, but this should not come as a surprise given that, in the real world, engineers quickly learn that the best solutions require thinking about problems in a systems context and working in partnerships with other disciplines. While a disease outbreak may start with one person, he explained, it soon spreads to a community, involving multiple health care providers, hospitals, and public health systems, and it may affect law enforcement, business, schools, and transportation systems. “A disease outbreak is not just about treating a person,” Romig said. “It involves a system that goes from the person all the way up to an entire region, a country or potentially even the globe.”
There is an aspect of big data—cybersecurity—that Romig said he finds worrisome and thus calls for caution. Data can be corrupted or stolen, Romig explained, but equally worrisome to him is the idea that someone could input false data into the system or trigger detectors to suggest that an outbreak is occurring when it is not. This is not an intractable problem but, again, one that
will require input from many disciplines in what he called an “active partnership” to address fully.
There are many sources of data containing information about the health of a community, Becerra said. Some sources, such as hospitals and health clinics, provide relatively reliable inputs to existing health monitoring systems, but the data are often incomplete or delayed. More novel data sources, such school attendance records, veterinary clinics, social media, pharmaceutics sales, global transportation patterns, and climate do not contain as much information by themselves, Becerra said; but when multiple traditional and nontraditional data sources, both structured and unstructured, are combined, they can yield a more rapid, reliable, and actionable picture of a community’s health than is possible from clinical data can alone.
Internet and mobile cell phone use, for example, provide accessible data about social behavior, including the types of searches performed on the Web, information communicated to schools and social networks, tweets written by specific people, the types and sources of medicines bought by individuals, places where an individual visits or eats, if an individual is absent from school or work due to illness, and many other daily activities. If used correctly and ethically, Becerra said, this information could lead to early warning or alert systems in health care that would enable public health sectors to identify and track disease outbreaks and quickly identify clusters of food-borne illnesses, facilitating early responses.
Referring back to Dzau’s mention of digital exhaust data, Becerra said that there is some suggestion from examining such data that even noncommunicable diseases can become metaphorically contagious through the imitation of human behavior that is so readily on display via social networks and social media. Affirmation of unhealthy behaviors such as smoking, eating high-fat and high-calorie food, eschewing exercise, or indulging in unsafe activities among members of social networks could lead to an increase in morbidity and mortality related to such behaviors. Becerra said that social networks could serve as a means of reinforcing positive behaviors that would improve health. “All the affirmation behaviors could be imitated through mutual connections or as a result of following people with interests in the social media setting and the same is true for communicable diseases,” he said. A deeper understanding of these issues will help to highlight the vulnerable aspects of the population’s health and also shed light on events that can be prevented, he said.
While there is enormous potential for these many data sources to vastly improve disease surveillance, King said, the challenge will be to wrangle these multiple data sources into a usable and informative form. “We need to deal with the explosion of data and develop the ability to connect the dots to ensure new discoveries and the creation of new knowledge to improve and possibly redefine health and well-being in the future,” he said. He expressed optimism that this challenge will be met thanks to the emerging science of big data analytics, but he said he believes that a dose of reality is necessary to prevent being seduced by technology for technology’s sake. Technology, he said, is a wonderful enabler, but the ultimate goal is to reduce disease.