dispensing, and 2.4 billion unique patient encounters, including 40 million acute inpatient stays. Each of the 17 data partners involved in the project uses a common data format so that remote programs can operate on the data. Data checks ensure that the data are correct. Data partners have the option of stopping and reviewing the queries that arrive before the code is executed. They also can stop and inspect every result before it is returned to the coordinating center. The amount of patient-level data that is transferred is minimized, with most of the analysis of patient-level data done behind the firewall of the organization that has the data. “Our goal is not to never share data. Our goal is to share as little data as possible.” The analysis dataset is usually a small fraction of all the data that exist, and the data can usually be de-identified.
As an example of the kinds of projects that can be done using this system, Platt described a study looking at comparative risks of angioedema related to treatment with drugs targeting the renin-angiotensin-aldosterone system. The results of the study had not yet been released at the time of the workshop, but Platt concluded from the experience that data from millions of people could be accessed to do the study without sharing any patient-level data. Yet, from the perspective of the investigators, “essentially everything that was interesting in those datasets that could answer this question was accessible and was used to address the questions of interest.”
Using such a system, it would be possible to address a large fraction of the questions thought to require data sharing by instead sharing programs among organizations that are prepared to collaborate on distributed analyses, Platt insisted. Organizations also could participate in multiple networks, further expanding the uses of the data they hold. At the same time, every network could control its own access and governance.
Today, only FDA can submit questions to Mini-Sentinel, but FDA believes it should be a national resource and is working on ways to make it accessible to others. Toward that end, the week before the workshop, the NIH announced the creation of the Health Care Systems Research Collaborative, which will develop a distributed research network with the capability of communicating with the Mini-Sentinel distributed dataset. Such systems, by sharing information rather than data, could make progress faster than waiting for all the issues surrounding data sharing to be resolved, said Platt.