Real-world data (RWD), as defined by FDA, are any data related to patient health status and/or the delivery of health care routinely collected from a variety of sources, [including] electronic medical records (EMRs), claims and billing activities, product and disease registries, patient-generated data including in home-use settings, or data gathered from other sources that can inform on health status, such as mobile devices.1
RWD and real-world evidence (RWE) are playing an increasingly important role in clinical practice, clinical trials, and regulatory and reimbursement decision making, said Margaret Sutherland, program manager of science at the Chan Zuckerberg Initiative and former program director at NINDS. As digital devices and other new modes of data collection are implemented, the amounts of data are going to be very large and will require either a huge server bank or cloud support systems to handle the volume, said Sutherland.
RWD may be useful for hypothesis testing and identifying questions that need to be answered, said Adriana Di Martino. Stuart Hoffman noted that smaller studies can help investigators generate hypotheses, but testing them may require large populations, large cloud-based databases, and cloud-based analytics.
Sutherland described how RWD were collected and stored for a study called Assessing Tele-Health Outcomes in Multiyear Extensions of PD Trials (AT-HOME PD).2 AT-HOME PD is an ancillary study to two NINDS-funded Phase 3 clinical trials in PD patients—STEADY-PD III, a trial of Isradipine in recently diagnosed PD patients, and SURE PD3, a trial of Inosine. Participants in those trials had consented to be recontacted by study coordinators for ancillary studies. Those who agreed to take part
1 To learn more about FDA’s definitions of real-world data, see https://www.fda.gov/scienceresearch/science-and-research-special-topics/real-world-evidence (accessed November 7, 2019).
in AT-HOME PD were then reconsented for raw data sharing using cloud infrastructure. Data they agreed to share included that based on clinical telemedicine visits; data collected on the mPower platform created by Sage Bionetworks3; and participant-entered data collected through The Michael J. Fox Foundation’s Fox Insight.4
The goal of AT-HOME PD, according to Sutherland, is to link individual-level data from different data streams to create a better picture of the patient experience of living with PD across multiple data modalities. To make these data work together through cloud computing, they established a global unique identifier (GUID) for each participant, said Sutherland. The GUID has been adopted by several NIH institutes for use in clinical studies, she said. It is a computer-generated code based on personally identifiable information such as name, date of birth, etc., that allows investigators to strip data of personal information before submitting the de-identified information to the database. The GUID allows de-identified data to be integrated across multiple databases and studies.
The quality of RWD, its reproducibility, robustness, and usefulness must be established, which may require validating the various data sources to confirm what they contribute to the overall picture of the patient going forward, said Sutherland. Obtaining quality EMR data, for example, is challenging in part because clinicians may see EMR as interfering with their interactions with patients and do not find EMR data useful, said Sean Hill. Incentives are needed to encourage clinicians to collect higher quality data, he said. He and his colleagues have developed a visualization chart intended to build quality into the medical health record by integrating measurements that are valuable for clinicians and giving them a quick way to see where data are missing, inaccurate, or irrelevant. EMR data collected using this approach, de-identified using a GUID model, can also be entered into a research database. Hill said they are also planning to integrate decision support tools into this model.
Datasets may also have substantial differences in terms of how much data were recorded, said Peter Wahl, senior director of clinical research at Optum Analytics. In a study that tried to leverage EMR data in a health information exchange comprising multiple institutions, he and his col-
3 For more information about mPower, see https://sagebionetworks.org/research-projects/mpower-researcher-portal (accessed November 12, 2019).
leagues found that only 5 percent of the population had blood pressure recorded. He found that only nurses involved in specific research projects had gone to the trouble of entering this information in the records, raising questions about the usefulness of the dataset. Ferguson suggested one possible solution to the problem of incomplete data entry: having scribes whose sole job is to follow a physician around and record data. Although this might be a very expensive solution, it could also reduce physician burnout, said Wahl. Jane Roskams raised the possibility of going back and looking retrospectively at EMR data. Indeed, said Sutherland, this might be a way to identify early signs of disorders such as PD, where the prodromal state is currently not well characterized. However, Wahl noted that older data tend to be of lower quality.
Moreover, structured data may provide an incomplete assessment of patient state in part because these data are codified using ontological constructs that may be severely limiting, said Wahl. For example, there is often no way to quantify a status change from one stage of disease to another, he said. He advocated augmenting and validating structured data with terms and concepts extracted from free text notes in the EMR.
Another reason to use RWD to understand the natural history of a disease is that many disorders are misdiagnosed or underdiagnosed, said Wahl. While symptomatology is an important component of how a neurodegenerative disease, its subtypes, and its progression are defined, symptoms reported by undiagnosed or misdiagnosed patients may not be captured in a disease-defined dataset. EMR data may also fail to capture patient-reported information, which is increasingly important for informing regulatory approval and labeling, and for postmarketing surveillance, said Wahl.
Incentivizing clinicians to participate in real-world studies will require understanding what brings value to clinicians, said Sutherland. Hill noted that providing physicians with digital tools may allow them to spend more time with patients and less time entering data into the EMR as they try to decipher the trajectory of the patient’s illness. Ruth Marinshaw noted that in developing tools such as this it is important to partner with the users of those tools as well as the data providers to ensure that the interface and approach bring value to all of them. She called this a “carrot-flavored stick” because it brings them value, but also requires them to put their data into the EMR system.
Capturing the context and provenance of how RWD were generated is equally important, said Hill. Was the parameter measured during the day or at night? Was the participant well rested or sleepy? Was he or she taking medications? Having patients record data or sharing data with patients may, however, alter behaviors and outcomes, noted Wahl. For example, Horgan noted that in the context of sleep, patients wearing devices to
measure sleep parameters may become so obsessed about their sleep that their sleep is disrupted.
While acknowledging the potential of large cloud-based databases to test hypotheses, Di Martino noted that the scale of these databases may make it difficult to clean up all the confounding variables. She also noted that in many developmental and psychiatric conditions, there are no objective diagnostic standards and suggested that to make RWD useful, diagnostic standards and validation of those standards are needed.
This page intentionally left blank.