The first workshop panel considered the current landscape for clinical trial data sharing and reuse, and provided stakeholder perspectives on current data-sharing policies in practice. Harlan Krumholz, Harold H. Hines Jr. Professor of Medicine at the Yale School of Medicine, discussed the challenges that researchers face as both data generators and users of data generated by others. Moses Taylor, Jr., spoke from his personal experience as a clinical trial participant. Sonali Kochhar, medical director of Global Healthcare Consulting, discussed facilitating data sharing from her perspective as a member of an independent review panel. Lyric Jorgenson, deputy director in the Office of Science Policy at the National Institutes of Health (NIH), discussed data-sharing policies from the perspective of a funder. The panel discussion was moderated by Bernard Lo.
Harlan Krumholz, Harold H. Hines Jr. Professor of Medicine, Yale School of Medicine
Sharing his perspective as both a data generator and a user of data generated by others, Krumholz said that unshared data are wasteful, result in missed opportunities to learn, and do not serve the public good. Researchers are faced with a host of “incentives to sequester data” for themselves, he said. He acknowledged the progress that has been made in facilitating transparent sharing of clinical trial data, such as the Yale University Open Data Access Project,1 and expressed optimism about the opportunities to continue to enhance data sharing.
As a researcher, Krumholz said he has benefited from having access to data generated by others. As an example, he described revisiting data from the mid-1990s clinical trial of digoxin for the management of heart failure. The National Heart, Lung, and Blood Institute (NHLBI), which sponsored
the study, had made the data available for analysis, but for years this resource had not been leveraged. Krumholz and colleagues published two secondary analyses, which he said sparked interest from the original NHLBI investigators and other investigators in analyzing the dataset for new insights.2 These subsequent analyses “were complementary and synergistic and served the public good,” he said.
Krumholz also spoke from the perspective of a data generator, describing the “emotional distress” of having to share the resulting data from a challenging, multi-site clinical trial he conducted. Researchers often have a social/emotional perspective (incorrectly, he added) that the data they generate are their own, which can impede progress on data sharing. This can occur, he added, despite their acknowledgment of the public good of data sharing and even though data would not exist without trial participants and a broad range of others on the clinical trial team.
This tension between benefiting from data shared by others and the hesitancy to share generated data is fed by the culture within academia. Krumholz said that academia does not recognize, reward, or incentivize data sharing or team science when evaluating researchers for promotions and grants. Little recognition is given to those who enable the important work done by others. In addition, little to no funding support is available to facilitate data sharing, and the infrastructure for sharing is lacking. However, great advances, Krumholz emphasized, can only occur through team science, which includes other researchers and trial participants. Furthermore, sharing data brings new perspectives and ideas to the analyses. Moving forward will require attention to the incentives and rewards structures for scientists, as well as the funding and infrastructure needed to readily enable sharing.
Alex Sherman, director of the Center for Innovation and Bioinformatics at Massachusetts General Hospital, suggested that an incentive for sharing could be including the data contributor as a co-author on publications that analyze his or her data. He noted that several consortia are already doing so and include a provision in the data sharing agreement (DSA). Tim Feeney from The BMJ noted that many journals require preregistration of clinical trials as a condition of consideration for publication. He asked whether there might be a similar role for journals in promoting the sharing of data. Krumholz agreed with the spirit of acknowledging data contributors with authorship, but raised his own personal concerns that the data contributors might not meet the criteria for authorship as currently outlined by the International Committee of Medical Journal Editors for its member journals. He suggested that journal articles could indicate the provenance of the
2 See https://www.nejm.org/doi/full/10.1056/nejmoa021266 (accessed March 30, 2020) and https://jamanetwork.com/journals/jama/fullarticle/195990 (accessed March 30, 2020).
data and promote the prior work, perhaps in the supplemental information for a publication or by including a link that would facilitate tracing back to and crediting the data originator. Amy Nurnberger of the Massachusetts Institute of Technology called attention to the practice of data citation—a practice that allows authors to include references to data similar to how publications are cited. Data citations can include digital object identifiers and can be tracked in the same way that article citations are tracked. Nurnberger mentioned ORCID IDs,3 which are intended to serve as unique identifiers for individuals in the research community.
Moses Taylor, Jr., Participant, Systolic Blood Pressure Intervention Trial
Taylor shared his perspective as a clinical trial participant, drawing from his personal experience in the Systolic Blood Pressure Intervention Trial (SPRINT)4 as well as his interactions with other trial participants during the 2017 Aligning Incentives for Sharing Clinical Trial Data summit.5
The Benefits and Risks of Sharing and Not Sharing Trial Data
Other than being asked to consent to sharing of their data, trial participants generally receive little or no information about what will be done with the data generated from a given study, Taylor reported. He said participants need a better understanding of the potential benefits and risks associated with data sharing before giving consent. He suggested that trial participants should be involved in decision making about how their data are shared.
Taylor observed that, in his experience, trial participants do not perceive any disadvantage to widely sharing their clinical trial data and generally believe that data sharing is beneficial. Most people participate in a clinical trial for personal benefit, in the hope that the results of the trial will ultimately impact their own clinical care, he said. For patients with life-threatening illnesses, trial participation can be lifesaving, and partici-
3 ORCID IDs are persistent digital identifiers assigned to individual researchers that can be used to associate them with their research outputs. (ORCID was originally derived from Open Researcher and Contributor IDentifier but is now referred to as an ORCID ID.) See https://orcid.org (accessed March 2, 2020).
pants and their providers need to know the results of the trial as soon as possible. Trial participants are also motivated by the opportunity to help others through their participation. From a patient perspective, Taylor said, there is a belief that if more people conduct secondary analysis on data, there is a better chance that discoveries will be made that will help trial participants and others with the same or similar conditions. Sharing data widely can provide “checks and balances,” Taylor added, and helps to reduce unintended bias in interpretation as well as misuse of the data by the original trialists versus if data were otherwise not shared.
Taylor acknowledged the potential for inappropriate use of shared data but believed this risk was fairly low due to the data governance systems in place in the United States. There is also a risk of reidentification of individual participants; but again, he perceived this to be of low risk due to the privacy protections that are in place. He added that, depending on the subject of the trial, participants might not think it is harmful to be identified.
Data from SPRINT were shared over the course of the study and at the conclusion, Taylor recalled. In addition, the “SPRINT Data Analysis Challenge” invited analysts after the conclusion of the study to use the trial dataset in novel ways to identify new findings.6 Taylor said his experience with sharing the data from SPRINT was positive.
In conclusion, Taylor advocated for sharing trial data widely, under the governance of experts, as it provides great benefit to both study participants and patients like them. The consent process ensures that participants understand the health risks of enrolling in a clinical trial, but they generally do not understand what will be done with the data that are generated, he said. Based on his experience, Taylor said that data sharing is an important indicator of the benefit a trial might have for patients, and he recommended that potential trial participants should be provided information on data-sharing plans to help their decision on whether to participate. Jeffrey M. Drazen noted that participant data might be used to answer questions in the future that cannot be predicted at the time of enrollment and consent. Taylor responded that he personally would accept risk involved with data sharing and he reiterated the importance of informing potential participants about the secondary uses of their data during the consent process so they can make informed decisions.
Sonali Kochhar, Medical Director, Global Healthcare Consulting
Kochhar discussed clinical trial data sharing from her perspective as a member of the Wellcome Trust’s Independent Review Panel (IRP) for the data-sharing platforms, ClinicalStudyDataRequest.com (CSDR) and Vivli.7 A Wellcome Trust survey of its grantees found that researchers share data for a variety of reasons, including requirements by funders or journals; belief that data sharing is a good research practice; and understanding that data sharing can facilitate collaboration as well as enable external validation and replication of results. The survey also found that researchers are concerned about the potential for misuse or misrepresentation of their shared data, the loss of opportunities to publish those data, and the time and effort required to deposit data (van Den Eynden et al., 2016). A survey by Kratz and Strasser (2015) found that researchers share data primarily through direct contact, such as email.
Kochhar suggested that data-sharing platforms can help researchers share data by providing “a standard, transparent access route, independent of the data generator.” Platforms can also help protect trial participants and align with patient consent by anonymizing data and restricting data access. In addition, platforms can facilitate metadata searches, and in some cases provide data storage.
Data-sharing platforms also present the following challenges, Kochhar said:
- The cost recovery model (the cost of running the platform might be paid for by the data contributor, the data requester, and/or a central core grant)
- The costs of maintaining secure analysis environments
- The lack of internationally agreed-upon minimum standards for metadata (further complicated when metadata are not in English)
- The lack of sufficient information about a study to assess whether data are worth requesting for a given research analysis
7 CSDR and Vivli are discussed further in Chapter 3. Information about Vivli is available at https://vivli.org. Information about the CSDR platform is available at https://www.clinicalstudydatarequest.com (accessed February 10, 2020). The charter for the CSDR IRP is available at https://clinicalstudydatarequest.com/Documents/Independent_Review_Panel_Charter.pdf (accessed February 10, 2020).
- The lack of interoperability among platforms (resulting in the need for researchers to search and make requests of multiple platforms)
- The lack of internationally agreed-upon standards for DSAs
- The different governance structures used by platforms
Two examples of data-sharing platforms are CSDR and Vivli. CSDR, established in 2013, is a searchable database of more than 3,000 studies.8 Briefly, researchers submit proposals to CSDR to request access to data from studies of interest. Requests are approved through a process that includes an IRP, and approved data users must sign a DSA before being provided access to the data in a secure system. There is no charge to access the data for secondary analyses. Kochhar noted that data requesters must also seek approval from their own institutional review boards or ethics committees as required by their institutions. A similar process is employed by Vivli, which currently houses more than 4,000 studies.
The Role of Independent Review Panels
The Wellcome Trust serves as the secretariat for CSDR and Vivli IRPs. Using an IRP helps to ensure that access to the data in the platforms are provided in a consistent, transparent, trusted, and controlled manner, Kochhar explained. The multi-disciplinary IRP includes a layperson in addition to members who are highly experienced in reviewing proposals and who have expertise in global health, clinical research, statistics, and ethics. The IRP also provides constructive feedback to researchers.
To illustrate the process, Kochhar said that a total of 531 research proposals had been submitted to CSDR through August 2019. Of those, 409 met the initial check criteria, and, of those, 338 were approved by the IRP. Of those that were rejected, 60 percent were revised to address IRP feedback, resubmitted, and ultimately accepted (Kochhar noted that similar proposal approval rates have been observed at Vivli). Georgina Humphreys, clinical data sharing manager from Wellcome Trust, added that some attrition is due to requesters withdrawing their requests or not responding to correspondence from the platform. In some cases, she said, researchers are approved for access but never log in or conduct analysis. Kochhar noted that a study sponsor has the right to veto secondary data use requests due to concerns, such as potential conflict of interest or competitive risk. However, this option has not been exercised by a sponsor thus far.
The requests to CSDR have led to 80 published or forthcoming publications, which Kochhar suggested is a low number relative to the time
8 See https://www.clinicalstudydatarequest.com/Metrics.aspx (accessed March 2, 2020).
and resources invested in the platform. Drazen noted that there are other potential metrics to consider besides publications. For example, secondary analysis of data can inform future clinical trial design. Kochhar agreed but added that thus far most secondary data users have been interested in conducting novel research, not designing future clinical research.
The average time from proposal submission to data access at CSDR is about 6 months, Kochhar said. The longest segment is review and signing of the DSA, which averages just over 3.5 months. Kochhar emphasized that this time could be reduced if institutions accepted a standard DSA. Other factors that impact the timeline include the researchers’ responses to IRP questions and review by the sponsor’s publication steering committee. Kochhar listed some of the common reasons for IRP rejection of proposals:
- The Plain English summary of the proposal was too technical and/or there was not a clear benefit to patients described.
- The information about the scientific rationale and/or clinical relevance of the proposed research was insufficient and the research plan and aims were unclear.
- The study design, methodology, and/or analysis plan provided insufficient detail or was incorrect.
- The research team did not have the relevant skills, qualifications, or experience to carry out the proposed plan.
- No publication or dissemination plan was submitted.
In conclusion, Kochhar shared suggestions to enhance data sharing and reuse based on lessons learned from the IRP process (see Box 2-1). Volumes of data are now available to request for analysis, she said, and there is a need to incentivize data sharing and reuse. She suggested that guidance from professional bodies on issues such as consent or common DSAs would help promote data sharing. She also noted the need to decrease the time from proposal to data access and reduce the cost of data sharing and reuse.
Lyric Jorgenson, Deputy Director, Office of Science Policy, National Institutes of Health
NIH has long been committed to promoting data sharing. Jorgenson discussed NIH’s approach, as a funder, to incentivizing data sharing. NIH recently released a draft policy for comment that calls for researchers to prospectively design a plan for data management and sharing, taking into consideration what data are necessary for replicating and validating the
findings of the study.9 Plans should include timelines for sharing, where data will be shared, and anticipated costs of sharing, she said. Although the default should be to share data, she said that NIH recognizes there are circumstances in which sharing might not be appropriate (e.g., where prohibited by state or tribal laws, or where sharing data is not in the best interest of the research participants). The intent is to submit information to the repository that will enable others to replicate and validate the study.
Jorgenson highlighted the three main reasons NIH is investing significantly in sharing data from clinical trials:
9 Jorgenson referred participants to the Draft Data Management and Sharing Policy that was recently released by NIH for public comment, available at https://osp.od.nih.gov/scientific-sharing/nih-data-management-and-sharing-activities-related-to-public-access-and-open-science (accessed February 10, 2020).
- Sharing clinical trial data supports the concept of engaging participants as partners in research. People are no longer referred to as research “subjects,” Jorgenson said. The focus is now on including them in discussions of study design and how their data are used now and in the future. Data sharing provides accountability, allowing trial participants to see how their time and data help to move clinical science forward.
- Sharing clinical trial data supports the accountability and transparency of NIH as a public funder of clinical trials. NIH invests an estimated $3 billion annually in clinical trials,10 Jorgenson said. As a public agency, NIH must demonstrate that it is a good steward of taxpayer-funded research and that the sharing and reuse of data are important metrics.
- Sharing clinical trial data advances science. Jorgenson emphasized the importance of sharing all data, both positive clinical trial findings that might impact care as well as negative data that can inform future research and avoid waste of time and resources. NIH is focused on enhancing the value of data generated in clinical trials through prospective planning (e.g., data elements to be collected, where data will be shared).
Measures of success are needed for data sharing as it relates to participant engagement and public trust in the clinical research enterprise. Metrics are needed to better understand whether information about clinical trials is useful and understandable to the public, Jorgenson said. Other metrics are needed to evaluate whether the public is willing to participate in clinical trials, whether participants understand the purpose of a given trial, and how participants’ data may or may not be used.
Metrics are also needed to assess and track the scientific value of the clinical trial data, as well as the ease with which the data can be shared, Jorgenson said. This involves understanding factors related to how sharing is facilitated—including the standards for data collection, the infrastructure for data sharing (e.g., platforms, repositories), and the processes for accessing human data (e.g., consent, appropriate use). Jorgenson noted that the variability in how data are reported is a key challenge in data sharing.
Also discussed was understanding the importance of whether and how data are being shared and/or reused. Jorgenson observed that
researchers’ concerns about data sharing and use span the spectrum, from being “scooped” (i.e., by sharing their data they are at risk for others publishing their findings before they do) to having their data go unused.
Challenges and Opportunities
One challenge to sharing clinical trial data is meeting the needs of the different stakeholders that NIH serves. Jorgenson pointed out that the level of granularity needed by scientists is different from that needed by the public. She said that ClinicalTrials.gov continuously strives to strike a balance between providing the level of detail needed by researchers, and providing understandable, usable information for members of the public who are seeking to participate in a trial. Another challenge is meeting infrastructure needs, including physical storage capacity as well as technical knowledge to ensure that data are findable and usable. Shared data, including associated metadata, need to be accessible to be used.
One area where NIH is focusing attention is understanding the value of sharing individual participant data. Jorgenson said that a small pilot study done by NIH of several data repositories found that researchers were accessing individual participant data and combining it with other datasets to pursue new lines of inquiry. It was also found that the reuse of shared data contributes to patents, systematic reviews, and guidelines, she said.
In closing, Jorgenson said that, as a funder, NIH is working to implement policies that incentivize a culture of data sharing. For example, as discussed by Krumholz, there might be approaches that could be used to reward those who generate and share data that are of value and that enable the work of other researchers. In response to a question from Rebecca Li, Jorgenson elaborated that potential incentives include finding ways to make data sharing easier by, for example, reducing administrative hurdles, improving infrastructure for sharing, and supporting cost allowances.
This page intentionally left blank.