In Chapter 4, the committee offers recommendations for what specific types of data should be shared and at what times during the life of a clinical trial. These recommendations are intended to strike a balance between benefiting the public through timely access to data and allowing investigators and sponsors time to complete planned analyses and obtain regulatory approval. This chapter examines with whom the data are shared and under what conditions. Potential data recipients may seek access to data for a variety of purposes (see Box 5-1), which may present different potential benefits and risks.
As previously stated, data sharing is the practice of making data from scientific research available to other investigators for secondary uses. The term “open access” was first applied to allowing any member of the public with Internet access to read and download for free the full text of articles from scientific journals for unrestricted use. In the context of clinical trial data, “open access” implies unrestricted and free access to data (Krumholz and Peterson, 2014). An example of such open access is the posting of registration information and summary trial results on ClinicalTrials.gov, a public website. In Chapter 4, the committee recommends that sponsors and investigators publicly share their data sharing plans at registration and summary-level results 12 months after study completion (as currently required under the Food and Drug Administration Amendments Act [FDAAA]). In Chapter 4, the committee concludes that the risks of sharing individual participant data and clinical study reports (CSRs) are significant and that these data elements may contain
Potential Recipients of Clinical Trial Data
- Researchers seeking to carry out additional analyses or explore new scientific questions
- Attorneys, who may be seeking information for use in litigation*
- Other companies, which may be competitors of the sponsor of the trial
- Consultants, whose clients may include investment and financing companies and research organizations
- Participants in the trial, who are interested in its results
- Journalists writing about a specific treatment or condition or about clinical trials generally
- Disease advocacy groups seeking to provide information to patients, families, and the public or to advance research
- Interested members of the public who wish to know more about the treatment or condition studied
- Research Ethics Committees, Institutional Review Boards, or scientific peer review committees reviewing a new study of the same or a similar intervention to obtain a more comprehensive safety profile of the intervention
- The Data and Safety Monitoring Board/Data Monitoring Committee for another clinical trial, whose decision to recommend continuing or terminating that trial may be informed by the results of a trial that has been completed but not published
- Educators seeking to use a data set for teaching purposes (e.g., in a biostatistics class)
* In the European Medicines Agency’s (EMA’s) experience with sharing clinical trial data, lawyers, other companies, and consultants were the most common data requestors (Rabesandratana, 2013).
sensitive data, a risk that in most cases needs to be mitigated through appropriate controls on data access and use. The committee applies the term “controlled access” to any arrangement whereby data sharers place certain restrictions on access to or conditions of use of data. Controlled access includes a range of models, from relatively light controls, such as requiring registration and data use agreements, to more extensive controls, such as review of secondary users’ qualifications, research proposals, or data analysis plans.1 Thus, controlled access can be viewed along a spectrum, from more open to more restrictive models.
1 At the extreme is a closed model in which access is available only to persons in an organization or research network. The committee viewed this model as so restrictive that for purposes of this report it is not discussed in depth.
The key argument in favor of open access is that removing barriers for those who seek access to data and not placing limitations on how data can be used will promote transparency, reproducibility, and more rapid advancement of new knowledge and discovery. Proponents of open access argue that it is more important to promote this potential for innovation—with the accompanying risk of invalid analyses—than to impose barriers that are too restrictive and impede potential scientific discovery and progress. Proponents argue that barriers to access in the past have led to invalid information in the medical literature, resulting in serious adverse public health consequences (see Table 3-1) (Godlee, 2009). Furthermore, proponents of open access believe that individuals and organizations with bad intentions could easily find ways to overcome the controls instituted by sponsors, and the controls would therefore serve only to slow the rate of scientific discovery and advancement without mitigating risks (Butte, 2014; Eichler, 2013; Wilbanks, 2014). Current proposals for restricted access and conditions of use have been criticized as deeply flawed: ambiguous wording and poorly specified provisions have mired those seeking secondary access in prolonged delays, legal risks, and lawsuits (Goldacre et al., 2014). It is also argued that data derived from research that is publicly funded or publicly subsidized (for example, through tax incentives or support for public universities) should be shared with the public that paid for it.
Some organizations, such as the European Medicines Agency (EMA) and the U.S. National Institutes of Health (NIH), in its Genomic Data Policy, employ a graded approach to data sharing, placing more controls on sharing data that are considered more sensitive (EMA, 2013; NIH, 2014). For example, the EMA plans to provide public access to redacted CSRs in a nondownloadable format and will provide the CSRs in a down-
loadable format only to known requesters who commit to using the data for scientific purposes only and to other conditions of use (EMA, 2014a).
The remainder of this chapter analyzes concerns about and the risks of sharing clinical trial data and the controls proposed for addressing them. The final section presents the committee’s recommendation on operational strategies for addressing these concerns and mitigating these risks. In general, the committee believes no single approach to access can be recommended at this time for all types of clinical trials.
Various approaches to mitigating the risks of sharing clinical trial data are currently being implemented according to the interests, concerns, and
Use of Shared Data for Another Company’s
Sponsors of clinical trials have serious concerns about competitors copying data packages that lack strong regulatory data protection. If competitors can obtain regulatory approval primarily on the basis of shared data and not their own work, companies and their investors may be reluctant to assume the high costs and risks of developing new therapies and carrying out the clinical trials required for regulatory approval. In the long run, patients and the public would suffer if the development of new therapies declined. These concerns have some empirical basis; the majority of early requests to the European Medicines Agency (EMA) to access clinical trial data were made by drug companies, lawyers, and consultants, not academic investigators (Rabesandratana, 2013).
What types of policies might be implemented to protect data from “unfair commercial use” is a difficult question. Some such policies may have unintended adverse effects on attempts to reanalyze data and merge data from other clinical trials. As of October 2014, the EMA’s data release policy included a contractual provision under which data requestors agree not to reuse the data to seek regulatory approval in other jurisdictions (EMA, 2014a). Although this provision is unlikely to pose difficulties for researchers, effective enforcement and sanctions may be difficult to implement. Additionally, only the person or entity that first accesses the data is bound by this contractual provision. If the data become available publicly, those who access the data are under no restrictions. Perhaps in recognition of the relatively limited effectiveness of a contractual provision, the EMA will place watermarks on published clinical report data “to emphasize the prohibition of its use for commercial purposes” (EMA, 2014a). The efficacy of a watermark-based approach remains to be seen. Further, according to its October 2014 statement, the EMA will consider “the nature of the product concerned, the competitive situation of the therapeutic market in question, the approval status in other jurisdictions,
resources of the organizations and individuals involved. As stated above, sharing analyzable data sets and the CSRs presents risks. The first risk is to participant privacy. Analyzable data sets and the CSRs contain identifiable data, and participants may be harmed if the data are not adequately de-identified and other appropriate privacy protections are not in place. Second, the CSRs may be used for “unfair commercial purposes,” such as wholesale copying of originator data sets for purposes of receiving regulatory approval in jurisdictions with limited regulatory data protection laws (see also Box 5-2). Such use of shared data could harm individual companies, the industry as a whole, and ultimately the public by reducing incentives to develop new therapies. Third, data recipients may perform and disseminate invalid secondary analyses as a result of misunderstanding the analyzable data set and its limitations or performing improper
and the novelty of the clinical development” in making determinations regarding redaction (EMA, 2014a).
By not allowing data to be downloaded or copied, the Yale University Open Data Access (YODA) agreement with Johnson & Johnson goes a substantial step further, which may make it more difficult for researchers to aggregate data sets (for example, for meta-analyses) (YODA Project, 2014). Thus, proposals for such limitations should be approached cautiously, or other provisions should be made for providing data sets for meta-analyses. That being said, to the extent that data set providers use a common website, some aggregation may be possible. For example, although the website for sharing clinical trial data that Bayer, Boehringer Ingelheim, Eli Lilly, GlaxoSmithKline, Novartis, Roche, Sanofi, and ViiV Healthcare have agreed to use (www.ClinicalStudyDataRequest.com) does not allow data downloading, researchers working on the website are permitted to aggregate data from the different study sponsors.
The discussion thus far has assumed that a competitor would submit for marketing approval precisely the same molecule as that of the originator. However, certain jurisdictions, including the United States, allow applicants to submit modified molecules for approval through a pathway whereby they rely on approval of the originator’s product and submit new data relating to the modification. In the United States, this is known as the 505(b)(2) pathway (Minsk et al., 2010). In the United States, data exclusivity regimes adopted with generic molecules in mind appear to apply to the 505(b)(2) case, so this route cannot be used until the data exclusivity of the originator’s product expires.* However, such exclusivity may be absent in other countries. (see also the discussion of data protection and exclusivity laws in Appendix C). Moreover, in the case of pathways similar to 505(b)(2) in other countries, these countries may actually require detailed clinical trial data, in which case public release of even redacted clinical study reports could result in a risk of competitive harm, although how significant this risk might be is unclear.
* 21 U.S.C. § 355(b)(2).
data analysis, or even intentionally. Invalid secondary analyses may lead to inappropriate conclusions about the safety and effectiveness of therapies, which may in turn harm patients. Lastly, investigators who conduct clinical trials want some assurance that they will receive appropriate professional credit for their work and publications resulting from additional analyses of the data they collected.
The committee is aware that the likelihood or severity of any of these risks will be specific to the circumstances of a given trial. For trials involving sensitive or stigmatizing conditions, for example, the risk to privacy is great; for trials of innovative first-in-class agents, the risk to commercial interests is large. For other trials, these risks may be relatively small. Consequently, the committee does not recommend one access model for all trials and types of clinical trial data but instead presents the rationale for using various controls on access to clinical trial data and recommends operational strategies for their use. Current practices for mitigating the risks of data sharing include the de-identification of data; making data available for inspection and analysis but not for downloading; registration and the use of data use agreements (DUAs); and review of data requests, including review by an independent third party. Box 5-3 lists which specific controls are designed to address the various specific risks; further elaboration is provided below.
De-identification is commonly used to protect the privacy of participants in a clinical trial (see also Appendix B). Various jurisdictions may differ on the degree to which the risk of re-identification must be reduced for the data to be considered sufficiently de-identified to justify more widespread sharing, particularly in the absence of specific informed consent of the data subjects.2 In the United States, the Health Insurance Portability and Accountability Act (HIPAA) provides two methodologies for rendering health information “de-identified,” but it does not set a specific numerical threshold for unacceptable re-identification risk. Similarly, the European Union’s (EU’s) Data Protection Directive and similar directives
2 In general, privacy or data protection laws regulate or reserve the most stringent regulations for identifiable personal data—data that either directly or indirectly identify the data subjects. In the United States, for example, the Health Insurance Portability and Accountability Act (HIPAA) does not regulate health data that are “de-identified,” which is defined as “health information that does not identify an individual and with respect to which there is no reasonable basis to believe the information can be used to identify an individual” (45 CFR 164.514). The European Union’s Data Protection Directive 95/46, Recital 26, states that “the principles of data protection shall not apply to data rendered anonymous in such a way that the data subject is no longer identifiable.”
Approaches to Mitigating the Risks of Sharing Individual
Participant Data and Clinical Study Reports (CSRs)
- De-identification and application of other privacy-enhancing technologies and algorithms
- Required registration
- DUA clause prohibiting the re-identification or misuse of data
- Security protections
Unfair Commercial Use
- DUA clauses
- Watermarking CSRs
- Making CSRs nondownloadable
- Review of data requests/restrictions on who gets access
Invalid Secondary Analyses
- Review of data requests/restrictions on access based on qualifications and/or merit of research proposal
- DUA clause requiring public posting of analysis plan
- DUA clause allowing sponsor/clinical trialist to review analyses before publication
Credit for Clinical Trialists/Sponsors
- DUA clause to credit data generator in any publication
around the world do not provide explicit guidelines for how data should be protected through de-identification or anonymization.3 In Sweden, any possibility, however theoretical, that data can be re-identified is sufficient to render the data identifiable.4 Thus, jurisdictions vary considerably in their standards for de-identification.
Frequently, the risk of re-identification depends on the context in which data are released. For example, are mitigating controls in place (e.g., release only in controlled environments or to recipients with strong
3 The U.K. Information Commissioner’s Office has published a code of practice providing examples of de-identification methods and issues to consider when assessing the level of identifiability of data, but it does not provide a full methodology or specific standards to follow (El Emam and Malin, 2014).
4 E-mail communication, M. Barnes, B. Bierer, and R. Li, Multi-Regional Clinical Trials (MRCT) Center, to A. Claiborne, Institute of Medicine, regarding comments to the Institute of Medicine questions on data sharing, April 1, 2014.
data management policies and practices) that would reduce the likelihood of re-identification? What is the potential for harm or an invasion of privacy of the data subjects (e.g., due to sensitivity of the data) if re-identification were to occur? What are the possible motives and capacity of the recipient to re-identify the data? Providing some assurances to the public with respect to the risk of re-identification may require more than removing or masking direct or potential (quasi) identifiers.5 At a minimum, recipients of data from data sharing initiatives should commit to not intentionally re-identifying the data subjects, for example, through a DUA (see the section below). In addition, all holders of even de-identified or anonymized data should adopt reasonable security safeguards to help prevent inadvertent, unauthorized access.
De-identifying data does not eliminate all risk of re-identification, and reducing that risk to zero, as by coarsening the data or combining cells in the data set that contain few individuals, often destroys or significantly impairs the utility of the data for subsequent research.
Protecting privacy is a particular challenge in the era of “big data,” where the variety of data, the size of data sets, and the scope of data analysis are unprecedented. Inferences can be drawn about an individual even if there are no data about the individual that are traditionally considered identifying. Even if overt identifiers are removed from a data set, it may be possible to re-identify individuals by bringing auxiliary information from other sources to bear on the data set (Dwork, 2014). Moreover, it is possible to detect the presence of genomic DNA from a specific individual within an admixture of genomic DNA from many individuals (Homer et al., 2008). As one privacy scholar wrote, big data analytics “make certain facts newly inferable that anonymity promised to keep beyond reach” (Barocas and Nissenbaum, 2014, p. 56). Similarly, The President’s Council of Advisors on Science and Technology (PCAST) declared that anonymization is now not “sufficiently robust to be a dependable basis for privacy protection where big data is [sic] concerned” (PCAST, 2014).
Successful re-identification attacks on properly de-identified or anonymized health or clinical data are rare, but they happen.6,7 Reducing the risk of re-identification of data subjects is a valuable tool for ensuring that the benefits of data sharing outweigh the risks, but it should not be the
6 The well-publicized re-identification of the medical records of the governor of Massachusetts depended on using the governor’s birthdate and zip code. These data would need to be removed from a de-identified data set under the HIPAA safe harbor requirements (DHS, 2005).
7 E-mail communication, M. Barnes, B. Bierer, and R. Li, Multi-Regional Clinical Trials (MRCT) Center, to A. Claiborne, Institute of Medicine, regarding comments to the Institute of Medicine questions on data sharing, April 1, 2014.
only tool leveraged to protect the privacy of research participants. Entities storing clinical trial data for sharing should take advantage of innovations in data privacy and security protections and deploy additional safeguards to bolster protections against residual re-identification risk—for example, by deploying advanced cryptographic techniques when running analyses on encrypted data (Zeldovich, 2014) or relying on distributed data sets for analysis in lieu of centralized collection of data (which creates a single target for attack) (The White House Office of Science and Technology Policy and MIT, 2014). In addition, holders of clinical trial data may need to address differential privacy, such as through techniques that introduce random noise into a data set (Dwork, 2014; The White House Office of Science and Technology Policy and MIT, 2014). Stakeholders in responsible sharing of clinical trial data need to keep up to date with emerging privacy protection techniques being developed by computer scientists.
Making Data Available for Use But Not Downloadable
Several data sharing programs are granting some access to clinical trial data to secondary users but not allowing them to download the data to their own computers. The EMA is allowing users to view data online after simple registration; to download data, secondary users must agree to additional conditions. The consortium of drug companies ClinicalStudyDataRequest.com does not allow secondary users to download individual participant data to their computers; analyses must be carried out using standard software programs on the website in a secure workspace (ClinicalStudyDataRequest.com, 2014a). This approach helps protect sponsors from secondary users’ carrying out analyses beyond those proposed in the data request, compromising participant privacy, or using data for their own regulatory submission. Secondary users can combine data from different clinical trials that are accessible on the website, for example, to carry out meta-analyses. However, secondary users may be concerned that this process may make their work more cumbersome or take longer.
Registration and Use of Data Use Agreements
recipient would pledge to use the data only for the purposes specified; not to disclose the data to others (except insofar as needed to assure research integrity and except under specific publication conditions); and not to try at any point to re-identify subjects” (SACHRP, 2013). Common provisions in DUAs that may reduce risks to various parties currently include
- prohibitions on any attempt at re-identification or contact of individual trial participants,
- prohibitions on further sharing of the data unless permitted or required,
- prohibitions on the use of shared data to support a competitor sponsor’s application for licensing of a product or new indication or for “unfair commercial use” (see Box 5-2),
- requirements to acknowledge in any publication or dissemination the trial whose data were shared so that the original trialists will gain appropriate professional credit for the value of their work for secondary analyses, and
- assignment of intellectual property rights for discoveries from the shared data.
Other provisions frequently included in DUAs are intended to enhance the scientific value of secondary analyses. They include
- requirements that secondary users seek to publish their analyses in peer-reviewed publications and make their statistical analysis plan available to other researchers;
- requirements to send copies of submitted manuscripts and publications to the trial investigators or study sponsor, with no right of revision or approval; and
- restrictions on using the data for purposes other than those originally proposed in the application to access the data.
Finally, to ensure that safety signals concerning medical products are used to protect the public health, DUAs may include requirements to notify industry sponsors and appropriate regulatory authorities of any findings that raise significant safety concerns.
The committee does not endorse all the above provisions in DUAs but believes that sponsors, funders, and intermediaries that hold and release clinical trial data should consider these provisions as potential options for increasing the benefits and reducing the risks of sharing clinical trial data. From a legal perspective, it is not clear whether and how these DUAs can be enforced if violated by secondary users, and the committee could not find any relevant case law. Nevertheless, the committee believes the terms
in DUAs have a significant normative, symbolic, and deterrent value, setting standards for responsible behavior, even if their legal enforceability has not been tested in the courts.
Conclusion: DUAs are a useful strategy and best practice for increasing the benefits and mitigating the risks of sharing clinical trial data.
Review of Data Requests
Thus far, industry sponsors that have established data sharing arrangements have employed an additional level of control on data access beyond registration and use of DUAs—review of data requests, as employed, for example, by GlaxoSmithKline (GSK) and other sponsors via ClinicalStudyDataRequest.com (see Box 5-4). Proponents contend that review of data requests helps protect against invalid secondary uses of the data, which may occur for a number of reasons, including unfamiliarity with the data set and its limitations, invalid statistical methods, the
ClinicalStudyDataRequest.com is a multisponsor Web system, launched in January 2014 for requesting clinical trial data, which is based on a system initially launched in May 2013 by GlaxoSmithKline (GSK). Thus far, in addition to GSK, Bayer, Boehringer Ingelheim, Eli Lilly, Novartis, Roche, Sanofi, Takeda, UCB, and ViiV Healthcare have agreed to release data through the website.
The Web request system provides de-identified individual participant data from medicines that had received regulatory approval (in any country) or whose development had been terminated. As with the earlier system, ClinicalStudyDataRequest.com requires investigators to submit a research proposal to an independent review panel before a request for data is granted. The review panel (1) assesses whether the research proposal has a valid “scientific rationale and relevance to medical science or patient care” and (2) considers requesters’ qualifications (e.g., statistical expertise) and potential conflicts of interest (Nisen and Rockhold, 2013). Once the review panel has accepted a data request and investigators have signed a data sharing agreement, access to individual participant data, analyzable data sets, and supporting or metadata documents—including the protocol, statistical analysis plan, clinical study report, blank annotated case record form, and data specifications—is granted through a password-protected secure Internet connection. Data are not downloadable. Finally, investigators that analyze shared data are required to post their analysis plan publicly and, after the study is completed, to post summary results and seek publication in a peer-reviewed journal (Hughes et al., 2014).
inherent inaccuracy of safety signals obtained without prespecification of adverse effects, or analyses aimed at reaching a preconceived conclusion. These data request submissions are reviewed either by the sponsor (e.g., Merck and as endorsed in Biotechnology Industry Organization [BIO] principles for data sharing), by an independent panel that applies criteria set by the sponsor on a case-by-case basis (e.g., the companies signed on to ClinicalStudyDataRequest.com), or by an independent intermediary organization to which the sponsor has transferred authority and jurisdiction for reviewing data requests (e.g., Johnson & Johnson’s transfer of authority to Yale University Open Data Access [YODA]). The criteria used in reviewing requests may or may not be stated explicitly, and the review committee may or may not be publicly named (see Appendix D for a detailed description of the data sharing policies of the 12 largest pharmaceutical companies). Criteria for reviewing requests have focused on review of requesters’ qualifications and the scientific merit and validity of the proposed research. One sponsor’s independent review committee has published its experience during its first year (Strom et al., 2014).
Controlling Access Based on Qualifications
Sponsors that have implemented data sharing polices suggested to the committee that controlling access on the basis of qualifications may help reduce the risk of invalid secondary analyses. For example, requiring someone on the team to have expertise in biostatistics may reduce the likelihood that multiple analyses will be carried out without appropriate statistical correction. For example, Bayer, Eli Lilly, GSK, Merck, and Roche all require that data requesters have a biostatistician on their research team before granting requests. However, it is difficult to judge an individual’s qualifications to carry out a proper statistical analysis. Some who are qualified to do so may not have a formal degree in statistics, while others who have a degree in statistics may lack the training or experience to carry out a rigorous biostatistical analysis of clinical trial data. Thus, a categorical requirement for formal biostatistics training may exclude some data user teams with appropriate expertise.
To address the problem of unwarranted malpractice claims driven by invalid secondary analyses, some have proposed restricting lawyers’ access to clinical trial data (see Box 5-5). As noted earlier, the EMA reports that lawyers were among those most frequently requesting clinical trial data from the EMA—far more frequently than academic researchers.8
8 Personal communication, Virtual WebEx Open Session, L. Brown and G. Fleming, to Committee on Strategies for Responsible Sharing of Clinical Trial Data, Institute of Medicine, regarding clinical trial data sharing: product liability, April 9, 2014.
Use of Shared Data for Malpractice Litigation
The benefits and risks of litigators’ use of shared clinical trial data are hotly debated.
Plaintiffs’ attorneys contend that malpractice suits have revealed serious problems with medical products that caused widespread harm to patients (e.g., COX-2 inhibitors, gabapentin).a From their perspective, lawsuits help protect patients from unsafe drugs and devices. Access to clinical trial data would help plaintiffs’ attorneys identify new risks of therapies and obtain further evidence through the discovery process. Noting that in U.S. federal courts, the Daubert rule excludes invalid scientific evidence from being admitted into trial, these attorneys believe that concerns about “rogue science” are misplaced. In their view, moreover, contingency fees provide powerful incentives to bring only lawsuits that are based on sound scientific evidence.a
Defense attorneys have a sharply different perspective on the sharing of clinical trial data. Post hoc analyses can significantly overstate the risks of a therapy by including multiple subgroup analyses, varying the endpoints of an analysis, or selecting studies for inclusion in a meta-analysis in a biased manner. In their view, the Daubert rule does not reliably exclude statistically invalid subgroup analyses that identify a group of patients as more likely than not to be harmed by a therapy.a Moreover, the Daubert rule is applied differentially by judges and may not be used in a state court where a lawsuit is brought. From the perspective of defense attorneys, lawyers should not be categorically excluded from access to clinical trial data; like other secondary users, they should be able to obtain access to the data under controlled access by submitting a sound data analysis plan.a From this perspective, moreover, plaintiffs’ lawyers do not need direct access to clinical trial data; they can obtain the data through the discovery process. In addition, plaintiffs’ attorneys can draw on published reports by researchers who have conducted secondary analyses of the data.
Restricting litigators from direct access to clinical trial data poses difficult implementation challenges. The Yale University Open Data Access (YODA) policy states that YODA will deny access to data sought for purposes of litigation, but it does not address how those who do not self-identify as litigators might be identified. Prohibiting the downloading or copying of data may deter litigators from using only one result from multiple subgroup analyses without acknowledging the statistical limitations of such use. As discussed earlier, however, such prohibitions may also deter useful data analysis and aggregation by researchers. Similarly, provisions in data use agreements that the data not be used for litigation are difficult to enforce. Nonetheless, there may still be a normative, symbolic, or deterrent value in such restrictions.
a Personal communication, Virtual WebEx Open Session, L. Brown and G. Fleming, to Committee on Strategies for Responsible Sharing of Clinical Trial Data, Institute of Medicine, regarding clinical trial data sharing: product liability, April 9, 2014.
However, trying to restrict lawyers’ access to clinical trial data may be impractical. Lawyers can arrange with academic researchers to obtain clinical trial data in order to seek evidence that sponsors knew about and failed to respond to serious adverse events. Furthermore, a lawyer may be part of a data requesting team that has an appropriate research question and analysis plan (Eichler, 2013).
Conclusion: Controlling access to data based on the requester’s qualifications is not effective for mitigating risks and may present barriers to some qualified teams of requesters.
Controlling Access Based on Data Access Requests
Several models for sharing clinical trial data entail reviewing the scientific rationale or purpose of data requests and the ability of the proposed research and data analysis plan to achieve the scientific objectives. The rationale for such review is to screen out data requests that lack a valid purpose or research or data analysis plan and therefore will not produce valid scientific knowledge that benefits the public. For example, requiring secondary users to prespecify a research question and submit a data analysis plan will reduce the risk of multiple comparisons leading to spurious conclusions because these requirements make it possible to identify secondary users who report a different analysis from what they originally proposed. However, at least one independent review panel does not see its role as performing scientific peer review, leaving that task to peer review of publications (Strom et al., 2014).
Review of a prespecified research question may also exclude secondary users whose analysis could benefit the public even though they lack a prespecified data analysis plan. For instance, an investigator may request a data set to generate new hypotheses for additional basic and clinical research but have no prespecified data analysis plan. As another example, a teacher of a biostatistics or evidence-based medicine course may wish to use a data set as a classroom exercise, to give students hands-on experience with analyzing clinical trial data.
Concerns about invalid secondary analyses are controversial. On the one hand, proponents of open science, who advocate public access to clinical trial data, argue that a free marketplace of ideas and vigorous debate are in the long run the best path to better understanding of clinical trial data (Goldacre, 2013a,b, 2014; Goldacre et al., 2014). From this perspective, sponsors and the original trial investigators may introduce serious bias into trials by withholding unfavorable data, manipulating variables, or carrying out inappropriate statistical analyses (Doshi et al., 2013). With some industry-sponsored trials, there is no peer review of the protocol or
amendments. The resulting bias in the evidence base for clinical decisions harms patients. In this view, it is unfair to subject secondary analyses to more scrutiny than that received by the original trial. From this perspective, some invalid secondary analyses are an unavoidable side effect of sharing clinical trial data and gaining the benefits of correcting invalid original clinical trial reports and analyzing unpublished data. Moreover, proponents of open access argue that concerns about inaccurate secondary analyses are speculative (Doshi et al., 2013, p. 4), unlike documented cases of serious distortion in the evidence base for clinical care caused by biased published manuscripts and by unpublished data and trials (Doshi et al., 2013; Joober et al., 2012).
On the other hand, proponents of some controls over data access argue that biased primary analyses of clinical trial data and failure to publish unfavorable results will be addressed effectively by providing data to other investigators under controlled access. In their view, the reasonable controls currently being used by some drug manufacturers have not impeded the vast majority of requests for access to clinical trial data (Strom et al., 2014). Proponents of controls over access contend that with uncontrolled access to the CSRs and individual participant data, a secondary user of the data could carry out multiple analyses in an invalid manner. At least one team of clinical trialists has claimed that repeated challenges and accusations based on erroneous data and improper analyses can consume large amounts of time and effort; this team wrote that it wished to warn of the “dangers of open access to data when the peer review and editorial processes fail to do due diligence” (Wallentin et al., 2014).
Because a fundamental goal of responsible sharing of clinical trial data is to produce additional analyses that are scientifically valid, the committee believes some control over access to clinical trial data based on the research proposal may be beneficial, provided that the controls are not unduly burdensome for secondary users.
Conclusion: Controlling access on the basis of the purpose and/or scientific validity of the research proposal may be an effective strategy for mitigating risk, although overly restrictive controls are undesirable because they would inhibit valid secondary analyses and innovative scientific proposals.
Independent Review Panels
Use of review panels to control access to clinical trial data by reviewing and approving data requests raises important questions regarding implementation: Who decides whether data requestors gain access to the
data, and what criteria are used to make the decision? One option is for the trial sponsor or investigator to make decisions about access. However, this arrangement may raise concern about conflicts of interest and bias and cause mistrust. Another option is for a “trusted intermediary” or “honest broker” to make the decisions. The intermediary may negotiate the conditions for data sharing (with the data provider retaining control over the data and their release) or take full responsibility for deciding who gets access and delivering the data to recipients (Mello et al., 2013). Trusted intermediaries may also accept and facilitate data analysis queries from secondary investigators if a model of “bringing the question to the data,” as discussed in Chapter 6, is adopted. Moreover, an independent oversight panel could have the authority to use its discretion to alter the timing of data release; examples of how an urgent public health need might justify earlier release of the analytic data set supporting a pivotal publication were offered in Chapter 4. If the review of data requests is carried out by such a group that is independent of the sponsor and investigators in the original trial, high-profile concerns raised in the past about sponsors placing undue barriers and delays in the path of data requests can be avoided (Doshi et al., 2012; Godlee, 2009).
Conclusion: It is best practice to designate an independent review panel, rather than the sponsor or investigator of a clinical trial, to be responsible for reviewing and approving requests for clinical trial data.
The use of independent review panels also raises a number of practical issues that need to be resolved, including selection of the panel members, administration, funding, and compensation for the panel’s services (ADNI, 2013; MCRT at Harvard, 2013; Nisen and Rockhold, 2013; PhRMA and EFPIA, 2013; YODA Project, 2013). Several large drug manufacturers have established programs for sharing clinical trial data using an independent review panel to determine access to the data. As stated above, GSK and nine additional industry sponsors have hired an independent panel of four members to review research requests (ClinicalStudyDataRequest. com, 2014b). Johnson & Johnson and Bristol-Myers Squibb have taken a slightly different approach and contracted with YODA and the Duke Clinical Research Institute, respectively. Currently, large pharmaceutical companies are paying for the cost of these services. However, investigators funded by public and nonprofit organizations generally lack the resources to establish their own independent review panels. Thus, public and nonprofit funders would need to provide support and funding to establish these panels for their investigators to use.
Composition of Independent Review Panels
In general, independent review panels are currently composed of persons with expertise in clinical research, clinical trials, biostatistics, and clinical medicine who have no conflicts of interest in deciding whether a data requester receives access. Some panels also have members with expertise in law and ethics, which is helpful if the panel is charged with reviewing whether consent forms from legacy trials allow data sharing.
Currently, many review panels established by pharmaceutical companies do not include representation of clinical trial participants, their communities, disease advocacy groups, or the public. But as discussed in Chapter 3, engaging these stakeholders and giving them a meaningful voice can help sponsors and investigators better understand their concerns and can suggest constructive ways of addressing those concerns and improving the sharing of clinical trial data generally.
There are several examples of how representatives of communities and patient and disease advocacy groups can contribute fresh perspectives and constructive ideas to research institutions and research projects. Furthermore, some research funders have required that patient and community representatives play formal roles in research organizations and projects. The 2005 National Research Council report Ethical Considerations for Research on Housing-Related Health Hazards Involving Children points out how community-based participatory research could enhance the scientific usefulness of such studies, reduce the risks of the research, and build trust among the communities being studied (NRC, 2005). The Centers for Disease Control and Prevention (CDC) has suggested that community-based participatory research may help increase the net benefits of research in cancer prevention and intervention by reducing disparities in cancer outcomes and by “addressing many of the challenges of traditional practice and research” (CDC and ATSDR, 1997; Simonds et al., 2013). The NIH-sponsored HIV Prevention Trials Network requires sites to have community advisory boards. Studies suggest that such community advisory boards may help investigators understand community concerns about proposed research and improve the informed consent process and consent forms (Morin et al., 2008). The California Institute for Regenerative Medicine, which provides public funding for stem cell research, has disease advocates serve on both the governing body and scientific panels that review grant applications (IOM, 2012). The 2014 NIH working group responding to the Institute of Medicine (IOM) report on the Clinical and Translational Science Awards (IOM, 2013) recommended having community or disease advocacy groups in all stages of planning for the awards and in oversight and administration (NCATS, 2014). In the health care delivery context, patients have been called on to play a key role. Taken together, these examples offer proof of the principle
that stakeholders from disease advocacy groups and communities where research is carried out can enhance the mission of organizations funding and conducting clinical research.
Conclusion: Representatives of communities and patient and disease advocacy groups can contribute fresh perspective and constructive ideas to the bodies responsible for decisions about access to clinical trial data.
Transparency in Data Processes and Procedures
Steps can be taken to enhance the trustworthiness of data sharing programs in several ways. First, the criteria for data sharing and the process for determining access should be publicly available. This transparency allows others to review the criteria and process, compare criteria and policies from different sponsors, and identify best practices. For instance, best practices regarding DUAs are likely to emerge as experience with different terms is shared. Public reporting of the number of requests for data and the number refused and for what reasons, as is done in the sharing program established by GSK, would further enhance trustworthiness (Strom et al., 2014); for example, the report on the program’s first year of work reveals that the vast majority of data requests is granted. Going beyond these summary data, YODA will publicly post all data requests; the requester’s research proposal and analysis plan; and, if access is denied, the reasons for refusal. This information will allow others to review the reasons for refusal and discuss whether they are appropriate.
Conclusion: It is best practice that policy and procedures regarding access to clinical trial data be transparent, including
- public reporting of the policies and procedures for sharing clinical trial data (including criteria for determining access and conditions of use), as well as the names of individuals making decisions about access and serving on the governing body of the unit determining access; and
- public reporting of a summary of the disposition of data sharing requests, including the number of requests and approvals and the reasons for disapprovals.
The experiences of early adopters of the sharing of clinical trial data will undoubtedly offer lessons and best practices from which others can learn. In fact, programs for sharing clinical trial data have already evolved in response to experience and feedback from stakeholders. For example,
YODA now has revised policies regarding the sharing of clinical trial data based on its experience with data sharing for Medtronic and on public comments it solicited on its draft policies to share data for Johnson & Johnson (YODA Project, 2013). In another noteworthy example, the EMA conducted a series of meetings and solicited public comments on draft regulations and revised its policies accordingly (EMA, 2014b). Policies on sharing clinical trial data can be expected to continue to evolve in the future. Sponsors will try different approaches, and comparing the outcomes of these approaches will provide useful information on what does and does not work in various contexts. Moreover, approaches to data sharing are likely to change as new approaches to clinical trials are introduced. Finally, new issues and challenges are likely to emerge as more experience is gained with data sharing.
The committee drew together the deliberations detailed in this chapter with the following overarching recommendation:
Recommendation 3: Holders of clinical trial data should mitigate the risks and enhance the benefits of sharing sensitive clinical trial data by implementing operational strategies that include employing data use agreements, designating an independent review panel, including members of the lay public in governance, and making access to clinical trial data transparent. Specifically, they should take the following actions:
- Employ data use agreements that include provisions aimed at protecting clinical trial participants, advancing the goal of producing scientifically valid secondary analyses, giving credit to the investigators who collected the clinical trial data, protecting the intellectual property interests of sponsors, and ultimately improving patient care.
- Employ other appropriate techniques for protecting privacy, in addition to de-identification and data security.
- Designate an independent review panel—in lieu of the sponsor or investigator of a clinical trial—if requests for access to clinical trial data will be reviewed for approval.
- Include lay representatives (e.g., patients, members of the public, and/or representatives of disease advocacy groups) on the independent review panel that reviews and approves data access requests.
- Make access to clinical trial data transparent by publicly reporting
- − the organizational structure, policies, procedures (e.g., criteria for determining access and conditions of use), and membership of the independent review panel that makes decisions about access to clinical trial data; and
- − a summary of the decisions regarding requests for data access, including the number of requests and approvals and the reasons for disapprovals.
- Learn from experience by collecting data on the outcomes of data sharing policies, procedures, and technical approaches (including the benefits, risks, and costs), and share information and lessons learned with clinical trial sponsors, the public, and other organizations sharing clinical trial data.
ADNI (Alzheimer’s Disease Neuroimaging Initiative). 2013. ADNI overview. http://www.adni-info.org/Scientists/ADNIOverview.aspx (accessed October 17, 2014).
Barocas, S., and H. Nissenbaum. 2014. Big data’s end run around anonymity and consent. In Privacy, big data, and the public good: Frameworks for engagement, edited by J. Lane, V. Stodden, S. Bender, and H. Nissenbaum. New York: Cambridge University Press. Pp. 44-75.
Butte, A. 2014. Clinical trials data sharing and the ImmPort experience. Paper presented at IOM Committee on Strategies for Responsible Sharing of Clinical Trial Data: Meeting Two, February 3-4, Washington, DC.
CDC (Centers for Disease Control and Prevention) and ATSDR (Agency for Toxic Substances and Disease Registry). 1997. Committee on community engagement: Principles of community engagement. http://www.cdc.gov/phppo/pce (accessed October 17, 2014).
ClinicalStudyDataRequest.com. 2014a. How it works: Access to data. https://clinicalstudydatarequest.com/How-it-works-Access.aspx (accessed November 24, 2014).
ClinicalStudyDataRequest.com. 2014b. Members of the independent review panel. https://clinicalstudydatarequest.com/Members-of-the-Independent-Review-Panel.aspx (accessed November 24, 2014).
DHS (U.S. Department of Homeland Security). 2005. Statement of Latanya Sweeney, PhD, associate professor of computer science, technology and policy, director, data privacy laboratory, Carnegie Mellon University, before the privacy and integrity advisory committee of the Department of Homeland Security (“DHS”), “privacy technologies for homeland security.” http://www.dhs.gov/xlibrary/assets/privacy/privacy_advcom_06-2005_testimony_sweeney.pdf (accessed December 8, 2014).
Doshi, P., T. Jefferson, and C. Del Mar. 2012. The imperative to share clinical study reports: Recommendations from the Tamiflu experience. PLoS Medicine 9(4):e1001201.
Doshi, P., K. Dickersin, D. Healy, S. S. Vedula, and T. Jefferson. 2013. Restoring invisible and abandoned trials: A call for people to publish the findings. British Medical Journal 346:f2865.
Dwork, C. 2014. Differential privacy: A cryptographic approach to private data analysis. In Privacy, big data, and the public good, edited by J. Lane, V. Stodden, S. Bender, and H. Nissenbaum. New York: Cambridge University Press. Pp. 296-322.
Eichler, H. G. 2013. Selecting data sharing activities. Paper presented at IOM Committee on Strategies for Responsible Sharing of Clinical Trial Data: Meeting One, October 22-23, Washington, DC.
El Emam, K., and B. Malin. 2014. Concepts and methods for de-identifying clinical trials data. Paper commissioned by the Committee on Strategies for Responsible Sharing of Clinical Trial Data (see Appendix B).
EMA (European Medicines Agency). 2013. Publication and access to clinical-trial data. http://www.ema.europa.eu/docs/en_GB/document_library/Other/2013/06/WC500144730.pdf (accessed October 15, 2014).
EMA. 2014a. European Medicines Agency policy on publication of clinical data for medicinal products for human use. http://www.ema.europa.eu/docs/en_GB/document_library/Other/2014/10/WC500174796.pdf (accessed November 17, 2014).
EMA. 2014b. Finalisation of the EMA policy on publication of and access to clinical trial data—targeted consultation with key stakeholders in May 2014. http://www.ema.europa.eu/docs/en_GB/document_library/Report/2014/09/WC500174226.pdf (accessed October 17, 2014).
Godlee, F. 2009. We want raw data, now. British Medical Journal 339:b5405.
Goldacre, B. 2013a. Are clinical trial data shared sufficiently today? No. British Medical Journal (Clinical Research Edition) 347:f1880.
Goldacre, B. 2013b. My comments to IOM on IPD sharing. In bengoldacre: passing thoughts. http://bengoldacre.tumblr.com/post/64880190003/my-comments-to-iom-on-ipd-sharing (accessed December 16, 2014).
Goldacre, B. 2014. Commentary on Berlin et al. Clinical Trials 11(1):15-18.
Goldacre, B., F. Godlee, C. Heneghan, D. Tovey, R. Lehman, I. Chalmers, V. Barbour, and T. Brown. 2014. Open letter: European Medicines Agency should remove barriers to access clinical trial data. British Medical Journal 348:g3768.
Homer, N., S. Szelinger, M. Redman, D. Duggan, W. Tembe, J. Muehling, J. V. Pearson, D. A. Stephan, S. F. Nelson, and D. W. Craig. 2008. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genetics 4(8):e1000167.
Hughes, S., K. Wells, P. McSorley, and A. Freeman. 2014. Preparing individual patient data from clinical trials for sharing: The GlaxoSmithKline approach. Pharmaceutical Statistics 13(3):179-183.
Immune Tolerance Network. 2014. Public studies on ITN TrialShare. https://www.itntrialshare.org/public/catalog.html (accessed November 24, 2014).
IOM (Institute of Medicine). 2012. The California Institute for Regenerative Medicine: Science, governance, and the pursuit of cures. Washington, DC: The National Academies Press.
IOM. 2013. The CTSA program at NIH: Opportunities for advancing clinical and translational research. Edited by A. I. Leshner, S. F. Terry, A. M. Schultz, and C. T. Liverman. Washington, DC: The National Academies Press.
Joober, R., N. Schmitz, L. Annable, and P. Boksa. 2012. Publication bias: What are the challenges and can they be overcome? Journal of Psychiatry & Neuroscience 37(3):149-152.
Krumholz, H. M., and E. D. Peterson. 2014. Open access to clinical trials data. Journal of the American Medical Association 312(10):1002-1003.
Mello, M. M., J. K. Francer, M. Wilenzick, P. Teden, B. E. Bierer, and M. Barnes. 2013. Preparing for responsible sharing of clinical trial data. New England Journal of Medicine 369(17):1651-1658.
Minsk, A., L. Nguyen, and D. R. Cohen. 2010. The 505(b)(2) new drug application process: The essential primer. FDLI Monograph Series 1(6).
Morin, S. F., S. Morfit, A. Maiorana, A. Aramrattana, P. Goicochea, J. M. Mutsambi, J. L. Robbins, and T. A. Richards. 2008. Building community partnerships: Case studies of community advisory boards at research sites in Peru, Zimbabwe, and Thailand. Clinical Trials 5(2):147-156.
MRCT (Multi-Regional Clinical Trials) Center at Harvard. 2013. Proceedings: Multi-Regional Clinical Trials Center (MRCT) at Harvard 2nd annual meeting. Paper read at Multi-Regional Clinical Trials Center (MRCT) at Harvard 2nd Annual Meeting, December 4, 2013, Cambridge, MA.
NCATS (National Center for Advancing Translational Sciences). 2014. NCATS Advisory Council working Group on the IOM report: The CTSA program at NIH. Bethesda, MD: NCATS.
NIH (U.S. National Institutes of Health). 2014. NIH issues finalized policy on genomic data sharing. http://www.nih.gov/news/health/aug2014/od-27.htm (accessed October 14, 2014).
Nisen, P., and F. Rockhold. 2013. Access to patient-level data from GlaxoSmithKline clinical trials. New England Journal of Medicine 369(5):475-478.
NRC (National Research Council). 2005. Ethical considerations for research on housing-related health hazards involving children. Washington, DC: The National Academies Press.
PCAST (President’s Council of Advisors on Science and Technology). 2014. Big data and privacy: A technological perspective. http://www.whitehouse.gov/sites/default/files/microsites/ostp/PCAST/pcast_big_data_and_privacy_-_may_2014.pdf (accessed November 17, 2014).
PhRMA (Pharmaceutical Research and Manufacturers of America) and EFPIA (European Federation of Pharmaceutical Industries and Associations). 2013. Principles for responsible clinical trial data sharing: Our commitment to patients and researchers. http://www.phrma.org/sites/default/files/pdf/PhRMAPrinciplesForResponsibleClinicalTrialDataSharing.pdf (accessed October 16, 2014).
Rabesandratana, T. 2013. Europe. Drug watchdog ponders how to open clinical trial data vault. Science 339(6126):1369-1370.
SACHRP (Secretary’s Advisory Committee on Human Research Protections). 2013. SACHRP comment regarding the June 4, 2013 FDA request for comment relating to the availability of masked and de-identified non-summary safety and efficacy data. http://www.hhs.gov/ohrp/sachrp/commsec/attachmentasecretarialletter21.pdf (accessed October 24, 2014).
Simonds, V. W., N. Wallerstein, B. Duran, and M. Villegas. 2013. Community-based participatory research: Its role in future cancer research and public health practice. Preventing Chronic Disease 10:E78.
Strom, B. L., M. Buyse, J. Hughes, and B. M. Knoppers. 2014. Data sharing, year 1—access to data from industry-sponsored clinical trials. New England Journal of Medicine 371: 2052-2054.
Wallentin, L., R. C. Becker, C. P. Cannon, C. Held, A. Himmelmann, S. Husted, S. K. James, H. S. Katus, K. W. Mahaffey, K. S. Pieper, R. F. Storey, P. G. Steg, and R. A. Harrington. 2014. Review of the accumulated PLATO documentation supports reliable and consistent superiority of ticagrelor over clopidogrel in patients with acute coronary syndrome. Commentary on: DiNicolantonio JJ, Tomek A, Inactivations, deletions, non-adjudications, and downgrades of clinical endpoints on ticagrelor: Serious concerns over the reliability of the PLATO trial, International Journal of Cardiology, 2013. International Journal of Cardiology 170(3):E59-E62.
The White House Office of Science and Technology Policy and MIT (Massachusetts Institute of Technology). 2014. Big data privacy workshop: Advancing the state of the art in technology and practice, workshop summary report. Cambridge, MA: MIT.
Wilbanks, J. 2014. Open access to clinical data. Paper presented at IOM Committee on Strategies for Responsible Sharing of Clinical Trial Data: Meeting Two, February 3-4, Washington, DC.
YODA (Yale University Open Data Access) Project. 2013. Yale University Open Data Access (YODA) Project policy for the public availability of RHBMP-2 clinical trial data. New Haven, CT: Yale University Center for Outcomes Research and Evaluation.
YODA Project. 2014. Yale University Open Data Access (YODA) Project procedures to guide external investigator access to clinical trial data. New Haven, CT: Yale University Center for Outcomes Research and Evaluation.
Zeldovich, N. 2014. Using cryptography in databases and web applications. http://web.mit.edu/bigdata-priv/pdf/Nickolai-Zeldovich.pdf (accessed December 8, 2014).