Key Messages Identified by Individual Speakers
• The public expresses a strong demand for biomedical innovation, but privacy issues are also a concern and regulations designed to protect the privacy of personal health information can impact data sharing.
• Incentives in academia to keep data private for purposes of professional advancement can hinder data sharing, but new models for research allow competition to continue in more open environments.
• Means of ensuring the quality of secondary analyses of shared data prior to publication would help ease concerns, particularly for those in industry, regarding misuse of shared data. The development of trust relationships, between patients and researchers but also among researchers, can enable progress that contractual agreements cannot achieve.
• Policies that mandate data sharing are often not observed.
• Technical challenges to data sharing may be more easily addressed by making arrangements to share participant-level clinical data and implementing data standards at the outset of a trial.
Despite the widely acknowledged benefits to be gained by sharing research data, significant barriers to sharing remain to be addressed. Several speakers described these barriers, and others mentioned them in
passing. Some barriers, such as the need to maintain patient privacy, are common to all organizations, but others apply more strongly to some organizations than others. For example, researchers in private industry (and their partners or potential partners in academia) are more concerned about protecting proprietary information, while academic researchers are more interested in keeping data under their control to generate publications and professional acclaim. These differing incentives complicate efforts to disseminate data more widely, as described in the chapters on models for sharing clinical data (Chapter 4), standardization (Chapter 5), and changing in the culture of research (Chapter 6).
Jennifer Geetter, a partner at McDermott Will & Emery, talked about patient privacy considerations and regulations as barriers to data sharing. Despite a desire to see innovation and progress in biomedical research, the public remains very concerned about a potential loss of privacy. However, the exact nature of these concerns is not well understood, Geetter observed. Are people afraid their personal information will be used in ways they do not intend? Are they afraid their information will be used in ways that have adverse employment or insurance consequences? “It is difficult to know,” said Geetter, “but everyone out there perceives a real privacy concern.”
Geetter described two opposing approaches to privacy protection. The first holds that access to information should be very restricted with a presumption of nondisclosure. For example, the classic doctor–patient relationship assumes this model, which obviously is a significant impediment to data sharing. The second model balances confidentiality with socially useful sharing and disclosures of data using generally accepted rules for doing so. The public is ambivalent about which of these models to adopt, said Geetter. Despite this ambivalence, Geetter encouraged researchers to involve patients in data-sharing decisions. When patients are asked to share their information, are given a voice in that process, and are thanked for their participation, they are more likely to choose to share health information, she said.
A good deal of legal uncertainty exists regarding privacy regulations and the associated impacts on data sharing, said Geetter. At the time of the workshop, several regulations were being developed and revised that affect data sharing. Release of modified rules promulgated under the
Health Insurance Portability and Accountability Act (HIPAA),1 which protect the privacy of individually identifiable health information, was imminent and, Geetter noted, will influence the sharing of medical information for research as well as health care. Also, the Common Rule,2 which provides protection for human research subjects, was undergoing “a massive upgrade for the 21st century,” according to Geetter. For example, a proposed change to the rule dealt with whether human biospecimens could ever be considered de-identified, which will have a substantial effect on research. To further complicate matters, there is a lack of harmonization among regulations. For example, Food and Drug Administration (FDA) regulations, the Common Rule, and HIPAA do not agree on aspects of data sharing. Furthermore, state-by-state rules exist that tend to be disease specific, and these can have provisions related to data sharing.
Another uncertainty mentioned by Geetter related to the implications of the FDA’s “Part 11” rule,3 which covers the submission of electronic data to the FDA. Parts of the rule are currently enforced, but other parts are not. This affects data sharing because most data sharing involves electronic data, which is likely to be included in future FDA submissions.
The “preparatory to research exception” is a HIPAA pathway that allows sharing of protected health information (PHI) without an individual’s authorization in order to prepare for a clinical trial. However, data sharing using this pathway is limited by a restriction preventing PHI from leaving the premises of a covered entity. “This made sense when you were looking at dusty paper medical records,” Geetter noted. “It does not make as much sense, in my view, when you are talking about electronic data that may never be at the covered entity’s premises to begin with” because the data may be in an electronic health record held by a vendor who uses cloud computing services.
Geetter concluded by observing that the line between clinical care and research is growing progressively more blurred, especially as electronic health record systems become more common. Opportunities for systematic data collection across institutions will become more plentiful as these systems become increasingly interoperable. Data generated during clinical care, in addition to that resulting from research studies, may become available for mining. However, competing public priorities will likely place limits on the ability of that data to be viewed and shared. For
145 CFR 160, 164.
245 CFR 46.
321 CFR 50, 56.
example, physicians may be encouraged to use that data to troubleshoot the care they provide, but privacy advocates may oppose making it available for discovery purposes.
Need for Incentives
Andrew Vickers, attending research methodologist in the Department of Epidemiology and Biostatistics at the Memorial Sloan-Kettering Cancer Center, has experienced firsthand the reluctance of investigators to share clinical research data; he described several such instances. In one case, data from the control arm of a trial funded by the National Institutes of Health (NIH) was needed to help design a new study, but the response to his request was “I am not prepared to release the data at this point.” Another time, when data were needed for a meta-analysis of results from trials involving chemotherapy, he was told, “I would love to send you these data but my statistician won’t allow it.” In a third case, Vickers was developing a statistical model for predicting response to treatment for specific patient populations and needed access to a dataset from another NIH-funded trial. Despite providing the investigators with numerous reassurances, including that the data would be used only for a statistical methodological study, that the paper would explicitly state that no clinical conclusions should be drawn from the analysis, that the data would be slightly corrupted, and that he would send a draft of the paper to the investigators and give them veto power, “We never heard back from them,” he said.
Among the explanations Vickers received for refusal to share data was the cost and trouble of putting datasets together, typified by the comment, “I would love to help you, but it would take too much time.” Vickers labeled this argument as unacceptable. In the case of the 10 papers they surveyed, the authors had just published results based on the requested dataset. He questioned how authors could publish a study without having a clean and well-annotated dataset, which could easily be distributed on request. “You have to do this anyway, right?” he said.
Other arguments against sharing data had more validity, Vickers acknowledged. For example, career advancement in an academic setting depends on the ability of investigators to generate publications from data they might have spent years collecting. This concern was raised repeat-
edly by speakers throughout the workshop as a major barrier to sharing for those working in academia.
In response to his frustrations with the current problems with data access, Vickers wrote an essay in the New York Times, drawing attention to this cultural barrier and pointing out the moral obligation that researchers have to share data that has been collected from patients who volunteered to participate in clinical trials, in part, for the benefit of future patients. In addition to the essay, Vickers also helped publish a study on data sharing by authors who publish in Public Library of Science (PLoS) journals. Despite the journals’ data-sharing policies, which require authors to share their raw data, only 1 of the 10 requested datasets was received (Savage and Vickers, 2009), indicating that additional incentives or enforcement mechanisms are necessary to change the culture surrounding data sharing.
John Ioannidis, C.F. Rehnborg Chair in Disease Prevention at Stanford University, also discussed the inadequacy of current incentives and policies promoting data sharing. A study by Ioannidis and his colleagues on data-sharing policies at the 50 scientific journals with the highest impact factors found that most have policies in place for sharing data and making the data available (Alsheikh-Ali et al., 2011). However, when the authors sampled 10 papers from each journal, they found that few authors had actually deposited the data summarized in the paper. “Even though there is a lot of interest and a lot of investment in trying to make data sharing work, in practice we still have some ways to go,” Ioannidis said.
Fears Regarding Misuse of Shared Data
Another common concern raised during the workshop, particularly by those in industry, was how clinical data will be reused once they become more accessible. One fear was that data will be misused or misinterpreted if, for example, too little attention is paid to how the data were collected and analyzed or to the nature of the patient population. Such misinterpretations can be published outside the peer-reviewed literature so that standard quality controls do not apply. Ensuring that secondary analyses of data are done responsibly is an important issue, said Michael Rosenblatt, executive vice president and chief medical officer of Merck & Co., Inc., because in many cases the information that comes out of a secondary review will not be subject to the same kind of scrutiny and peer review as was done the first time. But, said Elizabeth Loder, BMJ,
in the long view, “we should have confidence in the fact that eventually science is self-correcting.” It may take a long time, during which multiple and competing analyses coexist. “The anxiety that we feel about many different people looking at the data and coming up with different interpretations is somewhat misplaced. I do not think we should be afraid of that future.”
Other workshop participants worried that researchers will not have the time to rebut all misconceived analyses. Incorrect information can have extremely harmful results, observed Robert Califf, Duke University Medical Center. “There will be consequences of people being killed by poor use of data because, if it hits the news, a lot of people will stop taking their medications … based on what is in the news,” he said. “You can kill people with bad science very quickly. It is a problem that we are going to have to grapple with.”
Kelly Edwards, acting associate dean at the graduate school and associate professor in the Department of Bioethics and Humanities at the University of Washington, described some of the cultural barriers to greater data sharing—along with several ways of overcoming those barriers.
These arrangements may be necessary, but they are insufficient, said Edwards. Trust also needs to be relational, with contracts serving as a way to punctuate what has already been agreed to rather than the sum total of how a relationship will work. Different elements enter into relationship trust. In some cases, people share core values and interests or are committed to a common cause. Someone may have another person’s best interests in mind, as in the doctor–patient relationship. Relational trust can be built on transparent and consistent (or logical) rules, and trust can
depend on associations. “When someone else I trust trusts you, I can also get in the door.”
Reemphasizing a point made by several other participants, Edwards noted that the promotion system in academic settings, which relies on individual credit for grants awarded and papers published, interferes with the establishment of trust relationships and can be antagonistic to data sharing. However, Edwards pointed to several developments that can allow rewards, acknowledgment, and attribution to coexist in a more open research system. The old models of medical research where data are kept close to our chest are beginning to crumble, she said, and new models can be built that still promote competition, but in a more open research space. Other industries are helping to drive more open systems and the democratization of data, such as the information technology industry’s move toward mobile and cloud computing. “This culture shift has happened already in other fields,” said Edwards. “Let’s move it into health research.”
One way to encourage openness is to stay grounded in traditional research ethics. The 1978 Belmont Report, subtitled Ethical Principles and Guidelines for the Protection of Human Subjects of Research, referred to respect for persons, beneficence, and justice (HEW, 1979). These principles also can be applied to the emergence of more open research systems. For example, respect for persons can be embodied in both partnerships and communication. As a concrete example, Edwards mentioned the simple step of thanking research participants. “I am taking an informal poll of researchers I work with on how often they just say thank you to the participants who are in their studies, and it is embarrassingly low numbers,” she said. “Simply saying thank you can go such a long way.”
Regulations provide a minimum standard for behavior, Edwards said, and researchers need to do more than just what regulations mandate. Thus, data-sharing policies can provide a scaffolding, but the research community needs to set standards of excellence and strive to meet those standards.
Planning for sharing participant-level data at the outset of a study is important, Ioannidis pointed out. A more difficult, or impossible, task is to unearth data after the paper describing those data has been published. Far
better, he said, is to arrange upfront for the full individual-level data to be available. Issues such as coding, cleaning, and logical queries take some time to resolve, but their difficulty is probably overrated. Post-hoc efforts to standardize data can also prove challenging and costly and may limit the usefulness of such data (see Chapter 5). “If we wait to see what we can do after the fact, it is very difficult.” He was involved in one study in which the investigators sought to repeat the analyses of microarray expression studies published in Nature Genetics using the datasets deposited with the papers (Ioannidis et al., 2009). Four independent teams of microarray analysts could reproduce only 2 of the 18 tables and figures from the papers. Much of the time, key information was not available, despite a precondition to publication in the journal that data be made available to independent investigators.
Some fields have adopted strong principles of data sharing. One of the best examples is the field of human genomics, which has principles on how to share information among all investigators working in the same area and, in some cases, with other investigators and the public. Without those standards in places, said Ioannidis, the “fantastic growth” in the field of human genomics would have occurred much more slowly. “Clinical trials could learn from such paradigms and try to adopt them.”