Different Interpretations of Existing Standards
A fundamental issue in the debate over the sharing of publication-related data, information, and materials is whether exceptions should be made to a community standard if the progress of science might be advanced by them. Participants at the workshop expressed different opinions as to whether a narrow or broad interpretation of a putative community standard is appropriate. To illustrate more clearly the strain that exceptions place on the publication process, some of the common arguments in favor of exceptions are presented here and examined in the context of the principles of publication put forward in this report.
The essential “value” of a scientific publication resides in the scientific finding and its implications as seen by the peer scientific community. If the finding is viewed as important, a journal’s readers may be willing to accept partial disclosure of supporting data and materials by the author so as not to delay awareness of the finding having been made.
Journal editors make decisions routinely about how much information the authors must provide to support their findings—in part on the basis of what will satisfy their readers. But progress in science is most efficient when an author’s peers can critically evaluate published findings. Announcing a finding without making available the data or materials that are integral to the publication (that is, necessary to support the major claims of a paper and allow knowledgeable peers to
validate or refute the major claims) may be appropriate for an advertisement or press release, but it is not appropriate for scientific publication.
An author should only need to disclose or share that which is required to reproduce and validate the published result, nothing more, nothing less.
Presenting results with enough detail so that they can be repeated might be the minimum a journal officially requires of an author but it is substandard from the perspective of the community, in particular, if repeating the work will be labor intensive. Scientists do not read others’ papers in order that they might repeat those experiments; rather they read articles to find insights and gain knowledge that allows them to move forward from that point. Only when scientists are unable to successfully build on results of a paper are they inclined to repeat the author’s experiments. Taking the stance that authors need only furnish what is necessary to repeat one’s experiments removes the value of the cumulative process of science and, considering how science is conducted today, is unrealistic. It is not possible, for example, to get a public research grant to repeat the experiment of another scientist.
Partial access to data in a publication is better than no access at all.
It has been proposed that providing data on a private Web site, with no limit on what can be viewed but with limits on the amount of data that can be downloaded at one time, satisfies the quid pro quo of publication. However, if the data are central or integral to the reported findings, this arrangement violates the spirit of the fundamental principle of allowing other researchers to replicate, verify, and build on the findings. Researchers need to be able to manipulate, query, and transform the data that support a publication’s findings so that they can build on them.
Some categories of materials are difficult, time-consuming, or expensive to reproduce; therefore, requiring authors to share them is unreasonable.
The community has never required an author to provide extensive or ongoing technical support for a requester of materials. If materials are scarce or difficult to replicate, a request could be reasonably met by
providing requestors with detailed protocols and advice regarding their synthesis, or with information on how they were obtained. Stock centers and repositories (such as museum collections and herbaria, the Jackson Laboratories and the American Type Culture Collection) constitute another appropriate alternative for distribution or maintenance of certain materials and are standard for systematic and evolutionary biology, including paleontology.
Because large-scale data assemblies can be extremely expensive to produce initially, authors should not have to make them available free.
Part of the responsibility of publishing is to share what is integral or central to the findings of a publication in exchange for credit and acknowledgment of research achievements. The commercial market has mechanisms other than publishing for making data available by subscription. The costs associated with distributing and updating a publication-related dataset might reasonably be charged to users if no public archive is available, but it is not appropriate to impose a charge to recoup the original costs of production. In the future, National Institutes of Health (NIH)-funded scientists might be able to request supplemental support to an existing grant to help distribute data (NIH, 2003).
It is not always possible to distribute materials or freely share data because of prior contractual agreements made with a research sponsor.
If a contractual agreement (for example, between an academic researcher and an industrial sponsor of the research) prohibits such sharing, the researcher should not publish. Researchers who wish to publish should avoid entering into contractual agreements that prevent unconstrained sharing of publication-related data or materials.
Researchers should have an exclusive right to analyze, or “mine,” the data they produce for a specified period after publication. The delayed release of full datasets in some disciplines should be permitted.
Science moves forward most rapidly when the research community has the ability to view and use all the data integral to a published research
finding immediately on publication. There is precedent for placing a time-limited hold on some aspects of data, such as atomic coordinates in crystallography, but the discipline has moved steadily away from that practice. The adoption of such a moratorium by a particular community can at best be justified only as a temporary interim step toward the goal of full release upon publication.
Young investigators will easily be scooped and their careers potentially will suffer if they are required to share data or materials related to their publications. Researchers just starting their academic careers should be granted a moratorium on sharing to prevent them from being overrun by their competitors.
Participating in the scientific enterprise involves agreeing to “do the right thing,” which entails some risks. Young investigators abide by community standards and risk facilitating their competitors’ research because there is an equal, if not greater, probability that they will be the beneficiaries of unrestricted sharing by others. A principal investigator can certainly try to protect the interests of graduate students or postdoctoral fellows, for example, by asking other investigators to become collaborators, and to use data or materials other than those being used by a young investigator in the laboratory. However, the sharing of materials cannot be made contingent on a promise by the recipient to enter into collaboration or to avoid competing with the one who supplies the materials. Furthermore, granting a special exception to some researchers is problematic for a variety of reasons, including the difficulty of deciding who qualifies for the exception. Who is more vulnerable: a starting researcher who has just finished a postdoctoral position in a famous laboratory, or a late-career researcher whose only research grant has just been turned down for renewal? The exposure to both benefit and risk associated with competitive activities triggered by publication must be shared equally by all participants in the publication process.
Authors should have the right to request a collaboration or coauthorship of future publications in exchange for publication-
related materials, particularly if the materials are scarce or difficult to produce.
One of the advantages of publishing is that it may lead to new and fruitful collaboration. Authors can pursue mutual research interests with requestors of materials, but it is not reasonable for an author to demand a scientific collaboration or coauthorship of a future publication in exchange for material. The principle of publication implies that authors must make their materials accessible on terms that do not interfere in a recipient’s work. However, authors who provide data or materials should be acknowledged for their contributions (see Chapter 6).
The life-sciences community must recognize the growing role of the for-profit sector in basic research and acknowledge differences in culture and tradition between academe and industry. If the for-profit sector is not treated exceptionally, it may choose not to publish and instead charge for access to its data or materials. Moreover, some fields of the life sciences, such as plant biology, have received less public funding than others; in these fields, the for-profit sector is likely to be generating the most cutting-edge data and scientific findings. If there is no competing public effort that will make the same data public eventually, it is not realistic to think that companies will disclose data without some incentive or special exception.
By requiring sharing of publication-related materials and data, the scientific community does risk the possibilities that the for-profit sector will choose not to publish some of its findings and that some data or materials will be made available only at a cost. But allowing data to be provided on terms that don’t meet community standards also has a cost— the accumulation of fragmented data sets that are difficult to validate, compare, search, or combine with the data in public repositories. In addition, allowing data to be made available to companies on different terms from those given to academics places another burden on the publication system. That fact might not concern the academic community, but it is a double standard that makes it difficult to expect the commercial sector to share data and materials as “freely” as academics claim to do.
There may also be data and materials in the for-profit sector of which the larger community is not even aware, because it is too commercially valuable to be shared. It is possible, therefore, that loosening standards for sharing publication-related data and materials (for example by allowing companies to make them available only under restrictive licenses) will not lead the for-profit sector to disclose via publication much more than it does now, because such disclosure is likely still to entail too many commercial risks.
Whether or not the short-term gain of partial and restricted access to data is worth the long-term setbacks to the system of publication is a matter of debate and difficult to prove. However, other ways exist to facilitate access to data or materials generated in the for-profit sector that might result in greater benefits to the scientific community than making exceptions to community standards and principles. These include, for example, the creation of private consortia (such as the SNP [single nucleotide polymorphism] Consortium) and public-private consortia (such as the consortium to accelerate sequencing of the mouse genome).
Requestors may use data or materials provided to them to compete, gain commercial advantage, or find a flaw in the original study and disprove its findings.
Competition and correction of erroneous conclusions and the later genesis of data that represent commercializable inventions are part and parcel of the scientific enterprise and part of the risk of publishing. Indeed, they are vital. Identifying problems in a flawed study improves the overall scientific process. Denying potential competitors access to data undermines the basic principles of sharing.
Authors should have the right to share publication-related data or materials with academic investigators only.
This view must be rejected as an artificial taxonomy. There should be a single scientific community that operates under a single set of principles regarding the pursuit of knowledge, a common ethic with regard to the integrity of the scientific process, and a long-held commitment to the
validation of concepts by experimentation and later verification or falsification of published observations. There is no clear line between “for-profit sector” and “academic” research. Some of the research done in companies is basic research with no predictable commercial end point, whereas some academic laboratories are directly connected to companies via sponsored research agreements, collaborations, consulting agreements, or stock ownership.
In the systematic and evolutionary biological community, nations often restrict the right of commercial firms to investigate or use biological samples for which they have allowed export. Because it is a matter of law, and the material is unequivocally the property of the nations involved, investigators are obligated to abide by these restrictions. While there is interest in pursuing the use of such materials by commercial firms in a regulated legal context, at the present time, companies are compelled to negotiate individual separate agreements with a country to use biological samples of interest.
Aside from this unique situation, the committee became convinced during its deliberations that exceptions to standards in one form or another could not be rationalized without sacrificing the integrity of the principle of publication. In considering arguments for making exceptions to community standards, including the need to accommodate commercial interests, the costs of producing data and materials, the vulnerability of young investigators to competition, and an investigator’s right to mine his or her data before others, the committee found that participants in the publication system were just as likely to benefit as to be hurt by a system that favored the sharing of data and materials. In some instances, avenues other than publication are available for those investigators who want to publicize their findings while maintaining control of the related data. In other cases, reasonable and innovative ways can be found to overcome the problems of costs, contractual restrictions, and competition. Notwithstanding the rule of law and other common-sense situa-
tions, exceptions unfairly penalize the community, which would have otherwise had access to the data, information, or material being withheld. Furthermore, granting a special exception to certain categories or particular researchers is problematic for a variety of reasons, including the difficulty of deciding who qualifies for the exception. Considering that publication standards maintain quality and facilitate the work of the community in moving science forward, the committee observed that exceptions are likely to weaken the effectiveness of that process over the long term:
Universal adherence, without exception, to a principle of full disclosure and unrestricted access to data and materials will promote cooperation and prevent divisiveness in the scientific community, maintain the value and prestige of publication, and promote the progress of science.