The following HTML text is provided to enhance online
readability. Many aspects of typography translate only awkwardly to HTML.
Please use the page image
as the authoritative form to ensure accuracy.
elaborated by quite a number of subsequent social scientists, points to the dominant norms of science, which he indicated were organized skepticism, universalism, disinterestedness, and what he called communism, which is of most relevance to this symposium: the idea that findings belong not to the individual but to the entire scientific community, that is, they become part of the public domain.
5 This notion of collective ownership was always, in some sense, prescriptive rather than descriptive of the behavior of scientists. One has to look no further than James Watson's book The Double Helix
6 to know that, but nowadays, with commercial interests so permeating the scientific process, even a pretense of normativeness is often gone. At one time, scientists who were unwilling to share data were often responding to concerns that other scientists would steal their findings to get credit for discoveries rightly their own. It might be said that they wanted credit more than ownership. These types of concerns have certainly survived. But a change in recent years has been the extent to which commercial interests have affected the desire to establish ownership over biomedical research data. Under these circumstances, researchers and their commercial entities want not only credit but also ownership. Because of this desire for both ownership and credit, scientists often restrict access to their data by keeping them privatized, either by their own choice or at the insistence of their commercial collaborators. These restrictions often revolve around publication of the data, which may be delayed or suppressed entirely. However, they sometimes affect informal exchanges of data as well. The magnitude of these concerns will be discussed in later sessions, as will the concerns that arise with disputes over data access. Here, it is sufficient to note that supporters of a free flow of scientific data believe that resistance to data sharing and disputes over data sharing can:
waste resources by leading to duplication of efforts,
slow the progress of science because scientists cannot easily build on the efforts of others or discover errors in completed work, and
lead to a generalized level of mistrust and hostility among scientists in place of what should be a community of scientists.
Let me give one example from a study I conducted several years ago together with Stephen Hilgartner, who will be speaking with you later in this meeting. We studied data-sharing practices among x-ray crystallographers. One of these scientists reported to us that an industry group had published a paper with an incomplete structure, containing just what he referred to as “the juicy parts of the analysis.” He wrote to ask them for their coordinates, and they responded, “Well, maybe in a couple of years after we look at it a little bit more.” Three years later, he finally gave up waiting and went ahead and did the structure for a homologous substance, for which he intended to deposit coordinates and to publish. Not only was he looking forward to a significant publication, but he was especially gleeful about the possibility of harming the first group by putting into the public domain the very data they sought to keep private. This is surely not the most productive way for science to proceed. One could not even regard this as productive from the standpoint of replication because the original data were not made accessible to be replicated. It does, however, provide an example of the “disappearing property rights” referred to by Paul Uhlir and Jerry Reichman.
Exploring these issues requires a detailed analysis of what the basic terms mean. At the least, we need to understand what we mean by data and what we mean by sharing. In addition, we need to consider how, with whom, and under what circumstances and conditions scientists share and withhold data, recognizing that sharing and withholding constitute a spectrum of entities. Few scientists can afford the time and resources involved in sharing everything with everyone and few, if any, refuse to share anything. The hypothetical scientist who shares everything could never be productive, since she is spending all her time e-mailing and talking on the phone. The chimera who shares nothing would have a career that is nasty, brutish, and short. He would never publish or speak at meetings and probably would never even talk to colleagues. Indeed, he could not really be said to have colleagues. Nobody makes everything public and nobody keeps everything private. Data sharing constitutes a flexible concept