Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
18 Committee on National Statistics large. Data organized in complex file structures may need to be converted to simpler structures by the subsequent analyst. The data-base dictionary may be tied to an incompatible software package and require conversion. The ori- ginal data collectors may not have used standard data preparation and docu- mentation practices. The data documentation may be inadequate; the codes may be undocumented, inconsistent, or erroneous. Undiscovered errors are inevitable. These costs can be reduced if data sharing is recognized as a goal by initial data collectors. And the costs may be shared if data tapes are transferred to an intermediate archive that takes responsibility for editing and documenting them. Sharing Costs One strategy for encouraging data sharing is to impose a cost for not sharing data. A public statement that a researcher was withholding data may encour- age the researcher and others to share their data. Reinforcing data shar- ing as a scientific obligation may be fruitful in promoting data sharing more widely. The practice of data sharing probably will become more widespread if the costs are not borne exclusively by the initial researcher. Data shanng, then, must also be cost sharing; subsequent analysts should contribute appropriately to the costs of documentation and pay the costs to transfer data. Sharing data primarily benefits science and society; the costs are borne mostly by the initial investigators. Yet most scientists are willing to share their data to some extent despite this relationship. One reason is that recogni- hon of the initial investigator usually is provided by subsequent analysts. Another reason is that scientific institutions do foster data sharing through peer recognition of altruistic behavior that advances science. THE CHANGING ENVIRONMENT FOR DATA SHARING Developments in computers and software, changes in research practices, the different rewards and incentives for research, and new laws and regulations may all affect the sharing of data. Ibis section describes how a few of these changing circumstances may affect the propensity of researchers to share their data.
Sharing Research Data 19 Use of Computers The widespread use of computers for recording, summarizing, and analyzing research data facilitates sharing data. The use of computers avoids time- consuming clerical work and permits the transfer of large data bases that would not have been feasible in the past. Large machine-readable data files are a research resource in the social sciences analogous to large-scale instru- mentation in the physical sciences. Transfer of machine-readable data is hindered by incompatibility of com- puter equipment and software. Help to overcome such technical problems may come from the acceptance of common conventions for the internal stor- age and representation of data, from the development of standard analytic packages, and the development of conversion capabilities to move from one system to another. More burdensome to an initial investigator are the time- consuming tasks of file cleaning, preparation of data-base dictionaries and other appropriate documentation, and dissemination. As the importance of these activities has become more widely recognized, some aids have been developed; more are expected in the future. The literature on computer file management, standards for file documentation, and similar matters is grow- ing. Moreover, institutions have been organized that specialize in the collec- tion, maintenance, and dissemination of machine-readable data files. Some of these institutions are international in scope. Both the technical guidelines for data documentation and the number of institutions that serve as intermedi- aries to transfer data are growing (see Clubb, in this volume, for a further dis- cussion on using computers for data sharing). Privacy and Confidentiality Confidentiality refers to not disclosing responses to questions that could be identified as belonging to an individual organization or person. Privacy ref- ers to the right of an individual not to make personal information available to another. Confidentiality is obviously relevant to data sharing. Privacy is also relevant: as the public has become more concerned about invasion of pri- vacy, researchers have attempted to overcome respondent hesitation by mak- ing stronger promises of confidentiality. Legal protections for privacy at- tempt to protect privacy by maintaining confidentiality of records, and in many cases, restricting their use to the agency to which the respondent pro- vided information. Growing concerns about confidentiality and the protection of privacy have affected research involving information about individuals and the conditions under which data may be shared, especially if the research is undertaken under
20 Committee on National Statistics federal contract. As a result, more attention is paid to maintaining the con- fidentiality of records, whether legally required or not; to removing identifi- able information from records before data are shared; and to using other dis- closure avoidance techniques. Paralleling the burgeoning use of computers in business and government, public awareness of issues of privacy and confidentiality has increased during the past two decades. Respondents express concern over invasion of privacy and are skeptical of assurances that confidentiality will be protected (see, for example, National Research Council, 19791. Also, the public is apprehen- sive of the growth of large-scale computerized data banks that contain person- al, individually identifiable information. Investigators have become more sensitive to issues of privacy and confidentiality because of this public discus- sion and respondent reactions. The public concerns have led to enactment of statutes designed to protect privacy and ensure the confidentiality of data concerning individuals (see Cecil and Griffin, In this volume). A major federal statute is the Privacy Act of 1974. Designed to protect the confidentiality of records collected and maintained by the federal government, it provides, with certain exceptions, that identifiable information about individuals may not be disclosed outside the agency that collected the information unless the prior consent of the indi- viduals concerned is obtained.S A key characteristic of this statute is that it does not distinguish between data for administrative purposes and data for re- search or statistical purposes. The provisions of the law apply directly to in- vestigators whose research or surveys are undertaken under a contract win a federal agency, as are, for example, most evaluations of federal programs. Such investigators must observe the provisions of the Privacy Act in sharing data by deleting identifying names and numbers from individual records; sometimes, over disclosure-avoidance techniques are used. These rules may hamper and at times prevent the matching or linking of data files. In some research requiring access to federal data, identification of individuals is essential. In epidemiological studies, for example, it may be necessary to know He names of persons exposed to certain suspected hazards over long periods in order to match these win records of death or disease at a later time. Unless such epidemiological research is considered "routine use" under Be teens of Be Privacy Act, access to this information may be res- tricted. Biomedical researchers in particular are affected by federal regulations go- verning research on humans that require review of research plans by institu- SIn Edition to federal law, several states have enacted statutes to protect privacy that may also affect research.
Sharing Research Data 21 tional review boards. In some cases, such boards may go beyond the require- ments of the Privacy Act and so have an effect on the ability of researchers to share data. The Privacy Protection Study Commission, called for by the Privacy Act of 1974, urged among other recommendations that the Act be revised to distin- guish between data for research purposes and those maintained for administra- tive purposes (Privacy Protection Study Commission, 1977: especially pages 567 6041. If the law is changed, investigators might find fewer restrictions on access to individually identifiable federal data for research purposes. It is certain, however, that there would still be strong injunctions and safeguards calling on researchers to protect the confidentiality of data. Freedom of Information Another federal statute, the Freedom of Information Act, enacted in 1966, which provides for greater public access to many kinds of federal data, has had the opposite effect of the Privacy Act (see Cecil and Griffin, in this vol- ume). There are two specific exemptions to access in the Freedom of Information Act that are most relevant to research data: "personnel and medi- cal and similar files Me disclosure of which would constitute a clearly unwar- ranted invasion of privacy" and "trade secrets and commercial or financial in- formation obtained from a person and privileged or confidential." An investi- gator whose contract with a federal agency calls for transfer to the agency of microdata that do not qualify for these exemptions should expect that We data may be shared with others, researchers or not, under the Freedom of Information Act. The act does not appear to apply, however, to data main- tained solely under the control of the investigator. Even investigators working on funds from private sources may be subject to Me Freedom of Information Act should Hey submit data to a federal agency for advice or checking. For example, a privately sponsored survey Hat used computer as- sistance from the federal Centers for Disease Control was ruled subject to the Freedom of Information Act (Dickson, 19801. Patents, Profits, and PropFietary Data The possibility that a research effort may lead to He development of a patent- able product or process may affect the willingness of investigators to share Heir data. Patent laws may also delay publication of research results and, therefore, may delay data sharing. A recent change in He U.S. patent law, for example, led the Office of Management and Budget to suggest that federal agencies require notification of any potentially patentable results at least three months before research reports are submitted for publication. The rule would
22 Committee on National Statistics apply to federally sponsored research in universities and small businesses and is intended to allow time to apply for patent rights in certain European coun- tries. In the United States, patents can be applied for up to one year following publication of research results, but in some European countries patent rights may be forfeited by publication. In commenting on these developments, Dickson (1981:501) noted: "The proposed rule has already created a storm of protest from the U.S. research community, which claims that, by threatening to deny a scientist patent rights to a discovery if the procedure is not followed, it could seriously impede scientific communication." The Copynght Act is also relevant to data sets developed by researchers. Under that act, the proprietary rights of a person who has developed informa- tion are balanced against the public benefits from distribution of the inforrr~a- tion. Interpretations of the Copynght Act, which was significantly amended in 1976, may affect the extent to which data are shared. The doctrine of fair use, which limits the exclusive rights of copyright owners in order to permit reasonable use by others for purposes such as criticism, news reporting, teaching (including multiple copies for classroom use), or research, was ex- panded in the Copynght Act amendments (see Cecil and Griffin, in this vol- ume). Scholarly journals that insist on copyrighting all articles may impede reanalysis of previously published information by requiring secondary ana- lysts to obtain copyright releases from original researchers, although the fair use provision makes this requirement unnecessary. Recent applications of research on DNA have drawn dramatic attention to the potential profitability of some research. Academic research scientists and private funs engaged in developing profitable applications have sometimes found themselves win very different interests. A report in Science of a dis- pute between the University of California and the pharmaceutical firm of Hoffmann-La Roche concerning a human gene containing the genetic infor- mation for the synthesis of interferon earned the following headline: "University and Drug Finn Battle Over Billion-Dollar Gene: A lawsuit over interferon may change the informal ways by which researchers exchange materials" (Wade, 1980~. Donald Kennedy, president of Stanford University, commented: "Scientists who once shared Republication info~a- tion freely and exchanged cell lines without hesitation are now much more re- luctant to do so" (Roark, 1981~. And the New York Times (1981) editonal- ized: "The values of the marketplace have so invaded the campus that on sev- eral occasions researchers have refused to share with their colleagues the ex- act details of how Hey did their experiments. Such attitudes are incompatible with the ethos of a scholarly community." Similar views were expressed in a Nature (1980) editorial. Potentially lucrative applications of scientific re-
i Sharing Research Data 23 search are not widespread, but, in the scientific disciplines in which they oc- cur, the effect on data sharing is significant. At a recent meeting of university and company officials, the need for facul- ty freedom to report research was discussed, and it was agreed that research contracts or licensing agreements between universities and private companies should avoid secrecy (Chronicle of Higher Education, 1982:121. The joint statement included, under the heading "Open Communication Encouraged," the following: The traditions of open research and prompt transmission of research results should govern all university research, including research sponsored by industry. Those traditions require that universities encourage open communication about re- search in progress and research results. However, it is appropriate for institutions to file for patent coverage for inventions and discoveries that result from university research. This action may require brief delays in publication or other public disclo- sure. Receipt of proprietary information from a sponsor may occasionally be desirable to facilitate the research. Such situations must be handled on a case-by-case basis in a manner which neither violates the principles stated above nor interferes with the educational process. Any other restrictions on control of information disclosure by institutions are not appropriate as general policy. Restrictions on International Sharing of Data Restrictions on the sharing of data across national boundaries are likely to fluctuate with international political tensions and changes in perceived nation- al interests. Such restrictions may apply not only to defense-related technolo- gy, but more broadly to research that is deemed to be of advantage to other na- tions. The Export Administration Act of 1979, administered by the U.S. Department of Commerce, requires that export controls be used where neces- sary '`to restrict the export of goods and technology which would make a sig- nificant contribution to the military potential of any other country or combina- tion of counties which would prove detrimental to the national security of the United States.,' In the United States, restrictions on sharing data with other countries appar- ently are being tightened. Examples include: (1) Proposed revisions in the 1972 International Traffic in Arms Regulations, published in preliminary form in the Federal Register (December 19, 1980), require that an export license be obtained for transfer to a foreigner of technical data that may have a defense application. (2) Dunng 1981, an amendment was proposed to the Arms Export Control