(including degree of review and certification) and institutional origin, have given rise to additional complications associated with the increasingly pervasive electronic distribution of scientific data.
Some data issues are more discipline specific. Perennial problems affecting access to data in the observational sciences, for example, include gaps in quality control, incompatibility of data streams, inadequate documentation of data sets, and difficulty in meeting the requirements for long-term retention of data. In the biological sciences, the variety of attributes and qualifiers included with each observation and differences in terminology and usage put a heavy burden on any supplier of data to identify and specify the character of the data precisely enough to prevent misinterpretation. In the laboratory physical sciences, as in many other branches, fragmentation of data into numerous, autonomous, and often incompatible databases with different formats and levels of quality is a chronic problem.
Putting scientific data to use rapidly in sectors outside the immediate discipline of origin poses additional challenges to the longer-term effort to provide full and open access. In the observational environmental sciences, for instance, massive archives and reliable institutional memory are necessary to keep the data accessible and intelligible. Simultaneously, however, data also must be available to meet the public's need for warnings of natural hazards and disasters and for commercial use by the private sector. In addition, availability of data can be affected by governmental concerns related to national security, foreign policy, and international trade. Newly adopted or proposed restrictions on previously open and unrestricted data have caused particular concern in the Earth science communities, for example.
Another significant concern regarding full and open access to scientific data is related to commercialization of electronic publication and electronic databases. Science operates according to a "market" of its own, one that has rules and values different from those of commercial markets. While protection of intellectual property may concern a scientist who is writing a textbook, that same scientist, publishing a paper in a scientific journal, is motivated by the desire to propagate ideas, with the expectation of full and open access to the results. To commercial publishers (including many professional societies), protection of intellectual property means protection of the rights to reproduce and distribute printed material. To scientists, protection of intellectual property usually signifies assurance of proper attribution and credit for ideas and achievements. Generally, scientists are more concerned that their work be read and used rather than that it be protected against unauthorized copying. These conflicting viewpoints pose challenging problems for science and the rest of society. Current discussions are seeking a balance between protecting publicly supported activities that advance the public welfare and strengthening individual rights to intellectual property.
Associated with the internationalization of scientific data collection and use has been the growth of data centers—dedicated, stable institutions supporting collaborative data sharing across international boundaries and providing verifica-