The following HTML text is provided to enhance online
readability. Many aspects of typography translate only awkwardly to HTML.
Please use the page image
as the authoritative form to ensure accuracy.
Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age
The generation of complete genome sequences for a growing number of organisms has intensified the digitization of biomedical research. These data have many applications in both basic and applied research, with the lines between the two often being difficult to discern. For example, computational processing and reference to information and knowledge bases about organisms and disease processes allow researchers to reach faster conclusions about the likely results of a therapy.c The combination of cellular data, genomic profiling, and biological simulation may reduce the failure rate of drug candidates and the cost of testing. In the near future, it will even be possible, given sufficient computing and storage resources, to record the genotype of each person in a secure database. Variations in genes may indicate specific disease susceptibility or responses to known drug types. This information could enable physicians to prescribe a personal immunization and screening schedule or to recommend specific preventive measures for each patient.
Further integration of the biomedical sciences using digital technologies could allow independent investigators to remain the engine of innovative research by participating in “virtual team science.” Early examples of such “cyberinfrastructure”—including the Biomedical Informatics Research Network, myGrid, and the cancer Biomedical Informatics Grid—indicate that it is technically feasible, if not easy, to integrate the many threads of biomedicine. The challenge is to ensure that new “cybersilos” do not replace existing disciplinary and institutional silos.d
a “The race to computerize biology.” 2002. Economist, Dec. 12, 2002.
b David R. Bentley. 1996. “Genomic sequence information should be released immediately and freely in the public domain.” Science 274:533–534. This statement was written on behalf of the Sanger Institute at the Wellcome Trust Genome Campus and the Genome Sequencing Center at Washington University in St. Louis.
c Chris Sander. 2000. “Genomic medicine and the future of health care.” Science 287:1977–1978.
d Kenneth H. Buetow. 2005. “Cyberinfrastructure: Empowering a ‘third way’ in biomedical research.” Science 308: 821–824.
journals require the submission and public dissemination of the data supporting an accepted manuscript. Funding agencies and research institutions also have policies that require the open sharing of the data on which research conclusions are based. Codes of conduct in a research community, whether explicit or tacit, can exert a powerful influence on researchers to make data accessible.
Advances in information technology—for instance, the advent of grid computing and cloud computing3—will continue to transform the environment for
In grid computing, distributed computing resources link experimental apparatus, processing, analysis, and storage; cloud computing involves large-scale, data-intensive, Internet-hosted applications and related infrastructure.