2.3.5 Cyberinfrastructure and Data Acquisition

Cyberinfrastructure for science and engineering is a term coined by the National Science Foundation to refer to distributed computer, information, and communication technologies and the associated organizational facilities to support modern scientific and engineering research conducted on a global scale. Cyberinfrastructure for the life sciences is increasingly an enabling mechanism for a large-scale, data-intensive biological research effort, inherently distributed over multiple laboratories and investigators around the world, that facilitates the integration of experimental data, enables collaboration, and promotes communication among the various actors involved.

Obtaining primary biological data is a separate question. As noted earlier, 21st century biology is increasingly a data-intensive enterprise. As such, tools that facilitate acquisition of the requisite data types in the requisite amounts will become ever more important in the future. Although they are not by any means the whole story, advances in IT and computing will play key roles in the development of new data acquisition technologies that can be used in novel ways.

Chapter 7 focuses on the roles of cyberinfrastructure and data acquisition for 21st century biology.


The forthcoming integration of computing into biological research raises deep epistemological questions about the nature of biology itself. For many thousands of years, a doctrine known as vitalism held that the stuff of life was qualitatively different from that of nonlife and, consequently, that living organisms were made of a separate substance than nonliving things or that some separate life force existed to animate the materials that composed life.

While this belief no longer holds sway today (except perhaps in bad science fiction movies), the question of how biological phenomena can be understood has not been fully settled. One stance is based on the notion that the behavior of a given system is explained wholly by the behaviors of the components that make up that system—a view known as reductionism in the philosophy of science. A contrasting stance, known as autonomy in the philosophy of science, holds that in addition to understanding its individual components, understanding of a biological system must also include an understanding of the specific architecture and arrangement of the system’s components and the interactions among them.

If autonomy is accepted as a guiding worldview, introducing the warp of computing into the weft of biology creates additional possibilities for intellectual inquiry. Just as the invention of the microscope extended biological inquiry into new arenas and enlarged the scope of questions that were reasonable to ask in the conduct of biological research, so will the computer. Computing and information technology will enable biological researchers to consider heretofore inaccessible questions, and as the capabilities of the underlying information technologies increase, such opportunities will continue to open up.

New epistemological questions will also arise. For example, as simulation becomes more pervasive and common in biology, one may ask, Are the results from a simulation equivalent to the data output of an experiment? Can biological knowledge ever arise from a computer simulation? (A practical example is the following: As large-scale clinical trials of drugs become more and more expensive, under what circumstances and to what extent might a simulation based on detailed genomic and pharmacological knowledge substitute for a large-scale trial in the drug approval process?) As simulations become more and more sophisticated, pre-loaded with more and more biological data, these questions will become both more pressing and more difficult to answer definitively.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement