Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page R1
Catalyzing Inquiry at the Interface of Computing and Biology Catalyzing Inquiry at the Interface of Computing and Biology John C. Wooley and Herbert S. Lin, editors Committee on Frontiers at the Interface of Computing and Biology Computer Science and Telecommunications Board Division on Engineering and Physical Sciences NATIONAL RESEARCH COUNCIL OF THE NATIONAL ACADEMIES THE NATIONAL ACADEMIES PRESS Washington, D.C. www.nap.edu
OCR for page R2
Catalyzing Inquiry at the Interface of Computing and Biology THE NATIONAL ACADEMIES PRESS 500 Fifth Street, N.W. Washington, DC 20001 NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropriate balance. Support for this project was provided by the Defense Advanced Research Projects Agency under Contract No. MDA972-00-1-0005, the National Science Foundation under Contract No. DBI-0094528, the Department of Health and Human Services/National Institutes of Health (including the National Institute of General Medical Sciences and the National Center for Research Resources) under Contract No. N01-OD-4-2139, the Department of Energy under Contract No. DE-FG02-02ER63336, the Department of Energy’s Office of Science (BER) under Interagency Agreement No. DE-FG02-04ER63934, and National Research Council funds. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the organizations or agencies that provided support for the project. International Standard Book Number 0-309-09612-X Library of Congress Control Number: 2005936580 Cover designed by Jennifer M. Bishop. This report is available from Computer Science and Telecommunications Board National Research Council 500 Fifth Street, N.W. Washington, DC 20001 Additional copies of this report are available from the National Academies Press, 500 Fifth Street, N.W., Lockbox 285, Washington, DC 20055; (800) 624-6242 or (202) 334-3313 (in the Washington metropolitan area); Internet, http://www.nap.edu. Copyright 2005 by the National Academy of Sciences. All rights reserved. Printed in the United States of America
OCR for page R3
Catalyzing Inquiry at the Interface of Computing and Biology THE NATIONAL ACADEMIES Advisers to the Nation on Science, Engineering, and Medicine The National Academy of Sciences is a private, nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Academy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Ralph J. Cicerone is president of the National Academy of Sciences. The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engineers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineering programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. Wm. A. Wulf is president of the National Academy of Engineering. The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Institute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Harvey V. Fineberg is president of the Institute of Medicine. The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy’s purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Ralph J. Cicerone and Dr. Wm. A. Wulf are chair and vice chair, respectively, of the National Research Council. www.national-academies.org
OCR for page R4
Catalyzing Inquiry at the Interface of Computing and Biology COMMITTEE ON FRONTIERS AT THE INTERFACE OF COMPUTING AND BIOLOGY JOHN C. WOOLEY, University of California at San Diego, Chair ADAM P. ARKIN, University of California at Berkeley and Lawrence Berkeley National Laboratory ERIC BRILL, Microsoft Research Labs ROBERT M. CORN, University of California at Irvine CHRIS DIORIO, University of Washington LEAH EDELSTEIN-KESHET, University of British Columbia MARK H. ELLISMAN, University of California at San Diego MARCUS W. FELDMAN, Stanford University DAVID K. GIFFORD, Massachusetts Institute of Technology TAKEO KANADE, Carnegie Mellon University STEPHEN S. LADERMAN, Agilent Laboratories JAMES S. SCHWABER, Thomas Jefferson Medical College Staff Herbert Lin, Senior Scientist and Study Director Geoff Cohen, Consultant to CSTB Mitchell Waldrop, Consultant to CSTB Daehee Hwang, Consultant to Board on Biology Robin Schoen, Senior Staff Officer Elizabeth Grossman, Senior Staff Officer (through March 2001) Jennifer Bishop, Program Associate D.C. Drake, Senior Program Assistant (through March 2003)
OCR for page R5
Catalyzing Inquiry at the Interface of Computing and Biology COMPUTER SCIENCE AND TELECOMMUNICATIONS BOARD JOSEPH TRAUB, Columbia University, Chair ERIC BENHAMOU, Benhamou Global Ventures, LLC DAVID D. CLARK, Massachusetts Institute of Technology, CSTB Chair Emeritus WILLIAM DALLY, Stanford University MARK E. DEAN, IBM Almaden Research Center DEBORAH ESTRIN, University of California, Los Angeles JOAN FEIGENBAUM, Yale University HECTOR GARCIA-MOLINA, Stanford University KEVIN KAHN, Intel Corporation JAMES KAJIYA, Microsoft Corporation MICHAEL KATZ, University of California, Berkeley RANDY H. KATZ, University of California, Berkeley WENDY A. KELLOGG, IBM T.J. Watson Research Center SARA KIESLER, Carnegie Mellon University BUTLER W. LAMPSON, Microsoft Corporation, CSTB Member Emeritus TERESA H. MENG, Stanford University TOM M. MITCHELL, Carnegie Mellon University DANIEL PIKE, GCI Cable and Entertainment ERIC SCHMIDT, Google Inc. FRED B. SCHNEIDER, Cornell University WILLIAM STEAD, Vanderbilt University ANDREW J. VITERBI, Viterbi Group, LLC JEANNETTE M. WING, Carnegie Mellon University RICHARD ROWBERG, Acting Director KRISTEN BATCH, Research Associate JENNIFER M. BISHOP, Program Associate JANET BRISCOE, Manager, Program Operations JON EISENBERG, Senior Program Officer and Associate Director RENEE HAWKINS, Financial Associate MARGARET MARSH HUYNH, Senior Program Assistant HERBERT S. LIN, Senior Scientist LYNETTE I. MILLETT, Senior Program Officer JANICE SABUDA, Senior Program Assistant GLORIA WESTBROOK, Senior Program Assistant BRANDYE WILLIAMS, Staff Assistant For more information on CSTB, see its Web site at http://www.cstb.org, write to CSTB, National Research Council, 500 Fifth Street, N.W., Washington, DC 20001, or call (202) 334-2605, or e-mail the CSTB at email@example.com.
OCR for page R6
Catalyzing Inquiry at the Interface of Computing and Biology This page intentionally left blank.
OCR for page R7
Catalyzing Inquiry at the Interface of Computing and Biology Preface In the last decade of the 20th century, computer science and biology both emerged as fields capable of remarkable and rapid change. Moreover, they evolved as fields of inquiry in ways that draw attention to their areas of intersection. The continuing advancements in technology and the pace of scientific research present the means for computing to help answer fundamental questions in the biological sciences and for biology to demonstrate that new approaches to computing are possible. Advances in the power and ease of use of computing and communications systems have fueled computational biology (e.g., genomics) and bioinformatics (e.g., database development and analysis). Modeling and simulation of biological entities such as cells have joined biologists and computer scientists (and mathematicians, physicists, and statisticians too) to work together on activities from pharmaceutical design to environmental analysis. On the other side, computer scientists have pondered the significance of biology for their field. For example, computer scientists have explored the use of DNA as a substrate for new computing hardware and the use of biological approaches in solving hard computing problems. Exploration of biological computation suggests a potential for insight into the nature of and alternative processes for computation, and it also gives rise to questions about hybrid systems that achieve some kind of synergy of biological and computational systems. And there is also the fact that biological systems exhibit characteristics such as adaptability, self-healing, evolution, and learning that would be desirable in the information technologies that humans use. Making the most of the research opportunities at the interface of computing and biology—what we are calling the BioComp interface—requires illuminating what they are and effectively engaging people from both computing and biology. As in other contexts, the challenges of interdisciplinary education and of collaboration are significant, and each will require attention, together with substantive work from both policy makers and researchers. At the start of the 1990s, attempts were made to stimulate mutual interest and collaboration among young researchers in computing and biology. Those early efforts yielded nontrivial successes, but in retrospect represented a Version 1.0 prototype for the potential in bringing the two fields together. Circumstances today seem much more favorable for progress. New research teams and training programs have been formed as individual investigators from the respective communities, government agencies, and private foundations have become increasingly engaged. Similarly, some larger groups of investigators from different backgrounds have been able to
OCR for page R8
Catalyzing Inquiry at the Interface of Computing and Biology obtain funding to work together to address cross-disciplinary research problems. It is against this background that the committee sees a Version 2.0 of the BioComp interface emerging that will yield unprecedented progress and advance. The range of possible activities at the BioComp interface is broad, and accordingly so is the range of interested agencies, which include the Defense Advanced Research Projects Agency (DARPA), the National Science Foundation (NSF), the Department of Energy (DOE), and the National Institutes of Health (NIH). These agencies have, to varying degrees, recognized that truly cross-disciplinary work would build on both computing and biology, and they have sought to advance activities at the interface. This report by the Committee on Frontiers at the Interface of Computing and Biology seeks to establish the intellectual legitimacy of a fundamentally cross-disciplinary collaboration between biologists and computer scientists. That is, while some universities are increasingly favorable to research at the intersection, life science researchers at other universities are strongly impeded in their efforts to collaborate. This report addresses these impediments and describes some strategies for overcoming them. In addition, this report provides a wealth of well-documented examples. As a rule, these examples have generally been selected to illustrate the breadth of the topic in question, rather than to identify the most important areas of activity. That is, the appropriate spirit in which to view these examples is “let a thousand flowers bloom,” rather than one of “finding the prettiest flowers.” It is hoped that these examples will encourage students in the life sciences to start or to continue study in computer science that will enable them to be more effective users of computing in their future biological studies. In the opposite direction, the report seeks to describe a rich and diverse domain—biology—within which computer scientists can find worthy problems that challenge current knowledge in computing. It is hoped that this awareness will motivate interested computer scientists to learn about biological phenomena, data, experimentation, and the like—so that they can engage biologists more effectively. To gather information on such a broad area, the committee took input from a wide variety of sources. The committee convened two workshops in March 2001 and May 2001, and committee members or staff attended relevant workshops sponsored by other groups. The committee mined the published literature extensively. It solicited input from other scientists known to be active in BioComp research. An early draft of the report was examined by a number of reviewers far larger than usual for National Research Council (NRC) reports, and the draft was modified in accordance with their extensive input, which helped the committee to sharpen its message and strengthen its presentation. The result of these efforts is the first comprehensive NRC study that suggests a high-level intellectual structure for federal agencies for supporting work at the BioComp interface. Although workshop reports have been supported by individual agencies on the subject of computing applied to various aspects of biological inquiry, the NRC has not until now undertaken a study whose intent was to be inclusive. Within the NRC, the lead unit on this project was the Computer Science and Telecommunications Board (CSTB), and Marjory Blumenthal and Elizabeth Grossman launched the project. The committee also acknowledges with gratitude the contribution of the Board on Biology—Robin Schoen continued work on the project after Elizabeth Grossman’s departure. Geoff Cohen and Mitch Waldrop, consultants to CSTB, made major substantive contributions to this report. A variety of project assistants, including D.C. Drake, Jennifer Bishop, Gloria Westbrook, and Margaret Huynh, provided research and administrative support. Finally, grateful thanks are offered to DARPA, NIH, NSF, and DOE for their financial support for this project as well as their patience in awaiting the final report. No single agency can respond to the challenges and opportunities at the interface, and the committee hopes that its analysis will facilitate agency efforts to define their own priorities, set their own path, and participate in what will be a continuing adventure along the frontier at this exciting and promising interface, which will continue to develop throughout the 21st century.
OCR for page R9
Catalyzing Inquiry at the Interface of Computing and Biology A Personal Note from the Chair The committee found the scope of the study and the need to achieve an adequate level of balance in both directions around the BioComp interface to be a challenge. This challenge, I hope, has been met, but this was only possible due to the recruitment of an outstanding physicist turned computer science policy expert from the NRC. Specifically, after the original series of meetings, Herb Lin from the CSTB side of the NRC joined the effort, and most notably, followed up on the committee’s earlier analyses by interviewing numerous individuals engaged in both biocomputing (applications of biology to computing) and computational biology (applications of computing to biology). This was invaluable, as was Herb’s never ending enthusiasm, insight into the nature of the interdisciplinary discussions that are growing, and his willingness to engage in learning a lot about biology. The report could never have been completed without his persistence. His expertise in editing and analytical treatment of policy and technical material allowed us to sustain a broad vision. (Even with the length and breadth of this study, we were able to cover only selected areas at the interface.) The committee’s efforts were sustained and accelerated by Herb’s determination that we stay the course despite the size of the task, and by his insightful comments, criticisms, and suggestions on every aspect of the study and the report. John Wooley, Chair Committee on Frontiers at the Interface of Computing and Biology
OCR for page R10
Catalyzing Inquiry at the Interface of Computing and Biology This page intentionally left blank.
OCR for page R11
Catalyzing Inquiry at the Interface of Computing and Biology Acknowledgment of Reviewers This report has been reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise, in accordance with procedures approved by the National Research Council’s Report Review Committee. The purpose of this independent review is to provide candid and critical comments that will assist the institution in making its published report as sound as possible and to ensure that the report meets institutional standards for objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process. We wish to thank the following individuals for their review of this report: Harold Abelson, Massachusetts Institute of Technology, Eric Benhamou, Benhamou Global Ventures, LLC, Mina Bissell, Lawrence Berkeley National Laboratory, Gaetano Borriello, University of Washington, Dennis Bray, University of Cambridge, Steve Burbeck, IBM, Andrea Califano, Columbia University, Charles Cantor, Boston University, David D. Clark, Massachusetts Institute of Technology, G. Bard Ermentrout, University of Pittsburgh, Lisa Fauci, Tulane University, David Galas, Keck Graduate Institute, Leon Glass, McGill University, Mark D. Hill, University of Wisconsin-Madison, Tony Hunter, The Salk Institute for Biological Studies, Sara Kiesler, Carnegie Mellon University, Isaac Kohane, Children’s Hospital, Nancy Kopell, Boston University, Bud Mishra, New York University, William Noble, University of Washington,
OCR for page R12
Catalyzing Inquiry at the Interface of Computing and Biology Alan S. Perelson, Los Alamos National Laboratory, Robert J. Robbins, Fred Hutchinson Cancer Research Center, Lee Segel, The Weizmann Institute of Science, Larry L. Smarr, University of California, San Diego, Sylvia Spengler, National Science Foundation, William Stead, Vanderbilt University, Suresh Subramani, University of California, San Diego, Charles Taylor, University of California, Los Angeles, and Andrew J. Viterbi, Viterbi Group, LLC. Although the reviewers listed above have provided many constructive comments and suggestions, they were not asked to endorse the conclusions or recommendations, nor did they see the final draft of the report before its release. The review of this report was overseen by Russ Altman, Stanford University. Appointed by the National Research Council, he was responsible for making certain that an independent examination of this report was carried out in accordance with institutional procedures and that all review comments were carefully considered. Responsibility for the final content of this report rests entirely with the authoring committee and the institution.
OCR for page R13
Catalyzing Inquiry at the Interface of Computing and Biology Contents EXECUTIVE SUMMARY 1 1 INTRODUCTION 9 1.1 Excitement at the Interface of Computing and Biology, 9 1.2 Perspectives on the BioComp Interface, 10 1.2.1 From the Biology Side, 11 1.2.2 From the Computing Side, 12 1.2.3 The Role of Organization and Culture, 13 1.3 Imagine What’s Next, 14 1.4 Some Relevant History in Building the Interface, 16 1.4.1 The Human Genome Project, 16 1.4.2 The Computing-to-Biology Interface, 16 1.4.3 The Biology-to-Computing Interface, 17 1.5 Background, Organization, and Approach of This Report, 19 2 21st CENTURY BIOLOGY 23 2.1 What Kind of Science?, 23 2.1.1 The Roots of Biological Culture, 23 2.1.2 Molecular Biology and the Biochemical Basis of Life, 24 2.1.3 Biological Components and Processes in Context, and Biological Complexity, 25 2.2 Toward a Biology of the 21st Century, 27 2.3 Roles for Computing and Information Technology in Biology, 31 2.3.1 Biology as an Information Science, 31 2.3.2 Computational Tools, 33 2.3.3 Computational Models, 33 2.3.4 A Computational Perspective on Biology, 33 2.3.5 Cyberinfrastructure and Data Acquisition, 34 2.4 Challenges to Biological Epistemology, 34
OCR for page R14
Catalyzing Inquiry at the Interface of Computing and Biology 3 ON THE NATURE OF BIOLOGICAL DATA 35 3.1 Data Heterogeneity, 35 3.2 Data in High Volume, 37 3.3 Data Accuracy and Consistency, 38 3.4 Data Organization, 40 3.5 Data Sharing, 44 3.6 Data Integration, 47 3.7 Data Curation and Provenance, 49 4 COMPUTATIONAL TOOLS 57 4.1 The Role of Computational Tools, 57 4.2 Tools for Data Integration, 58 4.2.1 Desiderata, 59 4.2.2 Data Standards, 60 4.2.3 Data Normalization, 60 4.2.4 Data Warehousing, 62 4.2.5 Data Federation, 62 4.2.6 Data Mediators/Middleware, 65 4.2.7 Databases as Models, 65 4.2.8 Ontologies, 67 22.214.171.124 Ontologies for Common Terminology and Descriptions, 67 126.96.36.199 Ontologies for Automated Reasoning, 69 4.2.9 Annotations and Metadata, 73 4.2.10 A Case Study: The Cell Centered Database, 75 4.2.11 A Case Study: Ecological and Evolutionary Databases, 79 4.3 Data Presentation, 81 4.3.1 Graphical Interfaces, 81 4.3.2 Tangible Physical Interfaces, 83 4.3.3 Automated Literature Searching, 84 4.4 Algorithms for Operating on Biological Data, 87 4.4.1 Preliminaries: DNA Sequence as a Digital String, 87 4.4.2 Proteins as Labeled Graphs, 88 4.4.3 Algorithms and Voluminous Datasets, 89 4.4.4 Gene Recognition, 89 4.4.5 Sequence Alignment and Evolutionary Relationships, 92 4.4.6 Mapping Genetic Variation Within a Species, 94 4.4.7 Analysis of Gene Expression Data, 97 4.4.8 Data Mining and Discovery, 100 188.8.131.52 The First Known Biological Discovery from Mining Databases, 100 184.108.40.206 A Contemporary Example: Protein Family Classification and Data Integration for Functional Analysis of Proteins, 101 4.4.9 Determination of Three-dimensional Protein Structure, 103 4.4.10 Protein Identification and Quantification from Mass Spectrometry, 106 4.4.11 Pharmacological Screening of Potential Drug Compounds, 107 4.4.12 Algorithms Related to Imaging, 107 220.127.116.11 Image Rendering, 110 18.104.22.168 Image Segmentation, 110 22.214.171.124 Image Registration, 113 126.96.36.199 Image Classification, 114 4.5 Developing Computational Tools, 114
OCR for page R15
Catalyzing Inquiry at the Interface of Computing and Biology 5 COMPUTATIONAL MODELING AND SIMULATION AS ENABLERS FOR BIOLOGICAL DISCOVERY 117 5.1 On Models in Biology, 117 5.2 Why Biological Models Can Be Useful, 119 5.2.1 Models Provide a Coherent Framework for Interpreting Data, 120 5.2.2 Models Highlight Basic Concepts of Wide Applicability, 120 5.2.3 Models Uncover New Phenomena or Concepts to Explore, 121 5.2.4 Models Identify Key Factors or Components of a System, 121 5.2.5 Models Can Link Levels of Detail (Individual to Population), 122 5.2.6 Models Enable the Formalization of Intuitive Understandings, 122 5.2.7 Models Can Be Used as a Tool for Helping to Screen Unpromising Hypotheses, 122 5.2.8 Models Inform Experimental Design, 122 5.2.9 Models Can Predict Variables Inaccessible to Measurement, 123 5.2.10 Models Can Link What Is Known to What Is Yet Unknown, 124 5.2.11 Models Can Be Used to Generate Accurate Quantitative Predictions, 124 5.2.12 Models Expand the Range of Questions That Can Meaningfully Be Asked, 124 5.3 Types of Models, 125 5.3.1 From Qualitative Model to Computational Simulation, 125 5.3.2 Hybrid Models, 129 5.3.3 Multiscale Models, 130 5.3.4 Model Comparison and Evaluation, 131 5.4 Modeling and Simulation in Action, 134 5.4.1 Molecular and Structural Biology, 134 188.8.131.52 Predicting Complex Protein Structures, 134 184.108.40.206 A Method to Discern a Functional Class of Proteins, 134 220.127.116.11 Molecular Docking, 136 18.104.22.168 Computational Analysis and Recognition of Functional and Structural Sites in Protein Structures, 136 5.4.2 Cell Biology and Physiology, 139 22.214.171.124 Cellular Modeling and Simulation Efforts, 139 126.96.36.199 Cell Cycle Regulation, 146 188.8.131.52 A Computational Model to Determine the Effects of SNPs in Human Pathophysiology of Red Blood Cells, 148 184.108.40.206 Spatial Inhomogeneities in Cellular Development, 149 220.127.116.11.1 Unraveling the Physical Basis of Microtubule Structure and Stability, 149 18.104.22.168.2 The Movement of Listeria Bacteria, 150 22.214.171.124.3 Morphological Control of Spatiotemporal Patterns of Intracellular Signaling, 151 5.4.3 Genetic Regulation, 152 126.96.36.199 Cis-regulation of Transcription Activity as Process Control Computing, 152 188.8.131.52 Genetic Regulatory Networks as Finite-state Automata, 153 184.108.40.206 Genetic Regulation as Circuits, 157 220.127.116.11 Combinatorial Synthesis of Genetic Networks, 158 18.104.22.168 Identifying Systems Responses by Combining Experimental Data with Biological Network Information, 159 5.4.4 Organ Physiology, 161 22.214.171.124 Multiscale Physiological Modeling, 161 126.96.36.199 Hematology (Leukemia), 162
OCR for page R16
Catalyzing Inquiry at the Interface of Computing and Biology 188.8.131.52 Immunology, 163 184.108.40.206 The Heart, 166 5.4.5 Neuroscience, 172 220.127.116.11 The Broad Landscape of Computational Neuroscience, 172 18.104.22.168 Large-scale Neural Modeling, 173 22.214.171.124 Muscular Control, 175 126.96.36.199 Synaptic Transmission, 181 188.8.131.52 Neuropsychiatry, 187 5.4.6 Virology, 189 5.4.7 Epidemiology, 191 5.4.8 Evolution and Ecology, 193 184.108.40.206 Commonalities Between Evolution and Ecology, 193 220.127.116.11 Examples from Evolution, 194 18.104.22.168.1 Reconstruction of the Saccharomyces Phylogenetic Tree, 195 22.214.171.124.2 Modeling of Myxomatosis Evolution in Australia, 197 126.96.36.199.3 The Evolution of Proteins, 198 188.8.131.52.4 The Emergence of Complex Genomes, 199 184.108.40.206 Examples from Ecology, 200 220.127.116.11.1 Impact of Spatial Distribution in Ecosystems, 200 18.104.22.168.2 Forest Dynamics, 201 5.5 Technical Challenges Related to Modeling, 202 6 A COMPUTATIONAL AND ENGINEERING VIEW OF BIOLOGY 205 6.1 Biological Information Processing, 205 6.2 An Engineering Perspective on Biological Organisms, 210 6.2.1 Biological Organisms as Engineered Entities, 210 6.2.2 Biology as Reverse Engineering, 211 6.2.3 Modularity in Biological Entities, 213 6.2.4 Robustness in Biological Entities, 217 6.2.5 Noise in Biological Phenomena, 220 6.3 A Computational Metaphor for Biology, 223 7 CYBERINFRASTRUCTURE AND DATA ACQUISITION 227 7.1 Cyberinfrastructure for 21st Century Biology, 227 7.1.1 What Is Cyberinfrastructure? 227 7.1.2 Why Is Cyberinfrastructure Relevant? 228 7.1.3 The Role of High-performance computing, 231 7.1.4 The Role of Networking, 235 7.1.5 An Example of Using Cyberinfrastructure for Neuroscience Research, 235 7.2 Data Acquisition and Laboratory Automation, 237 7.2.1 Today’s Technologies for Data Acquisition, 237 7.2.2 Examples of Future Technologies, 241 7.2.3 Future Challenges, 245 8 BIOLOGICAL INSPIRATION FOR COMPUTING 247 8.1 The Impact of Biology on Computing, 247 8.1.1 Biology and Computing: Promise and Skepticism, 247 8.1.2 The Meaning of Biological Inspiration, 249 8.1.3 Multiple Roles: Biology for Computing Insight, 250
OCR for page R17
Catalyzing Inquiry at the Interface of Computing and Biology 8.2 Examples of Biology as a Source of Principles for Computing, 253 8.2.1 Swarm Intelligence and Particle Swarm Optimization, 253 8.2.2 Robotics 1: The Subsumption Architecture, 255 8.2.3 Robotics 2: Bacterium-inspired Chemotaxis in Robots, 256 8.2.4 Self-Healing Systems, 257 8.2.5 Immunology and Computer Security, 259 22.214.171.124 Why Immunology Might Be Relevant, 259 126.96.36.199 Some Possible Applications of Immunology-based Computer Security, 259 188.8.131.52 Immunological Design Principles for Computer Security, 260 184.108.40.206 An Example: Immunology and Intruder Detection, 262 220.127.116.11 Interesting Questions and Challenges, 263 18.104.22.168.1 Definition of Self, 263 22.214.171.124.2 More Immunological Mechanisms, 263 126.96.36.199 Some Possible Difficulties with an Immunological Approach, 264 8.2.6 Amorphous Computing, 264 8.3 Biology as Implementer of Mechanisms for Computing, 265 8.3.1 Evolutionary Computation, 265 188.8.131.52 What Is Evolutionary Computation? 265 184.108.40.206 Suitability of Problems for Evolutionary Computation, 267 220.127.116.11 Correctness of a Solution, 268 18.104.22.168 Solution Representation, 269 22.214.171.124 Selection of Primitives, 269 126.96.36.199 More Evolutionary Mechanisms, 270 188.8.131.52.1 Coevolution, 270 184.108.40.206.2 Development, 270 220.127.116.11 Behavior of Evolutionary Processes, 271 8.3.2 Robotics 3: Energy and Compliance Management, 272 8.3.3 Neuroscience and Computing, 273 18.104.22.168 Neuroscience and Architecture in Broad Strokes, 274 22.214.171.124 Neural Networks, 274 126.96.36.199 Neurally Inspired Sensors, 277 8.3.4 Ant Algorithms, 277 188.8.131.52 Ant Colony Optimization, 278 184.108.40.206 Other Ant Algorithms, 279 8.4 Biology as Physical Substrate for Computing, 280 8.4.1 Biomolecular Computing, 280 220.127.116.11 Description, 281 18.104.22.168 Potential Application Domains, 284 22.214.171.124 Challenges, 285 126.96.36.199 Future Directions, 286 8.4.2 Synthetic Biology, 287 188.8.131.52 An Engineering Approach to Building Living Systems, 288 184.108.40.206 Cellular Logic Gates, 288 220.127.116.11 Broader Views of Synthetic Biology, 290 18.104.22.168 Applications, 291 22.214.171.124 Challenges, 291 8.4.3 Nanofabrication and DNA Self-Assembly, 292 126.96.36.199 Rationale, 292 188.8.131.52 Applications, 296
OCR for page R18
Catalyzing Inquiry at the Interface of Computing and Biology 184.108.40.206 Prospects, 297 220.127.116.11 Hybrid Systems, 298 9 ILLUSTRATIVE PROBLEM DOMAINS AT THE INTERFACE OF COMPUTING AND BIOLOGY 299 9.1 Why Problem-focused Research? 299 9.2 Cellular and Organismal Modeling, 300 9.3 A Synthetic Cell with Physical Form, 303 9.4 Neural Information Processing and Neural Prosthetics, 306 9.5 Evolutionary Biology, 311 9.6 Computational Ecology, 313 9.7 Genome-enabled Individualized Medicine, 317 9.7.1 Disease Susceptibility, 318 9.7.2 Drug Response and Pharmacogenomics, 320 9.7.3 Nutritional Genomics, 322 9.8 A Digital Human on Which a Surgeon Can Operate Virtually, 323 9.9 Computational Theories of Self-assembly and Self-modification, 325 9.10 A Theory of Biological Information and Complexity, 327 10 CULTURE AND RESEARCH INFRASTRUCTURE 331 10.1 Setting the Context, 331 10.2 Organizations and Institutions, 332 10.2.1 The Nature of the Community, 332 10.2.2 Education and Training, 333 10.2.2.1 General Considerations, 333 10.2.2.2 Undergraduate Programs, 334 10.2.2.3 The BIO2010 Report, 335 10.2.2.3.1 Engineering, 336 10.2.2.3.2 Quantitative Training, 336 10.2.2.3.3 Computer Science, 337 10.2.2.4 Graduate Programs, 341 10.2.2.5 Postdoctoral Programs, 343 10.2.2.5.1 The Sloan/DOE Postdoctoral Awards for Computational Molecular Biology, 343 10.2.2.5.2 The Burroughs-Wellcome Career Awards at the Scientific Interface, 344 10.2.2.5.3 Keck Center for Computational and Structural Biology: The Research Training Program, 344 10.2.2.6 Faculty Retraining in Midcareer, 345 10.2.3 Academic Organizations, 346 10.2.4 Industry, 349 10.2.4.1 Major IT Corporations, 350 10.2.4.2 Major Life Science Corporations, 350 10.2.4.3 Start-up and Smaller Companies, 351 10.2.5 Funding and Support, 352 10.2.5.1 General Considerations, 352 10.2.5.1.1 The Role of Funding Institutions, 352 10.2.5.1.2 The Review Process, 352
OCR for page R19
Catalyzing Inquiry at the Interface of Computing and Biology 10.2.5.2 Federal Support, 353 10.2.5.2.1 National Institutes of Health, 353 10.2.5.2.2 National Science Foundation, 356 10.2.5.2.3 Department of Energy, 357 10.2.5.2.4 Defense Advanced Research Projects Agency, 359 10.3 Barriers, 361 10.3.1 Differences in Intellectual Style, 361 10.3.1.1 Historical Origins and Intellectual Traditions, 361 10.3.1.2 Different Approaches to Education and Training, 362 10.3.1.3 The Role of Theory, 363 10.3.1.4 Data and Experimentation, 365 10.3.1.5 A Caricature of Intellectual Differences, 367 10.3.2 Differences in Culture, 367 10.3.2.1 The Nature of the Research Enterprise, 367 10.3.2.2 Publication Venue, 369 10.3.2.3 Organization of Human Resources, 369 10.3.2.4 Devaluing the Contributions of the Other, 369 10.3.2.5 Attitudinal Issues, 370 10.3.3 Barriers in Academia, 371 10.3.3.1 Academic Disciplines and Departmental Structure, 371 10.3.3.2 Structure of Educational Programs, 372 10.3.3.3 Coordination Costs, 373 10.3.3.4 Risks of Retraining and Conversion, 374 10.3.3.5 Rapid But Uneven Changes in Biology, 374 10.3.3.6 Funding Risk, 375 10.3.3.7 Local Cyberinfrastructure, 375 10.3.4 Barriers in Commerce and Business, 375 10.3.4.1 Importance Assigned to Short-term Payoffs, 375 10.3.4.2 Reduced Workforces, 376 10.3.4.3 Proprietary Systems, 376 10.3.4.4 Cultural Differences Between Industry and Academia, 376 10.3.5 Issues Related to Funding Policies and Review Mechanisms, 377 10.3.5.1 Scope of Supported Work, 377 10.3.5.2 Scale of Supported Work, 379 10.3.5.3 The Review Process, 380 10.3.6 Issues Related to Intellectual Property and Publication Credit, 381 11 CONCLUSIONS AND RECOMMENDATIONS 383 11.1 Disciplinary Perspectives, 383 11.1.1 The Biology-Computing Interface, 383 11.1.2 Other Emerging Fields at the BioComp Interface, 384 11.2 Moving Forward, 385 11.2.1 Building a New Community, 386 11.2.2 Core Principles for Practitioners, 387 11.2.3 Core Principles for Research Institutions, 388 11.3 The Special Significance of Educational Innovation at the BioComp Interface, 389 11.3.1 Content, 389 11.3.2 Mechanisms, 390
OCR for page R20
Catalyzing Inquiry at the Interface of Computing and Biology 11.4 Recommendations for Research Funding Agencies, 392 11.4.1 Core Principles for Funding Agencies, 392 11.4.2 National Institutes of Health, 395 11.4.3 National Science Foundation, 397 11.4.4 Department of Energy, 397 11.4.5 Defense Advanced Research Projects Agency, 398 11.5 Conclusions Regarding Industry, 398 11.6 Closing Thoughts, 399 APPENDIXES A The Secrets of Life: A Mathematician’s Introduction to Molecular Biology 403 B Challenge Problems in Bioinformatics and Computational Biology from Other Reports 429 C Biographies of Committee Members and Staff 437 D Workshop Participants 443 What Is CSTB? 445
OCR for page R21
Catalyzing Inquiry at the Interface of Computing and Biology Catalyzing Inquiry at the Interface of Computing and Biology
OCR for page R22
Catalyzing Inquiry at the Interface of Computing and Biology This page intentionally left blank.