intended only for viewing, but the data in a database have the potential to be downloaded, manipulated, analyzed, annotated, and combined with data from other databases. In short, databases can be far more than repositories—they can serve as tools for creating new knowledge.
For that reason, databases hold the key to how well biologists deal with the flood of information in which they now find themselves awash. Getting control of the data and putting them to work will start with getting control of the databases. With that in mind, on February 16, 2000, the National Research Council's Board on Biology held a workshop titled “Bioinformatics: Converting Data to Knowledge.” Bioinformatics is the emerging field that deals with the application of computers to the collection, organization, analysis, manipulation, presentation, and sharing of biologic data. A central component of bioinformatics is the study of the best ways to design and operate biologic databases. This is in contrast with the field of computational biology, where specific research questions are the primary focus.
At the workshop, 15 experts spoke on various aspects of bioinformatics, identifying some of the most important issues raised by the current flood of biologic data. The pages that follow summarize and synthesize the workshop's proceedings, both the presentations of the speakers and the discussions that followed them. Like the workshop itself, this report is not intended to offer answers as much as to pose questions and to point to subjects that deserve more attention.
The stakes are high—and not only for biologic researchers. “Our knowledge is not just of philosophic interest,” said Gio Wiederhold, of the Computer Science department at Stanford University. “A major motivation is that we are able to use this knowledge to help humanity lead healthy lives.” If the data now being accumulated are put to good use, the likely rewards will include improved diagnostic techniques, better treatments, and novel drugs—all generated faster and more economically than would otherwise be possible.
The challenges are correspondingly formidable. Biologists and their bioinformatics colleagues are in terra incognita. On the computer science side, handling the tremendous amount of data and putting them in a form that is useful to researchers will demand new tools and new strategies. On the biology side, making the most of the data will demand new techniques and new ways of thinking. And there is not a lot of time to get it right. In the time it takes to read this sentence, another discovery will have been made and another few million bytes of information will have been poured into biologic databases somewhere, adding to the challenge of converting all those data into knowledge.