Skip to main content

Currently Skimming:

Creating Databases
Pages 5-10

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 5...
... Today, biologic researchers face an entirely different sort of problem: how to handle an unaccustomed embarrassment of riches. "We have spent the last 100 years as hunter-gatherers, pulling in a little data here and there from the forests and the trees," William Gelbart, professor of molecular and cellular biology at Harvard University, told the workshop audience.
From page 6...
... Technology that has recently become available allows us to study individual cells or individual clusters of similar cells to look at either the genes that are being expressed in the cells or the gene products. If you do this in any one cell, you can easily come up with thousands of data points." A single brain cell, Koslow noted, may contain as many as 10,000 different proteins, and the concentration of each is a potentially valuable bit of information.
From page 7...
... The database contains descriptions of protein function as reported in the scientific literature, information on gene sequences and protein structures, details about proteins' roles in the cell and their interactions with other proteins, and data on where and when various proteins are produced in the body. DATABASE CURATION It is a major challenge, Garrets said, simply to capture all that information and structure it in a way that makes it useful and easily accessible
From page 8...
... We train our curators a lot, and to have 6,000 untrained curators all sending us data on yeast would not work." Researchers, Garrets said, should deposit some of their results directly into databases genetic sequences should go into sequence databases, for instance but most of the work of curation should be left to specialists. In addition to acquiring and arranging the data, curators must perform other tasks to create a workable database, said Michael Cherry, technical manager for Stanford University's Department of Genetics and one of the specialists who developed the Saccharomyces Genome Database and the Stanford Microarray Database.
From page 9...
... Along another path, we would like to study how, by perturbing the normal parts list or instruction manual, we create aberrations in how organisms look, behave, carry out metabolic pathways, and so on. We need databases that support these operations." One stumbling block to such integration, Gelbart said, is that the best way to organize diverse biologic data would be to reflect their connections in the body.
From page 10...
... 10 BIOINFORMATICS: CONVERTING DATA TO KNOWLEDGE the natural world can be expected to flow from a well-organized collection of data, but organizing the data well demands a good understanding of that world. The solution is, as it was with Linnaeus, a bootstrap approach: Organize the data as well as you can, use them to gain more insights, use the new insights to further improve the organization, and so on.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.