Skip to main content

Currently Skimming:

Converting Data to Knowledge
Pages 23-28

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 23...
... DATA MINING Perhaps the best-known technique is data mining. Because many data are now available in databases including information on genetic sequences, protein structure and function, genetic mutations, and diseasesand because data are available not only on humans but also on many other species, scientists are finding it increasingly valuable to "mine" the databases for patterns or connected bits of information that can be assembled into a larger picture.
From page 24...
... Over the next year, each time a new human protein was analyzed, they analyzed it by using homologues and a technique developed in Brutlag's laboratory called eMATRICES. "Using both methods, we assigned biologic functions to almost 77% of the human proteins.
From page 25...
... Any representation must be probabilistic that is, the representation will not describe exactly where each structure or function lies, but will instead provide a set of possible locations and the likelihood of each. So instead of creating a single, sharply defined map laying out the various features of the human brain and coloring in the areas responsible for different functions, any brain-mapping project must find some way to capture and display the inherent fuzziness in where things lie in the brain.
From page 26...
... "The only problem was that this was a library exercise," Mazziotta concluded. "What it needs to be and what we want it to be is a digital database exercise, where the framework is the structure of human brain, so we can do an experiment, find this observation, and go deep into the data and find other features that are now very awkward to identify." The second motivating factor for developing the brain-atlas database, Mazziotta said, was the sheer amount of data generated by even the simplest experiments with the human brain.
From page 27...
... A second study compared the brains of a population of patients who had early Alzheimer's disease, averaged in probabilistic space, with the brains of a population of patients in the later stages of the disease. It found that Alzheimer's disease causes changes in the gross structure of the brain, thinning the corpus callosum and causing the upper part of the parietal lobe to shrink.
From page 28...
... "And if you could compare her brain with those of a well-matched population of other 19-year-old left-handed Asian women who smoke cigarettes, had 2 years of college, and had not read Gone With the Wind, you might find that there is an extra fold in the gyrus here, the cortex is a half-millimeter thicker, and so on." In short, because of the data that it is gathering on its subjects and the capability of isolating the brains of subjects with particular characteristics, the probabilistic brain atlas will allow physicians and researchers not only to say what is normal for the entire population, but also what is normal for subgroups with specific traits. And that is something that would not be possible without harnessing the tremendous data-handling capabilities of modern biologic databases.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.