Digital Data in the Neurosciences
The neurosciences illustrate both the potential value of well-organized and accessible data and the variety of issues raised by the increased importance of data handling and data sharing.
It is not surprising that the neurosciences are rich in the use of and need for data, given the complexity of the nervous system. The brain has roughly a hundred billion neurons and more than 1,000 subdivisions, each with different structures and circuitry. In the past, neurological research has depended heavily on autopsy for clues about function and structure. Now it relies heavily on in vivo imaging methods and computational models, both of which depend on computing power and mathematical techniques.
This new universe of neuroscience data is too vast and complex for manual analysis. Large-scale detailed maps of the brain can require some 25 gigabytes of memory per image. Also, neuroscientists must work across multiple scales of resolution because they do not yet know which levels are critical for many neurological processes. They must integrate such diverse datasets as cellular neuroimaging, gene expression data, genotype data, neuronal morphology, and clinical data.
Making neuroscience data widely available holds tremendous potential for helping science and society. This includes:
Facilitating replication and validation of experimental results,
Promoting collective analyses of large numbers of experiments by different groups,
Improving communication within and between groups, and
Several very effective databases have been developed in the neurosciences. They include:
The Cell Centered Database, started in 2002, makes two-dimensional and three-dimensional static and dynamic microscopic data available to the research community. It also links data obtained at cellular and subcellular scales to molecular and higher order structure. It is built on the Biomedical Informatics Research Network and Telescience grid infrastructure for distributed collaboration.
SumsDB is a repository of brain-mapping data, including surfaces and volumes, with both structural and functional data. It includes more than 500 studies on monkeys, rodents, apes, humans, and others, totaling about 10 percent of the published literature. It also includes a data mining tool called WebCaret so that SumsDB