Skip to main content

Currently Skimming:

2. Challenges and Opportunities
Pages 15-30

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 15...
... The process of acquiring environmental data for research or commercial use, however, continues to be difficult. Users must first seek out the data they need, which can be time consuming and difficult because Here is no comprehensive list of or universal access point to all government data holdings.
From page 16...
... Inexperienced users and investigators using many different data sources require a substantial investment of time to acquire data. Almost without exception, data centers offer multiple methods of retrieving data In their holdings (e.g., file transfer protocol ~ 1~)
From page 17...
... Metadata in government data centers should include the following types of information: . data formats (how information is stored within data files)
From page 18...
... STANDARD TRANSLATABLE FORMATS . Typically, standards for data and metadata management are created by the individuals and organizations collecting the data; community organizations such as professional societies, data centers, and sponsoring government agencies; and international organizations.
From page 19...
... Recommendation: With their user communities, data centers should accelerate work toward standardizing and making formats more transparent for data and metadata and thereby improve distribution and interoperability between data centers, between data centers and users, and between users. Metadata formatted In XML would assure that recipients would be able to parse data automaticaBy and feed them directly to their applications.
From page 20...
... While disk storage capacities continue to increase dramatically, tape capacities arid transfer speeds have barely increased during the past five years. ~ addition, without random access to on-line data, subletting through a network is unworkable, as users cannot capture slices of the linearly stored datasets.
From page 21...
... , and support for flexing data on its spatial and temporal attributes enables efficient query execution. The complexity of the SQL query relates directly to the complexity of the database.
From page 22...
... . Data centers have spent considerable effort preserving metadata by routinely documenting information on data lineage, such as the source data, transformation processes, and quality assurance information of their datasets.
From page 23...
... Fortunately, database technology and standard formats can be as useful for metadata management as they are for data management. The self-descr~b~ng approach adopted in the definition of extensible languages such as ~ Schema is an important step in realizing technologies to support metadata management in government data centers.
From page 24...
... data producers should include data lineage and authenticity information in the metadata; (2) data centers should improve management of and access to metadata through standard formats and database technologies; and (3)
From page 25...
... These computers generally have far smaller capabilities than the scientific computing hardware currently In the data centers. To be useful for scientific applications, the data segments, or granules, have to be broken into smaller units that cart be ingested, processed, stored, and served with larger numbers of small processors.
From page 26...
... Three current projects are attempting to implement this: MODster, NEpster, and the Distributed Oceanographic Data System (Sidebar 2.21. Recommentiation: Data centers should adopt commodity hardware and commercial and open-source software solutions to the widest extent possible and concentrate their own efforts on problems that are unique to environmental data management.
From page 27...
... Due to their number and complexity, searching for a specific dataset is not a trivial task. To combat this, the Federation of Earth Science Information Partners is supporting the development of MODster to support the decentralization and distribution of MODIS data and services and to promote sharing of remote-sensing standard products.
From page 28...
... In addition, DODS applications allow users to transform existing data analysis and visualization applications into those able to access remote DODS data. Because DODS data are distributed by the same scientists who develop the data, the DODS protocol and software rely on the user community to use, improve, and extend the system.
From page 29...
... The costs of implementing demonstration data centers can be minimized by building on work that is already in progress (e.g., Sidebar 2.2~. Finally, the demonstration centers would also help the data centers and communities adapt to serving and ~nterachng with a wider range of users.
From page 30...
... Recommendation: Data centers and their sponsoring agencies should create independent demonstration data centers aimed at testing applicable technologies and satisfying the data needs of a range of users, including interdisciplinary and nontechnical users. These centers might best prove technological approaches through several participants working in parallel.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.