1
Introduction

NOAA is a mission agency and historically it has collected environmental and geospatial data of many types to meet its primary meteorological, oceanographic, and geophysical operational mission requirements. The data managed by NOAA stretch from the surface of the sun to the core of the earth, and affect every aspect of society. Although it is difficult and beyond the charge of this committee to assess the monetary value of environmental data, the importance can be inferred: For example, the Department of Commerce’s Bureau of Economic Analysis estimates that 42 percent of U.S. Gross Domestic Product is sensitive to weather and climate.2 Customers of this investment include NOAA, other Federal agencies, state and local governments, industry, business interests, scientists, educators, the general public, and the international community. The needs of these customers are diverse, making it difficult to assess the value of any particular environmental data stream. For instance, while outdated weather forecasts are of little use to most customers, they may be crucial for the legal and research communities.

NOAA’s National Environmental Satellite, Data, and Information Service (NESDIS) operates three national data centers (the National Climate Data Center, National Geophysical Data Center, and National Ocean Data Center) and over thirty centers of data (e.g., the National Ice Center); collectively these entities are responsible for “acquiring, integrating, managing, disseminating, and archiving environmental and geospatial data and information obtained from worldwide sources to support NOAA’s mission”.3 For the purposes of this report, the term “data” will be taken to mean both environmental or geospatial observations, including physical samples, and also model output. Equally important are metadata, which are all the information necessary for data to be independently understood by users, to ensure proper stewardship of the data, and to allow for future discovery.

In the NOAA 2007 budget request4 it is noted that “Collectively, the three national data centers acquire over one petabyte (1015 bytes) of new data annually, provide access to an archive exceeding 3.5 petabytes, and support over 100 million worldwide queries per year, providing data transfers to over two million customers.” The rapid increase in the volume of data distribution (Figure 1) and its associated stewardship and management activities is a significant concern. Furthermore, the challenges associated with managing NOAA’s data are only expected to increase with the anticipated explosion in model output and new satellite systems in the years ahead (Figure 2). Even though the launch dates of some assets such as the National Polar-orbiting Operational Environmental Satellite System (NPOESS) are uncertain, NOAA has agreed to archive certain data collected by other agencies, such as MODIS (Moderate Resolution

2  

Bureau of Economic Analysis figures reported in National Research Council, 1998, The Atmospheric Sciences Entering the Twenty-First Century, National Academy Press, Washington, D.C., page 25.

3  

NOAA Administrative Order 212-15, effective 22 December 2003, available at http://www.corporateservices.noaa.gov/~ames/NAOs/Chap_212/naos_212_15.html

4  

NOAA 2007 budget request ”blue book,” available at http://www.corporateservices.noaa.gov/~nbo/07bluebook_highlights.html



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 3
Preliminary Principles and Guidelines for Archiving Environmental and Geospatial Data at NOAA: Interim Report 1 Introduction NOAA is a mission agency and historically it has collected environmental and geospatial data of many types to meet its primary meteorological, oceanographic, and geophysical operational mission requirements. The data managed by NOAA stretch from the surface of the sun to the core of the earth, and affect every aspect of society. Although it is difficult and beyond the charge of this committee to assess the monetary value of environmental data, the importance can be inferred: For example, the Department of Commerce’s Bureau of Economic Analysis estimates that 42 percent of U.S. Gross Domestic Product is sensitive to weather and climate.2 Customers of this investment include NOAA, other Federal agencies, state and local governments, industry, business interests, scientists, educators, the general public, and the international community. The needs of these customers are diverse, making it difficult to assess the value of any particular environmental data stream. For instance, while outdated weather forecasts are of little use to most customers, they may be crucial for the legal and research communities. NOAA’s National Environmental Satellite, Data, and Information Service (NESDIS) operates three national data centers (the National Climate Data Center, National Geophysical Data Center, and National Ocean Data Center) and over thirty centers of data (e.g., the National Ice Center); collectively these entities are responsible for “acquiring, integrating, managing, disseminating, and archiving environmental and geospatial data and information obtained from worldwide sources to support NOAA’s mission”.3 For the purposes of this report, the term “data” will be taken to mean both environmental or geospatial observations, including physical samples, and also model output. Equally important are metadata, which are all the information necessary for data to be independently understood by users, to ensure proper stewardship of the data, and to allow for future discovery. In the NOAA 2007 budget request4 it is noted that “Collectively, the three national data centers acquire over one petabyte (1015 bytes) of new data annually, provide access to an archive exceeding 3.5 petabytes, and support over 100 million worldwide queries per year, providing data transfers to over two million customers.” The rapid increase in the volume of data distribution (Figure 1) and its associated stewardship and management activities is a significant concern. Furthermore, the challenges associated with managing NOAA’s data are only expected to increase with the anticipated explosion in model output and new satellite systems in the years ahead (Figure 2). Even though the launch dates of some assets such as the National Polar-orbiting Operational Environmental Satellite System (NPOESS) are uncertain, NOAA has agreed to archive certain data collected by other agencies, such as MODIS (Moderate Resolution 2   Bureau of Economic Analysis figures reported in National Research Council, 1998, The Atmospheric Sciences Entering the Twenty-First Century, National Academy Press, Washington, D.C., page 25. 3   NOAA Administrative Order 212-15, effective 22 December 2003, available at http://www.corporateservices.noaa.gov/~ames/NAOs/Chap_212/naos_212_15.html 4   NOAA 2007 budget request ”blue book,” available at http://www.corporateservices.noaa.gov/~nbo/07bluebook_highlights.html

OCR for page 3
Preliminary Principles and Guidelines for Archiving Environmental and Geospatial Data at NOAA: Interim Report Imaging Spectroradiometer) data from the EOS (Earth Observing System) satellites operated by NASA. This large and exponentially growing data volume indicates an urgent need for NOAA to address its ability to handle the current and future needs of NOAA archive users, and in fact it has already begun to do so, but significant work remains. In addition to data volume, data diversity is another challenge; NOAA’s consolidated observation requirements include over 2000 diverse variables ranging from hyperspectral satellite imagery to the stomach contents of fish (McLean S., 2006). These data come from a broad range of platforms including (but not limited to) satellites, fixed and mobile radars, research aircraft, buoys, and ships of opportunity, and may be derived from such diverse sources as embedded sensors, models, physical samples, and self-organizing networks, each of which are associated with unique challenges in organizing, cataloguing, archiving, and providing access to the data they collect or generate. Figure 1: Quarterly data downloads from NOAA’s National Geophysical Data Center (NGDC), in gigabytes (line plot and left axis), and number of distinct hosts served (bars and right axis) for fiscal years 1993-2006 (Source: Fox C., 2006)

OCR for page 3
Preliminary Principles and Guidelines for Archiving Environmental and Geospatial Data at NOAA: Interim Report Figure 2: Current NOAA-NESDIS data archive volume projections under the Comprehensive Large Array-data Stewardship System (CLASS), in petabytes (Source: Updated May 4, 2006 from NOAA, 2003) NOAA deserves praise for the steps it has taken and is taking to address its considerable and growing data management challenges. For instance, the recently completed Assessment of NOAA’s Environmental Data and Information Management report (NOAA, 2006) includes a comprehensive, NOAA-wide assessment of data management capabilities organized by its mission goals. This effort, along with the establishment of the NOAA Observing Systems Council (NOSC) and its components (NOAA Observing System Architecture (NOSA), Data Management Committee (DMC), and Chief Information Officer (CIO) efforts) will eventually bear fruit in an appropriate enterprise-wide culture and coordinated best processes, if they can be made effective.