9
Oak Ridge National Laboratory DAAC

Panel Membership

J.-BERNARD MINSTER, Chair, Scripps Institution of Oceanography, La Jolla, California

FERRIS WEBSTER, Vice Chair, University of Delaware, Lewes

KENNETH D. DAVIDSON, National Climatic Data Center, Asheville, North Carolina

BETH H. DRIVER, National Imagery and Mapping Agency, Chantilly, Virginia

WILLIAM J. PARTON, JR., Colorado State University, Fort Collins

ERIC T. SUNDQUIST, U.S. Geological Survey, Woods Hole, Massachusetts

ABSTRACT

The Oak Ridge National Laboratory (ORNL) DAAC manages data needed to study biogeochemical fluxes and processes. In contrast to other DAACs, the ORNL DAAC deals with data derived mainly from intensive field campaigns and process studies, rather than from satellites. ORNL DAAC data are important to the satellite program, however, because they provide a means for validating data from high-resolution imaging satellites, such as Landsat, and multisensor platforms, such as AM-1. In fact, without the integration of remote sensing and in situ observations, it is doubtful whether the EOS program would ever fulfill its full potential.

At the time of the site visit, the ORNL DAAC was only beginning to get



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 165
Review of NASA'S Distributed Active Archive Centers 9 Oak Ridge National Laboratory DAAC Panel Membership J.-BERNARD MINSTER, Chair, Scripps Institution of Oceanography, La Jolla, California FERRIS WEBSTER, Vice Chair, University of Delaware, Lewes KENNETH D. DAVIDSON, National Climatic Data Center, Asheville, North Carolina BETH H. DRIVER, National Imagery and Mapping Agency, Chantilly, Virginia WILLIAM J. PARTON, JR., Colorado State University, Fort Collins ERIC T. SUNDQUIST, U.S. Geological Survey, Woods Hole, Massachusetts ABSTRACT The Oak Ridge National Laboratory (ORNL) DAAC manages data needed to study biogeochemical fluxes and processes. In contrast to other DAACs, the ORNL DAAC deals with data derived mainly from intensive field campaigns and process studies, rather than from satellites. ORNL DAAC data are important to the satellite program, however, because they provide a means for validating data from high-resolution imaging satellites, such as Landsat, and multisensor platforms, such as AM-1. In fact, without the integration of remote sensing and in situ observations, it is doubtful whether the EOS program would ever fulfill its full potential. At the time of the site visit, the ORNL DAAC was only beginning to get

OCR for page 165
Review of NASA'S Distributed Active Archive Centers involved in the EOS Land Validation Program. Given that Landsat 7 and the AM-1 platforms are scheduled to be launched less than a year from now, the DAAC will have to be resourceful and assertive to become meaningfully involved in the calibration and validation plans for these programs. It will also have to have a vision of what it intends to accomplish in the EOS Land Validation Program and other data activities, and the panel's main recommendation is that the DAAC develop a vision and implementation strategy for fulfilling its special mission within the DAAC system. A plan for participating in the EOS flight missions is a key part of the implementation strategy, and given the imminence of launch date, the DAAC needs to develop this plan immediately. Only by becoming involved with satellite programs will the ORNL DAAC effect the transition from being a biogeochemical data center to becoming a fully functional DAAC. INTRODUCTION The Oak Ridge National Laboratory DAAC was created in 1993 to archive and disseminate biogeochemical dynamics data from NASA and non-NASA field activities (Box 9.1). It is collocated with several environmental data centers at Oak Ridge National Laboratory, including the Carbon Dioxide Information Analysis Center, the World Data Center for Atmospheric Trace Gases, and the Atmospheric Radiation Measurement (ARM) archive. The DAAC has a unique role within EOSDIS. Rather than serving as a repository for large volumes of data from a small number of remote sensing missions, it serves as collecting agent and repository for small, disparate sets of data from principal investigators of field projects sponsored by NASA and other federal agencies. Rather than processing data, the DAAC makes available data that have been processed by the scientists who collected them. These ground-and aircraft-based data are critical for validating EOS remote sensing instruments. In the future, the DAAC will continue to provide data management support for field campaigns, such as the Large-Scale Biosphere Atmosphere Experiment in Amazonia (LBA), as well as for process studies related to biogeochemical dynamics. It will also place increasing emphasis on the EOS Land Validation Program. These activities will not pose a technological problem for the DAAC—processing capacities and response times are ample for current needs, and the DAAC is able to add on-line storage as required. In the panel's opinion, the greatest challenge facing the DAAC is implementing its role in integrating ground-based measurements into the overall array of remotely sensed data available from EOS and other NASA programs. The ORNL DAAC is a crucial link in the validation of biogeochemical inferences from remotely sensed data. It is not an exaggeration to assert that the success (or failure) of the ORNL DAAC in this role could make (or break) large components of the stated mission of the EOS

OCR for page 165
Review of NASA'S Distributed Active Archive Centers BOX 9.1. Vital Statistics of the ORNL DAAC History. The ORNL DAAC, which was created in 1993, is one of several centers that manage environmental data in the Environmental Sciences Division of Oak Ridge National Laboratory. The DAAC's holdings go back to the 1800s, although most data sets are relatively modern. Host Institution. DOE's Oak Ridge National Laboratory in Oak Ridge, Tennessee. Disciplines Served. Biology, ecology, geology, and chemistry. Mission. To provide data and information about the Earth's biogeochemical dynamics and ecology to the global change research community, policy makers, educators, and interested general public. Holdings. The DAAC holds about 48,000 files with a total volume of 6–7 GB. Users. There were 1,143 distinct users in 1997. Staff. In FY 1998 the DAAC had 14 FTEs and 1 ECS contractor. Budget. Approximately $2.6 million in FY 1998 (including DAAC costs and ECS-provided hardware, software, and personnel), decreasing to $2.4 million by FY 2000. program. Yet, neither the DAAC, ESDIS, nor the instrument teams seem to appreciate fully the DAAC's critical role in EOS. The Panel to Review the ORNL DAAC held its site visit on March 19–20, 1998. The following report is based on the results of the site visit and e-mail correspondence with the DAAC manager in June through September 1998. HOLDINGS The ORNL DAAC archives and distributes biogeochemistry data sets associated with intensive field campaigns and global terrestrial ecosystems process studies (Box 9.2). The intensive field campaigns, such as the First International Satellite Land Surface Climatology Project Field Experiment (FIFE), are funded through NASA's Terrestrial Ecology Program. The ORNL DAAC is preparing to receive the data from the Boreal Ecosystem-Atmosphere Study (BOREAS) field campaign and will manage the data from the LBA field campaign in the next few years. An example of data from a field site is shown in Figure 9.1. These data contribute to the overall goals of the ORNL DAAC by providing

OCR for page 165
Review of NASA'S Distributed Active Archive Centers FIGURE 9.1. Response of plant biomass to monthly and yearly variations in seasonal precipitation at a grassland site in Kenya. The dry season, from June to October, is indicated by the Walter-Lieth Climate Diagram, a commonly used method of visualizing climate based on mean temperature and precipitation data (the dark shaded area represents the period of relative drought). Data on biomass response to climate variation are important for understanding the global carbon cycle and how net primary productivity on a grassland may vary in response to global change. SOURCE: ORNL DAAC.

OCR for page 165
Review of NASA'S Distributed Active Archive Centers BOX 9.2. Data Holdings as of January 1998 Intensive Field Campaigns First International Satellite Land Surface Climatology Project Field Experiment (FIFE)—Monthly data from the Kansas prairie for May 1987 to October 1989. Superior National Forest (SNF)—Daily, monthly, and yearly data from Minnesota for 1972 to 1990. Oregon Transect Ecosystem Research (OTTER)—Monthly data for May 1989 to June 1991. Boreal Ecosystem-Atmosphere Study (BOREAS)—Data are currently available from the Goddard Space Flight Center. Process Studies Net Primary Production (NPP)—Daily, monthly, and yearly data from grassland and woodland sites worldwide. Studies range from 3 to 51 years in duration. Amazon River Basin Precipitation—0.2-degree gridded monthly and daily data from Peru, Bolivia, and Brazil for January 1972 to December 1992. Global Wetlands and Methane Emissions—Global monthly data at 1-degree resolution from the 1980s. U.S. Streamflow—Monthly data from the United States for 1874 to 1988. River Discharge—Measurements from more than 1,000 stations around the world for 1807 to 1991. Hydroclimatology—Monthly point data from the continental United States for 1948 to 1988. SOURCE: NASA (1998). data to test and develop biophysics and ecosystem models at different sites and to compare with biogeochemistry data that will be collected in the future. The global terrestrial ecosystem data sets (Box 9.2), such as Net Primary Production (NPP), are primarily used to develop and test global ecosystem models. The ORNL DAAC is also collecting and processing data from the Fluxnet program, an international program in which microclimate variables and fluxes of CO2, trace gases, and heat are being measured from regional networks of eddy-correlation towers.

OCR for page 165
Review of NASA'S Distributed Active Archive Centers Commitment to Data Perhaps the greatest strength of the ORNL DAAC lies in its commitment to data and its readiness to do the often thankless work required to render the data it receives useful to the larger scientific community. Such commitment is particularly important for data collected in terrestrial field studies in which investigators, often working in difficult and rapidly changing conditions, must make on-the-fly decisions to control the most significant variables in a unique setting. Rendering such descriptions useful to other scientists requires devoting time and attention to compiling, organizing, and presenting information about how the data were collected, and storing the data so that they can be readily used in conjunction with other data from the DAAC—work that the academic and research communities do not generally reward. Integration of Ground-Based and Remotely Sensed Data The ORNL DAAC's role in the EOS Land Validation Program will be (1) to make ground-and aircraft-based data available for validating satellite instruments and algorithms, and (2) to facilitate integration of these disparate data types by working with researchers to ensure that the data are in suitable, self-consistent formats. Both satellite data validation and data set integration are necessary for scientists to gain a more complete understanding of biogeochemical processes. With regard to validating the satellite measurements, one of the most important issues has to do with geolocation. For incoming satellite data and for modern field studies using the Global Positioning System (GPS), sources of data can be located on the Earth's surface precisely and accurately. However, it may not be possible to provide locations of comparable accuracy for many historical sources of data that are essential to understanding biogeochemical dynamics. Even modern field measurements must often change locations opportunistically, depending on local conditions, yielding aggregated data that may be representative of a less-than-precise location. Thus, for many biogeochemical studies, geolocation attributes must include estimates not only of position but also of geolocation precision and accuracy. Such estimates, which are often difficult and are seldom undertaken in a comprehensive manner, exemplify the kind of data treatment that should be provided by the ORNL DAAC. Yet, at the site visit, the ORNL DAAC did not seem to be aware of these issues. Recommendation 1. As a crucial element of the EOS program, the DAAC should work to resolve the issues of accurate co-registration of in situ and remotely sensed data, and ensure that both its staff and its users fully understand what the DAAC's data holdings mean to the proper interpretation of remotely sensed data.

OCR for page 165
Review of NASA'S Distributed Active Archive Centers Metadata Much of the DAAC's holdings are documented by the principal investigators of individual field campaigns and process studies, who tend to work with the data for three to four years until the quality control and documentation are as complete as possible. Incorporating metadata about field projects into the EOSDIS metadata model, which was designed for satellite data, is not direct and requires numerous interactions between the DAAC and ESDIS. In addition, the existing metadata descriptors are apparently changed by staff at the Global Change Master Directory (GCMD), often to descriptors that are not in common usage by field investigators, requiring the ORNL DAAC to expend additional effort to make such changes. For example, DAAC staff told the panel that ESDIS and GCMD staff have changed the descriptor ''carbon dioxide'' to "CO2" to "atmospheric carbon dioxide concentration" and back to "carbon dioxide" over the past two years, requiring the DAAC to change its metadata files accordingly. Recommendation 2. Given the importance of validating the EOS remote sensing measurements, ESDIS should ensure that the ECS metadata model accommodates data derived from ground-and aircraft-based studies. Formats Although the standard format for EOS data is HDF-EOS, all of the ORNL DAAC's holdings are kept as ASCII files. Users are happy with ASCII formats and ESDIS does not require that ground-based data be put into HDF-EOS because the overhead is too high for small data sets. An ability to work with HDF-EOS, however, will be needed for validating land data from MODIS and other remote sensing instruments, and the panel encourages the DAAC to become familiar with HDF-EOS. Processing Strategy Although the principal investigators of field experiments and process studies process most of the data eventually held by the DAAC, the DAAC acquires some unprocessed data from other sources. In addition to processing these data, the DAAC will become involved in processing Fluxnet data several years from now. On occasion, the scientific community has asked the DAAC to generate value-added products, but the DAAC has no immediate plans to reprocess data sets. Long-Term Archive NASA is currently negotiating with NOAA to provide a long-term archive for EOS data. Until a Memorandum of Understanding is concluded, the ORNL

OCR for page 165
Review of NASA'S Distributed Active Archive Centers DAAC will continue to archive its holdings as long as funding is available. The panel notes, however, that the ORNL DAAC is collocated with two long-term DOE archives—the Carbon Dioxide Information Analysis Center and the ARM archive. Moving the DAAC's data sets to one or both of those centers would likely be cheaper than moving them to NOAA and would keep the data sets where the scientific expertise resides. At the urging of the ORNL DAAC, NASA and DOE apparently considered an MOU on long-term archive several years ago. The panel urges the two agencies to resume their discussion. Recommendation 3. NASA and DOE should consider establishing a Memorandum of Understanding for the long-term archive of biogeochemical data from the ORNL DAAC. USERS Characterization of the User Community The ORNL DAAC primarily serves the global change research community, which includes scientists who use terrestrial ecology and biogeochemical dynamics data from process studies, field experiments, and remote sensing. A high proportion (about 30% of users in 1997) are foreign researchers, who learn about the data from scientific publications. Many U.S. scientists who are interested in the types of data held by the DAAC were originally involved in the field experiment or study and, consequently, are not frequent users. Individuals from private corporations (e.g., corporations that harvest forested lands) and institutions are also occasional users of the DAAC. The DAAC does not keep track of the more detailed characteristics of its user community, and the panel encourages the DAAC to do so. This will permit the DAAC to develop a more accurate user profile, which in turn will assist it in expanding user activities and increasing usage of its data. The ORNL DAAC is arguably the most prominent U.S. data center focused on terrestrial ecology and environmental data, and it has the potential to serve a much larger segment of the biological and ecological communities. In the panel's view, the DAAC should also begin to capture and more carefully analyze statistics on data usage and use the results to construct metrics of its performance. Not only would such data be useful in better understanding how users respond to DAAC initiatives or to external changes, but they would afford a deeper understanding of user differences and of gradual shifts in user behavior, which might be expected as holdings grow and become more diverse. A metrics program need not be elaborate; it must be inexpensive to implement, it must result in consistent data collection, and it must provide data that can be analyzed and used with ease and with confidence.

OCR for page 165
Review of NASA'S Distributed Active Archive Centers Recommendation 4. To better serve its users, the DAAC should develop and implement a strong metrics program to track resource usage, evaluate the impact of data management decisions, and predict the outcome of future actions. Finally, the development of a "marketing" plan may also help the ORNL DAAC to expand its user base, thus increasing its constituency. This then could prove valuable for budget purposes as well as for generating successful proposals to other government or nongovernment agencies. User Working Group The DAAC has a good relationship with its User Working Group (UWG). The UWG is an independent body; it helps the DAAC determine which data sets to acquire and what emphasis to give to existing data sets. For example, the current allocation of work between field campaigns, validation of remote sensing products, and ecosystem modeling (see "Data Priorities," below) was recommended by the UWG. The greatest concerns of the UWG include (1) increasing the size of the DAAC's user community; (2) competition for DAAC activities by the Earth Science Information Partners (ESIPs) of NASA's prototype federation; and (3) transferring the BOREAS data, which are currently being managed at Goddard Space Flight Center, to the DAAC so that they can be distributed to the broader community. With regard to the latter, the BOREAS Project Office and the ORNL DAAC formally agreed several years ago that the BOREAS data would be transferred to the ORNL DAAC for archive and unrestricted distribution. However, neither the DAAC nor the UWG has been able to obtain the data, with the result that the transition is taking a longer time than expected. Relationship with the Scientific Community A strength of the ORNL DAAC is its willingness to work with the scientific user community to provide information needed for research. The DAAC is working toward involving scientists more closely in its operations, and an obvious opportunity for enhanced interaction exists at Oak Ridge National Laboratory, which hosts a large environmental sciences group. The DAAC could also enhance its visiting scientist program or recruit DOE postdocs to work with the data. Seeing scientists work with the data will help the DAAC understand better how the data are used and how it can better serve the needs of the scientific community. Past experience with scientists who provide data to the DAAC has shown the DAAC the benefits of early involvement in the intensive field campaigns. If the DAAC has not been involved in any stage of the experiment, it will have a considerably more difficult time serving users when the data finally arrive. The

OCR for page 165
Review of NASA'S Distributed Active Archive Centers DAAC has taken important strides to remedy this problem through its early involvement in planning data management for the LBA intensive field site investigation, and the panel applauds this strategy. User Services The panel got the impression that the user services group at the DAAC is dedicated and competent. Indeed, the DAAC prides itself on its ability to satisfy users requests. If the DAAC is to meet the challenge of the EOS Land Validation Program, however, the user services group will have to place increased emphasis on providing standardized data sets for the development and intercomparison of biogeochemical models, and on providing data from a variety of platforms (chambers, buoys, towers, shops, aircraft, balloons, satellites) in self-consistent and accessible ways. TECHNOLOGY Hardware Availability The DAAC's in-house computer suite appears to be a reliable resource for DAAC staff and users. Processing capacities and response times are ample for current needs, and the project is able to add on-line storage as it is required. The team appears to use adequate configuration management practices and to perform operations housekeeping, such as backups, consistently. One user observed that network capacity for on-line delivery of data to customers was likely to become a problem; however, it is expected that most DAAC users will continue to use public networks, augmented by transportable media. The panel believes that developing a simple performance model for end-to-end servicing of customer requests would help the DAAC evaluate the net impact of response times at any point in the process. For example, the impact of high-speed retrieval capabilities for data to be delivered by Federal Express would have to be weighed against other services the DAAC might improve with the same funds. Significant changes in system load may occur when large volumes of time-series data are collected and delivered to customers on a regular basis. The DAAC will have to tune performance as resource scarcities shift, to anticipate sudden degradations in service as resources saturate, and to justify budgeting for increased capacity while current resources are not fully utilized. The outyear budgets presented to the review team did not show significant funds earmarked for additional system capacity. The DAAC is not on the schedule for ECS delivery and will soon have to replace its hardware. It has no plans for an evaluation of costs and benefits of using ECS as opposed to continuing to use a relatively small, tailored system with

OCR for page 165
Review of NASA'S Distributed Active Archive Centers ECS-compatible data definitions and interfaces. The DAAC should be prepared to make a case for its findings on either side of this issue. Finally, the panel suggests that within the next four to six months, the DAAC verify its own readiness for the millennium rollover. Although the DAAC team was able to answer specific questions concerning potential year 2000 problems and appears to have avoided a significant data conversion cost, it does not appear to have done the methodical self-assessment and verification called for by U.S. government directives, which are warranted to ensure uninterrupted service. User Interfaces Developers showed justifiable pride in fielding easy-to-use user interfaces and evaluating them based on user behaviors. They also showed creativity in applying new technology to expedite the work of in-house staff as well as to facilitate timely, complete, and accurate capture of relevant data from scientists at the point of capture. Such tools, combined with prompt processing of the data upon receipt, should reduce costs and improve the fidelity of the descriptive data that accompany field samples. Media Versus Web Distribution Strategy Some DAAC staff voiced a preference for Web technology (i.e., string-based searches or browsers that can be used with many file structures) over traditional database management system technology. The panel counsels that such a change should not be made before more powerful Boolean search capabilities become available. As the DAAC's holdings grow, users may have greater need for the precise search capabilities afforded by structured databases than they do at present. Web browsers often return large numbers of irrelevant references that are difficult to eliminate with today's search tools. MANAGEMENT General Philosophy The DAAC management team, led by Larry Voorhees, is well experienced in data management activities, and this is reflected in its ability to instill into the staff the requirements and importance of data management. In general, the basic data center functions of acquisition, quality control, archiving, and providing access to data appear to be well understood by the DAAC and carried out successfully. Several CD-ROMs based on current data holdings have been produced, and they demonstrate the DAAC's understanding of the value of and requirements for its data holdings by the research community. The strengths of the management activities are tied to the staff's dedication to the data, not the orga-

OCR for page 165
Review of NASA'S Distributed Active Archive Centers nization, and the general understanding of data center activities, which is due in part to their collocation with other Oak Ridge National Laboratory data centers. In the panel's view, DAACs should be more than data centers because, in addition to the data center functions mentioned above, they are involved in active satellite programs. The ORNL DAAC, however, functions more like a data center than a DAAC. To become a DAAC in reality, the ORNL DAAC will need a clear identity and a stronger sense of its special mission (validating satellite data with ground-and aircraft-based measurements) within the EOS program. A vision and an implementation strategy for achieving this vision are critical. The implementation strategy should also describe the DAAC's participation in the EOS flight missions. Such participation in the flight missions will require a proactive attitude at the DAAC. In fact, the DAAC's experience with early involvement in upcoming field campaigns such as LBA will help it form similar relationships with the instrument teams. By thinking strategically, the DAAC will also be able to further improve the effectiveness of its data center operations. Recommendation 5. The ORNL DAAC should articulate a vision of its mission within the EOS program and an implementation strategy with goals for fulfilling this mission. The strategy should influence every decision the DAAC makes, from participation in the relevant EOS flight missions, to targets for data acquisition, priorities for data ingest and preparation, staffing, development of a user base and constituency, and allocation of resources for current-year and projected spending. Such a strategy should help the ORNL DAAC become a DAAC in reality. Personnel The DAAC has 1 ECS liaison and funding for 14 FTEs, which supports 10 full-time and 13 part-time staff. Only a few FTEs are devoted to system development, indicating the DAAC's focus on operations. DAAC staff come from a variety of line organizations at Oak Ridge National Laboratory, but they are located together and report to Voorhees. The line managers supervise the staff, write annual performance reviews, and make decisions on promotions with input from Voorhees. Consequently, the DAAC has little opportunity to directly reward or promote outstanding efforts. Turnover appears to be higher than normal, perhaps because of uncertainties in long-term funding. Employees seem to recognize that their jobs may, and could, go away at the discretion of a NASA executive. The DAAC has a good relationship with its ECS liaison. She is viewed as a team member by the DAAC, and her only ECS tasks are to keep track of the status of ECS deliveries to other DAACs and to make sure that the DAAC does not stray too far from the ECS standards.

OCR for page 165
Review of NASA'S Distributed Active Archive Centers In the panel's view, the ideal data center has a mixture of data management experts, computer programmers, and scientists familiar with the types of data stored at the center. Scientists familiar with the data have a first-hand understanding of the potential problems associated with experimental data sets, the scientific value of the data sets, and the type of analysis being done. The excellent work by Jonathan Scurlock (ORNL scientist) and Dick Olson (DAAC staff) on the global NPP data sets shows the advantage of having scientists familiar with the NPP data work with data management personnel. In general, however, scientists in the Environmental Science Division do not appear to be actively working with DAAC personnel on most of the biogeochemistry data stored by the DAAC, and the DAAC should aggressively pursue opportunities for collaboration (see "Relationship with Scientific Community"). Budget The DAAC's FY 1998 budget is approximately $2.6 million, and the budget is not expected to vary greatly over the next several years (Table 9.1). Staff costs are currently about 85% of the total DAAC budget, indicating the DAAC's focus on operations, rather than development. The DAAC practices full-cost accounting, so the figures given in Table 9.1 represent the true cost of the DAAC, including occupation of the facility and support services. Although the budget is small compared with the other DAACs, the UWG is concerned that the DAAC's user community is too small to justify the DAAC's resources. Based on the annual budget for the program and the number of staff supported, it appears that the average cost per work year is somewhat high, even when the academic credentials of the staff are considered. The costs are driven by TABLE 9.1. Total ORNL DAAC Costs (million dollars)a   Fiscal Year   1994 1995 1996 1997 1998 1999 2000 2001 2002 ORNL DAAC 0.8 2.7 2.1 2.2 2.6 2.3 2.3 2.5 2.7 ECS hardware 0 0 0 0 0 0.4 0 0 0 ECS software 0 0 0 0 0 0.1 0 0 0 ECS personnel 0 0.1 0.1 0.1 0 0 0.1 0.1 0.1 Total cost 0.8 2.8 2.2 2.3 2.6 2.8 2.4 2.6 2.8 a Budget numbers for FY 1994–1997 are actual values; numbers for FY 1998 – 2002 are projections, as of May 1998. SOURCE: ESDIS.

OCR for page 165
Review of NASA'S Distributed Active Archive Centers the overhead imposed by ORNL operations. If the DAAC grows in order to process more data sets, it should consider organizing its work so additional staff would not require the academic credentials and experience typical of the current labor mix. The panel also suggests that the DAAC capture actual costs in a work breakdown structure that would permit isolation of costs for specific services performed. It would be useful to capture other performance indicators in order to analyze the high "cost per stored byte" that is the only available performance measure at present and to evaluate impacts of variables, such as source or type of data, and process changes, such as those for facilitating data capture that the staff has already initiated. The panel recognizes that the acquisition, quality control, and maintenance of the ORNL data sets (each of which is unique and requires unique management) is labor intensive and suggests that these costs be quantified as part of the performance measures. The fact that the DAAC practices full-cost accounting provides a rare opportunity to estimate the true cost of operating a data center. The absence of hidden costs may partly explain the apparently high cost per stored byte. Data Priorities The ORNL DAAC archives and distributes biogeochemistry data sets associated with three major activities: (1) NASA intensive field campaigns, (2) EOS Land Validation Program, and (3) global terrestrial ecosystems process studies. Currently 40% of the DAAC's effort is associated with archiving and distributing the small, diverse data sets from the intensive field campaigns; and another 40% of the DAAC's effort goes toward the EOS Land Validation Program. The remaining 20% of the DAAC effort is concerned with global terrestrial ecosystem data sets. The 40:40:20 allocation of effort was recommended by the DAAC's User Working Group, and it represents a significant increase in emphasis on the EOS Land Validation Program. The panel concurs with this change and suggests that in the future the DAAC should further increase its emphasis on the EOS Land Validation Program, if it can do so without jeopardizing its ability to meet the data needs of the broader biogeochemical community. Archiving and distributing data from NASA-funded field campaigns contribute to the overall objectives of the ORNL DAAC but should not be its primary focus. By expanding its role in the EOS Land Validation Program, the ORNL DAAC will strengthen its position in the overall EOS program.

OCR for page 165
Review of NASA'S Distributed Active Archive Centers ORNL DAAC AND THE EARTH SCIENCE ENTERPRISE Relation to Oak Ridge National Laboratory The ORNL DAAC is located within the Environmental Sciences Division of Oak Ridge National Laboratory. The Environmental Sciences Division focuses on five strategic areas, three of which are relevant to the DAAC—response of ecosystems to environmental change, integrated assessments, and environmental data management. The division's budget is $40 million per year, which is divided among about 75 activities. To the panel, the DAAC appeared to be lost in the Oak Ridge organization. If NASA funding is reduced, there seems to be little motivation for the Oak Ridge general contractor to perform this type of work in the future. Similarly, the DOE office at the laboratory views the ORNL DAAC as "work for others," which will cease to exist when funding ends. Relation to ESDIS The ORNL DAAC feels that ESDIS listens to its needs and concerns, but it is a small voice in a big system. Its main advocate within the NASA management structure is Diane Wickland, the DAAC's program manager at NASA Headquarters, who provides advice and guidance to the DAAC. The panel felt that without Wickland's influence, the DAAC's position with ESDIS would be precarious at best. Although the DAAC's main interaction with ESDIS is through William North, the DAAC felt it had a good relationship with Gregory Hunolt, former DAAC system manager. At the time of the review, Hunolt's departure had led to considerable uncertainty about the DAAC's future interactions with ESDIS. As noted above, the ORNL DAAC is the smallest of the DAACs, but it has the crucial role of validating EOS instruments. Therefore, in the panel's view, it should have an equal voice at ESDIS, and appropriate consideration should be given to meeting the special needs of the DAAC, such as incorporating field-related key words into the ECS. Recommendation 6. ESDIS should devote greater attention to the importance of the ORNL DAAC to the success of the EOS program, support its activities as a full player in EOSDIS, and thereby help it become better integrated within the DAAC system. Relation to Other DAACs Although the ORNL DAAC works with the EROS DAAC on the EOS Land Validation Program, it apparently has little other interaction with the EOSDIS system. The DAAC perceives the weekly telephone conferences with other

OCR for page 165
Review of NASA'S Distributed Active Archive Centers DAAC managers as focusing on ECS issues, which are not relevant to the ORNL DAAC, rather than on problems that DAACs face in common. It participates only because it is politically important to do so. The DAAC has great difficulty working within the EOSDIS system and would operate differently if it had a choice. Relation to ECS Contractor In the past, the DAAC had a difficult time working with the ECS contractor because ECS activities are contract driven, and many of the DAAC's requests to ECS are not included in the contract. For example, the DAAC has had a major problem with the ECS contractor in getting key words accepted in the metadata model. The ORNL DAAC has never been scheduled to receive the entire ECS because it does not manage satellite data. In the last year, the DAAC was removed from the delivery schedule (although it may receive selected components in the year 2000), so there is no longer a need to hold technical discussions with the ECS contractor. Relation to Instrument Teams Until now, the "instrument teams" for the ORNL DAAC were the principal investigators of the NASA intensive field campaigns and process studies. The DAAC has generally had a good relationship with these data providers and has positioned itself to become more involved in the planning of future studies, such as LBA and Fluxnet. The panel urges the DAAC to become similarly involved with the EOS instrument teams that need data from the ORNL DAAC to validate their instruments and algorithms. Given that MODIS is scheduled to be launched in a matter of months, it is imperative that working relationships between the ORNL DAAC and MODIS instrument and science teams be developed immediately. SUMMARY The ORNL DAAC has two important roles in NASA's Earth Science Enterprise—to facilitate in situ ecological science research and to validate remote sensing data from the EOS satellites. The latter—involvement in an active satellite program—is what distinguishes a DAAC from a data center. With regard to its data center functions, the ORNL DAAC is arguably the primary U.S. data center focused on ecological data. It has done pioneering work on the acquisition and distribution of biogeochemical data sets, and the careful stewardship of its holdings is impressive. Indeed, one of the greatest strengths of the DAAC lies in its commitment to data. The DAAC makes good use of its User Working Group to identify data sets to acquire, but it should also take advantage of the scientific expertise that resides at Oak Ridge National Laboratory and/or strengthen its

OCR for page 165
Review of NASA'S Distributed Active Archive Centers visiting scientist program. A closer relationship with scientists would help the DAAC understand more about its holdings and how they are used. DAAC staff have also proven to be creative and innovative in providing tools that users need. To satisfy the evolving needs of its users, however, the DAAC must embark on near-and midterm strategic planning. A vision and an implementation plan for fulfilling this vision must be developed for the DAAC to fully succeed in its data center mission. The need for a vision and implementation plan is even greater in the DAAC's other mission in the Earth Science Enterprise—validation and calibration of EOS satellites. Unless in situ data from the ORNL DAAC are integrated successfully with the satellite data, the interpretation of satellite observations cannot be validated. Yet, plans for resolving the geolocation problem inherent in integrating these disparate data types have not been developed or even conceived, and the ECS metadata model does not accommodate key words needed to search and retrieve land-and aircraft-based data. Further, neither the DAAC, ESDIS, nor the instrument teams seem aware of the importance of involving the DAAC in the planning stages of the flight missions. Early involvement of the DAAC would help ensure that the in situ data needed for validation and calibration are available in a form that is suitable to the instrument teams. This issue must be resolved immediately because the space missions (e.g., Landsat 7 and the AM-1 platform) will be launched within a year. To get involved at this late date, the DAAC will have to become more aggressive with ESDIS and the instrument teams. It can no longer wait passively if it wishes to fulfill its special mission within the EOS program. Fortunately, the DAAC can draw on the experience of its successful involvement in the planning stages of the LBA intensive field campaign.

OCR for page 165
Review of NASA'S Distributed Active Archive Centers This page in the original is blank.