D EOSDIS Lessons Learned
In the late 1970s through the mid-1980s NASA began supporting a series of pilot data system studies to develop publicly accessible electronic data systems. These included such discipline-based systems as the Space Physics and Astrophysics Network, the Pilot Climate Data System, the pilot planetary data system (PPDS), the International Satellite Cloud Climatology Project (ISCCP), the Pilot Land Data System, etc. As NASA entered the age of launching great astronomical observatories in the late 1980s, the notion of specialized information processing and distribution centers emerged. These centers were organized around such instrument spectral domains as the visible spectral data at the Hubble Science Telescope Institute at Johns Hopkins University, the Infra Red Center at Cal tech, the High Energy X Ray Institute at the Marshall Space Flight Center, and the Upper Atmosphere Research Satellite instrument-processing teams. As part of the congressional approval of the EOS mission in 1990, the NASA Earth Science Enterprise committed to supporting the development of a long-term comprehensive data and information system (EOSDIS) whose products would be easily accessible both by the science research community and the broader public. Based on the information systems experience gained, the EOSDIS system design would employ a distributed open architecture. In addition to its functional requirements for space operations control and product generation for EOS, the EOSDIS would be responsible for the data archival, management, and distribution of all NASA Earth science mission instrument data (including EOS) during the mission life. The system would be organized as an integrated collection of distributed active archive centers (DAACs) providing the data services and interfaces with the user community. A common and core infrastructure of hardware and software capabilities would constitute the EOSDIS Core System that would be geographically distributed at
the eight DAACs. Each DAAC would be focused mainly on a particular Earth system component or discipline, such as atmospheres, oceans, land, snow and ice, hydrology, radiation and chemistry, and even socio-economic influences.
NEED FOR THE EOSDIS
Four primary spacecraft make up the long-term measurement system of the EOS mission. They are Terra, launched in December 1999, Aqua launched in May 2002, ICESat launched in January 2003, and Aura to be launched in 2004. In addition to the processing of the instrument data from these satellites, the EOSDIS has responsibility for the archival and management of all NASA Earth science mission data products prior to EOS as well as data from NASA instruments flown on foreign satellites. The NASA Earth Science Enterprise is responsible for assuring the long-term permanent preservation of these data and has negotiated agreements with the operational agencies (NOAA and USGS) for their permanent retention. EOSDIS will support migration of the data to these agencies.
The EOSDIS performs flight operations for the above four EOS spacecraft; processes, archives, and distributes data from 17 instruments on six EOS spacecraft; and archives and distributes data from more than 40 instruments from more than 15 EOS and non-EOS spacecraft. The system today serves approximately 2 million users per year internationally, archiving almost 4 terabytes per day, distributing about 2 terabytes per day and maintaining the current archive, which is larger than 3 petabytes and is growing. In addition, the system supports 1,800 different data types, manages some of the nation’s largest spatial databases, interfaces with over 35 external systems and depends on more than a million lines of code, with more than 60 commercial off-the-shelf products integrated with custom code deployed on a variety of vendor servers. This system is unprecedented in scope and scale for a NASA mission, and one of the largest, if not the largest, working civilian scientific data system ever built.
What distinguishes the EOSDIS from any of the above space mission data systems are the massive volumes of data ingested, processed into higher-level standard products and archived within hours to a few days of acquisition, and distributed to a broad user community on a routine basis.
Over the decade-long period of planning and implementation, the architecture and design of the EOSDIS have undergone nearly continuous evolution to incorporate new technologies and changing science requirements. In addition to managing a relatively large number of research instru-
ments and satellites, EOSDIS provides the scientific data products to a broad community of users, including other value-added providers, such as the Federation of Earth Science Information Partners; the partners are responsible for the development and distribution of specialized and enhanced products for small, focused user communities. In terms of space flight operations and control, the EOSDIS has demonstrated that it can manage and execute some of the most intricate orbit executions by aligning these spacecraft into trains of operational and research satellites trailing and underflying each other separated in orbit by mere seconds to minutes in order to provide near simultaneity of observations of complementary instruments on disparate spacecraft. Previously management of such multiinstrument configurations and generation of data products from such measurements were possible only from a single satellite with a very large capacity. This capability has enriched the scientific mission capabilities at significantly reduced costs.
An additional contribution supported by the EOSDIS is the number of Pathfinder climate data studies from similar or nearly identical instruments flown since the inception of high-resolution satellite remote sensing in the early 1970s on multiple spacecraft from operational and research satellites, some spanning decades. The lessons learned provided by the EOSDIS itself as well as through the experience gained by supporting science instrument processing teams, the core DAAC processing capabilities, and the Earth science information partner processing capabilities afford ample examples to evaluate the advantages of and drawbacks to producing various datasets that should prove useful in the design of the NPOESS/NPP operational EDRs and CDRs.
EOSDIS SYSTEM PERFORMANCE
The EOSDIS has undergone numerous reviews and budget reductions by scientific and data system committees. As a result of implementing their various recommendations, the current system configuration, scope, performance, and services have been dramatically changed from the original concept in terms of functionality, capability, and scale of communities served. The history of the EOSDIS has been stormy in terms of functionality and high expectations of various communities and criticisms have been numerous. One complaint often made concerns the high cost of the system compared with the data systems and operations costs for similar component products on other satellite systems. Another frequent argument made about the EOSDIS is the “one size fits all” customers’ approach of the system.
While these arguments certainly have a large degree of validity, the system nevertheless has managed a timely delivery of the exceedingly demanding and scientifically credible data products. Today the system is routinely meeting its requirements supporting the EOS mission goals while reprocessing many of the datasets with improved algorithms and better calibration. It is not yet clear at this relatively early date in the expected lifetime of the EOS missions whether an information system organized around traditional dedicated mission data systems approaches, would have produced comparable performance more cost-effectively.
Another aspect of the system often overlooked are the synergistic capabilities afforded by the infrastructure resources and broader information science capabilities that can be brought to bear during such unanticipated events as fires, volcanoes, hurricanes, and floods. The breadth of the system has made possible a host of services because of its size (e.g., commercial adoption of data format standards for Earth science, specialized tools for geographic systems and visualization, system interoperability, global directories, and high-speed broadband EOS network accessibility for its user community). More recently the introduction of online data pools providing the most popular data products has led to a growing increase in their accessibility. The development of the EOS Clearing House with open applications programmer interfaces enables development of user interfaces tailored to specific communities; for example, MODIS provides L1 processing source code to direct broadcast users and have an open source code policy with respect to science algorithms. There has not yet been much demand for such software other than for direct broadcast stations.
The EOSDIS offers a large target during enterprise budget crunches and flight launch delays over the long term. Budget reductions in planned funding have forced scale-backs in the planned introduction of functionality and technological upgrades in system capabilities. Whether the traditional mission data approach with dedicated instrument or spacecraft systems or some modification of the present system will be more flexible in adapting services under such budget restraints is an open question.
LESSONS LEARNED FROM EOSDIS
The NASA EOS, and in particular the Aqua satellite and its system of instruments, affords a highly representative collection of instruments with comparable numbers of spectral bands and spatial resolutions very similar to that being planned for the NPP. The class of scientific products produced by the EOSDIS are forerunners of the environmental data records expected
from NPP. Thus, the lessons learned from the development and conduct of the EOSDIS as it evolved in the decades leading to the current system offers NESDIS an unparalleled opportunity to benefit from that experience in planning the production of NPOESS/NPP CDRs. The following lessons learned from the EOSDIS illustrate some of the management philosophies that can be used to sustain the system design architectures and the considerations of meeting evolving customer needs over decades as the evolution of technology, scientific requirements, and budgetary constraints change.
Science investigator-led processing systems. A programmatic change in early 1998 transferred the responsibility for most EOS data processing from the DAACs (and the EOSDIS Core System) to EOS science instrument teams and their facilities. This transfer, accomplished through a call for proposals, was a major reason for the success and timely delivery of the EOS standard data products, and accounted for the high degree of scientific community acceptance of these products. One reason for the scientists’ willingness to assume day-to-day involvement in operational data processing was that in many cases the principal investigators felt that they would be judged by their peers on the quality and timeliness of delivery of these products. This transfer did have significant implications in terms of budget allocations, delegation of management oversight, creeping science requirements growth, interface coordination, software configuration control, hardware and network resource growth, and security, to name just a few issues that had to be addressed.
Planned evolutionary upgrades. The EOSDIS has changed significantly in architecture, design, and implementation since its original planned configuration. Infusion of more recent but mature information technology has made it possible to support the scope of data products and services without compromising the functionality of the EOSDIS and scope of mission under the ensuing budgetary constraints over the years. If anything, the functionality has expanded to support a much larger community than originally envisaged. In fact, a community of EOS partners has been established by NASA’s Earth Science Enterprise to participate in broadening the EOSDIS in many different ways: data and portal providers; algorithm product processors and producers; data services and distribution nodes to research and educational users; value added providers; international and interagency centers; and low cost direct broadcast reception to universities, state and local agencies, and commercial organizations. It is this transforming functional evolution that is changing the popular misconception of the EOSDIS from a highly centralized, inflexible, cost-inefficient data and information
system to one that has a more open and distributed architecture both in terms of science processing and user applications. No longer is the EOSDIS developed, maintained, and restricted by outdated requirements and design processes. It has adopted a limited set of open source architecture concepts to address the current needs and capabilities as a natural evolution and to avoid having to define costly revised system versions or scrapping systems and restarting with a clean slate. While all of the source code is not available for anyone to modify and share, some of the modules developed as a part of EOSDIS have been reused by other organizations.
Program and project management. Creating widely acceptable CDRs from NOAA operational satellites will be as difficult a science challenge as managing a data information system as complex as the EOSDIS. Garnering the full support by a diverse and broad representation of the science community from the initial proposed concept, plan, scope, and implementation for an approach is critical to the success of this NESDIS undertaking. Unfortunately, in developing the EOSDIS the science community was not completely supportive from the start and was unsatisfied with the approach of an EOSDIS Core System with noncompetitively selected DAACs for the scientific processing of higher-level data products. Another concern of the science community was the size of the EOSDIS budget being appropriated to a single large contractor responsible for the system development. The lack of an effective interface between the science community and a large centrally managed science information system developed by a large industrial contractor came as a culture shock. As a result various stakeholders found themselves engaged in conflict over priorities and requirements, with no realistic mechanism to reach closure between information technology system development teams, the science instrument teams, and external science communities. New systems must allow users to gain ownership of requirements through sponsored workshops to reach community consensus, and initiating processes to enable users to prioritize requirements allowed stakeholders to feel more comfortable with the direction the system and project were taking.
User working groups. A valuable component of the DAAC activities has been the critical evaluation and directions provided to each DAAC by its User Working Group, appointed through NASA.
Incremental development. The project could have been more effective if it had pushed for early operational releases with incremental growth in functionality. The first operational release of the EOSDIS Core System (ECS) was delivered to support Landsat 7/Terra and provided a complete, end-to-end capability. It was deployed over six years into the ECS contract
and was subject to many problems. Some of the difficulties were due to ill advised technology choices that would have become readily apparent in an operational environment (e.g., Distributed Computing Environment). Having a stable baseline of some core components would have made it easier to add additional capabilities. As evidence, the releases to support the Aqua and Aura missions have been delivered with progressively fewer problems, while adding new functionality. The early deployment of truly core components would have allowed and even fostered the development of value-added components from the broader environmental science and engineering and external data and information system development community.
Technology reuse. Reuse of independently developed components is possible and has occurred within EOSDIS. Some examples are the EOS Data Gateway (reused from Version 0 EOSDIS), the Simple, Scalable, Script Based Science Processor (GSFC DAAC developed component now in the production system), Land Processes (EDC) DAAC-Billing and Accounting system for Landsat-7 (borrowed from USGS), SeaWiFS processing system adapted to MODIS and the Ozone Monitoring Instrument Science Investigation Processing System, GSFC DAAC-developed Version 0 systems reused in Aura SIPSs at the Jet Propulsion Laboratory. This was enabled by the maturation of a stable, base-lined ECS and the implementation and publication of standard interfaces to its components.