Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 72
8
The EOS Data and Information System
Does the proposed EOS Data and Information System (EOSDISJ
represent the appropriate approach to support this lon~term data
coilecfion and modeling effort?
The preeminent challenge to global change research is the synthesis
of diverse types of information from different sources. EOSDIS is a
pioneenog effort in this regard: the intended scope of the system far
exceeds that of any existing civilian data management system. Nonetheless,
relevant experience for developing some aspects of EOSDIS exists in pilot
data programs in NASA and in the data programs of other disciplines
and agencies. If EOSDIS succeeds, it will be both a key ingredient in the
success of the U.S. Global Change Research Program (USGCRP) and a
substantial contribution to the field of data management. On the other
hand, there is no operational paradigm for the effective management and
dissemination of large scientific datasets under which data can be obtained
readily and quickly for analysis and research.
Nor is there any formula for success in such an endeavor. The EOS
instruments will be examples of advanced scientific and engineering con-
cepts, producing many simultaneous data sets, each with large amounts of
data. Significant advances in data management will be required so that
these data will be readily available and useful for modeling global change.
1b prepare for this challenge, NASA proposes to commit a major frac-
tion of the EOS budget to the Earth-based component, including EOSDIS.
NASA should be commended for its early recognition of the importance of
72
OCR for page 73
73
the EOS Data and Information System (EOSDIS) to the success of EOS
and the USGCRP. This importance is best summarized in the words of the
EOS Science Steering Committee:
"The key to the Eos concept and to its ultimate success in meeting the needs
of the Earth science community is the data and information system. This system
must be the foundation upon which the rest of the mission is built; it will be
the means fly which all Eos results are collected and communicated." [pp 25-27
From Pattern to Process: The Smategy of the Earth Observing System]
We agree that investing in the early development of EOSDIS is appro-
priate and necessary for the long-term success of the EOS data collection,
management, and modeling effort. Some previous NASA missions have
suffered from depleted budgets before the data processing and scientific
analysis phases were done, resulting in a poor scientific payoff. The EOS-
DIS program is beginning with a healthy commitment to data management
and analysis.
Investment does not guarantee success, however. The EOS program
will be large and complex, and many potential pitfalls will be faced in
the course of its implementation. While the importance and challenge of
EOSDIS are understood, it is not equally clear that the route to achieving
such a system and the resources and advance planning required to imple-
ment, maintain, and evolve it are fully appreciated. The management of
very large databases with provisions for indexing, browsing, visualization,
and other capabilities is a research issue. Current understanding of how
to meet this challenge is not mature. A program of research is needed to
guide the evolution of the proposed data mangement capabilities.
CHARACTERISTICS OF THE SYSTEM
EOSDIS will support a variety of scientific activities. According to
NASA plans, its major functions are:
mission planning, scheduling, and control;
instrument planning, scheduling, and control;
effective resource management;
communications;
computational facilities to support research;
production of standard and specialized data products; and
archiving and distribution of data and research results.
NASA currently plans to have EOSDIS operational well before the
launch of the first EOS spacecraft. The processing, archiving, and dis-
tribution of data are to be functional by 1994. Prior to the first launch,
EOSDIS plans are to exploit currently available data, enhance the data
acquisition and processing capabilities for ongoing missions, and correct
OCR for page 74
74
some long-standing deficiencies in the access to data. Development is to
begin immediately, building on the existing infrastructure.
NASA plans call for a network architecture that is open and distributed,
capable of evolving with advances in computing and networking. In the
terms used by NASA, "EOSDIS must adhere to a flexible, distributed,
portable, evolutionary design and operate prototypes in a changing exper-
imental environment." In our view, this is the right approach, but these
goals are easier to state than to accomplish. The challenge of achieving
them must not be underestimated.
In the development of EOSDIS, NASA has had the benefit of interac-
tions with other federal agencies and the external scientific community. In
the former case, the Interagency Working Group on Data Management for
Global Change has been a forum for discussing data, distribution, format,
access, cataloging, and related topics affecting interagency management of
data and their accessibility. Agencies participating include NASA, NOAA,
USGS, DOD, DOE, and EPIC In addition, the EOS Investigators Working
Group has organized a Science Advisory Panel for EOSDIS that is charged
to represent the scientific community associated with EOS in advising
NASA on matters related to EOS data production and scientific interfaces.
Evolution
The computer industry has experienced increasingly rapid technological
evolution in both components and architectures. EOSDIS must be flexible
enough to take advantage of inevitable advances in hardware and software
capabilities, particularly in areas such as high-performance computing, data
storage media, disk controllers, networking, and data base management.
How hardware and software technologies will change over the next 25 years
cannot be predicted, and traditional contract specifications for EOSDIS
hardware and software written today cannot remain unchanged over the
long term.
Consequently, NASA should not attempt to define total system specifi-
cations at the outset and then assume that they will not be altered through-
out the remainder of the program. First, the evolutionary approach should
rely heavily on experiments with prototype elements and include continu-
ing interactions with and testing by members of the research community.
Second, EOSDIS should have a system architecture sufficiently flexible to
accommodate changes and to implement them in an evolutionary manner.
Third, priorities for EOSDIS should be driven and determined by
the research, monitoring, and modeling that will have to be carried out in
answering fundamental questions about global change. When modifications
to original specifications must be made for budgetary, performance, or other
reasons, the global change research community should have a major role
OCR for page 75
75
In advising on priorities. A broad scientific input will assure that EOSDIS
priorities are based on research requirements.
Diversity of Data and Information
The success of EOSDIS, like that of the USGCRP, will ultimately be
judged on scientific results rather than by how many bits can be processed
for the fewest dollars. 1b achieve the objectives of the USGCRP, the
system must include datasets from a diverse array of space- and ground-
based sources, which Poses significant challenges to its design.
Data Diversify
The demands of the USGCRP require that data and information
from EOS be merged with datasets from a variety of disciplines and
sources. EOSDIS must be able to cope with a wide spectrum of data and
information types. It cannot be focused simply on the data needs of NASA,
or even those of the United States. EOSDIS must provide the capability
for accessing and interpreting data and information from many agencies
domestically and from a number of other countries.
Many ancillary datasets, which will be needed to exploit the scientific
value of EOS data properly, are likely to be collected and held by agencies
other than NASA, such as NOAA and USGS. The USGCRP recognizes
this need, as do the participating agencies. In particular, NASA, NOAA,
NSF and the USGS, as agencies prominent in the collection and dissemi-
nation of data, have a special responsibility for working together to assure
greatest accessibility to all data and information relevant to understanding
global change. Some NOAA, NSF, and USGS centers are already primary
repositories of geoscience data, and they have substantial experience and
expertise in data archiving that could be of advantage to EOSDIS. The ob-
jective should be an integrated national system for processing, distributing,
archiving, and retrieving data and information about global change.
Human Interactions
While the needs for data and information by the human interactions
component are not yet well defined, the data and information involved with
this research will be sufficiently different from those customarily collected
and archived in earth remote sensing missions that special attention should
be paid to this field. As discussed in Chapter 5, NASA has an important
role for ensuring that EOSDIS is responsive to these requirements as they
evolve. We therefore encourage NASA to work with others, including
both the natural and social sciences research communities, to assure that
EOSDIS will contain data useful for human interactions research.
OCR for page 76
76
Conversely, EOSDIS could also provide the means for physical scien-
tists to obtain those human interaction datasets that might enhance their
scientific studies or that might help define the relevance of their research
to sociological, political, and industrial decisions.
A Distributed System
In a project of this scope, there are inevitably divergent views on the
proper balance between distributed and centralized responsibilities for data
management. The appropriate balance can only be determined through
experimentation and experience. A distributed system should be used to
take advantage of scientific expertise as well as computational facilities. This
basic requirement argues strongly against the development of a centralized
system.
Though distributed, EOSDIS should appear to users as a single in-
tegrated system. All users want "one-stop shopping" for their data. The
technical means are available to build a system that is distributed nationally
and even internationally so that the user needs only a single point of entry
to access all the data. EOSDIS planning within NASA seems to be heading
in this direction.
For example, NASA proposes to establish a network of EOSDIS
Distributed Active Archive Centers (DAACs). Seven have been selected
to date. Each DAAC would carry out routine production, distribution,
and archiving of EOS data and data products. In addition, NASA has
proposed that a number of Affiliated DAACs be located "outside of the
critical path for EOSDIS." Among other functions, these Affiliated DAACs
would provide access to important non-EOS data and seances.
While planning documents apparently are still in a state of flux, we
strongly endorse the concept of distributed archive centers charged with
storing data and data products and making them available to users. We
are concerned, however, about two important issues: criteria for selection
of the centers and relationships between the DAACs and the Affiliated
DAACs.
We were unable to obtain clear descriptions of the criteria for the
selection of DAACs or of the selection process. The criteria should be
readily identifiable and publicly stated. It might be desirable for a joint
working group of NASA personnel and extramural scientists to define the
operational and scientific criteria.
The broader questions are: What is the total range of responsibilities
that can be defined for dealing with data issues before and during the EOS
missions? Which activities should be handled at DAACs, which elsewhere,
and how are sites for all activities to be selected?
Of the seven DAACs named thus far, five are at NASA centers.
OCR for page 77
77
NASA:s stated objective is to build a national distributed data and infor-
mation system whose principal aim is scientific understanding of global
change. We believe that such a system is likely to benefit from involvement
by centers outside NASA, particularly in the academic community.
We recognize that some NASA centers have extensive experience on
which to build EOSDIS. It is natural that some will become critical EOSDIS
centers. Nevertheless, we believe it is important to establish an objective
process that includes peer review before DAACs are named and funded.
Such a procedure would optimize the effectiveness of a distributed EOSDIS
consistent with the priorities of the USGCRP.
Though some centers outside NASA have been identified in EOSDIS
planning documents as Affiliated DAACs, their role and status in the EOS
program have not been well established. It is also unclear whether they will
receive adequate support. In summary, the entire matter of the DAACs
needs study and clarification.
NASA'S DATA POLICY
NASA policy is that all data collected by the EOS program will be
archived in EOSDIS, and all EOSDIS data will be made available to the
research community at the incremental cost of distribution. EOS data
and information will be available to all users; the only distinction among
users will involve cost. There is to be no period of exclusive access for any
group, including the EOS principal investigators. Where EOS sensors make
site-specific observations, EOS will be an "acquire-on-demand" system as
opposed to a "process-on-demand" system. The data system is to provide
unprecedented ease of access to observations. NASA hopes to have a
common data policy for the entire international suite of data.
According to NASA:s EOSDIS policy, users who agree to place the
results of their investigations in the open literature will pay only the nominal
incremental cost of reproducing and delivering the data requested. In
exchange for access at low cost, these users must agree, through the
stipulations of a standard "Research Agreement," to make available to
the research community the derived data, algorithms, and models at the
time of acceptance for publication. Low cost data are to be used only
for the researcher's bona fide research purposes. Data may be copied
and shared among other researchers provided that they are covered by a
Research Agreement or that the researcher who obtained the data is willing
to take responsibility for compliance. Commercial users of EOSDIS will
be charged market prices through an intermediate vendor.
We welcome NASAs policy of open distribution for research of all
EOSDIS data. Since the EOS program will be judged on its scientific
results, maximizing the scientific use of the data is the optimal strategy.
OCR for page 78
78
Moreover, as a repository for an extensive range of data pertinent to global
change, the success of the USGCRP also depends on the openness of
EOSDIS.
A number of impediments to accessing the data are currently limiting
the effective scientific use of various existing datasets for global change re-
search. In this regard, we would call attention to two impediments-related
to data management policies and insufficient resources that require im-
mediate attention if EOSDIS is to be successful.
Landsat and Other Commercialized Datasets
Current policies that govern the use, distribution, and cost of the
Landsat and SPOT data make it difficult for the research community to
take advantage of this resource. When purchased from the commercial
remote sensing industry, the data are generally too expensive for most
research purposes.
Current legislation intended to protect the commercialization of Land-
sat remote sensing activities unfortunately prevents the inclusion of these
important data in EOSDIS unless the government purchases the data for
that purpose. In our view, Landsat data are sufficiently important to global
change research that means should be found to include them in EOSDIS,
whether by revising the Land Remote Sensing Commercialization Act, if
necessary, or by paying (again) for the data.
We also believe that it is in the interest of international research to
make all environmental data readily available to the global scientific com-
munity. Indeed, this is NASA's stated policy in regard to EOSDIS: scientists
anywhere in the world with appropriate telecommunications equipment will
have access to EOSDIS provided that they adopt the standard research
agreement. Only by such a strategy will the usefulness of the data be
maximized. Similarly, U.S. scientists should have access to relevant data in
foreign archives, and it is important that other nations be encouraged to
establish similar data policy assessments.
Preservation of Historic Datasets
Changes in technology and insufficient attention to maintenance of
irreplaceable historical damsels currently in various archives threaten to
limit their usefulness. In some cases, valuable data may be lost altogether.
For example, almost all the NOAA 1-km AVHRR data obtained prior
to 1986 have essentially become inaccessible because they are still stored in
an outdated system. Early Landsat data at the USGS's EROS Data Center
in Sioux Falls, South Dakota, and NOAA geosynchronous satellite data at
the University of Wisconsin are at risk of being lost forever because their
OCR for page 79
79
storage media are deteriorating. The success of global change research will
depend in part on the availability of a long time series of measurements.
The existing archived data are a most important resource in this regard. If
lost, they are not replaceable.
The USGCRP recognizes the need to preserve historical data and
includes some funds for the purpose in the FY 1991 budget. We underscore
the urgency of moving all relevant data to more secure storage media and
incorporating them into the EOSDIS as soon as possible.
RESEARCH AND PROlYPING NEEDS
Plans for EOSDIS emphasize that the system will maintain continuity
with current data systems because the current data centers provide a
heritage for the design and prototyping of EOSDIS. If this approach is
to succeed, those involved with EOSDIS development and implementation
must be committed to an evolutionary approach using system prototyping.
We are concerned that this commitment has not yet taken hold, and it may
be difficult to establish once a contract is written to procure design and
implementation services. Contracts normally include precise specifications
for deliverables, but in the case of EOSDIS a description of the system and
even its performance goals is premature.
EOSDIS must be an evolving entity, even over the entire life of EOS;
it is not a system that can be designed, implemented, and then left to
function unaltered throughout the remainder of the EOS program. NASA
appears to view EOSDIS as an evolving entity, but research will be required
to provide the basis for implementing EOSDIS.
The proposed structure of EOSDIS, the various entities involved, and
their interactions were evolving conceptually even during the brief period
that we were conducting our review. We regard this as a healthy sign for the
future of EOSDIS. ~ ensure that the evolution continues, there should be
a coordinated plan that incorporates a deliberate program of data system
research and prototype development.
Prototyping
In our view, the community of researchers, including those in the fields
of high performance computing and data mangement, is not yet ready to
start designing certain aspects of EOSDIS. In many areas prototyping
efforts should be under way, with dedicated people directing them. A
number of areas where work is needed are listed in Table 1 and discussed
in Appendix C. The list is long, but it is difficult to shorten it because the
community collectively has not yet reached the level where the magnitude
of the problem is understood.
OCR for page 80
80
TABLE 1: Elements of EOSDIS
Requiring Prototype Development
Data visualization and the user interface
Browsing capability
Data formats and media
Accessibility of data and information
Cataloging
Search and query capabilities
Model and data interaction
Metadata* and data structures
Data reduction algorithms
Networking
* Metadata is defined in Appendix C as
information about data, such as documentation.
Questions still to be answered include how to get NASA, other agen-
cies, the DAACs, contractors, and independent scientists and computer
engineers working together in defining problem areas; analyzing require-
ments; and using the results to establish prototype approaches that are
likely to be effective; creating designs; working together on interfaces,
design, capabilities, and other aspects; bringing the creativity of comple-
mentary kinds of expertise together; and exploring alternative approaches
that conserve available human and financial resources.
Prototyping should address the challenge of the immense size of EOS
datasets, which will dwarf any previous experience. As part of the proto-
Wping activity, some EOSDIS centers should be co-located with institutions
with high-performance computing and global modeling capabilities. Such
centers, which should have the capability for large temporary data repos-
itories, could well be at research centers but not necessarily at EOSDIS
DAACs. The high-performance computing capabilities required to effec-
tively exploit EOS datasets are likely to greatly exceed current expectations.
Because of its scale, EOSDIS is likely to break new ground in the use of
computing technology in research.
Pathfinder Datasets
EOSDIS plans call for the creation of `'Pathfinder Datasets" that will
incorporate currently existing data important to global change studies. The
data sets include derived data products, chosen on the basis of community
consensus. The procedures will include the validation and archiving of
derived products. Among the datasets under consideration are AVHRR,
OCR for page 81
:~
~1
GOES, TOYS, and others from earlier satellite missions. The selection of
the Pathfinder Datasets will be driven by global change science require-
ments.
We endorse this approach. There is a critical need that proto or pilot
datasets be made available to the scientific community. ~ challenge the
system effectively, such sets should cover a range of sizes typical of those
to be produced by EOS instruments, for example, from less than half a
terabit to significantly greater than a terabit.
INDEPENDENT SCIENTIFIC ADVICE
We strongly believe that cooperative and constructive interaction
among scientists in the global change research community, on the one
hand, and software design engineers and software implementation pro-
grammers, on the other, will be crucial to the success of EOSDIS.
In the business community, it would be unthinkable for a software
firm to design a large package for management of bank records or airline
reservations without steady interaction with users and corporate executives.
The former can provide feedback on what works and what does not, and
the latter on what the system priorities are. In a project like EOSDIS,
individual scientists will be needed to work actively with the system as it
develops and to provide feedback A committee of scientists dedicated to
the goals of USGCRP can provide an overview of the top priorities.
Experience gained from software development for serving the needs
of scientific communities has pointed strongly to the same conclusion.
Major software systems developed without steady involvement of users
have generally been failures; those that maintained advisory committees
of scientific users were almost guaranteed success (provided the design
and programming expertise was present) because potential problems were
caught in the design or early development.
1b date, NASA has worked effectively with several committees that
consist partly or wholly of scientists, and all indications are that this has
been a positive and productive interaction. Such cooperation should con-
tinue throughout the development of EOSDIS, notwithstanding histori-
cal precedent and the fact that scientists and software engineers tend to
speak different languages and sometimes have trouble communicating. We
strongly emphasize the importance of maintaining a three-way structure in
the evolution of EOSDIS, with NASA and other agencies involved in the
USGCRP, research and instrument scientists, and the contractors work-
ing together as equal partners. Excluding any of those three groups from
being involved with software testing and decision making would be a major
mistake.
OCR for page 82
82
In our view, the EOS Investigators' Science Advisory Panel for EOS-
DIS has successfully focused the concerns of a broader research community,
and has articulated the specific requirements of scientific users in the con-
text of the EOS mission and the USGCRP (see, for instance, the EOSDIS
Science Advisory Panel's November 1989 report). EOSDIS planners have
been responsive to the advice of the Science Advisory Panel, and as a result
of its guidance a major redirection of EOSDIS design strategy has taken
place in the last few months.
Because the implementation of EOSDIS poses significant, continuing
challenges, we believe that EOSDIS must have an ongoing mechanism
for acquiring independent advice from the user community. The EOSDIS
Science Advisory Panel should be a long-term advisory element in the
agency's planning and implementation process. It should include' some
scientists who are not EOS investigators but who are active in fields with
the range of scientific disciplines involved in global change research as
well as in research on data and information management. We therefore
recommend that the Panel EOSDIS Science Advisory Panel continue to
perform its function throughout the EOSDIS procurement, design, and
development cycle to ensure that all the major scientific requirements are
effectively met.
Representative terms from entire chapter:
change research