For more information, purchase options, and for other versions (if available) please visit
Contents | Data for Science and Society: The Second National Conference on Scientific and Technical Data | U.S. National Committee for CODATA | National Research Council Chapter 3: Administration Keynote Address: Environmental Observation Data for Addressing National Priorities--the Challenge and the Promise | Data for Science and Society: The Second National Conference on Scientific and Technical Data | U.S. National Committee for CODATA | National Research Council

U.S. National Committee for CODATA
National Research Council
Interdisciplinary and Intersectoral Data Applications: A Focus on Environmental Observations


Administration Keynote Address:
Environmental Observation Data for Addressing National Priorities--the Challenge and the Promise

D. James Baker

     The National Oceanic and Atmospheric Administration (NOAA) basically is a data agency. We collect data and make data available to people, and we try to stay on top of all the issues related to data so we can do this. I am very proud of all the people in NOAA who are deeply engaged in data issues, particularly since we always labor under difficult budget constraints. I hope that one of the things that comes out of this conference is some better awareness, both here and outside, of the importance of data issues. NOAA understands very well the needs of all of its different stakeholders and the challenges to the data community. We are trying to respond to these challenges with varying degrees of success. Sometimes it works and sometimes it doesn't, but this conference should help us there.

     NOAA spends about $1 billion a year, out of a total of close to $3 billion, collecting environmental data from around the world and from space, and this is something that is growing in importance. In the past it was primarily Weather Service data that we collected and distributed in real time. We have developed a very sophisticated way of doing this, providing weather warnings and forecasts. Now, we are providing climate information. We have learned a lot of lessons about how to do this, but of course today we have requirements from ocean observing systems, from fishery management, from marine mammal protection, from oil spill response. Almost all of our programs both require data and require data to be made available to people in near real time.

     NOAA has national data centers, which have the responsibility to provide perpetual stewardship, archiving, and dissemination of environmental data. It is a very powerful constraint that we have. Not only do we have to collect and make these data available, but also we have the responsibility for always making sure that they are there. We have the ultimate responsibility for the long-term management of the data gathered, not only by NOAA but also by other federal agencies such as the National Aeronautics and Space Administration (NASA), other countries, and other research programs. I think most of you could speak to this better than I can, but federal scientific and technical agencies will have increasingly more data to manage, driven by societal needs and new technologies. We at NOAA certainly expect have an order of magnitude more data to manage in about 5 years than we have today, and we have to make these data accessible. In order to do this, our 2001 budget has a specific line item for making data more easily accessible.

     Obviously, the growth of the Internet has had a huge impact on NOAA. Like all agencies, we are very much driven by the Internet and search engines and are finding new ways to interact with all of our different customers. We find not only that Internet requests have grown but also that the number of off-line requests doubled in the 1990s. What is interesting is that it is not just the number of requests for data, but the sophistication of the requests that increases. The users don't just want raw data; they want products. They require more complex data. They want easy access to related information, and they want all of this within hours or days of acquisition. So our users are becoming more demanding as we go forward, and this is something that is not going to change. In fact, user demand is going to get stronger.

     We have what we call a virtual data system in NOAA. Users can access our home page and link to any of our pages including our data systems.1 Users can get many of our data sets through the Internet. However, the quantity of data available from NOAA on the Internet is only about 1 percent of its total holdings. Every year, we ask Congress for more money to make the data more accessible, but we don't do as well as we do with other parts of our budget. It is easier for us to get money to forecast hurricanes than it is for us to access radar data so that they are accessible to researchers, but if we don't do this, researchers cannot take the next steps, and we won't be able to make the next step of modernization of the Weather Service in providing these data.

     This is a critical issue of science and technology for the United States--the need to invest in research and development (R&D). The National Academies have been very strong in recent years, very successful in the past year's budget, in making the President understand that this issue was critical. I was also very glad to see that you have Representative Rush Holt on your agenda this afternoon because he is, I think, an example of the new generation of members of Congress who really understand these issues and can make a big difference. Basic support of R&D is absolutely critical. Last week we announced that this winter was the warmest winter on record and that the winter before was the second warmest and the winter before was the third warmest. This kind of announcement is possible only if we have access to current and past data.

     So one of the big efforts in which NOAA is engaging is to rescue old data and make them available. We try to rescue old data on obsolete media. We try to continue to convert these data into accessible media and formats, but you have to be very careful when you talk about obsolete media and formats. My wife works at the Library of Congress in the National Digital Library Program, where they are taking parts of the collection and making them digitally accessible and available, but they have to ask the question, What is going to be accessible 10 or 20 years from now? This is a good question.

     If you review history, it is clear some of the media storing data have been very stable. Clay tablets or papyrus are great and rag paper is good, but the paper that we have now can disintegrate almost before our eyes, although the systems for reading the data, namely human eyes and brains, are still fully workable. However, the old Landsat data from the early 1970s provide an example of a less successful storage medium. Right now we don't have the systems to read the tapes of data that were collected. The tapes are available, but if you needed the system to read the tapes you would have to go back and rebuild the system. Therefore, there is concern about the media that are used to store data, and what is going to be accessible in the future. What hardware and software are you going to use for the future? This is one of the additional complexities of the new technology that we have: How do we store and make data available?

     NOAA is conducting a Global Ocean Data Archeology and Rescue Project to find out what data are available, then rescue them and make sure they are available under the aegis of the Intergovernmental Oceanographic Commission. This has been a very important exercise because there are a lot of interesting oceanographic data around the world, historical deep-ocean data and coastal data, that are not very accessible. They are written down on pieces of paper that are deep in Russian archives somewhere, so we are sending people out to work on this. This has been a very successful activity. In the past 5 years, we have rescued the largest ocean profile database in the world, with a lot of data still to be rescued. So this is another important issue.

     In fact, Russia has a wealth of historical data going back almost 100 years. It has as many oceanographic stations as the rest of the world combined. Therefore, this offers a possibility to fill in a lot of the gaps in our data, and we have begun to digitize some of those data and make them available to researchers. Also, of course, there are a lot of data available now from other countries that we will be looking at as well.

     At the same time, you have to be careful about these historical data. One interesting thing about data from the 1930s and earlier is that there was a pretty strong belief in the oceanographic community at that time that the only effects were seasonal effects. Therefore, researchers only wrote down the month and the day that the data were collected, not the year. As such, we have at Scripps Institution of Oceanography about 20 years of surface temperature data that are identified only by month and day. So today when we look at the question of interannual variations, these data are not helpful. Again, however, you try to look how they were stored, and then from that you can figure out in what year they were collected. This shows the importance of metadata, or data about data, so that you can really tell what is happening.

     Let me talk a little bit about other agencies, because we are just one of many federal agencies working with large amounts of scientific and technical data. There are many different agencies engaged, including NOAA, NASA, the U.S. Geological Survey (USGS), the National Science Foundation, and the Environmental Protection Agency; we work particularly closely with NASA. We have a Memorandum of Understanding with NASA that we will provide the long-term archival stewardship of a very large volume of remotely sensed data that we get from NASA satellites. In fact, this earth observing system (EOS) series, of which NASA just launched its first satellite, is going to be the first one in which we will be doing that. The satellite is called Terra.2 I think Terra has two Rs for "earth," but you could also say "terra" for terabytes because we will be getting terabytes of data from this new satellite. It is important to note that NASA was one of the first agencies to recognize that as it puts a lot of money into new systems, it must also focus resources on data systems. The Earth Observing System Data and Information System (EOSDIS)3 is a very important aspect of this whole process, and it will have an active data system. We hope to be able to get the resources so that NOAA can take over this long-term stewardship. We are not there yet, but we are working on it.

     In addition, NASA, along with the Department of Defense, is a partner with NOAA in building and launching a series of operational polar orbiting satellites that provide weather information. NOAA provides all the archival and dissemination services for this effort. We expect to get from the NASA series of satellites a data set that is at least 15 years in length. It will be a basis for future climate variability studies, and it will have at least 5000 terabytes of data. This is a critical number to remember.

     With USGS, we have had a Memorandum of Understanding since 1992, in an international effort involving NASA, the European Space Agency, and others to develop a global map of the world. This is 1-kilometer resolution using NOAA's Advanced Very-High Resolution Radiometer, which looks at the radiation that comes from the surface of Earth. This project is being conducted in cooperation with the International Geosphere Biosphere Program. The National Academies have strongly supported putting together this map, and we are working on it. We are also working with USGS on the Earth Resources Observation System (EROS) Data Center, which is the national satellite land remote sensing data archive. You will be hearing more about this, but that system also has hundreds of terabytes.

     NOAA also works very closely on the executive order coordinating geographic data acquisition and access, the National Spatial Data Infrastructure.4 This is a program that promotes the use of computer maps as a consistent way to share geographic information between computer and Internet users. Once again, the National Academies have taken a very strong role in this effort.

     I will conclude my discussion on the domestic side by saying that we are currently discussing with the private sector the issues of ensuring that NOAA data and information systems are providing the highest-quality calibrated weather and climate information, not only for forecasts, but also for various financial instruments, derivatives or futures, which I think Tom Karl is going to talk a little bit about in his presentation today.5 NOAA's role is to provide accurate weather and climate data, and we consider the financial community an important customer, together with the private weather community. You will be hearing more from Ray Ban and others about that today.

     Internationally, NOAA is an active partner in the Committee on Earth Observation Satellites (CEOS),6 which is very much engaged with setting standards for the exchange of data. This is a major coordination body, and we see CEOS as a central part of international guidance on data issues.

     We are also working with other countries using bilateral agreements. We have a bilateral agreement with Japan on sharing data, as well as with China and Europe for sharing data from satellites.

     I can say to our Chinese colleagues attending today's conference that one of the very exciting things has been the cooperation between U.S. and Chinese meteorological satellites. When I visited the satellite operations center in Beijing in September, I was told by my Chinese colleagues that the U.S. policy of full and open data disclosure has greatly helped them in the development of their own satellite program, and they supported it strongly. Our satellite data are available to them, and they simply need to receive them.

     Let me conclude with some thoughts about the famous World Meteorological Organization (WMO) Resolution 40 relating to data exchange.7 This is about trying to work at the international level to ensure free access to environmental data. There has been increasing pressure from many foreign governments to recover costs from the operation of meteorological services--in other words, selling weather forecasts. Various governments have asked their meteorological agencies to sell their forecasts, and a number of European countries started to do this in the early 1990s. In fact, France started to withdraw data that had been internationally available on the basis that it needed these data in order to produce forecasts. It didn't want to make them available internationally.

     This was in direct contradiction to the principle that the United States has always held that access to meteorological data should be free and open, or at least full and open. Maybe you have to pay a small charge for access, but basically subject to full and open data availability, and we saw a possible interest on the part of other countries in doing similar things. So, we put together a very strong team including Joe Friday, who was then director of the National Weather Service. They helped us work on how we could share weather and climate data in products worldwide. There was about 4 years of negotiation, and an agreement was reached that says, "Basic weather data shall be shared internationally among all countries. Those data that may have some commercial interest can be copyrighted, and countries can work out the arrangements for copyright, but all data should be available for research and development." There are four main parts to this resolution. It is a very important resolution, and I am going to read just a summary of each of these parts because I think we will probably end up with resolutions like this for lots of other kinds of data. The experience we have had in the WMO can be very valuable for other fields.

     The resolution affirms the principle of sharing meteorological and related data on a free and unrestricted basis. It identifies a basic set of weather and climate data that is necessary for the protection of life and property, which must be exchanged on a mandatory basis. However, it allows for certain supplemental data to be exchanged with the provision that the data not be used in the country of origin for commercial purposes; in other words, a private weather company cannot take French data and sell them to French wine growers for a forecast unless it pays a fee for those data. Then there is a code of conduct for how this has to happen. It allows us to receive all of the weather and climate data internationally and then we distribute them to all U.S. interests. The only limitation is that the information cannot be used by U.S. interests commercially in the country from which it came.

     We have had to deal with a lot of conflicting interpretations of the resolution since 1995, the year it was put in place, but in fact, we have successfully kept international data flow free and open. We cannot really give you numbers on this now. We are collecting these numbers to see exactly how the resolution is working. However, we certainly have stemmed the exit of data from the global telecommunications systems, and we have put a stake in the ground about the U.S. position of free and open data exchange. This is a very, very important point. In fact, I understand that some of the European meteorological services may be changing their views now about exactly how they are going to handle their data.

     Another resolution along the same lines was introduced last year on the free and unrestricted exchange of hydrological data. I also should say that there are a number of pieces of legislation now being considered in Congress about the issue of what constitute scientific data, including how to define intellectual property rights and what can be protected. These have huge implications for the scientific community, not just for environmental data exchange. Once again, Bill Wulf has been leading that effort. The National Academies have been out front on this issue, and the U.S. National Committee for CODATA has been a key player as well. The jury is still out on what is going to happen with the pending legislation, but I think it is very important for the scientific and the environmental data communities to realize that important things are happening, and we have to stay on top of this.

     Let me just conclude by saying that the economic value of global data and products is far too great to restrict access to the limited amount of environmental data already available. We have to make these data available. All of the issues that we are dealing with today--including increasing population, increasing pollution, increasing greenhouse gases going into the atmosphere--require us to share the best possible information that we have. We have to stay vigilant to make sure that data are available and to work with others to make sure that we have a free and open flow of data. I think that history is on our side, but as technology changes and demands increase this is something for us to follow.

     I can tell you that this administration is absolutely committed to promoting the availability and usefulness of all science and technology data. We have worked closely with the private sector. We will continue to do this, and we will continue to work with all of our partners to reach our mutual goals.


1 See the NOAA home page at <>.

2 See the Terra Web site at <> for additional information.

3 For additional information, see the EOSDIS home page at <>.

4 For additional information, please see the National Spatial Data Infrastructure home page at <>.

5 Tom Karl, see Chapter 6, "The Intergovernmental Panel on Climate Change and the Futures Markets: Common Climate Data Management Requirements," in these Proceedings.

6 See the Committee on Earth Observation Satellites' home page at <> for additional information.

7 For the text of WMO Resolution 40 regarding the policy and practice for exchange of meteorological and related data and products, see <>.

Copyright 2001 the National Academy of Sciences