Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 18
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data 2 Impact Assessment Project for Drought Early Warning in the Sahel This case study was selected by the committee as a prime example of a project that used and attempted to integrate disparate data from many sources, including sequential satellite sensor data. The multinational area of sub-Saharan Africa (the Sahel) included in this case study extends across millions of hectares of fragile arid and semiarid land. For hundreds of years the nomadic human and animal populations of this vast region have been subjected to periodic drought and famine. As uncontrolled growth of these populations has increasingly denuded and degraded these fragile grazing lands, the frequency and devastating effects of famine have increased. Rural populations in arid and semiarid regions of Africa are especially vulnerable to the effects of drought. Severe food shortages resulted from the African droughts of 1972–73 and 1983–84, particularly in the sub-Saharan region. Consequently, international and domestic agencies have increasingly emphasized the importance of a drought monitoring program. As the United States sought to define meaningful ways to respond to the plight of and requests for technical assistance from the drought-affected sub-Saharan nations, one of the proposals that emerged was to use Earth observation (remote sensing) systems to provide regular sequential data about the crop growth conditions in the region (Salby et al., 1991). It had previously been demonstrated in many areas of the world that data from satellite sensors could be used to derive two kinds of information that would be useful in crop yield prediction—the first, a time-sequential
OCR for page 19
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data inventory of quantity and type of clouds over the region (McDonald and Hall, 1980), and the second, a quick analysis of the extent and rate of growth of green vegetation over large areas. The major question posed by this proposal was whether data from Earth observation satellites could be combined with sparse data from other sources to design a system and methodology for credible crop modeling and yield predictions in this environment. Several U.S. government agencies therefore organized a pilot project to determine the feasibility of this approach. The project, which was conducted from 1979 to 1986, had three objectives: (1) detection and monitoring of droughts, (2) crop modeling condition assessment, and (3) prediction of crop yield potential (LeComte, 1994). The crops that were monitored included cowpeas, maize, millet, peanuts, and sorghum. These objectives were interrelated in that water is the most critical limiting factor for crop growth in Africa's Sahel and Horn regions. Precipitation is usually the only source of water for growing crops in this area. In fact, average depth to groundwater in the region is so great that groundwater as a source for crops is not a serious alternative. The development of a capability to determine precipitation and its variability over the region in near real time could be expected to improve significantly the assessment of crop and yield potentials, both spatially and temporally. The U.S. Agency for International Development provided most of the financial support, and the U.S. National Oceanic and Atmospheric Administration (NOAA) was the executing agency of the project (referred to as the "NOAA project" below). The Climate Impact Assessment Division (CIAD) of NOAA's Assessment and Information Services Center (AISC) was responsible for overall management. Appropriate agencies from each of the participating Sahelian and Horn countries provided valuable input to and collaboration with the project. Major features of the NOAA project included the development of a plan of action or implementation plan and of a management structure and organization to implement the plan. Neither of these tasks was easy or clear-cut. In the formulation of each planning task, many assumptions had to be made, not the least of which was that adequate ground observation data could be obtained in a timely fashion. Another hope, if not an assumption, was that adequate in-country technical support for each of the participating sub-Saharan countries would be available with a minimum of constraints in the flow of essential information. Another feature, perhaps unique to this project, was the global perspective, which necessitated the integration of disparate global, national, and local data sources. This integration also required free flow of data and information across many international boundaries.
OCR for page 20
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data VARIABLES MEASURED AND SOURCES OF DATA The methodology and variables used in the NOAA project are described in a technical report by LeComte et al. (1988). Ten African countries were included in the project, with each country generally having numerous crop growing areas. These different areas often had significant variations in environmental conditions. The project used groundwater measurements and satellite data for temperature and precipitation measurements. Rainfall was estimated for areas where data transmitted by the World Meteorological Organization's (WMO) Global Telecommunications System were not available or not reliable. In addition, 10-day rainfall data were supplied by the AGRHYMET (Agriculture-Hydrology-Meteorology Program) Center in Niamey, Niger. These measurements were summarized into an overall precipitation measurement for a given agroclimate region. Meteorologists used cloud imagery data from the NOAA polar-orbiting satellites and the European "Meteosat" geostationary weather satellite to supplement rainfall reports. The meteorogical satellite data were used throughout the project period to obtain cloud cover information, to monitor cloud movements, and to estimate precipitation patterns. Images over the entire sub-Saharan region were obtained every 3 to 6 hours from the Meteosat system. Unfortunately, this technique did not allow identification of local area convection. The Meteosat data were compared with the mean outgoing longwave radiation map published weekly by the Climate Analysis Center (LeComte et al., 1988). This was done by overlaying the two maps and drawing lines to identify discrepancies. Historical data from the most recent 10 to 12 years were used to create baselines for normal rainfall and crop yield for each crop (millet and sorghum) within each of the agroclimate regions. There were regions or cropping areas for which these baselines could not be developed because credible data were not available. The historical data were obtained from the NOAA National Climatic Data Center, from site visits to the sub-Saharan region, or through correspondence with local meteorologists or other scientists. For historical precipitation data, the available dates and quality of data varied by country. For some countries, such as Sudan, these data were very poor. This variability resulted in the use of different years to establish baseline precipitation norms for each country. Another difficulty with precipitation data for the region was that the data were reported at different spatial scales for different areas. The location of the Intertropical Discontinuity, its movement northward early in the season, and its subsequent retreat southward during the growing season were monitored daily. Determination of the Intertropical
OCR for page 21
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data Discontinuity's position depended primarily on station reports of dew point temperatures on the 1200 universal time coordinate surface analyses provided by the U.S. National Meteorological Center. The mean position of the Intertropical Discontinuity during the period from June to September has been found to correlate closely with cumulative rainfall and crop production in the sub-Saharan region (LeComte, 1994). Advanced Very High Resolution Radiometer (AVHRR) data from the NOAA polar-orbiting satellites (NOAA, 1985) also were used to generate a Normalized Data Vegetation Index (NDVI) in the later stages of the project (Tarpley et al., 1984). The NDVI was calculated by taking the difference between reflectance measured by the visible band (0.58 to 0.68 micrometers (mm)) and the near-infrared band (0.72 to 1.10 mm) and dividing this difference by the sum of the reflectances of the two bands. The index was found to be correlated with the vigor and quantity of vegetative biomass (Tucker et al., 1985). In general, there were serious discrepancies among data sources within a country. Subjective assessments and choices of data sources to use were made. Yield data especially varied greatly from country to country. Many interpolation algorithms were needed to integrate the wide variations in sparse ground observation data and data reported by different countries. Project planners had not anticipated that this set of problems related to yield estimates would be so serious. At least one subsequent review of the project criticized some of the methods used in the derivation of yield estimates (NRC, 1987). The committee found it difficult to make an adequate assessment of the methods used to arrive at yield estimates. The methods were not well documented, which led the committee to conclude that some aspects of yield estimate methods had been "ad hoc." NOAA obtained rainfall estimates and gridded crop model output from the EarthSat Corporation. This information was used in confirming a planting date and various stages of crop production in order to mesh precipitation values with crop growth. The committee was informed through its briefings that the methodology used to generate these values was proprietary to the EarthSat Corporation and not available to anyone else, including the NOAA staff, though the planting dates did correspond to values estimated by using data from other sources. In summary, an enormous volume of data was collected and summarized by the project, despite the generally spotty and sparse ground-based data. One of the greatest difficulties of the project was the lack of consistent and accurate data for the precipitation estimates and the yield models that were used. There were missing or unreliable data for such important parameters as precipitation, temperature, evaporation, date of planting of millet and sorghum, and a real extent of crops sown and harvested.
OCR for page 22
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data DATA MANAGEMENT AND INTERFACING Before the initiation of the NOAA project, representatives from participating agencies met a number of times to formulate data management plans. These plans included procedures for relating relevant technical issues to data integration. Existing database technology and software, including some FORTRAN programs, the Statistical Analysis System (SAS) and Lotus 1-2-3, were used to process the data. No relational database or other procedures were developed specifically for the project. The committee was informed that data management requirements related to database interfacing probably did not significantly increase the overall cost. A number of alterations were made to the data management plan during the project. Perhaps the most significant were made in response to the improvement in data acquisition, data handling and analysis technologies, and communications networking. Other changes were in response to difficulties encountered in obtaining desirable ground observations at the time they would have been most useful for real-time modeling of precipitation and yield predictions. If such a project were initiated today, the management and operations plan obviously would include much improved data acquisition, analysis, and communications systems. However, these new technologies would not eliminate the serious constraints imposed by the geographic and political boundaries. The remainder of this section identifies the most significant issues raised during the project regarding the management of data and database interfacing. Timeliness of Data Acquisition Time was a principal driver of the NOAA project. The very nature and objectives of the project meant that timeliness was critical for data acquisition (for both ground and remotely sensed observations), data analysis and interpretation, and distribution of yield prediction information to the decision makers and policymakers. Temporal uncertainties of precipitation and dates of planting contributed to the difficulties of acquiring optimal yield estimates. Also, a fundamental precept of the project was that weather conditions and crop yield predictions had to be made available to the users of that information as near to real time as possible. Accessibility of Data Accessibility of data varied with the type of data. Accessibility was generally determined by technical or political constraints. Some satellite
OCR for page 23
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data images had to be distributed to users by mail or fax. It was necessary to perform extensive preprocessing of satellite data prior to their use in any models. This preprocessing was performed at the University of Missouri Cooperative Institute for Applied Meteorology. Data distribution generally was exceedingly slow, in part because of the ''primitive" technology available at the time, especially in the sub-Saharan countries. Each country provided some information, but it was not always in computer format. Also, the commercially obtained data for planting dates had a number of restrictions on use. Data Quality and Verification The NOAA project did not explicitly deal with the quality of data in written documentation. However, data quality is in question because a number of subjective judgments were made in weighing sources. This occurred with respect to both yield and precipitation, probably more so with the former. Some effort was made to check for errors in data recording, however. The committee concludes that the measures of crop yield were not very accurate. The rank order of best to worst crop yields was probably correct, however. Periodic visits were made by personnel to the areas surveyed in an effort to verify and improve yield estimates. Data Retention Because the impetus for the NOAA project was the critical need for near-real-time crop yield predictions, it was designed to provide short-term data access, retrieval, and manipulation. Station rainfall and temperature data were loaded onto the AISC VAX 11/780 and stored for online access for 1 year. Cloud-cover data were entered based on visual interpretation of Meteosat visible and infrared bands of images sent to AISC. Project personnel decided not to archive all the data because retaining such records was not considered important in the initial planning. Consequently, only some disks of data and related models are kept in the data center at the University of Missouri, and the data are not accessible electronically. The NOAA satellite data are archived at the National Climatic Data Center. Because different components of the project data reside in different locations, it is unclear how any subsequent crop modeling projects or activities might have access to and benefit from the data acquired during this project. The conclusion drawn by the committee is that little or no consideration was given to archiving the modeling data and supporting metadata for future use. It would be costly and an almost
OCR for page 24
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data impossible challenge to recreate the data sets used in the models relating crop yield to precipitation. Data Documentation Because of the quasi-operational nature of the NOAA project, documentation of the data was not given a high priority. For instance, a set of Lotus spreadsheet files describing 10-day precipitation summaries was created with latitude, longitude, station name, and date. These are available from the National Climatic Data Center. However, the methodology (e.g., type of rain gauge and its locations) used to generate each point in the spreadsheet was not documented. Another aspect that was not always well documented was the way in which missing data were handled. The committee was informed that there was a systematic effort to account for missing or uncertain data, but the specifics were not given. For example, it was noted that an analyst decided how heavily to weight the ground-station data with the satellite data, but an exact numerical basis for these weights was not in the reports provided to the committee. Importance of the Crop Calendar Familiarity with the crop calendar is of critical importance to the successful implementation of crop yield predictions in the arid and semi-arid sub-Saharan region. For instance, essential elements of the crop calendar for sorghum and millet in the environment in which the project was operating included time of planting, length of growing season, critical times during the growing season when the crop is most vulnerable to moisture stress caused by drought, and date of harvest. Each of these points in the crop calendar depends on when the rains come. In this region the beginning of the rainy season, if it begins at all, varies greatly, and the entire season is generally of short duration. Knowing the crop calendar and following the precipitation events throughout the stages of growth and development of the grain crops are an essential part of conducting a credible crop yield prediction program, which can be modified as environmental and growth conditions change with advancing maturity of the crop. Definition of Users of Crop Modeling Results Two primary groups of users were served: (1) international agencies concerned with providing economic and food resources to the region under surveillance and (2) those national agencies involved in decision
OCR for page 25
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data making related to production, internal and external trade, and processing and distribution of agricultural products. These users were clearly defined. Climate assessments were made at regular intervals of the growing season (preplanting, planting, growth and development, maturity, and harvest). Crop yield estimates were delivered to Africa during the harvest season within days of obtaining the data in order to enhance maximum usage in those countries. The study was not designed to facilitate any significant access to data by users at a date much beyond harvest time or beyond the scope of the study. Different user groups had different perceptions of the accuracy and detail that might be provided by the crop modeling project. Many ideas related to these perceptions had to be altered as the project progressed. The primary funding agency, USAID, found that the high cost of developing quantitative yield estimates in the work environment of the Sahel could not be justified, and it altered the work plan toward a much less labor-intensive approach to obtain more qualitative rather than quantitative yield prediction data. Creation of New Algorithms and Data Management Procedures To accommodate users' needs, new algorithms and data management procedures were developed. For example, coarse satellite cloud imagery was used to produce rainfall estimates in data-sparse areas. Reports were generated that were useful to analysts in the integration of these data with the available ground-based data. In addition, as discussed above, the NDVI was derived from the infrared and visible bands of the NOAA-9 AVHRR sensor to indicate growing conditions for 0.5° by 1° (latitude and longitude) grid cells across the region. Smoothed time series and regression models were used to integrate several components of the data. The data were typically accommodated in flat files. Many of the algorithms used were not as well documented as one would like. For example, the use of a combination of ground-station data and satellite cloud-image data to derive precipitation estimates for a given locale was left to a climate assessment expert to interpret. The interpolation algorithms, which were different from country to country, were not accessible to the committee. Accommodation of Users' Needs for Crop Yield Estimates Officials of USAID appeared to be satisfied with the results of the yield predictions, and so the members of the committee were left to wonder why the NOAA project was terminated. The NRC (1987) report that
OCR for page 26
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data was critical of the project yield estimates suggested that an alternative modeling procedure was in order. The approach suggested in that report would entail accounting for more of the underlying processes. However, the present committee concludes that it is unclear whether that approach would have been possible, given the constraints imposed in the project. One of the principal participants in the project thought that its high cost led to its termination. It is unlikely that an alternative modeling procedure would have alleviated these budgetary concerns. Interfacing of Disparate Databases As discussed above, the broad range of data used in the NOAA project included historical precipitation data used to establish normal ranges, satellite data to predict current rainfall, a mix of satellite-generated vegetation data, and historical and current yield data that came from several sources, including the countries surveyed. Yield estimates varied from country to country in quality and quantity. Important contributors to these discrepancies included the gross variations in and lack of supporting ground observation data, as well as the spotty nature of the precipitation. The lack of uniform spatial and temporal scales created numerous problems. The density of meteorological stations in the area, the number of years of historical data, and the quantity and quality of meteorological data reported varied from country to country. A considerable amount of manual (e.g., noncomputerized) intervention was required to meld these data for continuity and format in a meaningful way. There was a simple premise that crop yield could be adequately predicted from precipitation data. A major problem inherent in this approach is the quantitative assessment of the two factors—crop yield and precipitation—especially in the sub-Saharan environment with its severe sparsity of data. Although in the original design of the project the importance of data interfacing was recognized, the extent and methods of such interfacing were not well documented. It therefore was difficult for the committee to assess the degree to which data integration was successfully accomplished. Nevertheless, even under optimal conditions, effective data interfacing in a project of multinational dimensions is extremely difficult. Designers of the project sought to incorporate lessons learned from other crop yield experiments conducted over very large areas, especially from the federal interagency Large Area Crop Inventory Experiment (LACIE) and the Agriculture and Resources Inventory Surveys Through Aerospace Remote Sensing (AgRISTARS) program. In each of these experiments, estimation of yield was obtained from meteorological data,
OCR for page 27
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data and areal extent of crop species was obtained from the analysis of sensor data from Landsat satellites (McDonald and Hall, 1978; NASA, 1983). Institutional and Political Constraints Any group that sets out to accomplish the kinds of objectives defined in the NOAA project immediately faces the dilemma of identifying institutions in participating countries that are equipped with facilities and personnel with the essential interests and skills for implementing the requisite tasks. Institutional constraints in many cases can be related to the fact that in some countries there is no clear-cut boundary or definition of the appropriate government agency to be assigned responsibility for collaborating in a project on crop modeling. Further, once an agency has been assigned responsibility, the identification of personnel with the special knowledge and skills essential to the project may prove to be difficult. These problems unfortunately were endemic throughout the sub-Saharan region. Of course, institutional constraints are not unique to developing countries. Difficulties can arise within U.S. institutions that become involved in projects in international environments that are dramatically different from those in which these institutions normally operate. U.S. government agencies may be constrained by their mandates, organizational structure, modus operandi, and personnel with limited experience and skills required for successful international cooperation. Insufficient coordination and communication among participating agencies within the United States posed a problem in this project, as it has in many others. Finally, political instability and subsistence-level economies increase the likelihood that essential data will be incomplete or inaccessible. For example, during the period of the project the war in Chad was a significant problem. However, even in those sub-Saharan nations not engaged in military conflicts, the generally low level of economic development and scientific and technical infrastructure magnified the problems associated with in situ data collection. LESSONS LEARNED Timeliness was essential for the realization of maximum utility of the results in the NOAA project. The objectives and scope were very broad and involved the interfacing of many disparate sources of data with wide variations in quality. As is frequently the case in such operational projects, formal data management was not given a very high priority. While data management was adequate to answer the immediate needs, little was done to organize the data for uses beyond the scope of the limited, short-range
OCR for page 28
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data objectives. If data sets from these kinds of projects are to make any contribution to future efforts, provision must be built into the data management plan to ensure that the data sets, including essential metadata, will be preserved, archived, and made accessible to potential users. The project involved the integration of several sources of precipitation data, and the "best professional judgment" was often used to combine the various values into a single number. All studies involve some degree of professional judgment. This project would have had greater value, however, if there had been more documentation describing the criteria agreed upon by the professionals who made these judgments. Such documentation is extremely important for anyone who wishes to use the data in subsequent studies. A number of quality control procedures were exercised during various stages of the project. As used in this report, "quality control" refers to error correction, or to establishing and maintaining the validity and integrity of data. Unfortunately, little documentation of these procedures was recorded. The development of precipitation data summaries involved the integration of multiple sources, and quality control is essential to this activity. However, the yield data had a multitude of problems in deriving credible results. One inherent problem was the reliance on each participating country to provide accurate, timely ground observation data. Many of these countries did not have adequate technology, logistical support, and trained personnel to provide essential data. There were substantial efforts at quality control through collaboration with participating government agencies, often with a less than desired degree of satisfaction. Perhaps improved Earth observation satellites will mitigate this problem in the future. It should be noted that an Earth-observing satellite may not last the lifetime of a project. In this project an effort was made to ensure that the values used from different satellite sensors were comparable. Any project of this magnitude that is so heavily dependent on sequential acquisition of data by Earth-orbiting satellites must include in its data management and operational plan the cross-calibration and verification of sensors from each successive spacecraft. Portions of the project data reside on archived tapes. For those data to remain useful, the redundant backups should be kept, and the data must be transferred periodically to more current media. Directories that describe these data sets and their accessibility should be available. Managers of research or assessment projects tend to believe that their projects could have accomplished much more if they had been adequately funded. This project was no exception, but its very nature and objectives gave it a special position among the case studies addressed by the committee. It was the only international study, and it dealt with a problem
OCR for page 29
Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data that is fundamental to numerous environmentally fragile areas of the world, many of which seem to be perennially on the brink of disaster from famine. The project held the promise of developing and testing a set of options and methodologies for monitoring and providing near-real-time information about growing conditions and yield predictions of food grains for the inhabitants of a very large area across sub-Saharan Africa. Unfortunately, inadequate funding made it impossible to provide the kind of crop monitoring and early warning system that was envisioned, although the project was successful in demonstrating the potential benefits of using advanced technologies for such applications. REFERENCES LeComte, D.M. 1994. The NOAA/NESDIS impact assessment project for drought early warning in the Sahel. In Crop Modeling and Related Environmental Data: A Focus on Applications for Arid and Semiarid Regions in Developing Countries. P.F. Uhlir and G.C. Carter, eds. CODATA, Paris. LeComte, D.M., F.N. Kogan, C.A. Steinhorn, and L. Lambert. 1988. Assessment of Crop Conditions in Africa. NOAA Tech. Memo. NESDIS AISC 13. NOAA, Washington, D.C. McDonald, R.B., and F.G. Hall. 1978. LACIE: An experiment in global crop forecasting. Pp. 17–48 in Proceedings of the LACIE Symposium. JSC-14551. NASA Johnson Space Center, Houston, Tex. McDonald, R.B., and F.G. Hall. 1980. Global crop forecasting. Science . 208: 670–679. National Aeronautics and Space Administration (NASA). 1983. Agriculture and Resources Inventory Surveys Through Aerospace Remote Sensing (AgRISTARS). Res. Rep. AP-J2-0393. NASA, Washington, D.C. National Oceanic and Atmospheric Administration (NOAA). 1985. Hydrologic and Land Science Applications of NOAA Polar-orbiting Satellite Data . National Environmental Satellite, Data, and Information Service , Washington, D.C. National Research Council (NRC). 1987. Final Report: Panel on the National Oceanic and Atmospheric Administration Climate Impact Assessment Program for Africa. Office of International Affairs . National Academy Press, Washington, D.C. Salby, M.L., H.H. Hindon, K. Woodberry, and K. Tanaka. 1991. Analysis of global cloud energy from multiple satellites. Bull. Am. Meteorol. Soc. 72(4): 467–480. Tarpley, J.D., S.R. Schneider, and R.L. Money. 1984. Global vegetation indices from the NOAA-7 meteorological satellite. J. Clim. Appl. Meteorol, 23: 491–494. Tucker, C.J., J.R.G. Townsend, and T.E. Goff. 1985. African land-cover classification using satellite data. Science 227: 369–375.
Representative terms from entire chapter: