Read "The Unpredictable Certainty: White Papers" at NAP.edu

Page 1 Cite

Suggested Citation:"The National Information Infrastructure and the Earth Sciences: Possibilities and Challenges." National Research Council. 1997. The Unpredictable Certainty: White Papers. Washington, DC: The National Academies Press. doi: 10.17226/6062.

×

Page 1

1
The National Information Infrastructure and the Earth Sciences: Possibilities and Challenges

Mark R. Abbott
Oregon State University

As with most areas of science, computer networks (especially wide area networks) have become indispensable components of the infrastructure required to conduct environmental research. From transport of data to support for collaboration, to access to remote information processing resources, the network (which I will use as synonymous with the Internet) provides many essential services. In this paper I discuss the various technical challenges in the use of information networks for the Earth sciences. However, the technical issues, though important, are not the essential point. The network is, after all, only a collection of wires, switches, routers, and other pieces of hardware and software. The most serious issue is the content carried across these networks and how it engenders changes in the way Earth scientists relate to data, to each other, and to the public at large. These changes have impacts that are far more profound than access to bandwidth or new network protocols.

Most of the discussions of the national information infrastructure (NII) have focused on technical details (such as protocols) and implementation (e.g., provision of universal access), with little discussion of the impacts of an NII on the scientific process. Instead, discussions of the interactions between technology and human activities focus almost exclusively on the positive aspects of networks and social interactions. For example, networks have been extolled as tools for an expanding sense of community and participatory democracy. However, technology does not have only positive effects; the impacts are instead far more subtle and often more extensive than they first appear. They may not appear for decades. In this paper I neither extol nor condemn the impacts of computer networks on the conduct of science. Rather, it is essential that we become aware of these impacts, both positive and negative. I show that networks do far more than simply move bits; they fundamentally alter the way we think about science.

Earth Science and Networks

Data and Earth Science

Before exploring the role of networks in Earth science, I first briefly discuss the role of data in environmental research. As my background is in oceanography, my comments focus on the ocean sciences, but these observations are generally applicable to the Earth sciences.

Unlike experimental sciences such as chemistry or physics, most Earth sciences cannot conduct controlled experiments to test hypotheses. In some cases, though, manipulations of limited areas such as lakes or small forest plots can be done. Other branches of science that depend heavily on observations, such as astronomy, can observe many independent samples to draw general conclusions. For example, astronomers can measure the properties of dozens of blue dwarf stars. However, Earth science (particularly those fields that focus on large-scale or global-scale processes such as oceanography) must rely on many observations collected under a variety of conditions to develop ideas and models of broad applicability. There is only one Earth.

Page 2 Cite

Suggested Citation:"The National Information Infrastructure and the Earth Sciences: Possibilities and Challenges." National Research Council. 1997. The Unpredictable Certainty: White Papers. Washington, DC: The National Academies Press. doi: 10.17226/6062.

×

Page 2

Earth science thus is driven strongly by developments in observing technology. For example, the availability of satellite remote sensing has transformed our view of upper ocean biology. The spring bloom in the North Atlantic, the sudden ''flowering" of phytoplankton (the microscopic plants of the ocean) that occurs over a period of a few weeks, was thought to be primarily a local phenomenon. However, satellite imagery of ocean color (which is used to infer phytoplankton abundance) has shown that this event covers the entire North Atlantic over a period of a few weeks. Here, a new observing technique provided an improved understanding of what was thought to be a well-known process.

There are instances where new observing systems have transformed our understanding of the Earth. For over 40 years, the State of California has supported regular sampling of its coastal waters to understand the relationship between ocean circulation and fisheries production. A sampling grid was designed based on our understanding of ocean processes at the time. When satellite images of sea surface temperature (SST) and phytoplankton abundance became available in the early 1980s, they revealed a complex system of "filaments" that were oriented perpendicular to the coast and sometimes extended several hundred miles offshore. Further studies showed that these filaments are the dominant feature of circulation and productivity of the California Current, yet they were never detected in the 40-year record. The original sampling grid had been too widely spaced. This example shideas can sometimes lead us to design observing systems that miss critical processes.

The interaction between ideas and observations occasionally results in more subtle failures, which may be further obscured by computing systems. A notable example occurred during the 1982–1983 El Niño/Southern Oscillation (ENSO) event. ENSO events are characterized by a weakening of the trade winds in the tropical Pacific, which results in a warming of the eastern Pacific Ocean. This shift in ocean circulation has dramatic impacts on atmospheric circulation, such as severe droughts in Australia and the Pacific Northwest and floods in western South America and southern California. The 1982–1983 ENSO was the most dramatic event of this century, with ocean temperatures 5°–6°F warmer than normal off southern California. This physical event strongly influenced ocean biology as well. Lower than normal salmon runs in the Pacific Northwest are associated with this major shift in ocean circulation.

The National Oceanic and Atmospheric Administration (NOAA) produces regular maps of SST based on satellite, buoy, and ship observations. These SST maps can be used to detect ENSO warming events. Because of the enormous volume of satellite data, procedures to produce SST maps were automated. When SST values produced by the satellites were higher than a fixed amount above the long-term average SST for a region, the computer processing system would ignore them and would use the long-term average value instead (i.e., the processing system assumed that the satellite measurements were in error). As there was no human intervention in this automated system, the SST fields continued to show "normal" SST values in the eastern tropical Pacific in 1982. However, when a NOAA ship went to the area in late 1982 on a routine cruise, the ocean was found to be significantly warmer than had ever been observed. An alarm was raised, and the satellite data were reprocessed with a revised error detection algorithm. The enormous rise in SST over much of the eastern Pacific was revealed. The largest ENSO event of the century had been hidden for several months while it was confidently predicted that there would be no ENSO in 1982.

This episode reveals that the relationship between data and ideas has become more complex with the arrival of computers. The increasing volume and complexity of the data available for Earth science research have forced us to rely more heavily on automated procedures. Although this capability allows us to cope with the volume, it also relies on precise specification of various filters and models that we use to sort data in the computer. These filters may reflect our preconceived notions about what the data should actually look like. Although computers and networks apparently place more data into our hands more rapidly, the paradox is that there is increasing distance between the scientist and the actual physical process. This "hands-off" approach can lead to significant failures in the overall observing system.

As noted by Theodor Roszak ¹, raw data are of little value without an underlying framework. That is, ideas come before data. There must be a context for observations before they can make sense. A simple stream of temperature readings will not advance science unless their context is defined. Part of this framework includes the ability to repeat the measurements or experiment. Such repeatability strengthens the claim of the scientist that the process under study is a general phenomenon with broad applicability. This framework also includes a

Page 3 Cite

Suggested Citation:"The National Information Infrastructure and the Earth Sciences: Possibilities and Challenges." National Research Council. 1997. The Unpredictable Certainty: White Papers. Washington, DC: The National Academies Press. doi: 10.17226/6062.

×

Page 3

historical context as well. Because of the strong observational nature of Earth science, it depends on earlier field programs to develop new theories and understanding.

With the availability of networks and computers, the task of obtaining, managing, and distributing data has become critical. The information system has become part of the observing system. The way we collect, store, and retrieve our data becomes another set of filters, just like the sampling strategy or measurement technique. Often technical problems obscure the underlying science. That is, the information science issues can dominate the Earth science issues. I explore these technical issues in the next section.

Earth Science and Information Systems

The network and computational requirements for Earth science focus on the more obvious problems of bandwidth, accessibility, ease of use, and so on. I argue that although these issues are important, the profound shift in networks and computational systems has exacerbated the fundamental conflicts between information systems and the conduct of science while simultaneously obscuring these conflicts. Analogous to the analysis of television by Mark Crispin Miller ², information technology has both revealed these problems and hidden them within the medium itself.

Technical Requirements

Earth science relies heavily on close interactions between many researchers from many disciplines. A single scientist cannot reserve an entire oceanographic research vessel for a cruise. Such expeditions require the work of many scientists. The study of problems such as the impacts of climate change on ocean productivity require an understanding of the physical dynamics of both the atmosphere and ocean, besides knowledge of ocean biology. Earth scientists must therefore develop effective mechanisms for sharing data.

Along with the need to share data and expertise among widely dispersed investigators, the characteristics of the data sets impose their own requirements. As Earth science moves toward an integrated, global perspective, the volumes of data have increased substantially. Although the dominant data sources continue to be Earth-observing satellites, field observations have also grown significantly. Sensors can now be deployed for longer time periods and can sample more rapidly. The number of variables that can be measured has increased as well. A decade ago, a researcher would come back from a cruise with a few kilobytes of data; today, a researcher will return with tens of gigabytes. However, these numbers are dwarfed by the data volumes collected by satellites or produced by numerical models. The Earth Observing System (EOS), which is planned by the National Aeronautics and Space Administration (NASA), will return over 1 terabyte per day of raw data.

The demands for near-real-time access to data have also appeared. Satellite-based communications to remote sampling sites have opened up the possibilities of having rapid access to observations, rather than waiting several months to recover environmental sensors. With more capable database servers, data can be loaded and made accessible over the network in hours to days after collection. This is in contrast to an earlier era when data were closely held by an individual investigator for months or even years. Although real-time access does open new areas for environmental monitoring and prediction, it does not necessarily address the need to accumulate the long, consistent, high-quality time series that are necessary for climate research. The pressures to meet everyday demands for data often distract scientists from the slower retrospective analyses of climate research.

As data become more accessible faster, public interest increases. As with the comet impact on Jupiter in 1994, public interest in science and the environment can often far exceed the anticipated demand. The EOS Data and Information System (EOSDIS) was originally designed to meet the needs of several thousand Earth science researchers. Now it is being tasked with meeting the undefined needs of the much larger general public ³. This presents many technical challenges to an agency that has little experience dealing with a potentially enormous number of inexperienced users.

Page 4 Cite

Suggested Citation:"The National Information Infrastructure and the Earth Sciences: Possibilities and Challenges." National Research Council. 1997. The Unpredictable Certainty: White Papers. Washington, DC: The National Academies Press. doi: 10.17226/6062.

×

Page 4

Technical Challenges

Against this backdrop of new technical requirements, Earth science is facing a new set of technical challenges in addition to the continuing challenge of network bandwidth. The structure of the Internet is undergoing massive changes. With the departure of National Science Foundation funding for the network backbone and for the regional providers as well in the near future, there will be increasing emphasis on commercial customers. Interoperability and functionality of the network access points remain problematic. The possibility of balkanization of the network is real and not insignificant. The number of users has also expanded rapidly, and with the appearance of new applications such as the World Wide Web (WWW), the effective bandwidth has dropped dramatically.

As the science community becomes a smaller and smaller fraction of the network user community, network providers focus less on meeting scientific needs and more on meeting commercial needs. The search for bandwidth has become more intense. Telecommunication companies claim that new protocols (such as asynchronous transfer mode [ATM]) and new hardware(fiber optics) will usher in a new era of unlimited bandwidth. Some researchers claim that software "agents" will reduce the need for bandwidth by relying on intelligence at each node to eliminate the need for bulk transfers of data. Besides the technical hurdles to bringing such technologies to market, there are economic forces that work against such developments. In a process that is well known to freeway designers and transportation planners, bandwidth is always used to capacity when the direct costs of the bandwidth are not borne by the users. Funding for the network is hidden from the user so that any increase in personal use of network capacity is spread over every user. In reality, bandwidth is not free, though it is essentially free to the individual user. Even in the case of a totally commercial network, it is likely that the actual costs will be amortized over a broad customer base so that an individual user will have little incentive to use bandwidth efficiently. Although I pay a certain amoun directly for the interstate highway system through my gasoline bills, the actual costs are hidden in many other fees and spread over a broad range of the population, many of whom may never even use the highway system.

With the rise of commercial Internet providers such as America Online and NetCom, will this situation change? Will users pay the true costs of using the network as opposed to paying only a marginal cost? I would argue that this is unlikely on several grounds. First, many users currently have access to virtually free Internet services through universities and other public institutions; it will be difficult to break this habit. Second, the government, through its emphasis on universal access, is unlikely to completely deregulate the system so that rural users (who truly are more expensive to service) will pay significantly more than urban users. Third, and perhaps more compelling, network bandwidth is no different from any other commodity. Every second that the network is not used, revenue is lost. Thus it is in the network providers' interest to establish a pricing structure that ensures that at least some revenue is generated all the time, even if it is at a loss. Some revenue is always greater than no revenue, and the losses can be made up elsewhere in the system. This is a well-established practice in the long-distance telephone industry as well as in the airline industry. Off-peak prices are not lower to shift traffic from peak periods to less congested periods; they are designed to encourage usage during off-peak periods, not reduce congestion during peak periods. They raise the "valleys" rather than lowering the "peaks."

The final threat to network bandwidth is the proliferation of new services. There is no doubt that as network bandwidth increases, new services become available. Early networks were suitable for electronic mail and other low-bandwidth activities. This was followed by file transfers and remote logins. The WWW dramatically increased the capabilities for data location and transfer. Just as the interstate highway system fosters the development of new industries (such as fast food franchises and overnight package delivery systems), so also has the Internet. As with highways, new services create new demand for bandwidth. Although considerable effort is being devoted within the NII to develop rational pricing strategies, it is more likely that the search for bandwidth will continue. It appears to be a law of networks that spare bandwidth leads to frivolous traffic.

As services and users proliferate, it has become more difficult for users to locate the data of interest. Search tools are extremely primitive, especially when compared with the tools available in any well-run library. Various "web crawlers" will locate far more irrelevant information than relevant information. For a user who does not know exactly where to find a specific data source, much time will be spent chasing down dead ends, or circling back to the beginning of the search. Although the relative chaos of the network allows anyone to easily

Page 5 Cite

Suggested Citation:"The National Information Infrastructure and the Earth Sciences: Possibilities and Challenges." National Research Council. 1997. The Unpredictable Certainty: White Papers. Washington, DC: The National Academies Press. doi: 10.17226/6062.

×

Page 5

set up a data server, this chaos confounds all but the most determined user. There are no standard methods for defining the contents of a server and without "truth in advertising" rules, servers, that may superficially appear to be relevant may contain totally irrelevant information. The only users who have time to "surf" the network are those who have nothing else to do, as noted by Negroponte ⁴. Once the appropriate server has been located, there is no consistent method for indexing and archiving data. Although data may be online, it often still requires a phone call to locate and access the relevant data sets.

With the increase in computer processor speeds and desktop mass storage, it has become increasingly important to match network speeds with these other components of the system. Following Amdahl's Rule where 1 bit of input/output requires one instruction cycle, this implies that a 1,000-MIPS (million instructions per second) processor will require a 1-gigabit-per-second network interface. Assuming that processor performance continues at a rate of 50 to 100 percent per year, we will have 1,000-MIPS machines in about 1 to 2 years. It is becoming commonplace for researchers to have 10-to 20-gigabyte disk storage systems on their desktop; in 2 to 3 years it will not be unusual for scientists to have 100 gigabytes of disk subsystems. Network speeds are increasing at a far slower rate, and the present Internet appears to many users to be running slower than it did 3 to 4 years ago.

This imbalance between processor speed, disk storage, and network throughput has accelerated a trend that began many years ago: the decentralization of information. In an earlier era, a researcher might copy only small subsets of data to his or her local machine because of limited local capacity. Most data were archived in central facilities. Now it is more efficient for the individual scientist to develop private "libraries" of data, even if they are rarely used. Although the decentralization process has some advantages, it does increase the "confusion" level. Where do other researchers go to acquire the most accurate version of the data? How do researchers maintain consistent data sets? On top of these issues, the benefits of the rapid decrease in price/performance are more easily acquired by these small facilities at the "fringes'' of the network. Large, central facilities follow mainframe price/performance curves, and they are generally constrained by strict bureaucratic rules for operation and procurement. They are also chartered to meet the needs of every user and usually focus on long-term maintenance of data sets. In contrast, private holdings can evolve more rapidly and are not required to service every user or maintain every data set. They focus on their own particular research needs and interests, not on the needs of long-term data continuity or access by the broader community.

Since small, distributed systems appear to be more efficient and provide more "niche" services than do central systems, it has become harder to fund such centers adequately (thus further reducing their services.) This situation is similar to the "Commons" issue described by Hardin nearly 30 years ago ⁵. For the centralized archive, each individual user realizes great personal benefit by establishing a private library while the cost is spread evenly over all of the users. It is therefore in the interest of the individual user to maximize individual benefit at the cost of the common facility. Not until the common facility collapses do we realize the total cost. The situation is similar in the area of network bandwidth.

The final two technical challenges facing Earth science and networks encroach into the social arena as well. The first is the area of copyrights and intellectual property. Although scientists are generally not paid for the use of their data sets or their algorithms, there is a strong set of unwritten rules governing use and acknowledgment of other scientists' data sets. With the network and its rapid dissemination of information and more emphasis on the development of Web browsers, this set of rules is being challenged. Data are being moved rapidly from system to system with little thought of asking permission or making an acknowledgment. Copying machines have created a similar problem, and the publishing industry has reacted vigorously. However, the rate at which a copying machine can replicate information is far slower than that for a computer network, and its reach is far smaller. As new data servers appear on the network, data are rapidly extracted and copied into the new systems. The user has no idea of what the original source of the data is nor any information concerning the data's integrity and quality.

Discussions over copyrights have become increasingly heated in the past few years, but the fundamental issues are quite simple. First, how do I as a scientist receive "payment" for my contributions? In this case, payment can be as simple as an acknowledgment. Second, how do I ensure that my contributions are not used incorrectly? For example, will my work be used out of context to support an argument that is false? With global, nearly instantaneous dissemination of information, it is difficult to prevent either nonpayment or improper use. After a few bad experiences, what incentive is there for a scientist to provide unfettered access?

Page 6 Cite

Suggested Citation:"The National Information Infrastructure and the Earth Sciences: Possibilities and Challenges." National Research Council. 1997. The Unpredictable Certainty: White Papers. Washington, DC: The National Academies Press. doi: 10.17226/6062.

×

Page 6

The last technical challenge involves the allocation of resources. Network management is not for amateurs. The Internet requires experts for its operation. Web browsers are not easy to design, build, and maintain. Thus programming talent that was originally hired to engage in scientific analysis is spending a larger fraction of its time engaged in nonscientific activities. Although developments in the commercial field are simplifying these and associated tasks, they do cost money. With declining resources for science, one can ask whether this is a reasonable use of scarce resources. The pace of technological change is also increasing, so that the intellectual and technical infrastructure that was assembled 5 to 10 years ago is largely irrelevant. For example, it is becoming harder to hire FORTRAN programmers. Experts on ATM have not yet appeared.

The computer industry and Earth science researchers have generally focused on these challenges from a technological point of view. That is, bandwidth will increase through the provision of ATM services. Data location will be enhanced through new WWW services. Copyrights (if they are not considered absolutely irrelevant or even malevolent) can be preserved through special identification tags.

The technology optimists have decided that the problems of Earth science (and science in general) can be solved through the appropriate application of technology. That is, the fundamental problems of science will be addressed by "better, faster, cheaper" technical tools. On the science side of the issue, it appears that the scientific community has largely accepted this argument. As information systems approach commodity pricing, scientists acquire the new technology more rapidly in an attempt to remain competitive in an era of declining federal support. As put forth by Neil Postman ⁶, this is the argument of "technology." That is, the technology has become an end in itself. The fundamental problem of science is understanding, not the more rapid movement of data. Although we have seen that the link between understanding and observations is perhaps more closely entwined in the Earth sciences, we must be aware of the implications of our information systems for how we conduct science. It is to this point that I now turn.

The Hidden Impacts

A recent book by Clifford Stoll ⁷ provided a critical examination of the information infrastructure and where we are headed as a society. Although he makes many valid points, Stoll does not provide an analysis as to how we arrived at this predicament. I draw on analyses of the media industries (especially television) by Postman and Miller that show some parallels between information systems and mass media. I do not argue that we should return to some pristine, pre-computer era. There is no doubt about the many positive aspects of information technology in the Earth sciences. However, it is worth examining all of its impacts, not just the positive ones.

Information Regulation

Postman postulates that culture can be described as a set of "information control" processes ⁸. That is, we have established mechanisms to separate important knowledge from unimportant knowledge (such as a school curriculum) and the sacred from the profane (such as religion). Even in the world of art and literature, we are constantly making judgments as to the value of a particular work of art. The judgments may change over time, but the process remains.

We are constantly inundated with information about our world, either from the natural world or by our creations. Somehow we must develop systems to winnow this information down to its essential elements. In the scientific world, we have established an elaborate set of rules and editorial procedures, most of which are not written down. For example, experiments that cannot be repeated are viewed as less valuable than those that can be. Ideas that cannot be traced to previous studies are viewed with more skepticism than those that can be traced through earlier research. Occasionally, a new idea will spring forth, but it, too, must go through a series of tests to be evaluated by the scientific community. This "editorial" process essentially adds value to the information.

The second point is that the scientific community believes that there is an objective reality that is amenable to observation and testing. Although clearly science is littered with ideas that rested on an individual's biases and processes that were missed because they did not fit our preconceived notions, we still believe that the

Page 7 Cite

Suggested Citation:"The National Information Infrastructure and the Earth Sciences: Possibilities and Challenges." National Research Council. 1997. The Unpredictable Certainty: White Papers. Washington, DC: The National Academies Press. doi: 10.17226/6062.

×

Page 7

natural world can be measured and understood in a predictable manner. Assumptions and biases can at least be described and perhaps quantified. Scientific knowledge is more than a set of opinions.

Through the scientific process, researchers add value to data. Useful data are separated from useless data and interpreted within a framework of ideas. Data are therefore placed in a structure that in turn can be described. Raw information that is presented out of context, without any sense of its historical origins, is of little use. Thus Earth science is not limited by lack of observations; rather, the correct types of observations are often missing (e.g., long, calibrated time series of temperature).

The process of adding value has arisen over the last several centuries. Publishing in peer-reviewed journals is but one example of how valid data are separated from invalid data, and how observations are placed within a larger framework for interpretation. Although data reports are often published, they are perceived to be of less importance than journal articles. Even "data" produced by numerical models (which are the products of our assumptions) are viewed as less valuable than direct measurements.

The Network and Science

There is no doubt that networks simplify many tasks for Earth science. However, there are many obvious problems, such as the separation of information from its underlying context, the difficulty in locating information of interest, and the lack of responsibility for the quality and value of a particular data set. Much as with television, it has become difficult to separate advertising from reality on the network. A recent discussion on the future of research universities in Physics Today ⁹ highlighted some troublesome issues associated with networks. Graduate students have become increasingly unwilling to use the library. If reference materials are not available online in digital form, then students deem them to be irrelevant. Although electronic searches can be helpful, it is clear that this attitude is misguided. Most scientific documents will never be placed online because of the associated costs. Second, digital storage is highly ephemeral and can never be considered a true archive. There will always be a machine and software between the reader and the data, and these tools are always becoming obsolete. Third, digital searching techniques follow rigid rules. Present search methods are quite sparse and limited. Thus far, no one has shown the ability to find material that truly matches what the reader wants although the search was incorrectly specified. Such serendipitous discoveries are common in libraries.

The network is becoming a place of advertising, with little true content and little personal responsibility. Home pages are proliferating that merely test what the network is capable of doing instead of using the network to accomplish a useful task. We have fixated on the technology, on the delivery system for information, rather than on the "understanding system" ¹⁰. Speed, volume, and other aspects of the technology have become the goals of the system. Although these are useful, they do not necessarily solve the problems of collaboration, data access, and so on. In fact, they can distract us from the real problems. The issue of scientific collaboration may require changes in the promotion and tenure process that are far more difficult than a new software widget.

The emphasis on speed arises in part from the need to have short "return on investment." In such an environment, market forces work well in the development of flexible, responsive systems. For Earth science, this is a useful ability for some aspects of research. For example, development of new processing algorithms for satellite sensors clearly benefits in such an environment. However, this short-term focus is not sufficient. Long-term climate analysis, where the data must be collected for decades (if not centuries) and careful attention must be paid to calibration, will not show any return on investment in the short run. These activities will "lose" money for decades before one begins to see a return on the investment. In a sense, long time series are "common facilities," much like network bandwidth and central archives. They are the infrastructure of the science.

The Network and Television

The early days of television were filled with predictions about increased access to cultural activities, improved "distance" learning, and increased understanding between people. The global village was predicted to be just around the corner. However, the reality is quite different. Television may fill our screens with

Page 8 Cite

Suggested Citation:"The National Information Infrastructure and the Earth Sciences: Possibilities and Challenges." National Research Council. 1997. The Unpredictable Certainty: White Papers. Washington, DC: The National Academies Press. doi: 10.17226/6062.

×

Page 8

information, but the nature of the medium has not increased our understanding ¹¹. Science programming is useful as a public relations tool (and such contacts with the public should be encouraged), but the medium is not suitable for scientific research.

As discussed above, one of the key activities of science is the structuring of information to sort valid from invalid. But computer networks increasingly encourage disjointed, context-free searches for information. Our metaphors resemble those of television and advertising, rather than those of science and literature. Home pages encourage rapid browsing through captivating graphics; long pages of text are viewed as a hindrance.

In the early days of television, the message was in front of the viewer, as discussed by Miller ¹². It was clear what product was being advertised. Today, both the medium and its aesthetics are in front. The smooth graphics, the rapidly shifting imagery, the compelling soundtrack are the primary elements. The advertising and the product are in the background. Feelings about products rather than rational evaluations are transmitted to the viewer. This sense of the aesthetic has permeated other media, including film, and to some extent, newspapers. Presentation is more important than content.

The early days of computers were characterized by cumbersome, isolated, intimidating machines that were safely locked away in glass rooms. The computer was viewed as a tool whose role was clearly understood. Today, computers are ubiquitous. Most Earth scientists have at least one if not two computers in their offices, plus a computer at home. This demystification of computers has been accompanied by much emphasis on the interface. Computers are now friendly, not intimidating. Their design now focuses on smoothness, exuding an air of control and calm. As described in a biography of Steve Jobs ¹³, design of the NeXT workstation focused almost exclusively on the appearance of the machine. Most of the advances in software technology have focused on aesthetics, not on doing new tasks. These new software tools require more technical capability (graphics, memory, and so on) to support their sophisticated form. This emphasis on form (both hardware and software) violates the Bauhaus principle of form following function. The computer industry appears to focus more on selling a concept of information processing as opposed to selling a tool.

Postman ¹⁴ has described print as emphasizing logic, sequence, history, and objectivity. In contrast, television emphasizes imagery, narrative, presentation, and quick response. The question is, Where do computer networks sit with regard to print versus television? There is no doubt that networks are beginning to resemble television more than print. The process of surfing and grazing on information as though it were just another commodity reduces the need for thoughtful argument and analysis. Networks encourage the exchange of attitudes, not ideas. The vast proliferation of data has become a veritable glut. Unlike television, anyone can be a broadcaster; but to rise above the background noise, one must advertise in a more compelling manner than one's competitors.

Conclusions and Recommendations

Networks will continue to play an important role in the conduct of Earth science. Their fundamental roles of data transport and access cannot be denied. However, there are other effects as well that are the result of a confluence of several streams. First, the next-generation networks will be driven by commercial needs, not by the needs of the research and education community. Second, the sharp decrease in price/performance of most computer hardware and the shortened product life cycles have required the science community to acquire new equipment at a more rapid pace. Third, expected decreases in federal funding for science have resulted in greater emphasis on competitiveness. This confluence has caused scientists to aim for rapid delivery of information over the network. Without the regulations and impedance of traditional paper publishing, scientists can now argue in near real time about the meaning of particular results. "Flame" wars over electronic mail are not far behind. The community now spends nearly as much time arguing about the technical aspects of the information delivery system as it does in carrying out scientific research.

Networks allow us to pull information out of context without consideration of the framework used to collect and interpret the data. Ever-increasing network speeds emphasize the delivery of volume before content. If all information is equally accessible and of apparently equal value, then all information is trivial. Science is at risk of becoming another "consumable" on the network where advertising and popularity are the main virtues.

Page 9 Cite

Suggested Citation:"The National Information Infrastructure and the Earth Sciences: Possibilities and Challenges." National Research Council. 1997. The Unpredictable Certainty: White Papers. Washington, DC: The National Academies Press. doi: 10.17226/6062.

×

Page 9

Long, thoughtful analyses and small, unpopular data sets are often overwhelmed in such a system. Similar processes are at work in television; the metaphors of the TV world are rapidly appearing in the network world.

One can successfully argue that Earth science is currently limited by the lack of data (or at least the correct data), but an equally serious problem is the inability to synthesize large, complex data sets. This is a problem without a technological solution. While information systems can help, they will not overcome this hurdle. Delivering more data at a faster rate to the scientist will obscure this fundamental problem. Indeed, technology may give the appearance of solving the problem when in reality it exacerbates it. As stated by Jacob Bronowski,

This is the paradox of imagination in science, that it has for its aim theimpoverishment ofimagination. By that outrageous phrase, I mean that the highest flight ofscientific imagination is toweed out the proliferation of new ideas. In science, the grand view is amiserly view, and a rich modelof the universe is one which is as poor as possible in hypotheses.

Networks are useful. But as scientists, we must be aware of the fundamental changes that networks bring to the scientific process. If our students rely only on networks to locate data as opposed to making real-world observations, if they cannot use a library to search for historical information, if they are not accountable for information that appears on the network, if they cannot form reasoned, logical arguments, then we have done them a great disservice.

The balance between market forces with their emphasis on short-term returns for individuals and infrastructure forces with their emphasis on long-term returns for the common good must be maintained. There is a role for both the private sector and the public sector in this balance. At present, the balance appears to be tilted toward the short term, and somehow we must restore a dynamic equilibrium.

Notes

[1] Roszak, Theodore. 1994. The Cult of Information: A Neo-Luddite Treatise on High-Tech, Artificial Intelligence, and the True Art of Thinking. University of California Press.

[2] Miller, Mark Crispin. 1988. Boxed In: The Culture of TV. Northwestern University Press.

[3] U.S. Government Accounting Office. 1995. "Earth Observing System: Concentration on Near-term EOSDIS Development May Jeopardize Long-term Success," Testimony before the House Subcommittee on Space and Aeronautics, March 16.

[4] Negroponte, Nicholas. 1995. "000 000 111—Double Agents," Wired, March.

[5] Hardin, Garrett. 1968. "The Tragedy of the Commons," Science 162:1243–1248.

[6] Postman, Neil. 1992. Technopoly: The Surrender of Culture to Technology. Knopf, New York.

[7] Stoll, Clifford. 1995. Silicon Snake Oil: Second Thoughts on the Information Superhighway. Doubleday, New York.

[8] Postman, Technopoly, 1992.

[9] Physics Today. 1995. "Roundtable: Whither Now Our Research Universities?" March, pp. 42–52.

[10] Roszak, The Cult of Information, 1994.

[11] Postman, Technopoly, 1992.

[12] Miller, Boxed In, 1988.

[13] Stross, Randall. 1993. Steve Jobs and the NeXT Big Thing. Atheneum, New York.

[14] Postman, Technopoly, 1992.

The Unpredictable Certainty: White Papers (1997)

Chapter: The National Information Infrastructure and the Earth Sciences: Possibilities and Challenges