5
VISUALIZATION

Scientific visualization has nearly become a cliché in recent years, as researchers apply increasingly sophisticated hardware and software tools to the task of data analysis. Techniques ranging from video animations of three-dimensional fields to simple two-dimensional line plots are often lumped under the term “visualization.” In a sense, any visual representation of data may be considered visualization. However, a more useful definition would be more restrictive; visualization is the representation of data as a picture. This picture could consist of either static or evolving fields (animations).

The motivation for scientific visualization is the increasing availability and complexity of enormous observational data sets and numerical model output. Traditional line plots, tables of data, and other methods are inadequate to cope with the volume and complexity of these “data.” Suitable visualization, by presenting the data as a picture, can allow the researcher to detect relationships and patterns much more quickly. This “illustrative” approach conveys information about relationships between components of the image simultaneously, rather than relying on a “discursive” or sequential approach using tables of numbers, sentences, and so on. The truism about a picture being worth a thousand words is applicable for many studies. In an effort to deduce the underlying processes responsible for the relationships between various physical phenomena, visualization tools will play an important role as scientists examine multidimensional data sets.

USES OF VISUALIZATION

The volume of data that can be collected by oceanographers has increased dramatically over the past 10 years. Although satellite sensors are the usual example, data rates from in situ instrumentation have also increased. For example, data storage technology now allows moorings to collect samples more frequently and for a longer time period. New instrumentation, such as spectroradiometers, are being deployed on moorings to measure upwards of 50 variables. Typical data sets now range from hundreds of megabytes to a few gigabytes or more.

Although the sheer volume of data may require visualization tools, an equally compelling need for improved visualization tools is the multitude of variables that are now being measured. Advances in ocean instrumentation have greatly increased the variety of processes that may be measured. For example, probes can now measure oxygen nearly continuously, rather than relying on bottle samples at a few discrete depths. High-resolution spectrometers measure phytoplankton fluorescence with much greater accuracy, resolving many pigments rather than just chlorophyll. The search for relationships becomes increasingly difficult as more data sets are added, and so analysis tools that simplify this process are essential. The need to examine complex relationships is not driven simply by our ability to measure numerous variables; rather, the importance of understanding the



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 33
Statistics and Physical Oceanography 5 VISUALIZATION Scientific visualization has nearly become a cliché in recent years, as researchers apply increasingly sophisticated hardware and software tools to the task of data analysis. Techniques ranging from video animations of three-dimensional fields to simple two-dimensional line plots are often lumped under the term “visualization.” In a sense, any visual representation of data may be considered visualization. However, a more useful definition would be more restrictive; visualization is the representation of data as a picture. This picture could consist of either static or evolving fields (animations). The motivation for scientific visualization is the increasing availability and complexity of enormous observational data sets and numerical model output. Traditional line plots, tables of data, and other methods are inadequate to cope with the volume and complexity of these “data.” Suitable visualization, by presenting the data as a picture, can allow the researcher to detect relationships and patterns much more quickly. This “illustrative” approach conveys information about relationships between components of the image simultaneously, rather than relying on a “discursive” or sequential approach using tables of numbers, sentences, and so on. The truism about a picture being worth a thousand words is applicable for many studies. In an effort to deduce the underlying processes responsible for the relationships between various physical phenomena, visualization tools will play an important role as scientists examine multidimensional data sets. USES OF VISUALIZATION The volume of data that can be collected by oceanographers has increased dramatically over the past 10 years. Although satellite sensors are the usual example, data rates from in situ instrumentation have also increased. For example, data storage technology now allows moorings to collect samples more frequently and for a longer time period. New instrumentation, such as spectroradiometers, are being deployed on moorings to measure upwards of 50 variables. Typical data sets now range from hundreds of megabytes to a few gigabytes or more. Although the sheer volume of data may require visualization tools, an equally compelling need for improved visualization tools is the multitude of variables that are now being measured. Advances in ocean instrumentation have greatly increased the variety of processes that may be measured. For example, probes can now measure oxygen nearly continuously, rather than relying on bottle samples at a few discrete depths. High-resolution spectrometers measure phytoplankton fluorescence with much greater accuracy, resolving many pigments rather than just chlorophyll. The search for relationships becomes increasingly difficult as more data sets are added, and so analysis tools that simplify this process are essential. The need to examine complex relationships is not driven simply by our ability to measure numerous variables; rather, the importance of understanding the

OCR for page 33
Statistics and Physical Oceanography interplay between biology, physics, and chemistry has driven the need for an interdisciplinary approach to data analysis. Numerical models can now provide detailed three-dimensional views of the ocean. Such volumetric data are nearly impossible to analyze using traditional two-dimensional graphic techniques (see, e.g., Pool, 1992). The addition of the temporal dimension also requires animation tools to allow researchers to study model dynamics and evolution. Visualization tools play an important role in assessing model performance as well. For example, most model output has traditionally been discarded in an attempt to limit data volumes to manageable levels. However, specific events in model simulations often appear in just a few time steps, so that the ability to retain model output at every time step is useful for model diagnostics. The resulting large quantities of model output place a greater demand for sophisticated visualization techniques to search through the large volumes of data in an efficient manner that enables easy identification of the events. CHALLENGES FOR VISUALIZATION Visualization will continue to be important for oceanographic research as the ability to measure and model the ocean improves. Existing visualization tools, however, are inadequate for these tasks. Many deficiencies revolve around implementation problems and have been described in numerous NASA and other federal government reports (Botts, 1992; McCormick et al., 1987). For example, existing visualization packages are generally expensive and difficult to learn. Packages are usually not extensible, so that custom features cannot be added easily. Some tools cannot handle three-dimensional data sets or animations. One of the more difficult challenges is the ability to visualize evolving volumetric data, such as that produced by an ocean circulation model. It is very difficult to “see” into the interior of such volumes using present technology. Most commercially available packages that are designed for such volumetric data are capable of handling static images, such as automobiles. For many packages, visualizing three-dimensional systems that evolve over time is a difficult task. Such implementation deficiencies are slowly being addressed by the software vendors and developers. The most troublesome aspect of existing visualization tools is that most of them break the link between the underlying data and the image on the screen. Although a researcher may be able to produce a sophisticated animation of the evolution of an ocean eddy, it is generally not easy to go from the animation on the computer screen back to the numbers that the various colors represent. As visualization is a tool to allow the detection of previously unknown relationships, it is still necessary to obtain quantitative information about the nature of the relationships. For example, if one notes a possible relationship between phytoplankton concentration and the strength of a density front in an eddy, it is desirable to examine the quantitative aspects of this relationship. Thus there must be techniques for excising subsets of the actual data for use in other analysis packages, such as statistical and plotting tools. Present visualization packages do not have probes or cursors that allow the user to examine the quantitative values of a three-dimensional image at specific locations,

OCR for page 33
Statistics and Physical Oceanography nor do they have tools for graphically selecting subsets of visualized data (the equivalent of the “lasso” tool on the Macintosh). Most earth science data are referenced to some system of Earth coordinates. As there is no standard way to carry such information along with the data, existing visualization packages either define their own format for such ancillary information or else discard it. It is vital that researchers be able to overlay different data sets on a geographic basis. A common example is the comparison of satellite maps of sea surface temperature and ship observations along a transect across the map. Again, most visualization tools do not retain this link to the underlying data. Visualization must include a link between the tools and an underlying database. This link must operate in both directions. That is, the visualization tool should be able to query databases to locate the raw data of interest for analysis, as well as maintain a database of the various visualization operations that were used to create a new, analyzed product. For example, an animation of vector winds and sea surface temperature might be created by querying a database. The steps used to create this animation would be stored along with the animation. Visualization tools can create large amounts of analyzed data that may be difficult to recreate without some type of audit-trail mechanism. Currently, visualization tools are used largely in an exploratory manner, rather than for presentation to the research community. The high cost of color printing often prohibits the use of color imagery, and there is no established method for distribution of video animations. Occasionally, special sessions are held at scientific meetings for presentations of videos, but this approach reaches only a small fraction of the community. New methods for dissemination of visualizations must be established, as the existing print medium is not adequate. One approach would be to develop animation servers that are capable of storing and retrieving hundreds of video animations and other visualizations. For example, a research article might reference a video loop that is stored on the server, much as on-line library catalogs are stored now. With the planned increases in network capabilities, it would be possible to retrieve and view the animation on a local workstation. Such an animation could be an integral part of the paper and thus subject to peer review. If scientific visualization is made part of the publication process, it will no longer be just a tool for exploring data sets but a key component of scientific research and communication. Lastly, color is often used in visualization to represent the underlying data. Most computer manufacturers have not invested in retaining color fidelity from device to device. For simple business graphics, variations in the shades of red from computer display to video tape to hard-copy printer may not be a serious concern. However, when this color represents specific data values in scientific applications, maintaining an exact shade of color across the breadth of output devices is essential for scientific research. This link to the data must also be maintained. Visualization tools will likely increase in importance for oceanographic research as the volumes and complexity of data continue to increase. However, more attention must be paid to using these tools for their quantitative value, and not just for their ability to present complex relationships. This requires that these tools retain the links to the data that are used in the visualization process.

OCR for page 33
Statistics and Physical Oceanography OUTSTANDING STATISTICAL ISSUES One issue that could benefit from input from the field of statistics is the question of what method to use to interpolate irregularly spaced data to a regular grid in a manner that preserves the statistics of the field of interest (cf., NRC, 1991b). For example, satellite data generally consist of high-resolution data within measurement swaths, separated by hundreds or thousands of kilometers for which there are no data between swaths. Most interpolation methods smooth the data and minimize spatial gradients. It is desirable to retain as much of the full range of spatial scales as possible in the gridded fields. Another issue that oceanographers are concerned with and that statisticians could contribute to is determining a method of identifying “interesting” events in the data that warrant a more detailed analysis. With small data sets, this can be accomplished by simply examining all of the data by various graphical techniques. For large satellite data sets or numerical model output, it is highly desirable to develop automated methods of locating such features. This can be done (with some success) for specific events with easily characterized features, but it is difficult when features are difficult to characterize concisely or do not possess simple characterizations.