Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
Statistics and Physical Oceanography 1 OVERVIEW INTRODUCTION Purpose and Scope of This Report Research in oceanography has historically been pursued to better understand the oceans as, for example, avenues to exploration, routes for commerce, theaters for military operations, and components in the weather system. Today this research is also done in conjunction with studies on major issues such as global climate, environmental change, and biodiversity, among many others. Statistical techniques have always been important in the analysis of oceanographic data. With the recent introduction of oceanographic observational mechanisms that yield much larger quantities of data than ever before, statistical considerations have gained even more prominence in oceanographic research contexts. Yet disciplinary distinctions have limited interactions across discipline boundaries in many national and global research areas (NRC, 1987, 1990a); traditional statistics and oceanography are not exceptions. To stimulate progress on important research questions now arising at this interface, more cross-disciplinary efforts between statistics and oceanography are needed. This report is thus presented to help encourage successful collaborations between statistics and oceanography that are focused on potentially fruitful cross-disciplinary research areas. The report was prepared in response to a request from the Mathematical Sciences Division of the Office of Naval Research for a cross-disciplinary report describing basic research questions in statistics and applied probability motivated by oceanographic applications. The request reflects ONR’s desire to call such questions to the attention of research statisticians and to develop stronger interactions between the statistics and oceanography research communities. A panel of five oceanographers and five statisticians was convened by the Committee on Applied and Theoretical Statistics of the National Research Council to produce the report. The charge to the panel was to survey crossover areas between statistics and oceanography of greatest potential value (with respect to important oceanographic questions) and to recommend statistical research opportunities. The panel met in April 1992 and again in August 1992. It quickly became apparent that a comprehensive summary of statistical research opportunities addressing all disciplines of oceanography would exceed the project time and budget constraints. This report is therefore limited to a discussion of statistical research opportunities arising in physical oceanography. Lest the limited scope of this report be misconstrued as a statement of the unimportance of statistical analysis to biological, chemical, and geological oceanography, the panel emphasizes that there are numerous opportunities for statisticians to work in those disciplines as well. For example, recent interest in the carbon cycle has focused attention on the spatial and seasonal distributions of phytoplankton pigment concentration in the ocean. These data, obtained by satellite, exhibit all the challenges of sparsity and
OCR for page 2
Statistics and Physical Oceanography incompleteness shared by the other data sets discussed in this chapter, and furthermore exhibit temporal and spatial correlation. An eventual question to address is the role of phytoplankton distribution in climate change, but first a quantitative analysis of the distribution itself is necessary. Factors such as bathymetry, nutrients, eddy kinetic energy, wind stress, cloud cover, meltwater formation, and Ekman upwelling are believed to be potential influences on the phytoplankton distribution, but the relationships are as yet unknown. Currently available data on many of these factors are sparse, and a great deal of spatial and temporal aggregation is necessary in order to assess such potential relationships. Future satellite observations are expected to ameliorate the data issues basic to the study of these important biological and chemical oceanographic processes, but the statistical problems discussed in Chapters 2 through 8 will remain the same. In physical oceanography, the development and application of statistical analysis techniques are somewhat more advanced than in other disciplines of oceanography. In large part, a greater need for sophisticated statistical techniques in physical oceanography has been driven by rapid technological advances over the past 30 years or so that have resulted in larger volumes of observational data spanning a broader range of space and time scales than are available in the other oceanographic disciplines. There has also been intensive development of a theoretical foundation to explain the observations. As a result of these two parallel efforts and recognition of the importance of physical oceanographic processes in many of today’s important global issues, there are many significant opportunities for applications of statistics, both where descriptive analyses of the observational data are needed and where there is a need to relate observations to theory. Even the limited scope of physical oceanography presents a rather daunting task for those who would explore it, since the discipline encompasses a very broad range of topics. Input to the panel was sought and was generously provided by several outside experts (see the preface) to broaden the span of topics outlined in this report. It should be emphasized at the outset that statistical analyses of physical oceanographic data have not been developed in total isolation from developments in the field of statistics. On the contrary, statistical techniques are already used to an unusual degree of sophistication compared with their use in some other scientific disciplines, partly because of the need to develop techniques to understand the almost overwhelming quantity of observational data available. In this regard, physical oceanography has benefitted from the parallel development of techniques of statistical analysis in the field of atmospheric sciences, in which researchers also need to interpret the large volumes of atmospheric data available. Physical oceanographers are generally well versed in traditional and many modern statistical analysis techniques. In addition, several books and monographs have been written specifically on applications of statistical techniques in the atmospheric sciences and physical oceanography (e.g., Gandin, 1965; Thiebaux and Pedder, 1987; Preisendorfer, 1988; Daley, 1991; Ghil and Malanotte-Rizzoli, 1991; Bennett, 1992). Many statistical techniques tailored to specific analyses of oceanographic data have also been published in journal articles. This report consists of a collection of sections (Chapters 2 through 8) outlining research problems that the panel believes could serve as fruitful areas for collaboration between statisticians and oceanographers. In Chapter 9, the panel presents its conclusions, observations, and suggestions on encouraging successful collaborations between statistics and
OCR for page 3
Statistics and Physical Oceanography oceanography. As noted above, physical oceanographic research encompasses a very broad range of topics. Not all of these subdisciplines are represented by the five oceanographers on the panel. This report should therefore be viewed as a compendium of research interests reflecting the viewpoints of the oceanographers on the panel. This somewhat parochial bias should be kept in mind when using this report to identify potential crossover areas between statistics and physical oceanography; there are likely many statistical research opportunities that have not been identified in the report. Notwithstanding these limitations, the panel believes that the report represents a good first step toward encouraging interaction between statisticians and physical oceanographers to the mutual benefit of both disciplines. Oceanography—A Brief Sketch The birth of oceanography as a science can be traced back to 1769, when Benjamin Franklin contributed significantly to scientific knowledge of the oceans by charting sea surface temperature in the North Atlantic and noting that the maximum flow of the Gulf Stream (which had been known to exist and had been used for navigation for a long time) occurred where surface temperatures began dropping rapidly for a ship traveling from the New World to the Old World. Further scientific surveys of the ocean were conducted during this same era by Captain James Cook, who set sail from England in 1772 with the primary goal of making a detailed map of the Pacific Ocean and learning the natural history of the Pacific region. Fontaine Maury is generally credited as the founding father of international oceanographic science. As a U.S. Navy officer, Maury published an atlas (Maury, 1855) based on a worldwide compilation of data taken from ship logbooks. The culmination of this era of scientific exploration of the ocean was the historic voyage of the HMS Challenger funded in 1873 by Great Britain to collect detailed measurements of the physical, biological, and chemical characteristics of the world oceans. The 4-year expedition resulted in some 50 volumes of reports published between 1890 and 1895. The 20th century has witnessed a dramatic expansion of oceanographic research. At the beginning of the century, most of the deep ocean was thought to be relatively quiescent. Except for moderate seasonal variability, it was generally believed that the circulation near the surface of the oceans was relatively constant and large scale. Scripps Institution of Oceanography was founded in 1903 and the Woods Hole Oceanographic Institution was established in 1930. As a result of new technological developments, it became possible to measure physical, chemical, and biological characteristics from the sea surface to the ocean bottom. Dedicated research vessels set out to systematically map the three-dimensional physical, chemical, and biological characteristics of the world ocean on a coarse spatial grid. Although tremendous progress was made in the field of oceanography prior to World War II, it was still possible to summarize existing knowledge in all three disciplines (physical, biological, and chemical) in a single book (Sverdrup et al., 1942). The general description of the steady component of ocean circulation (defined to be the temporal mean) has changed surprisingly little since World War II. In contrast, the view of temporal variability has undergone a major paradigm shift over the subsequent half century. Although eddy-like characteristics of ocean currents were known to exist even by
OCR for page 4
Statistics and Physical Oceanography Maury (1855), it was difficult to distinguish unresolved variability from measurement errors. Multiship surveys and repeated hydrographic surveys conducted beginning in the 1950s and moored current meter and surface drifter measurements beginning in the 1960s revealed considerable spatial structure and temporal variability that did not support the view of ocean currents as simple and large scale. Much of modern oceanographic research has focused on understanding the nature of the rich spatial and temporal variability through a proliferation of new measuring and modeling techniques. There has been a growing recognition of the importance of short space- and time-scale variability (turbulence) to the large-scale circulation, momentum transport, and heat transport and to the distribution of chemical and biological properties. Along with the rapid technological and theoretical developments over the past half century, oceanography has become progressively more specialized. It is no longer possible to summarize adequately the status of all disciplines of oceanography in a single book. Indeed, it is very difficult to summarize even a single discipline in one book. An excellent perspective on the post-World War II evolution of physical oceanography has been published by Warren and Wunsch (1981). A more popularized summary of several aspects of physical oceanography can be found in the Summer 1992 issue of Oceanus (Vol. 35(2)), which is dedicated to physical oceanography; dedicated issues on the other disciplines of oceanography can be found in the other 1992 issues of the magazine. A précis of physical oceanography is given in Chapter 1 of a National Research Council (NRC) report (NRC, 1988); also see NRC (1992b) for a state-of-the-science overview of all of oceanography. In simple terms, physical oceanography can be defined as the study of the physics of the circulation of the ocean on all space and time scales. Research in physical oceanography includes studies of the details of turbulent mixing on scales of millimeters, the propagation of surface and internal waves with scales of centimeters to hundreds of meters, the dynamics of wind-forced and thermohaline-driven ocean currents (see, e.g., NRC, 1992b) on scales of kilometers to thousands of kilometers, and the transfers of momentum, heat, and salt within the ocean and across the air-sea interface. Because of the pressing importance of questions about global warming, there has been an increasing emphasis in recent years on the role of the ocean in the global climate. This has led to a quest for general understanding of the dynamics and long-term evolution of the coupled ocean-atmosphere system (see, e.g., Gill, 1982) and its interactions with the land, cryosphere, and biosphere. The need to quantify and forecast natural and anthropogenic changes in weather patterns and global climate, on the one hand, and the emergence of more easily accessible supercomputing power, satellite remote sensing, and other instrumentation technologies, on the other hand, are factors determining the direction of present and near-future research in physical oceanography. Computer models of large-scale ocean circulation and ocean-atmosphere coupling, of biogeochemical cycles, and of the global budgets of carbon dioxide and other greenhouse gases are becoming the desired results of much of present research. The input data for such models have intrinsic shortcomings because of concerns about data quality and coverage (in space and time). Much effort must therefore be devoted to improving the interpretation of measured quantities and their subsequent use in computer models. The constraints may be due to limited spatial and temporal resolution of the measurements of the observed fields, limited accuracy of the measured quantities, gaps in the data records, short data records, or
OCR for page 5
Statistics and Physical Oceanography propagation of errors through different levels of data processing and analysis. As a result, the technological innovations available do not guarantee success unless considerable progress is made in utilizing the available data. This will necessarily involve the use of sophisticated statistical techniques for a wide variety of purposes, as summarized in this report. Collaborative research involving statisticians and physical oceanographers is desirable to fuel such progress and improvements. To provide statisticians with a brief sketch of the physical oceanographic community, the panel includes a few demographic items. It is not aware of any detailed demographic studies. The membership of the Ocean Sciences Section of the American Geophysical Union probably provides a fair representation of the community. In 1991, the section’s total membership was 4791, 84 percent of whom were regular members and 16 percent of whom were student members. About one-fourth of this membership was foreign. Of the remaining members, it is not known what percentage are actively involved in research, but the number is probably less than half. The total membership is certainly dominated by physical oceanographers; it also includes a substantial number of chemical oceanographers and smaller numbers of biological and geological oceanographers, most of whom are members of other professional societies. About a dozen U.S. universities offer graduate programs in physical oceanography. There are two civilian federal government oceanographic laboratories and several U.S. Navy-supported research and development laboratories involved in open-ocean physical oceanographic research. Private industry employs a relatively small fraction of the physical oceanographic community. Most physical oceanographic research is published in the six primary journals in the field: Journal of Physical Oceanography, Journal of Geophysical Research-Oceans, Journal of Marine Research, Deep-Sea Research, Progress in Oceanography, and Journal of Atmospheric and Oceanic Technology. Fundamental results frequently appear in the Journal of Fluid Mechanics. Significant advances in physical oceanographic research are occasionally published in Science, Nature, and Geophysical Research Letters. Overviews of physical oceanographic research written for less specialized audiences are often published in Oceanography Magazine and Oceanus. OCEANOGRAPHIC MODELING, DATA, AND NOISE The Many Meanings of the Term “Model” The term “model” has a variety of usages in oceanography, depending on the context. It can refer to modeling of data by statistical methods (e.g., curve fitting of one-dimensional data, surface fitting of multi-dimensional data, correlation and regression analysis, modeling of probability distributions, and so on). More typically, however, the term “model” connotes physical modeling on the basis of mathematical equations that govern fluid motion, mass conservation, heat conservation, and conservation of salt or other chemical tracers. Physical models range from purely analytical (i.e., explicitly solvable in closed form) to numerical (i.e., solvable on a computer), depending on the degree of approximation of the complete mathematical equations adopted. An introduction to the equations of fluid motion in the
OCR for page 6
Statistics and Physical Oceanography rotating reference frame of Earth can be found in Pond and Pickard (1983); a more advanced discussion can be found in Pedlosky (1987) or Stern (1975). A brief overview is given here. The vector equation for momentum conservation based on Newton’s Second Law that relates the acceleration of a fluid parcel to the forces acting on the parcel is (1.1) where v is the three-dimensional vector velocity, ∇ is the vector gradient operator along the x, y, and z coordinate axes with respective velocity components u, v, and w, Ω is the angular velocity vector of the rotation of Earth, g is the gravitational acceleration, ρ is the water density, p is pressure, and v is the molecular viscosity. The three components of this vector equation are referred to as the Navier-Stokes (N-S) equations, in honor of the physicist Claude L.M.H.Navier (1785–1836) and the mathematician Sir George Gabriel Stokes (1819–1903), who first formulated the molecular friction force in terms of the second derivatives of velocity along each of the three coordinate axes. The unknown quantities in the N-S equations are density, pressure, and the three components of velocity. Two additional equations are thus necessary to solve for the five unknowns. The first of these is the mass conservation equation, (1.2a) also known as the continuity equation. Seawater can generally be considered to be incompressible (i.e., the so-called total derivative ∂p/∂t+v·∇ρ, corresponding to the rate of change of density following a fluid parcel, is zero), in which case the continuity equation reduces to ∇·v=0. (1.2b) The other equation necessary to solve for the five unknowns is the equation of state relating density to temperature T, salinity S, and pressure, ρ=ρ(T,S,p). (1.3) This empirical relationship is based on laboratory studies of seawater. The dependence of ρ on T and S requires the addition of two more equations governing the conservation of T and S. These equations have the form
OCR for page 7
Statistics and Physical Oceanography (1.4) where C could be either temperature or salt concentration, κC is the molecular diffusivity for C (analogous to the molecular viscosity ν in the N-S equations), and Qc is a source or sink term to account for effects of heating and cooling. A source term is not necessary for salinity since all processes affecting salinity occur at boundaries (surface evaporation and precipitation, river runoff, freezing, and melting), and therefore enter the problem as boundary conditions. Temperature is also usually treated as a boundary condition, although, in a strict sense, the effects of solar heating can penetrate below the ocean surface. In total, then, there are seven equations for the seven unknowns u, ν, w, p, ρ, T, and S. These equations must be solved subject to boundary conditions of no normal flow at material surfaces (the ocean bottom and lateral boundaries), as well as boundary conditions for the normal and tangential components of forces at the boundaries (e.g., surface wind stress, bottom drag, lateral drag, and atmospheric pressure forcing) and buoyancy fluxes (heat and salt) across the air-sea interface and at coastal boundaries. The equations themselves are deterministic in the sense that a particular solution is obtained for a given specification of the boundary and initial conditions. However, the boundary and initial conditions have a random character, which imparts a randomness in the physical modeling. It is noteworthy that many of the methods used to determine the ocean circulation are based on measurements of various natural and anthropogenic chemical tracers. Examples include oxygen, carbon dioxide, silicate, and tritium. The concentrations of these tracers are coupled to the dynamic variables of the equations of motion (1.1) and (1.2a) (or (1.2b)) through conservation equations with exactly the form (1.4), with the term Qc corresponding to sources or sinks of the chemical tracer of interest. These tracers are used to infer indirectly the direction and, to some extent, the speed of deep ocean circulation where mean velocities are often too small to be measured directly. The equations of motion apply to the instantaneous velocity of the fluid. However, the nonlinear terms in the momentum equation (1.1) give rise to turbulent variability that is characteristically irregular in space and time. Because of this nonlinearity and the large range of spatial scales over which the ocean is energetic, it is not practical to solve the above equations explicitly. In particular, it is not possible to measure, and hence specify, the boundary and initial conditions at very fine spatial and temporal resolution. This, in effect, introduces additional noise-like or random character to the physical equations. The usual approach to addressing the turbulent character of oceanic variability is to parametrize the effects of turbulence in terms of large-scale observable quantities (typically the mean flow and its derivatives). As a consequence of the neglect of the detailed dynamics on small scales, the parametrized physical equations pertain to averages of the random dynamic variables. The simplest and most commonly used approach is to replace molecular viscosity ν and diffusivity κC with “eddy” or “turbulent” viscosity and diffusivity (also referred to as effective diffusion or mixing coefficients), as first suggested by Taylor (1915). The turbulent coefficients serve the same function as molecular coefficients but are much larger in magnitude to account for the effects of eddies smaller than those explicitly represented
OCR for page 8
Statistics and Physical Oceanography within the model. These eddies transport momentum and chemical properties much more rapidly than does molecular diffusion. Horizontal mixing is about 10 orders of magnitude larger than molecular diffusion. Because vertical density stratification in the ocean inhibits vertical mixing, vertical mixing is only about 2 orders of magnitude larger than diffusion. The detailed specification of turbulent mixing is not well understood because, unlike molecular diffusion, which is an intrinsic property of the fluid, turbulent mixing varies spatially and temporally and depends on the flow itself. Moreover, the particular choice of turbulent mixing coefficient depends critically on the spatial scales represented within the model. From coarsely spaced observations, it is even possible for turbulent transport to be counter-gradient (i.e., effectively a negative turbulent mixing coefficient, corresponding to energy transfer from eddies to the mean flow; see Starr, 1968). Such a situation is clearly nonphysical, and the turbulent mixing coefficient would presumably be non-negative with sufficiently close sample spacing. The equations of motion (1.1)-(1.4) (referred to as primitive equations) are very complex and are therefore not solvable in exact form. Various simplifications of the complete equations are employed in order to gain insight into the dynamics of fluid motion. A brief overview is given here; a more detailed summary can be found in Holland (1977). One class of simplifications concerns the treatment of vertical density stratification. The simplest models, referred to as barotropic models, consider the fluid density to be homogeneous. Next in complexity are layered models that divide the ocean into two or more distinct layers, in each of which the fluid density is considered homogeneous. The most complex models consider the fluid to be continuously stratified. Although a barotropic approximation is clearly unrealistic, many circulation aspects can be successfully modeled without the need for the more complex baroclinic layered or continuously stratified models. For both barotropic and baroclinic models, various approximations are employed to simplify the equations of motion. The simplest model is the geostrophic approximation, which neglects the nonlinear and acceleration (i.e., time-dependent) terms. The resulting steady-state, linearized equations can be solved analytically, and the geostrophic solution is surprisingly successful at describing the large-scale aspects of the circulation. The next level of complexity includes the acceleration term, which permits analytical wave solutions. Depending on the scales of interest, these waves can range from short capillary waves (wavelengths of millimeters) for which the restoring force is surface tension, to surface and internal gravity waves (wavelengths of tens of centimeters to hundreds of meters) for which the restoring force is gravity, to very long wavelength (tens to hundreds of kilometers) Kelvin or quasi-geostrophic Rossby waves, which arise from the restoring force provided by the latitudinal variation of the local vertical component of Earth’s angular velocity vector or horizontal gradients of bottom topography. The large-scale waves are the dynamical mechanism by which the large-scale circulation adjusts to time-dependent forcing such as the stress exerted by the wind blowing over the surface of the ocean. Although very illuminating, linear models of ocean circulation are not capable of producing accurate representations of detailed aspects of the circulation. In particular, the short spatial scales of many of the interesting features of the circulation (e.g., jetlike currents such as the Gulf Stream) result in strong gradients in the velocity field, which elevates the magnitude of the nonlinear terms to a level comparable to that of other terms in the
OCR for page 9
Statistics and Physical Oceanography equations of motion. More complex classes of physical models thus include nonlinear effects. Analytical solutions are still possible for weakly nonlinear approximations and for a few special cases of strongly nonlinear approximations of the equations of motion. Numerical methods using a computer are necessary for more general solutions. Numerical models can be classified as either process-oriented (also referred to as mechanistic) or simulation models. Process-oriented models simplify the ocean basin geometry in order to focus on the physics of specific term balances in the mathematical equations. Simulation models attempt to represent the basin geometry more accurately and to reproduce or predict some aspects of the actual circulation for comparison with observations. Numerical solutions to the equations of motion are obtained on a space-time grid by approximating the derivatives in the equations by finite differences or by the use of Fourier transform techniques. At each grid point, solutions are obtained by stepping forward in time from the initial conditions according to the mathematical equations governing the fluid motion (e.g., O’Brien, 1986; Haltiner and Williams, 1980; NRC, 1984). Computational models of the climate, especially coupled ocean-atmosphere models, are being used to produce estimates of the climate changes to be expected to result from changes in radiative forcing. Although deterministic, these models are sufficiently chaotic to show variability that is in many respects similar to that observed in the climate of the real world. Thus, the analysis of model output and comparison with data (see Chapter 7), especially to detect trends, raises serious statistical questions. The accuracy of a numerical solution depends critically on the spatial resolution of the grid and on the size of the time step, as well as on the particular parametrizations of the turbulent viscosity and specifications of the boundary and initial conditions. There are thus many ways in which the mathematical equations governing the physics of the ocean can be solved numerically. In general, the most accurate simulations require very fine grid spacing and short time steps. In practice, spatial and temporal resolutions are limited by available computer time and memory allocation. Disk storage capacity can also present a problem since the volume of model output can be very large. As discussed in Chapter 5, physical oceanographic research would benefit greatly from improved methods of visualization to examine the four-dimensional output of numerical models of ocean circulation. Besides the difficulties associated with the subjective natures of the choice of grid resolution, parametrization of turbulent viscosity, and the problem of availability of computer resources, another major issue in physical modeling of the ocean is assessment of the accuracy of the solution. Due to the underlying chaotic nature of ocean circulation (e.g., Ridderinkhof and Zimmerman, 1992), to numerical inaccuracies, and to inaccuracies in the specifications of boundary and initial conditions, numerical simulations can be expected to diverge fairly quickly from the actual circulation. One of the challenges of modern physical oceanography is development of techniques for comparing simulations from different numerical models with each other and with one or more independent observational data sets in order to evaluate the relative accuracies of various model simulations. It is unlikely that numerical simulations can ever be expected to exactly depict the actual circulation. There is currently no general agreement about what aspect of model simulation is most important. For example, one measure of the accuracy of a model is how well it represents the mean circulation. Another measure of accuracy is how well higher-order statistics of the flow field
OCR for page 10
Statistics and Physical Oceanography are reproduced (e.g., the variance of a particular variable or the covariance between two variables). As discussed in Chapter 7, data and model cross-comparison is another area in which the field of statistics may be able to make important contributions. It is noteworthy that, in contrast to physical modeling of atmospheric circulation, the detailed evolution of the actual ocean circulation is very poorly known because of a lack of observations. Global coverage of the ocean can be obtained only from satellite observations, but these are nonsynoptic (i.e., not simultaneous at all locations over Earth) and sample only surface conditions. Sparsely distributed in situ measurements or physical modeling (or both) are necessary to extrapolate the surface measurements from satellites to infer the ocean circulation at depth. Much of the present emphasis in physical modeling of the ocean is directed at developing methods of assimilating available observations (especially satellite observations) into the model solution at regularly or irregularly spaced time steps using statistical estimation, Kalman filtering, and generalized inverse techniques. Such methods have been in use in meteorology for some time. Recent reviews of oceanographic applications of data assimilation can be found in Ghil and Malanotte-Rizzoli (1991) and Bennett (1992). Successful assimilation of available data preserves some degree of similarity between numerical solutions and the actual circulation. Diverse Definitions of the Term “Data” Clarification is in order regarding oceanographic usage of the term “data.” In the field of physical oceanography, the term is used more liberally than in some other fields of science. The intent here is not to justify oceanographic use (or misuse) of the term, but rather to clarify the standard oceanographic jargon and the usage elsewhere in this report. Unlike measurements in some fields of science, few, if any, oceanographic measurements are direct. The quantity of interest is typically sensed electronically as a voltage drop, the number of frequency oscillations of a quartz crystal, the number of rotations of a rotor, or a count of some other sort. These counts must be converted to the geophysical quantity that is of interest by a hierarchy of transformations, some of which may be nonlinear or irreversible. These transformations are often empirically based and could benefit from improved statistical formulations. At each level of transformation, the output of the previous transformation becomes the input for analysis or for a higher level of transformation. This input is then generally referred to as “data” and is typically treated as if all previous levels of transformation have been done correctly. In this context, then, even the output of a numerical ocean model forced by wind fields derived from in situ or satellite observations can be, and sometimes is, referred to as “data” by an investigator interested in analyzing the model output to study ocean dynamics. An important element of these multiple levels of transformation is that it becomes progressively more difficult, and sometimes even impossible, to quantify uncertainties in the output product. Multiple levels of transformation are characteristic of all oceanographic data but are especially pronounced for satellite data. In an effort to distinguish between different types of “data,” NASA defined a hierarchy of data levels in the early 1980s (see, e.g., Arvidson et
OCR for page 11
Statistics and Physical Oceanography al., 1986; Dutton, 1989). The same definitions have subsequently been used for in situ observations, although some definitions of data level are not appropriate for some types of in situ data. A summary of the data levels follows: Level 0: Raw instrument data at original resolution, time ordered, with any duplicates removed. For satellite observations, this level of data consists of the bits (possibly compressed for transmission) telemetered from the satellite to a ground receiving station, corrected for any telemetry errors. For in situ observations, this level of data might consist of volts or counts of some other type. Level-0 data are sometimes referred to as experimental data. Level 1A: Reformatted or reversibly transformed level-0 data, located to a coordinate system (e.g., time, latitude, longitude, depth) and packaged with needed ancillary, engineering, and auxiliary data. Instrument counts from level-0 data have been converted to engineering units in level-1A data. In the case of in situ data, level-0 and level-1A may be the same. Level 1B: Irreversibly transformed values of the instrument measurements. For satellite observations, this might consist of calibrated microwave antenna temperatures, infrared or visible radiances, or microwave normalized radar cross sections. For in situ observations, this level of data is typically the geophysical parameter of interest. In some cases, the data might be resampled to a new grid. Level 2: Geophysical parameters at the measurement time and location. For satellite observations, level-2 data are obtained from a model function (typically derived empirically from some statistical analysis) applied to the level-1B data. For in situ observations, level-2 data may be the level-1B geophysical parameters corrected for any systematic errors or calibration adjustments (typically determined empirically from some statistical analysis). Level 3: Geophysical parameters resampled onto a regularly spaced spatial, temporal, or space-time grid by some sort of averaging or interpolation. Level 4 and above: No set definitions, but generally refer to higher-level processing. An example would be a map of some statistical quantity such as the mean value or standard deviation of a lower-level data quantity. Another example would be higher-level wind fields derived from gridded fields of surface wind velocity (e.g., wind stress or the curl of the wind stress, both of which are used for studies of wind-forced ocean circulation). An extreme example is the output of a numerical ocean circulation model forced by wind fields derived from a level-3 wind product.
OCR for page 12
Statistics and Physical Oceanography Specific examples serve to clarify the need for multiple data-level definitions in oceanography. Virtually any oceanographic measurement could serve as an adequate example for this purpose. The following two examples (one a satellite measurement and the other an in situ measurement) were chosen rather arbitrarily: • Example 1: Near-surface vector winds estimated by a satellite radar scatterometer. The basic quantity measured by the scatterometer is the power of the radar return. The measured return power is digitized, compressed, and telemetered to a ground receiving station along with a variety of necessary ancillary information (e.g., orbit altitude, satellite attitude, temperatures of the electronic components, and so on). The telemetry “data” are uncompressed and converted to engineering units “data” in ground-based processing. A quantity referred to as the normalized radar cross section (NRCS) is derived from the measured return power by normalizing by the power of the transmitted signal along with any necessary calibration adjustments determined from prelaunch calibration or from the ancillary information. Estimates of vector winds are constructed from NRCS “data” from two or more antenna look angles, collocated at approximately the same location on the sea surface. This requires both an empirically derived model function and a statistical method for solving the overdetermined problem of inverting the model function in a manner that is consistent with the noisy NRCS “data.” The result at this stage is individual vector wind “data” at the measurement locations. Most oceanographic applications of scatterometer observations require gridded fields of vector winds or some higher-level wind product derived from Earth-located individual vector wind “data.” These fields are obtained by space-time averaging or interpolation and are generally referred to as “data” by investigators who analyze the wind fields or use them to force ocean circulation models. • Example 2: Measurements of temperature and salinity by a conductivity-temperature-depth (CTD) profiler. A CTD (e.g., see p. 389 in Dickey, 1991) is lowered through the water column on a cable. Variations in voltage associated with changes in temperature and conductivity are measured at a high frequency from two separate sensors (a thermistor and a conductivity probe). These engineering unit “data” are converted to temperature and conductivity “data” through simple algorithms. The conductivity of seawater is a function of both temperature and salinity. Temperature effects are much greater than salinity effects and must therefore be removed from the conductivity measurements in order to estimate salinity. However, the response time of the thermistor measurements of temperature alone is much longer than the response time of the conductivity probe because of thermal inertia of the thermistor. This difference in response time must be accounted for when using the thermistor measurements to remove the temperature component of conductivity variations. Salinity “data” compatible with the thermistor measurements are usually obtained by applying a low-pass filtering algorithm to effectively slow down the response of the conductivity probe. The resulting temperature and salinity “data” at closely spaced vertical intervals usually are then bin averaged and processed to reduce the data volume. It is also necessary to adjust the salinity and, to a lesser extent, the temperature estimates to account for periodic recalibrations of the two sensors. The resulting vertical profiles of temperature and salinity “data” are useful for many oceanographic applications. Some applications require further processing of the temperature and salinity “data” to derive density, thereby yielding a vertical
OCR for page 13
Statistics and Physical Oceanography profile of density. The density “data” may then be integrated vertically to estimate the so-called steric height of the sea surface (or any other isobaric surface) relative to an arbitrary reference level Density profiles, steric height, and other higher-level “data” derived from the CTD temperature and salinity “data” are typically used to construct vertical sections or horizontal maps of the quantity of interest. These sections or maps are often referred to as “data” by investigators who analyze them or use them to force ocean circulation models or to verify ocean model output. Because of the multiple scales characteristic of both spatial and temporal variability in the ocean as discussed in Chapter 2, oceanographic data are commonly undersampled in several respects. One problem is aliasing that arises as a consequence of practical considerations that often limit the sampling to spatial or temporal intervals that are longer than the shortest energetic space and time scales of variability of the quantity being measured. For example, time series constructed from satellite observations are limited by the time interval between repeated satellite orbits over a given location. As another example, temperature measurements from an instrument lowered through the water column are sampled discretely at a fixed rate that often does not adequately resolve variations on the vertical scales of millimeters to centimeters that are important to turbulent mixing. As a third example, lines of vertical profiles of temperature and salinity along hydrographic sections across an ocean basin are sometimes not sampled sufficiently often along the ship track to resolve the energetic 10- to 50-km mesoscale variability that is superimposed on the larger-scale 100- to 1000-km variability that may be the primary signal of interest. The degree to which aliasing affects oceanographic data depends on the energy of the unresolved variability, be it of high frequency or short spatial scale, compared with the energy of the oceanographic signal of interest for the particular application of the data. Another common problem is the limited spatial or temporal resolution inherent in many oceanographic measurements because of limitations of the measurement process. For example, satellite-data generally consist of instantaneous measurements effectively averaged over a relatively large spatial “footprint.” As another example, current meter measurements often consist of a time series of successive time averages at a fixed location. In some cases, the spatial or temporal averaging obscures signals in the quantity being measured that might be of interest for some studies. In others, time series may be uncomfortably short, important concomitant variables may not have been measured, and other factors may be contaminating the records. For example, a change in instrumentation or recording sites can limit the amount of useful information contained in a data set. There may be gaps in the records and the raw (level-0) data may not be readily accessible. Such processes often generate measurements that violate the assumptions of the simplest statistical theory; i.e., the data are typically not independent, are not identically distributed, are not stationary, are non-Gaussian, or some combination. Especially problematic in this regard is serial dependence, which occurs at least to some extent in nearly all temporal oceanographic data. Collected data can involve a sampling problem because of the fundamentally “red” spectral characteristics of ocean variability (i.e., the predominance of energy at the lowest frequencies). Most oceanographic data records are not long enough to resolve all of the
OCR for page 14
Statistics and Physical Oceanography time scales of variability of the quantity of interest. This limits the frequency and wavenumber resolution of the measurements and the number of independent realizations of the physical process of interest. For example, the El Niño phenomenon that affects much of the ocean and the overlying atmosphere has a time scale of 3 to 5 years (cf., Ropelewski, 1992). Even a 30-year record (which is unusually long for physical oceanographic data) only resolves 6 to 10 realizations of this process, resulting in limited degrees of freedom for inferences about cause and effect (see, e.g., Davis, 1977; Chelton, 1983; Thiebaux and Zwiers, 1984; Barnett and Hasselman, 1979). An important example of unresolved variability is the secular trend of sea level rise (see, for instance, NRC, 1990b) associated with global warming (see also, Baggeroer and Munk, 1992). The study of oceanic sea levels is further complicated by there being very few long data records, and by the existence of other poorly understood signals in the data (for example, glacial rebound effects). The data also include long-period signals, such as the 18.6-year lunar tide. The processes responsible for changes in sea level need to be understood, and especially in their relation to possible global warming. If the oceans were to warm, thermal expansion of seawater would be reflected in increased sea levels, with obvious effects on human activity. Coupled with the problem of limited record length is the problem that many oceanographic signals of interest are intermittent (i.e., non-stationary or non-homogeneous). For example, turbulent mixing in the ocean generally occurs in sudden bursts and spatially irregular patches. Another example is the energetic wind events such as storms that vigorously force the ocean but occur only intermittently at a given location. As a consequence, it is difficult to characterize the statistics of ocean variability. For some purposes, it is the intermittent events that are of interest. In other applications, energetic intermittent events might be considered nuisances that can skew the sample statistics (e.g., the mean value or variance) that may be of interest. Techniques for analysis of non-Gaussian data (see Chapter 8) or estimation of robust statistics are therefore needed for many analyses of oceanographic data. These data provide the statistician and data analyst with many challenges. For example, work needs to be done on multivariate transfer functions, particularly with mixed spectra. Data such as these often contain both large deterministic effects and periodic terms plus a non-deterministic part. This can cause serious problems of estimation. Short multivariate series for which the number of series is greater than the number of temporal observations provide a particular challenge because any standard estimate of the spectral matrix is singular. An example of this type of problem is spatial temperature series for which the assumption of spatial homogeneity is obviously not appropriate, but, at least in some regions, spatial continuity might be reasonable. In many of these instances, estimates of uncertainty are inadequate or are completely lacking. Low Noise Is Good Noise Oceanographic measurements often suffer from low signal-to-noise ratio, in some cases because the signal of interest has much smaller energy than other geophysical signals
OCR for page 15
Statistics and Physical Oceanography in the data. For example, the sea level rise from global warming is much smaller than the energetic sea level variations of other oceanographic and non-oceanographic origin (see Chelton and Enfield, 1986). As another example, the visible radiances measured from a satellite for estimation of ocean chlorophyll concentration and investigation of the role of the ocean in the global carbon budget are dominated by atmospheric contamination from the scattering of sunlight from aerosol particles and atmospheric molecules; only about 20 percent of the measured radiances originate from the ocean (Gordon and Castano, 1987). A low signal-to-noise ratio may also arise because of the short record lengths typical of oceanographic data compared with the time scales of the signal of interest. Quantifying the signal-to-noise ratio and the auto- and cross-covariance functions of the signal and noise are important challenges in physical oceanography. A particularly difficult problem arises because of the fact that low-frequency calibration drifts in the measuring devices are often as large in magnitude as the low-frequency signal of interest. For example, estimation of sea level rise from global warming is complicated by vertical crustal motion in the vicinity of many ocean tide gauges. As another example, estimation of low-frequency variations in bottom pressure is complicated by electronic drifts in the pressure gauge measurements. Because of the variety of sampling problems inherent in oceanographic data, the term “noise” is often used to refer to more than just the measurement error associated with inaccuracies in the observations. Inadequately resolved contributions to a measurement from geophysical variability of the quantity of interest are generally referred to as “geophysical noise.” As discussed above, such unresolved geophysical variability can arise from use of a discrete sample interval (aliasing), from inherent spatial or temporal smoothing in the measurement (limited resolution), from finite record length (limited frequency or wavenumber resolution), from intermittency of energetic signals other than those of primary interest, or from low signal energy compared with the geophysical noise of other processes affecting the measured quantity. Although such geophysical noise is fundamentally different from that due to measurement errors, it has exactly the same effect as measurement errors from the point of view of data analyses. When there is a low signal-to-noise ratio, extraction of the signal of interest is especially difficult because typically the measurement noise and geophysical noise in the data are serially correlated.
OCR for page 16
Statistics and Physical Oceanography This page in the original is blank.
Representative terms from entire chapter: