Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 87

v
Geostatistical Analysis of
Spatial Data
Noel Cressie
Iowa State University
5.1 Introcluction
AH data have (more or less) precise spatial and temporal labels associated
with them. That is, a measurement is obtained from a particular location
at a particular time, although that information may be lost by omission or
made less precise by aggregation. For most of this chapter, it is assumed
that only the data's spatial labels are important hence the term spatial
data.
As a discipline, spatial statistics has components of ah the classical areas
of statistics, such as design, statistical methods (including data analysis and
diagnostics), stochastic modeling, en cl statistical inference. Importantly, the
spatial labels form an integral part of a spatial statistical analysis. Geostatis-
tics is the area of spatial statistics that is concerned mostly with prediction
of unknown values at given locations (or of aggregations over given regions).
Typically, the prediction is based on univariate and bivariate distributions
of the spatial values, and these distributions (or appropriate moments of
them) are estimated from an initial analysis of the data.
The prefix "geo" in geostatistics originally implied statistics pertaining
to the earth (Matheron, 1963; see also Hart, 1954, who used the term difl:er-
ently from Matheron, in a geographical context). However, more recently,
geostatistics has been used to solve problems in agricultural engineering,
atmospheric science, ecology, forestry, hydrology, meteorology, remote sens-
ing, etc. Although it is Matheron's development of the area within mining
87

OCR for page 87

88
that is best known, a Soviet meteorologist, L. S. Gandin, independently de-
veloped a framework for inference that is virtually identical (Gandin, 1963~;
he chose the term "objective analysis" instead of "geostatistics."
Section 5.2 presents the basic ideas behind a geostatistical analysis, in-
cluding a brief discussion of splines and conditional simulation. The first
part of §5.3 gives several applications of geostatistics, and the second part
discusses recent advances and future directions.
5.2 Theory ant! Methods~of Geostatistics
Geostatistics is mostly concerned with spatial prediction, but there are other
important areas, such as mode! selection, effect of aggregation, and spatial
sampling design, that offer fruitful open problems; see §5.3.2. The emphasis
in this section will be on a spatial-prediction method known as kriging.
Matheron (1963) coined the term in honor of D. G. Krige, a South African
mining engineer (see Cressie, 1990, for an account of the origins of kriging).
5.2.1 The Variogram
First, a measure of the (second-order) spatial dependence exhibited by the
spatial data is needed. A model-based parameter (which is a function)
known as the variogram is defined here; its estimate provides such a measure.
Statisticians are used to dealing with the autocovariance function. It is
demonstrated here that the class of processes with a variogram contains the
class of processes with an autocovariance function, and that kriging can be
carried out on a wider class of processes than the one traditionally user! in
statistics.
Let {Z(s): s ~ D C R4) be a real-valued stochastic process defined on
a domain D of the a?-dimensional space R4, and suppose that differences of
variables lagged in-apart vary in a way that depends only on h. Specifically,
suppose
var(Z(s + h) - Z(s)) = 2'y(h)
for all s, s+ h ~ D; (5.1)
typically the spatial index s is two- or three-dimensional (i.e., d = 2 or
3). The quantity 2~), which is a function only of the difference between
the spatial locations s and s + h, has been called the variogram by Math-
eron (1963), although earlier appearances in the scientific literature can
be found. It has been called a structure function by YagIom (1957) in

OCR for page 87

89
probability and by Gandin (1963) in meteorology, and a mean-square`d dif-
ference by Jowett (1952) in time series. Kolmogorov (1941) introduced
it in physics to study the local structure of turbulence in a fluid. Nev-
ertheless, it has been Matheron's mining terminology that has persisted.
The varion:ram must satisfy the conditional negative semi-definiteness con-
~ v ~
~ ~ _ Ok Ok _ _ n .~_ _ ~ ~ ~ t~ =_·~_ ~ An__ _~ _1 1_
OlllOn, 4~-=1 ~j-=1 aiaj-~tSt—Sj) ~ U' IOr any nmle number OI SDallal 10-
~;~= ~ 1~. · ~ — 1
", .
k. ~ area rang Flora 1~. · ~ — 1
Cl,tllVl[O tOt · ~— B,.-~^J, ~1~ 1~11~111~1~ fat . ~—',...7k} satisfying
~'k_1 ai = 0. When 27(h) can be written as 27°(||h||)' for h ~ Rd. the
variogram is said to be isotropic.
Variogram models that depend only on a few parameters ~ can be used
as summaries of the spatial dependence and as an important component of
optimal linear prediction (kriging). Three basic isotropic models, given here
in terms of the semivariogram (half the variogram), are:
Linear mode! (valid in R3, ~ > T)
)(h; 0) = { cO + by h 76 0
where t} = (Combo)' cO > 0, be > 0;
Spherical mode} (valid in Ri, R2, and R3)
Ash; B) =
O h= 0
<`, co + Cst32 (~h~/aS) - 2 (~h~/aS)3] 0 < Ah < aS
~ co+Cs Ah > us
where ~ = (co,cS'aS), co > 0' cs > 0, as > 0;
Exponential model (valid in R4, ~ > 1)
Ash; l9) { CO + Ce t!—expel—~~ hit /ae )] h i O
where 6' = (cO,ce,ae), co > 0, ce > 0, ae > 0.
Another semivariogram mode! is the rational quadratic mode! Ovoid in
R4, ~ > I):
~ O h=0
7(h;~) = i co + i+~l~/ar h 7t 0
where ~ = (co~cr,a~), co ~ O. cr > 0, ar > 0

OCR for page 87

go
A semivariogram model that exhibits negative correlations caused by
periodicity of the process is the wave (or hole-effect) mode! (valid in Ri,
R2, and R3~:
~ O h=0
1~ co + Cu' ~ a'U siln(~lhil/aw ) h ~ 0
where ~ = (Co~cu~,atl~' co > 0, cu. > 0, au, > 0.
A further condition that a variogram mode] must satisfy is (Matheron,
1971)
2~(h)/~h~2 , O as Ah ~ oo.
In fact, the power semivariogram model,
7~; ~ { co+bp~h~> h 7{ 0
where 6, = (cO, bp, A), co > 0, bp > 0, 0 < ~ < 2,
is a valid semivariogram mode] in R4, ~ > 1.
When the process Z is anisotropic (i.e., dependence between Z(s) and
Z(s + h) is a function of both the magnitude and the direction of h)' the
variogram is no longer purely a function of distance between two spatial
locations. Anisotropies are caused by the underlying physical process evolv-
ing differentially in space. Sometimes the anisotropy can be corrected by a
linear transformation of the lag vector h. That is,
Ash) = 2'y°~Ah~), h ~ R3,
where A is a ~ x ~ matrix and 2~° is a function of only one variable.
Replacing (5.~) with the stronger assumption
cov(Z(s + h), Z(s)) = C(h) for all s, s + h ~ D (5.2)
and specifying the mean function to be constant, i.e.,
E(Z(s)) = ,u for all s ~ D,
(5.3)
defines the class of second-order (or wide-sense) stationary processes in D,
with covariance function Ct ). Time series analysts often assume (5.2) and
work with the quantity p(~)——C(~/C(O). Conditions (5.1) and (5.3) define

OCR for page 87

91
the class of intrinsically stationary processes, which is now shown to contain
the class of second-order stationary processes.
Assuming only (5.2),
Ache = C(O) - C(h),
(5.4)
that is, the semivariogram is related very simply to the covariance function.
An example of a process for which 2'y(~) exists but C(~) does not is a one-
dimensional standard Wiener process TWO: t > 0~. Here, 2^y~h) = the
(-oo < h < off), but cov(W(t), W(u)) = mint, u), which is not a function
of it—up. Thus, the class of intrinsically stationary processes strictly contains
the class of second-order stationary processes.
Now consider estimation of the variogram from data {Z(si): i = 1, . . ., n3.
Suppose these are observations on an intrinsically stationary process (i.e., a
process that satisfies (5.1) and (5.3~), taken at the n spatial locations {Si:
i = 1, . . ., n). Because of (5.3), var(Z(s + h)—Z(s)) = E(Z(s + h)—Z(s)~2.
Hence, the method-of-moments estimator of the variogram 2~(h) is
2~(h) _ ~ [Z(si) - Z(sj)42/~N(h)~, h ~ R4, (5.5)
N(h)
where the average in (5.5) is taken over N(h) = {(si,sj): s'—sj = h), and
~N(h)~ is the number of distinct elements in N(h). For irregularly spacer!
data, N(h) is usually modified to {(si, sj): si - sj ~ T(h)), where T(h) is a
tolerance region of Ret surrounding h. Other estimators, more robust than
(5.51, are given in Cressie and Hawkins (1980) and Cressie (1991, sec. 2.4~.
Parametric models, 2~;~), can be fit to the estimator (5.5) by various
means; as a compromise between efficiency and simplicity, Cressie (1985)
advocates minimizing a weighted sum of squares
~ , ~ ~
':- ~ 2~(h~k)) _ 1 ~ ~1\rth~k)
k ~ l 27(h~k);6~)
with respect to variogram mode] parameters B. The sequence hill, . . ., h(I{)
denotes the "lags" at which an estimator (5.5) was obtained, and which sat-
isfy range and replication conditions such as those given by Journel and
Huijbregts (1978, p. 194, eq. IIT.42~. Zimmerman and Zimmerman (1991)
summarize and compare several methods of variogram-parameter estima-
tion based on simulated Gaussian data. They find that the weighted-least-
squares approach usually performs well, and never does poorly, against
such competitors as maximum likelihood estimation (both ordinary and re-
stricted) and minimum norm quadratic unbiased estimation.

OCR for page 87

92
5.2.2 Kriging
For the purposes of this section, assume that the variogram is known; in
practice, variogram parameters are estimated from the spatial data. Suppose
it is desired to predict Z(So3 at some unsampled spatial location so using a
linear function of the data Z _ (Z(s13, . . ·, Z(s,,33':
n
Also) = ~ AiZ(si)
i=1
It is sensible to look for coefficients
uniformly unbiased and which minimize the mean-squared prediction error
E(Z(So)—Z(So)32. More generally, one could try to minimize E(~tZ(So)'p(z)~)
with respect to predictor p(Z3, where ]; is a loss function. For example, the
loss function proposed by Zellner (19863,
(5.6)
(ii: ~ = 1, . . . , n) for which (5.6) is
LtZ(so3' p(Z3] = b {expta(Z(so3 - p(Z33]—a(Z(so)—p(Z)3 - 1), ~ > 0,
allows overprediction to incur a different loss than underprediction. Mini-
mizing mean-squared prediction error results from using
[tZ(So), p(Z)] = 5tZ(So) _ p(Z)42, b > 0,
which is the squared-error loss function. In all that is to follow, squared-error
loss is used.
The uniform unbiasedness condition imposed on (5.63 is simply E(Z(So)3 =
,u = E(Z(So33' for all ~ ~ R. which is equivalent to
rt
~ Ai = 1.
i=1
(5.7)
Now, assuming (5.73, the mean-squared prediction error can be written in
two ways. If the process is second-order stationary'
n n n n
E(Z(so3 - ~ AiZ(Si332 = C(03 - 2 ~ Aic~si - so3 + ~ ~ Ai~jC(si—sj),
i=1 i=1 i=1 j=1
(5.8)
or' if the process is intrinsically stationary (a weaker assumption)'
n n n n
E(Z(so)—~ AiZ(si))2 = 2 ~ Ai7(si—so3—~ ~ Ai~j7(si - sj) . (5 9)
i=1 i=1 i=] j=l

OCR for page 87

93
Using differential calculus and the method of Lagrange multipliers, optimal
coefficients ~ = (~1,...,An)' can be found that minimize (5.9) subject to
(5.7~; they are
v
. ~ . .
~ = r-1 [A + (1 ~ irrl~.~)l]
and the minimized value of (5.9) (kriging variance) is
ak~sO) =,~r-i,,_ (i - 1'r-l~12
.
.
(5.10)
(5.11)
In (5.10) and (5.11), fly = [bust—sold. ..,~(s,,—so)]t, 1 = (1~.
is the n X n symmetric matrix with (i, j)th element nisi—Sj).
The kriging predictor given by (5.6) and (5.10) is appropriate if the
process Z contains no measurement error. If measurement error is present,
then a "noiseless version" of Z should be predicted; Cressie (1988) has details
on when and how this should be implemented.
Thus far, kriging has been derived under the assumption of a constant
mean. More realistically, assume
..,1)', and r
Z(s) = Its) + t(S), S ~ D,
(5.12)
where E(Z(s)) = Cost for s ~ D and [~) is a zer~mean, intrinsically station-
ary stochastic process with variers + h) - his)) = var(Z(s + h) - Z(s)) =
2~(h), h ~ R3. In (5.12) the "large-scale variation" p(~) and the "smallL-
scale variation" b(~) are modeled as deterministic and stochastic processes,
respectively, but with no unique way of identifying either of them. What
is one person's mean structure could be another person's correlation struc-
ture. Often this problem is resolved in a substantive application by relying
on scientific or habitual reasons for determining the mean structure.
Suppose ,u(s) = x(s)',`3, a linear combination of variables that could in-
clude trend-surface terms or other explanatory variables thought to influence
the behavior of the large-scare variation. Thus,
p
Z(S) = ~ Xj(s)>j + {(S), s ~ D,
j=0
(5.13)
where,l3 _ (,Bo,...,,dp)' are unknown parameters and [(a) satisfies (5.1)
and (5.3) with zero mean. Although the mode! has changed, the problem
of predicting Z(so) using an unbiased linear predictor (5.6) remains. The
uniform unbiasedness condition is now equivalent to the condition
>'X = xO,
(5.14)

OCR for page 87

94
where xo _ (xo(So), · . ., xp(so))' and X is an n x (p+ 1) matrix whose (i, j)th
element is z;_l(si). Then, provided (5.7) is implied by (5.14), minimizing
the mean-squared prediction error subject to (5.14) yields the universal krig-
ing predictor
ZU(SO) = ~uZ
where
(5.15)
Ad = Or- + x(x~r-ix)-~`xo - x~r-~)~; (5.16)
the (universal) krig;~ng variance is
basso) = or- - ~x~r-~ - xO)~¢x~r-ix)-~¢x~r-~ - xO). (5.17)
Another way to write the equations (5.14) and (5.15) is
Z(so) = van + v2xO, (5.18)
where vat (an n x 1 vector) and v2 (a (p + 1) x 1 vector) solve
Ivy + Xv2 = Z
X'v~ = 0.
(5.19)
Equations (5.18) and (5.19) are known as the dual-kriging equations, since
the predictor is now expressed as a linear combination of the elements of
(~',xO). From (5.19), it is clear that splice smoothing is equivalent in form
to universal kriging (see Watson, 1984, where the relationship between the
two prediction techniques is reviewed). Kriging has the advantage that in
practice the data are first used to estimate the variogram, so adapting to the
quality and quantity of spatial dependence in the data. Furthermore, kriging
produces a mean-squared prediction error, given by (5.17), that quantifies
the degree of uncertainty in the predictor. Cressie (1989b) presents these
two faces of spatial prediction along with 12 others, including disjunctive
kriging and inverse-distance-squared weighting.
5.2.3 Conditional Simulation of Spatial Data
Simulation of spatial data (Z(si): i = I, . . ., N) with given means {~(Si):
i = 1, . . ., N) and covariances (C(si, sj): 1 < i ~ j < N) can be carried out
in a number of ways, depending on the size of N and the sparseness of Rev,
the N x N symmetric matrix whose (i,j)th element is C(si,sj). One way

OCR for page 87

95
is to use the Cholesky decomposition EN = IN[N, where [N iS a lower-
triangular N x N matrix (e.g., Golub and Van loan, 1983, pp. 86-90~. Then
ZN—(Z(SI ), . . ., Z(SN)~' can be simulated by
ZN = AN + [NEN, (5.20)
where EN— (,u~s~),...,~(sN))', and EN is an N x 1 vector of simulated
independent and identically distributed random variables, each with zero
mean and unit variance. Other methods, including polynomial approx~ma-
tions, Fourier transforms, and turning bands, are presented and compared
in Cressie (1991, sec. 3.6~.
Now consider the simulation of values of {Z(s): s ~ D) conditional on
observed values Zn Call this conditionally simulated process {W(s): s ~
D), end suppose (V(s): s ~ D) is an unconditionally simulated process
with the same first and second moments as {Z(s): s ~ D). For example,
(5.20) might be used to simulate VN—- (V(SI),...,V(SN)~/, where N > n.
Consider conditional simulation at an arbitrary location so+ in D. Now
write
En en
En+1 = C' C(sr~+l, Sn+1 )
and notice that the two terms of the decomposition
Z(sn+~) = CnEn Zn + [Z(sn+l)—CnEn Zn] (5.21)
are uncorrelated. Hence, the conditional simulation
W(Sn+l ~ = C' En 1 Zn + [V(sn+l ~—C' En 1vn] ~ Sn+1 ~ D, (5.22)
has the same first two moments, unconditionally, as the process {Z(s): s ~
D) and W(si) = Z(si), i = 1,...,n. That is, unconditional simulation of
sample paths of V yields, through (5.22), conditionally simulated sample
paths of W.
It is apparent from (5.20) and (5.21) that when the ci's are Gaussian,
so too is the process {W(s): s ~ DI. However, this may not reflect the
reality of the conditional process when the original process (Z(s): s ~ D)
is "far from" Gaussian even though the first two moments match and the
two processes agree at the data locations. There is clearly a danger in using
conditional simulation uncritically.

OCR for page 87

96
5.3 Applications and Research Frontiers
A geostatistical analysis of spatial data has a "nonparametric" flavor to it,
in that inferences are based on properties of univariate and bivariate clis-
tributions of Z(s) and Ztu), which are estimated from the data. In other
words, assumptions are few, although often it helps to transform the data
so that they are Gaussian-like. In contrast, Markov-random field models,
or simultaneous spatial autoregressive models, have a very rigid structure
that is not so well adapted to problems of spatial prediction (kriging). Sec-
tion 5.3.1 shows how geostatistics has considerable flexibility in applications
across diverse scientific clisciplines.
5.3.1 Applications
.. . . . .
.. . . ... .
The strength of geostatistics over more classical statistical approaches is
that it recognizes spatial variability at both the "large scale" and the "small
scale," or in statistical parlance, it models both spatial trend and spatial
correlation. Trend-surface methods include only large-scale variation by
assuming independent errors.
Watson (19723 eloquently compares the trio
approaches and points out that most geological problems have a small-scare
variation, typically exhibiting strong positive correlation between data at
nearby spatial locations. The books by
David ~1977), Journel and Huijbregts
(1978), and Clark (1979) are all aimed at applications of geostatistics in the
mining industry.
The geostatistica] method has also found favor among soil scientists who
seek to map soil properties of a field from a small number of soil samples
at known locations throughout the field; soil pH in water, soil electrical
conductivity,-exchangeable potassium in the soil, and soil-water infiltration
are some of the variables that could be sampled and mapped.
Water erosion is of great concern to agriculturalists, since rich topsoil
can be carried away in runoff water. The greater the soil-water infiltration,
the less the runoff, resulting in less soil erosion and less stream pollution by
pesticides and fertilizers. Also, greater infiltration implies better soil struc-
ture, which is more conducive to crop growth. Cressie and Horton (1987)
describe how double-ring infiltrometers were placed at regular locations in a
field that had received four tillage treatments' moldboard, paraplow, chisel,
and no-till. From these data, the spatial relationships were characterized;
Gotway and Cressie (1990) used the resulting stochastic models to estimate
efficiently the tillage effects and to build a spatial analysis of variance table,

OCR for page 87

97
from which tilIage differences can be tested.
Kriging can be applied in geophysical problems that require accurate
mapping of the ocean floor. Data are slopes or depths and a variety of as-
sumptions are made about the large-scale and small-scale variations defined
by (5.12) (e.g., Shaw and Smith, 1987; Smith and Jordan, 1988; Gilbert,
1989; MaTinverno, 1989~. This area of investigation would benefit from geo-
statistical analyses that use the data initially to fit an appropriate variogram
mode} and then draw kriging maps based on the fitted variogram.
Applications of geostatistics abound in other areas, such as rainfall pre-
cipitation (e.g., Ord and Rees, 1979), atmospheric science (e.g., Thiebaux
and Pedder, 1987), acid deposition (e.g., Bilonick, 1985), and groundwater
flow (e.g., Clark et al., 1989~. Examples from groundwater flow and acid
deposition will now be used to illustrate the geostatistical method described
in §5.2.
Flow of Groundwater from a Proposed Nuclear Waste Site
In 1986 three high-level nuclear waste sites were proposed in the United
States (in Nevada, Texas, and Washington), thus prompting study of the
soil and water-bearing properties of their surrounding regions. The chosen
site will probably contain more than 6S,000 high-level waste canisters placed
about 30 feet apart in holes or trenches surrounded by salt, at a depth of
2,000 feet. However, leaks could occur, or the radioactive heat could cause
the tiny quantities of water in the salt to migrate toward the heat until
eventually each canister would be surrounded by about 6 gallons of water.
The chemical reaction of salt and water would create hydrochloric acid that
could slowly corrode the canisters. Eventually, the nuclear wastes could
reach the aquifer and sometime later contaminate the drinking water.
Therefore, the types of questions one might ask are: If a nuclear waste
site were to be designated for, say, Deaf Srn~th County, Texas, what are the
risk parameters for radionucTides contaminating the groundwater? Where
would they flow? How long would they take to get there? Here the direction-
of-flow question will be addresse~l; kriging will be used to draw ~ ~n~.i~1 anon
of potentiometric heads throughout the area of interest.
~ ~r~V^~^ -lot
Potentiometric heads in the West Texas/New Mexico region are shown
in Figure 5.1, and are given by Harper and Furr (1986~. They were measured
by driLing a narrow pipe into the aquifer and letting the water find its own
level in the pipe. Measurements are given in feet above sea level.
An anisotropic variogram model was fit to the data; in each of two

OCR for page 87

98
209.4 1 4
1S9.414
109.414
S9.414
9.~14
_
f ~
~se,1 L: _
.2611
. ~ ~J:
2891 ~ 2SS]
~ 134 - ~ ~
. 1 2~t
3510 - 1
~ ·~
-145.24 -85~24
~ t476
I'.~' ~ ~ t4O: - _
t364 htOORt
e ·~.
O _
teas ~or~cR
I'rt ~. I J. ~ ~
OLDHA" O it'' _
~ 22 - RANOALL Al
OEA, l - ITH t I I ~ · ~ I ~
r 255, t 2,5, BalSCO. |
BAILY
1 1 ~'
_
DALLA~t 5~AN = Oe~lL ·_E ~—"O - .
.102< 148.e l3' ~/—
IOC' ~
1030
CARSON e 1~
1408
aRA~
.~ct"
`RU t TRONG
_ 1437
.1, 13 '
~CELER
CASTRO SW15~1ER 1 2116 1 1 ~ ~rse 1 `_
_ .~8 1 2548 ~ 1 .___ ~ ~ 17"
35,S —
23S.
2t28 2S't6 2S44
coc~a.. 2739 1433 ·I.~.
er"
-4S.24 4.~. S4.~. 104.~' 154.~.
1828 1~ 106
IfLOYo 153C I ~ l,"
tcoc ~ 17'
1n94 ~ .~ - _~'
CROS.Y ~ st ~' 1588 |
~ 180 1 1
. , 1
FIGURE 5.~: Locations en c! levels of piezometric-head data in West
Texas/New Mex~co. Amarillo is located close to the "1527" well in Pot-
ter Country. Reprinted, by permission, from Cressie (1989a). Copyright
(~) l9S9 by American Statistical Association.
Orthogonal directions, values of ~ = (~, 82, 83) in
27

OCR for page 87

99
Doo f Snatch Co.
~41 _
31q2
2843
~44 _
_
946 -
616 -
'347 _
piezometric
heat (fit)
off
Buzz
la48 - ~ .
South ~ .
0.~ _
1
72. 0~
1
A;
144.~ 81;00 900 -63.00
-135 00
N ~—7f
FIGURE 5.2: Three dimensional view of krig~ng surface {Z(so): so ~ D),
from the northeast corner of D. Reprinted, by permission, from Cressie
(1989a). Copyright ~ 1989 by American Statistical Association.

OCR for page 87

100
Amarillo, Texas. However, Amarillans need not be concerned; a decision
was made in 1987 by the U. S. Congress to locate the nation's high-level
nuclear waste dump site in Nevada, probably at Yucca Mountain.
Acid Deposition and Network Design
It is generally accepted that an important factor in the relatively recent
increase of acid deposition is the emission of industrial by-products into
the atmosphere; the consequences for aquatic and terrestrial ecosystems
are potentially disastrous. Most fish populations in freshwater lakes are
very sensitive to changes in pH (ElFAC, 1969~. More fundamentally, such
changes could also adversely affect most other aquatic organisms and plants,
resulting in a disruption of the food chain. Acid deposition has also been
closely connected with forest decline (Pitelka and Raynal, 1989) in both
Europe and the United States.
In the United States, acid deposition results mainly from the atmospheric
alteration of sulfur and nitrogen air pollutants produced by industrial pro-
cesses, combustion, and transportation sources. Total acid deposition in-
cludes acid compounds in both wet and dry form. Dry deposition is the
removal of gaseous pollutants, aerosols, and large particles from the air by
direct contact with the earth (NAPAP, 1988~. Since dry deposition is ctif-
ficult to monitor, and attempts at any such monitoring are relatively new,
we focus on wet deposition here.
Wet deposition, or acid precipitation as it is commonly caned, is defined
as the hydrogen ion concentration in all forms of water that condenses from
the atmosphere and fans to the ground. Measurement of the total annual
amount of hydrogen ion is the end result of a very complicated process begin-
ning with the release of pollutants into the atmosphere. They might remain
there for up to several days and, depending on a variety of meteorological
conditions (e.g., cold fronts or wind currents), they may be transported large
distances. While in the atmosphere, the pollutants are chemically altered,
then redeposited on the ground via rain, snow, or fog.
A model for the spatial distribution of total yearly hydrogen ion (H+),
measured on the Utility Acid Precipitation Study Program (UAPSP) net-
work in 1982 and 1983, was developed by Cressie et al. (1990~. We present
their results for the 1982 data, including implications of the fitted model for
network design.
Figure 5.3 is a map of the eastern half of the United States, showing the
19 UAPSP monitoring sites. Their latitudes, longitudes, and annual acid

OCR for page 87

101
FIGURE 5..3: Monitoring sites of the UAPSP network for the years 1982 and
1983. The square denotes an optimally located additional site. Reprinted,
by permission, from Cressie et al. (1990~. Copyright ~ 1990 by Elsevier
Science Publishers, Physical Sciences and Engineering Division.
~ 2 (~)
5.83
4.29
2.74
,~
. _
\
_ _ _
E ~
\
_\
//~1O /
~32°
FIGURE 5.4: Median-polish surface obtained from the 1982 data. Units
on the vertical axes are in ,umoles H+/cm2. Reprinted, by permission, from
Cressie et al. (1990~. Copyright ~ 1990 by Elsevier Science Publishers,
Physical Sciences and Engineering Division.

OCR for page 87

102
depositions (in Emote H+/cm2) for 1982 (and 1983) are presented in Cressie
et al. (1990~. By grouping nearby sites, a 4 x 3 table of acid-deposition data
was constructed. The table was then median-polished (e.g., Emerson and
Hoaglin, 1983), from which a crude picture of the large-scale variation was
obtained; see Figure 5.4.
In the east-west direction there appears to be a positive linear trend,
reflecting higher acid-deposition levels in the east. However, in the north-
south direction, the trend is quadratic, with higher levels in the central
region and lower levels in the extreme north and extreme south.
The surface in Figure 5.4 was subtracted from the original data to ob-
tain residuals, Basil: i = 1,...,19~. Using great-arc distances to de-
fine distances between sites, an isotropic (robust) variogram estimator was
computed, to which a spherical variogram mode! was fit by weighs Ed least
squares (§5.2.1~. The fitted parameters were co = 0.608(,umoles H+/cm2~2,
cs = 2.041~`umoles H+/cm2~2, and Us = 361.210miles. Figure 5.5 gives a
graphical representation of the results.
Optimal spatial prediction (ordinary kriging) can be implemented on
the residual process b(~) through equations (5.6), (5.10), and (5.11~. A
predicted surface of acid-deposition levels can then be obtained by adding
back this kriging surface (of the residual process) to the median-polish sur-
face shown in Figure 5.4. This is called median-polish kriging by Cressie
(1986~. The mean-squared errors of prediction (median-polish-kriging vari-
ances) (~k(So): So ~ eastern United States) are given by (5.11) and will
now be used to choose the optimal location of a new site.
Let S—Use, . . ., sn) denote the existing network and let sp _ {Sn+] ~ . . .
sn+m) denote m > 2 potential new sites from which one will be chosen.
Define S+i _ S U {sit, i = n + 1, . . ., n + m to be augmented networks.
Then S+j is preferred if it predicts best the remaining m - 1 sites in So (on
the average).
Specifically, let Basso; S+i) denote the kriging variance for predicting the
acid-deposition level at so using the augmented network S+i, where i =
n + 1, . . ., n+ m. For illustration, define the objective function
72+m
V(sj) = Ad, Nisi; S+j)/(m—1), j = n + 1, . . ., n + m . (5.23)
i=n+1
Then the site in Sp that achieves min{V(s'): j = n + 1, . . ., n + m) will be
declared the optimal site to add. (Other criteria are considered in Cressie
et al., 1990.)

OCR for page 87

103
7
2 —
1
o
*
*
0 200 400 600
h
6300 1 000 1 200
FIGURE 5.5: Empirical variograms (robust) for median-polished residuals.
The superimposed dashed line indicates the weighted-least-squares fit. Units
on the vertical axes are in (,umoles H+/cm2~2; units on the horizontal axes
are in miles. Reprinted, by permission, from Cressie et al. (1990~. Copyright
~ 1990 by Elsevier Science Publishers, Physical Sciences and Engineering
Division.
Eleven potential sites (Minneapolis, Minnesota; Des Moines, Iowa; Jef-
ferson City, Missouri; Madison, Wisconsin; Springfield, Illinois; Altoona,
Pennsylvania; Charlottesville, Virginia; Charleston, West Virginia; Balti-
more, Maryland; Trenton, New Jersey; and Knoxville, Tennessee) were cho-
sen to improve geographic coverage of the existing network (of 19 sites).
From among these eleven sites, Baltimore (marked with a square on Fig-
ure 5.3) was chosen as the optimal site to add. Its associated average kriging
variance, given by (5.23), was 2.56(,umolesH+/cm2~2, compared to Min-
neapolis's 2.59 (the second smallest value); Charlottesville had the largest
value of 2.77.
5.3.2 Research Frontiers
Change of Support
The change-of-support problem remains a major challenge to geostatisti-
cians. Although data come as Z = (Z(s~),...,Z(sn))', inference may be
required for Z(B) —— ~~ {B Z~u)du. Kriging adapts very easily to ac-

OCR for page 87

104
commodate the change from point support so to block support B. For
example, in (5.10) and (5.11), fly is modified to ~(B) —— [~BjlB:(s~ -
U) flu, . . ., ~ iB 7(su—u) bull, and in (5.11) ak(~) has the extra term
~ iBiB 7(U—v) MU ~v. But in mining applications and emission com-
pliance, for example, the quantity of greatest interest is the conditional
distribution Pr(Z(B) ~ z ~ Z). Both disjunctive kriging (Matheron, 1976)
and indicator kriging (Journel, 1983) attempt to answer this question based
on bivariate distributional properties of the (possibly transformed) process.
The problem is important enough to pursue beyond these initial approaches.
Multivariate Spatial Data
Prediction of a value Z(sO) based on data Z and observations on other
processes is known as cohriging. The appropriate generalization of the vari-
ogram (5.1) is the cross variogram
var(Y(u)—Z(v)) = 2:yz~u, v),
(5.24)
where Y(u) and Z(v) are normalized to have the same units. CoLriging
equations for predicting Z(so) from Z and Y can be obtainer! in terms of
fizz, Gym, and Ye (Clark et al., 1989~. However, there is a dearth of models
for (5.24~; the basic requirement for a valid model is that its parameters can
be estimated from the partial realization (Z', Y') of the bivariate process.
Variogram Mode] Fitting and its Effect on Inferences
The variogram (5.1) has the property of conditional negative-definiteness.
Based on a nonparametric estimator 2~( ), say, current practice is to fit a
parametric mode! 2~; B], which is guaranteed to be conditionally neg;ative-
~ day ·, ~ . ~ . ~ ~ . · ~ . . ~ ~ ~ ~ ~ . ~ , ~ ~1
definite. Is there a way to find a nonparametr~c fit to 2 yL ) from the set ol all
conditionally negative-definite functions? If it can be found, its description is
not likely to be very parsimonious. Variogram-model choice should probably
balance the closeness of its fit to the data, with its predictive power. For
temporal data, Rissanen (1984, 1987) takes such an approach; however, his
being able to sequence the observations is important, since the accumulated
prediction errors form an integral part of his method. Development of a
spatial version is an area worth investigating. Now, having chosen a model
2~; B), what effect does the estimation of 49 have on inferences for Z(so)?
Zimmerman and Cressie (1991) have some initial results, but considerable
further research is needed to resolve this important problem.

OCR for page 87

105
Bibliography
[1] Bilonick, R. A., The space-time distribution of sulfate deposition in the
northeastern United States, Atmos. Environ. 17 (1985), 2513-2524.
[2] Clark, I., Practical Geostatistics, Applied Science Publishers, Essex,
England, 1979.
A] Clark, I., K. L. Basinger, and W. V. Harper, MUCK: A novel ap-
proach to co-kriging, pp. 473-493 in Proceedings of the Conference on
Geostatistical, Sensitivity, and Uncertainty Methods for Groundwater
Flow and Radionuclide Transport Modeling, B. E. Buxton, ea., Battelle
Press, Columbus, Ohio, 1989.
[4] Cressie, N., Fitting variogram models by weighted least squares, '7. Int.
Assoc. Math. Geol. 17 (1985), 563-586.
[5] Cressie, N., Kriging nonstationary data, A. Amer. Stat. Assoc. 81
(1986), 625-634.
[6] Cressie, N., Spatial prediction and ordinary kriging, Mathematical Ge-
ology 20 (1988), 405-421.
[7] Cressie, N., Geostatistics, Am. Stat. 43 (1989a), 197-202.
[8] Cressie, N., The many faces of spatial prediction, pp. 163-174, in Geo-
statistics, Vol. 1, M. Armstrong, ea., Kluwer, Dordrecht, l989b.
[9] Cressie, N., The origins of kriging, Mathematical Geology 22 (1990),
forthcoming.
[10] Cressie, N., Statistics for Spatial Data, John Wiley and Sons, New York,
forthcoming (1991~.
t11] Cressie, N., and D. M. Hawkins, Robust estimation of the variogram,
I, ]. Int. Assoc. Math. Geol. 12 (1980), 115-125.
t12] Cressie, N. A. C., and R. Horton, A robust/resistant spatial analysis of
soil-water infiltration, Water Resour. Res. 23 (1987), 911-917.

OCR for page 87

106
[13] Cressie, N., C. A. Gotway, and M. O. Grondona, Spatial prediction from
networks, Chemometrics and Intelligent Laboratory Systems 7 (1990),
251-271.
[14] David, M., Geostatistical Ore Reserve Estimation, Elsevier, Amster-
dam, 1977.
[15] Emerson, J. D., and D. C. Hoaglin, Analysis of two-way tables by medi-
ans, pp. 166-210, in Understanding Robust and Exploratory Data Anal-
ysis, D. C. Hoaglin, F. Mosteller, and J. W. Tukey, eds., John Wiley
and Sons, New York, 1983.
[16] European Iniand Fisheries Advisory Committee (ElFAC), Water qual-
ity criteria for European freshwater fish, Water Res. 3 (1969), 593-611.
[17] Gandin, L. S., Objective Analysis of Meterologicat Fields, Gidromete-
orologicheskoe ~z~atel'stvo (GIMIZ), Leningrad, 1963 (translated by
Israel Program for Scientific Translations, Jerusalem, 1965~.
[18] Gilbert, L. E., Are topographic data sets fractal?, Pure and Appl. Geo-
phys. 131 (1989), 241-254.
[19] Golub, G. H., and C. F. Van Loan, Matrix Computations, The Johns
Hopkins University Press, Baltimore, 1983.
t20] Gotway, C. A., and N. Cressie, A spatial analysis of variance applied
to soil-water infiltration, Water Resour. Res. 26 (1990), forthcoming.
[21] Harper, W. V., and J. M. Furr, Geostatistical Analysis of Potentiomet-
ric Data in the Wolfcamp Aquifer of the Palo Duro Basin, Texas, Tech-
nical report EMI/ONWI-587, Battelle Memorial Institute, Columbus,
Ohio, 1986.
t22] Hart, J. F., Central tendency in areal distributions, Economic Geogra-
phy 30 (1954), 48-59.
t23] Journel, A. G., Nonparametric estimation of spatial distributions,
IT. Int. Assoc. Math. Geol. 15 (1983), 445-468.
[24] Journel, A. G., and C. J. Huijbregts, Mining Geostatistics, Academic
Press, London, 1978.
t25] Jowett, G. H., The accuracy of systematic sampling from conveyer belts,
Appl. Stat. 1 (1952), 50-59.

OCR for page 87

107
t26] Kolmogorov, A. N., The local structure of turbulence in an incom-
pressible fluid at very large Reynolds numbers, Doklady Akademii Nauk
SSSR 30 (1941), 301-305. Reprinted in Turbulence: Classic Papers on
Statistical Theory, S. K. FriedIander and L. Topping, eds., Interscience,
New York, 1961, 151-155.
[27] Malinverno, A., Testing linear models of sea-floor topography, Pure and
Appl. Geophys. 131 (1989), 139-155.
[28] Matheron, G., Principles of geostatistics, Econ. Geol. 58 (1963), 1246-
1266.
[29] Matheron, G., The Theory of Regionalized Variables and its Ap-
plications, Cahiers du Centre de Morphologie Mathematique 5,
Fontainebleau, France, 1971.
t30] Matheron, G., A simple substitute for conditional expectation: The
disjunctive kriging, pp. 221-236, in Advanced Geostatistics in the Min-
ing Industry, M. Guarascio, M. David, and C. Huijbregts, eds., Reidel,
Dordrecht, 1976.
(31] National Acid Precipitation Assessment Program (NAPAP), Interim
Assessment, the Causes and Effects of Acidic Deposition, vols. T. r~,
Ill, IV, U. S. Government Printing Office, Washington, D.C., 1988.
t32] Ord, J. K., an<1 M. Rees, Spatial processes: Recent developments with
applications to hydrology, pp. 95-118 in The Mathematics of lIydrology
and Water Resources, E. H. Lloyd, T. O'Donnell, and J. C. Wilkinson,
eds., Academic Press, London, 1979.
t33] Pitelka, L. F., and D. J. Raynal, Forest decline and acid deposition,
Ecology 70 (1989), 2-10.
[34] Rissanen, J., Universal coding, information, prediction, and estimation,
IEEE Trans. Inf. Theory 30 (1984), 629-636.
[35] Rissanen, J., Stochastic complexity, ~7. R. Stat. Soc. B 49 (19S7), 223-
239.
[36] Shaw, P. R., and D. K. Smith, Statistical methods for describing
seafloor topography, Geophys. Res. I:,ett. 14 (1987), 1061-1064.

OCR for page 87

108
[37] Smith, D. K., and T. H. Jordan, Seamount statistics in the Pacific
Ocean, J. Geophys. Res. 93 (1988), 2899-2918.
[38] Thiebaux, H. J., and M. A. Pedder, Spatial Objective Analysis with
Applications in Atmospheric Science, Academic Press, London, 1987.
[39] Watson, G. S., Trend surface analysis and spatial correlation, Geol. Soc.
Amer., Special Paper 146 (1972), 39-46.
[40] Watson, G. S., Smoothing and interpolation by kriging and with splines,
]. Int. Assoc. Math. Geol. 16 (1984), 601-615.
[41] YagIom, A. M., Some classes of random fields in n-dimensionaI space,
related to stationary random processes, Theory Probab. Its Appl. 2
(1957), 273-320.
[42] ZeDner, A., Bayesian estimation and prediction using asymmetric Toss
functions, '7. Amer. Stat. Assoc. 81 (1986), 446-451.
t43] Zimmerman, D. L., and N. Cressie, Mean squared prediction error in
the spatial linear mode! with estimated covariance parameters, Ann.
Inst. Stat. Math. 43 (1991), forthcoming.
[44] Zimmerman-, D. L., and M. B. Zimmerman, A Monte CarIo comparison
of spatial semivariogram estimators and corresponding ordinary kriging
predictors, Technometrics 33 (1991), forthcoming.