Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 87

Appendix B
Recent Research Accomplishments
and Related Opportunities
This appendix describes research achievements in the mathematical
sciences and outlines prospects and opportunities that these achieve-
ments open up. The richness and diversity of these achievements,
which span core mathematics and a wide range of applications, point
to the vigor, creativity, and depth and breadth of current research in
the mathematical sciences. The unification and cross-fertilization of
areas within core mathematics, increaser! reaching out to applications
(which often uncovers unusual and unexpected uses of mathematics),
and the growing role of the computer are all themes that are illus-
trated in these descriptions.
It should be emphasized that this list is only a selection that is not
intended to be complete or comprehensive, nor is it intended to be an
agenda for the future. Many important achievements and opportuni-
ties are not ~liscussed for lack of space. If past patterns continue, a
number of new achievements that we cannot visualize now will open
up yet newer opportunities. It is interesting and significant to note
how many of the achievements described in this appendix were not
even suggested in the appendix "Ordering the Universe: The Role of
Mathematics" in the 1984 Report.
The committee would like to thank the following individuals for their
assistance in preparing this appendix:
W. Ballhaus, N. Breslow, R.L. Bryant, D.M. Burns, S. Childress, W.
Cleveland, R.R. Coifman, G.B. Dantzig, H. Flaschka, J. Geanakoplos,
J.G. Glimm, L. Gordon, L.F. Greengard, J. Harris, T. Hoist, W-C. Hsiang,
87

OCR for page 87

APPENDIX B
A.M. Jaffe, A. Jameson, N. Jewell, D.S. Johnson, R.M. Karp, H. Kocak,
A. Kupianinen, H.B. Lawson, F.T. Leighton, C.E. Leith, G.L. Lieber-
man, A. Mazda, A. Marden, B. Mazur,tW. Murray, F.M. Odeh, C.S.
Peskin, P. Seymour, L.A. Shepp, T.C. Spencer, P. Switzer, M.S. Water-
man, S. Winograd, and I.A. Yorke.
The following topics are discussed:
1. Recent Advances in Partial Differential Equations
2. Vortices in Fluid Flow
3. Aircraft Design
4. Physiology
5. Medical Scanning Techniques
6. Global Change
7. Chaotic Dynamics
8. Wavelet Analysis
9. Number Theory
10. Topology
11. Symplectic Geometry
12. Noncommutative Geometry
Computer Visualization as a Mathematical Tool
Lie Algebras and Phase Transitions
15. String Theory
16. Interacting Particle Systems
17. Spatial Statistics
18. Statistical Methods for Quality and Productivity
19. Graph Minors
20. Mathematical Economics
21. Parallel Algorithms and Architectures
22. Randomized Algorithms
23. The Fast Multipole Algorithm
24. Interior Point Methods for Linear Programming
25. Stochastic Linear Programming
26. Applications of Statistics to DNA Structure
27. Biostatistics and Epidemiology
SYNOPSIS OF TOPICS
In the first section, "Recent Advances in Partial Differential Equa-
tions'" the items discussed are formation of shocks in non-linear waves,
recent advances in elliptic equations, free boundary problems, and
finally some remarkable advances in exactly solvable partial differen-
88

OCR for page 87

APPENDIX B
fiat equations. "Vortices in Fluid Flow', (Section 2) continues some of
these general themes to discuss vortex motion in fluid flow, a phe-
nomenon of great importance in many applications, including the
accurate tracing of hurricanes, the study of blood flow through the
heart, the efficient mixing of fuel in internal combustion engines, air-
craft flight, and the manner in which radiotelescopes sense distant
galaxies through the motion of galactic jets.
"Aircraft Design" (Section 3) illustrates the use of computational fluid
dynamics, a technique that has matured so that it is now seen as the
primary aerodynamic design tool for any problem. Analogous com-
puter models are described in Section 4, "Physiology," which dis-
cusses computational fluid models of the heart and other organs.
Modern medical scanning techniques using X-rays or nuclear mag-
netic resonance depend critically on algorithms deriving from the
mathematics of the Radon transform. Recent progress in emission
tomography is based on some newly developed algorithms of a very
different sort that have probabilistic elements; these developments are
described in Section 5, "Medical Scanning Techniques." Finally, "Global
Change" (Section 6) discusses the key role played by computational
fluid dynamics in global circulation models that are used in the analy-
sis of climate change on a worldwide scale.
Section 7, "Chaotic Dynamics," shows how ideas of Poincare on aperi-
odic orbits for ordinary differential equations, complemented by ideas
from topology, differential geometry, number theory, measure theory,
and ergodic theory, plus the ability of modern computing facilities to
compute trajectories, have led to a body of core mathematics that has
many interesting and important applications.
"Wavelet Analysis" (Section 8) outlines how classical ideas growing
out of Littlewood-Paley and Calder6n-Zygmund theory have been
developed within core mathematics and then have led to new and
very efficient numerical tools for analysis of a wide variety of prob-
lems. Algorithms based on wavelet analysis promise to significantly
speed up communications and signal-processing calculations. The
discussion titled "Number Theory" (Section 9) centers on a classic
area of core mathematics that is actively and vigorously moving for-
ward, spurred on in part by the resolution of the Mordell conjecture in
the early l980s. Advances of great significance for the future include
new results on the local-to-global problem in number theory and in
arithmetic algebraic geometry, and significant progress on Fermat's
89

OCR for page 87

so
APPENDIX B
last theorem. Section 10, "Topology," notes important advances in
major problems in low-dimensional topology, including remarkable
connections with Yang-Mills theory, and recent advances in knot the-
ory that involve a striking and unexpected connection with van Neu-
mann algebras and mathematical physics.
Section 11, "Symplectic Geometry," is devotecl to important recent
developments in that field, including the use of nonlinear elliptic
equations to establish a form of the Heisenberg uncertainty principle,
the discovery of new kinds of symplectic structures, and a basic ad-
vance in the understanding of regions of stability for area-preserving
maps of the plane.
"Noncommutative Geometry" (Section 12) describes a broad spectrum
of very interesting developments involving a link between analysis
and geometry and how the ideas of differential geometry extend to a
noncommutative setting. This is an excellent example of cross-fertili-
zation between areas within core mathematics and the building of an
internal unification.
The availability of powerful computers is stimulating research in core
mathematics. Section 13, "Computer Visualization as a Mathematical
Tool," indicates how computer graphics can be used as a tool in mini-
mal surface theory and other areas of geometry to enhance under-
standing and provide experimental evidence stimulating conjectures.
The exposition in Section 14, "Lie Algebras and Phase Transitions,"
displays the rich and deep interaction between this topic in statistical
mechanics and a number of areas of mathematics, including especially
the Kac-Moody Lie algebras. "String Theory', (Section 15) discusses
another topic in physics that relies heavily on developments from core
mathematics. Section 16, "Interacting Particle Systems," indicates how
systems similar to those discussed in the context of phase transitions
can have applications in the study of biological systems, image proc-
essing, and for medical and defense purposes. "Spatial Statistics"
(Section 17) describes an area that addresses some overlapping prob-
lems and uses new statistical tools for handling data in multidimen-
sional arrays. Section 18, "Statistical Methods for Quality and Produc-
tivity," discusses problems, opportunities, and new methods for ad-
dressing important problems of national interest and significance.
"Graph Minors" (Section 19) surveys some recent results in graph
theory, which open up new avenues for research especially important

OCR for page 87

APPENDIX B
in the design of algorithms. Section 20, "Mathematical Economics,"
describes some important recent developments and discusses how
several parts of core mathematics, especially differential topology,
have played key roles in the analysis of general equilibrium theory for
incomplete markets, a new departure that is a better model for real
markets than the now classic model for complete markets.
The next group of topics have as a common general theme the search
for new and efficient algorithms. "Parallel Algorithms and Architec-
tures" (Section 21) concerns the design of algorithms to take advan-
tage of parallel architectures, a problem not only for computer scien-
tists but also for mathematicians working in large-scale computation.
Here the idea is to see how problems that seem to be inherently se-
quential can be parallelized. Section 22, "Randomized Algorithms,"
describes recent progress in the development of these new kinds of
algorithms. Such algorithms are useful in primality testing, with re-
sulting consequences for cryptography, in sorting and searching algo-
rithms, in the design of distributed computing systems, and in many
other areas. The subject of Section 23, "The Fast Multipole Algo-
rithm," is a new, very efficient algorithm for computing interactions
in many-particle systems. This algorithm will have many applications
in the modeling of high-powered electron beam devices and in mo-
lecular dynamics, which affects theoretical studies of chemical kinet-
~cs.
The next two sections discuss recent advances in algorithms for nu-
merical optimization; Section 24 is devoted to the new and very im-
portant interior point methods for linear programming, which pro-
vide an alternative to the classic simplex methods and are beginning
to have a significant practical impact in the design of telecommunica-
tions networks and the solution of large-scale logistics planning and
scheduling problems. Section 25 discusses yet another approach-
stochastic linear programming, a technique that allows one to include
non-deterministic elements in the formulation and solution of a prob-
lem. Thereby real problems that involve uncertainties in future be-
havior or availability of resources can be better modeled.
Sections 26 and 27 discuss an array of applications of mathematics in
various additional areas. "Applications of Statistics to DNA Struc-
ture" includes as an application the statistical analysis of options for
cutting the DNA sequence to aid in the mapping processes) and ana-
lyzing the evolutionary process at the genetic level. "Biostatistics and
Epidemiology" is devoted to the use of statistics in epidemiology,
91

OCR for page 87

APPENDIX B
including survival analysis, analysis of incidence rate and relative
risk, and deconvolution techniques for estimating infection rates and
incubation periods from observed data.
1. RECENT ADVANCES IN PARTIAL DIFFERENTIAL
EQUATIONS
An important trend of the last 15 years has been the great progress
made in understanding nonlinear partial differential equations (PDEs).
Many physical phenomena are described by partial differential equa-
tions, e.g., fluid flow, electromagnetic fields, gravity, and heat. Roughly
speaking, linear partial differential equations govern small vibrations
or small disturbances from equilibrium, while nonlinear equations
govern large disturbances. The real world is nonlinear. Since the
mid-1970s, understanding of nonlinear equations has grown much
deeper. Finally, in the last few years, some of the most important
equations from geometry, physics, and engineering have been suc-
cessfully studied. Many other equations are still too hard, and much
more work is needed. Among the important problems solved recently
are the following.
Formation of Shocks in Nonlinear Waves
In one space dimension, a small, smooth initial disturbance will be
propagated by any truly nonlinear wave equation into a shock after a
finite time. In more than four space dimensions, such shocks do not
have to form. In three dimensions, "most" wave equations lead to
shocks, but only after an exponentially long time. Moreover, an
important class of equations (those satisfying a natural geometric
property called the "null condition") do not build up shocks. Very
recently there have been significant advances for one of the most
important nonlinear equations, Einstein's equations for gravitational
waves. At large distances and after a long time, one has a detailed
picture of how gravitational waves behave. Very difficult and inter-
esting questions remain in the study of Einstein's equations. Of spe-
cial interest is the formation of black holes.
Elliptic Equations
Another important class of partial differential equations arises in
geometry, when one tries to construct surfaces with prescribed curva-
ture. These equations are called nonlinear elliptic differential equa-
92

OCR for page 87

APPENDIX B
lions. A general theorem on regularity of solutions of elliptic equa-
tions with boundary conditions was recently proved, making it pos-
sible to treat boundary conditions that arise inevitably in real prob-
lems. This result is a basic step forward in the analysis of partial
differential equations.
Important progress has been made also on some singular elliptic
equations, namely those having "critical nonlinearity." If the- nonlin-
ear term in such an equation were made slightly weaker, then the
equation could be regarded as a small perturbation of a linear prob-
lem, but at the critical nonlinearity this becomes impossible. An out-
standing equation of this kind occurs in the Yamabe problem, in which
one is asked to deform a curved manifold until it has constant (scalar)
curvature. A complete solution to this problem has recently been
established.
Free Boundary Problems
An iceberg melting in the sea, the flow of oil and water through a
reservoir, and crystal growth are examples of free boundary problems
governed by partial differential equations. For the melting iceberg,
the temperature flow in the iceberg is governed by one parabolic
partial differential equation, the temperature flow in the water around
the iceberg by another, and the boundary between ice and water is
given by a third equation. The three equations are coupled. What
makes the problem very hard is the fact that the domains where the
differential equations are satisfied keep changing with time, and are
not known ahead of time. Recently proposed techniques have already
led to new regularity theorems for free boundary problems and prom-
ise further results.
Exactly Solvable Partial Differential Equations
Remarkably, a number of nonlinear PDEs can be solved exactly. These
equations admit stable solutions (solitons) that persist even after inter-
action with other solitons. Recently, equations for solitons have been
used to solve the Schottky problem, an outstanding open problem in
the theory of Riemann surfaces.
The method used to solve soliton equations may be illustrated by the
case of the Korteweg-deVries (KdV) equation, which describes the
propagations of water waves in a long, narrow channel. At a single
93

OCR for page 87

APPENDIX B
instant of time, we imagine the shape of the water wave to be frozen
and rigid. We then bombard the rigid shape with imaginary quan-
tized test particles. By studying how the test particles are scattered,
one can reconstruct the shape of the wave. Thus, the scattering data
provide an alternative description of the wave at a fixed time. Instead
of asking how the shape of the wave changes with time, we can there-
fore ask how the scattering data evolve with time. When rewritten in
terms of scattering data, the K6V equation becomes amazingly simple,
and the complete solution may be written down by inspection. In
particular, the highly stable behavior of solitons is explained for the
case of the K6V equation.
More recently, a number of physically interesting PDEs have been
solved completely by analogous methods, including the Kadomtsev-
Petviashvili (K-P) equation for weakly two-dimensional water waves,
and the sine-Gordon and nonlinear SchrBdinger equations. Explicit
solutions of the K-P equation successfully predicted the results of
experiments in water tanks, and a combination of theoretical and
numerical analysis has been applied to model the behavior of a Jo-
sephson junction. Remarkable connections have been discovered be-
tween explicit solutions for nonlinear waves, exact solutions of statis-
tical mechanics problems in two dimensions, and the Jones polynomi-
als for knots, some of which are discussed below in sections on phase
transitions and topology.
2. VORTICES IN FLUID FLOW
Intense swirling or vortex motion is a primary feature of many prob-
lems, including the accurate tracing of hurricanes and studies of blood
flow through the heart, efficient fuel mixing in carburetors, aircraft
flight, and the manner in which radiotelescopes sense distant galaxies
through the motion of galactic jets.
Governing much of this flow is a complicated set of nonlinear partial
differential equations called the Navier-Stokes equations; these equa-
tions are derived from Newton's laws of motion and include the fric-
tional effects of viscosity. Intuition suggests that this frictional effect
is extremely small in air or rapidly moving water, and this is con-
firmed by experiments. The simpler partial differential equations
obtained when this coefficient vanishes are called Euler equations.
These are accurate enough for studying the movement and accumula-
tion of vortices.
94

OCR for page 87

APPENDIX B
Recent ingenious large-scale computer simulations using these equa-
tions reveal unexpectedly that the sheets where vorticity typically
accumulates clump and concentrate in an amazing fashion. In re-
sponse to these discoveries, a new mathematical theory of "oscilla-
tions and concentrations" has developed using potential theory and
fractal (Hausdorff) measures. New kinds of "solutions" for the Euler
equations are being introduced. One outgrowth of this theory is an
explicit criterion to check whether numerical calculations for vortex
sheets actually converge to solutions of the Euler equations. Conver-
gence has been verified for many calculations of importance.
The vortex sheets in the applications just described involve fluid moving
at rapid speed but still much less than the speed of sound. Com-
pletely different phenomena transpire when the vortex sheets are
supersonic, as they are for the new space planes and for galactic jets in
astrophysics. One recent success in the alliance between large-scale
computing and modern mathematical theory is the discovery of a new
mechanism of nonlinear instability for supersonic vortex sheets. Recent
large-scale simulations have demonstrated that all supersonic vortex
sheets exhibit nonlinear instability, belying the predictions of stability
made in the 1950s and 1960s.
One of the most important problems in fluid dynamics, an extension
of the study of vortices, is the understanding of turbulence, which
occurs when the frictional effect is extremely small but not negligible.
Understanding turbulence requires the mathematical analysis of solu-
tions of the Euler and the Navier-Stokes equations in the limit of small
viscosity. This analysis is ongoing.
3. AIRCRAFT DESIGN
Within the last five years, full simulations of a whole aircraft have
appeared. Such a computation usually starts with steady Euler equa-
tions that accurately describe the flow outside the boundary layer.
Such flows are smooth until the Mach number, M, comes close to 1.
For Mach numbers in the transonic range that is, less than but close
to 1 small shocks are generated from a typical airfoil that dramati-
cally increase the drag. It is a mathematical theorem that in almost all
cases such shocks cannot be avoided. Since the cruising efficiency of a
plane is roughly proportional to ML/D, where L is lift and D is cirag, it
is imperative for manufacturers to design aircraft that minimize shocks.
Of course if M exceeds 1, there is no avoiding or even minimizing
95

OCR for page 87

APPENDIX B
shocks, and we have the inefficiency of the Concorde. In the past 15
years, great effort has been put into designing two-dimensional airfoil
cross-sections that at some cruising speed or range of cruising speeds
with M less than 1 have minimal shocks. When a wing cross-section is
chosen, the flow at design conditions is computed and compared with
wind tunnel results.
To extend the computation to the whole aircraft, new computational
capabilities have been added. The complex geometrical configura-
tions demand new methods not only for discretizing the equations but
also for handling the enormous volume of data. Currently the chal-
lenge is to resolve higher-dimensional shocks and vortex sheets to
predict viscous effects as described in the previous section. The most
useful end product of simulations is a determination of how surface
pressure varies with such parameters as Mach number and the angle
of attack. Varying parameters on the computer is much more eco-
nomical than doing enormous numbers of experiments in a wind tunnel.
The simple model provided by the Euler equations is remarkably good,
airplane flight being basically stable. But key elements are missing.
Despite considerable effort, there is still no good mathematical model
for the turbulent boundary layer, and when one is found it will in-
crease the size of the computation at least as much as by adding a
dimension. An ultimate goal of design is to pick an optimal pressure
distribution and then find the aircraft shape that corresponds to it.
Such inverse problems also increase drastically the computational needs.
The hope is that computer hardware speedups and algorithmic im-
provements will combine to make these goals achievable.
One area of particular note is in the design of aircraft engines. A
typical example is a turbomachinery compressor simulation where
instantaneous temperature contours are calculated. This computation
is based upon time-dependent Navier-Stokes equations. Simulations
show viscous wakes created by the blades and how some of the blades
chop or break these wakes into different pieces, creating an extremely
complex flow pattern. This flow pattern would be difficult or impos-
sible to describe and adjust without dependable mathematical models
coupled with computer simulations.
For very-high-altitude high-speed conditions, numerical simulations
are also being used for vehicle design. At these altitudes, air can
dissociate into its atomic constituents and even eventually ionize,
96

OCR for page 87

APPENDIX B
creating a situation that is extremely difficult to simulate in ground-
based experimental facilities. As a result, numerical flow simulations,
with the appropriate air chemistry models added, are currently being
used as an integral part of the design process for many high-speed or
atmospheric entry vehicles.
The best summary of the situation has been given by Goldhammer and
Rubbert:
The present state-of-th~art has progressed to the point where the design
engineer no longer considers Computational Fluid Dynamics (CFD) to be an
irritant imposed on him by a seldom seen researcher, but rather CFD is re-
garded as the primary aerodynamic design tool for any problem, and the wind
tunnel is treated as more of an evaluation and confirmation tools
4. PHYSIOLOGY
In the realm of physiology mathematical modeling has come into its
own over the last ten years. Today there are computational models of
the heart, the kidney, the pancreas, the ear, and many other organs.
Many of these models rely on fluid dynamics.
Physiological fluid dynamics has a long and illustrious history. Le-
onardo da Vinci first described the vortices that form behind heart
valves and that enable the valves to avoid backflow by closing while
the flow is still in the forward direction. Leonhard Euler first wrote
down the partial differential equations for blood flow in arteries. With
the recent flowering of computer technology and numerical algorithms,
there is unprecedented opportunity to simulate the fluid dynamic
functions of the human body at a level of detail sufficient to be of use
in the understanding and treatment of disease.
For instance, blood flow in the heart is governed by coupled equations
of motion of the muscular heart walls, the elastic heart valve leaflets,
and the blood that flows in the cardiac chambers. Computer solutions
allow one to study both normal and diseased states, and lead to the
design of prosthetic devices such as artificial valves and artificial
hearts. The methods used have a very general character since they are
applicable to any problem in which a fluid interacts with an elastic
medium of complicated geometry. Among these are the flow of sus-
pensions, blood clotting, wave propagation in the inner ear, blood
flow in arteries and veins, and airflow in the lung. Like much of
computational fluid dynamics, this work pushes computer technology
97

OCR for page 87

APPENDIX B
Almost from the beginning of the computer era, random number
generators have been applied to the simulation of complex systems
involving queueing and other stochastic phenomena and to the esti-
mation of multidimensional integrals and other mathematical quanti-
ties, using various sophisticated sampling techniques known collec-
tively as the Monte Carlo method.
A major factor in drawing the attention of computer scientists to the
wider uses of randomization was the discovery, around 1975, of two
efficient randomized algorithms for checking whether a number is
prime. Each of these algorithms is based on the concept of a witness
to the compositeness of a number. A simple illustration of this con-
cept is based on a theorem due to Fermat, which says that if n is a
prime number, then, for any integer m that is not a multiple of n,
me - ~' - 1 is a multiple of n. If this calculation is performed for some
m, and one does not get the result predicted by Fermat's theorem, then
n is composite (i.e., not prime); in this case, m is called a witness to the
compositeness of n. The tests mentioned are based on slightly more
complicated kinds of witnesses. The effectiveness of these tests stems
from theorems that show that, if n is composite, then most of the
integers between 1 and n -1 will serve as witnesses. An interesting
aspect of these tests is that they do not provide witnesses for primal-
ity, but this weakness was rectified in work that defined witnesses for
primality rather than compositeness, showing that if n is prime, most
randomly chosen numbers will bear witness to that fact. There are
many other randomized algorithms based on the abundance of wit-
nesses.
Randomized techniques have also proved to be a very effective tool
for algorithm construction in the areas of sorting, searching, and
computational geometry. A simple illustration is the problem of list-
inz all intersections among a set of line segments in the mane. There
~ ~ O ~ O--~ -- r--
· ~ · ~ ~ · · . ~ ~ · . t . ~ . · ~ . ~
lS a talrly OUVlOUS incremental algorithm that considers the segments
one at a time and reports the intersections of each new segment with
all the previous ones. If the segments are read in a particularly unfor-
tunate order then the run time of this algorithm will be excessively
long; however, it can be shown that if the segments are processed in a
random order, then with extremely high probability the algorithm
will be very fast.
In addition, randomization plays a crucially important role in the
design of distributed computing systems, in which many geographi-
124

OCR for page 87

APPENDIX B
catty dispersed computers connected by suitable communication links
work together as a single system. Such systems must cope with the
possibility that individual computers or communication links may fail
or may run synchronously at different speeds/ and must ensure that
the overhead of communication between processors will not become
an insurmountable obstacle. Randomization is particularly effective
in allocating computational tasks to the individual processors and in
choosing the communication paths along which data shall flow. It can
be shown in a variety of settings that random allocation of tasks to
processors and data to memory modules, together with randomized
routing of messages, yields near-optimal performance with high proba-
bility.
All the applications of randomization that we have mentioned depend
on the assumption that algorithms, or computer programs, have ac-
cess to a stream of independent random bits. More commonly, com-
puters use pseudorandom numbers that are generated from an initial
number, called the seed, by some purely deterministic iterative proc-
ess. These generators are typically subjected to certain statistical tests
in order to confirm that the streams of numbers they generate have
some of the properties of random sequences, even though they are
generated by a purely deterministic process. Currently, a deep line of
research into the properties of pseudorandom number generators is
being pursuecl. The goal of this research is to show that, as long as the
seed is random, the output of the generator cannot be distinguished
from a purely random sequence by any polynomial-time computa-
tional test whatsoever.
Finally, recent theoretical research has focused on a connection be-
tween pseudorandom generators and the concept of a one-way func-
tion, which is fundamental in cryptography. A one-way function is a
function that is easy to compute but hard to invert. It has been shown
that any one-way function can be used to construct a rigorously justi-
fied pseudorandom number generator. Unfortunately, researchers in
computational complexity theory have not yet determined whether
one-way functions even exist. This is one of the many important
problems remaining to be addressed.
23. THE FAST MULTIPOLE ALGORITHM
There are great opportunities for progress in algorithms dealing with
problems such as particle beams in plasma physics, underwater acous-
125

OCR for page 87

APPENDIX B
tics, molecular modeling, and even very important aspects of aerody-
namics. A basic calculation of central importance in these applica-
tions is the calculating of interactions in a many-particle system. These
interactions are often long-range, so all pairs of particles must be
considered. Because of the latter constraint, the direct calculation of
all interactions requires on the order of N2 operations in a system with
N particles. We will refer to this calculation as the N-body potential
problem.
There have been a number of efforts aimed at reducing the computa-
tional complexity of the N-body potential problem. The oldest ap-
proach is that of particle-in-cell (PIC) methods, requiring on the order
of N log(NJ operations. Unfortunately, they also require a mesh that
provides limited resolution and is inefficient when the particle distri-
bution is nonuniform. A more recent approach is that of the hierarchi-
cal solvers, which are gridless methods for many-body simulations,
having computational complexities also estimated to be of order
N log(N).
The Fast Multipole Method (FMM), which has recently been devel-
oped, requires an amount of work only proportional to N to evaluate
all pairwise interactions to within roundoff error, irrespective of the
particle distribution. Like the hierarchical solvers, the FMM is a di-
vide-and-conquer algorithm, based on the idea of clustering particles
on a variety of spatial length scales. The method is in fact based on a
combination of physics (multipole expansions), mathematics (approxi-
mation theory), and computer science, and its use in applications is
growing.
There are several immediate industrial applications for the techniques
being developed. The payoff should be substantial and almost imme-
diate in the straightforward use of particle simulations. Simulations
of this type are performed in the modeling of high-powered electron
beam microwave devices (e.g., gyrotrons and free-electron lasers),
particle beams, controlled fusion devices, and so forth.
A second immediate industrial application is in molecular dynamics,
a technique for studying the properties of fluids (and other phases of
matter) by computer simulation. Once initial positions and velocities
are chosen for some number of representative particles, their trajecto-
ries are followed by integrating Newton's second law of motion. In
126

OCR for page 87

APPENDIX B
early simulations, only nonpolar fluids were considered, where the
amount of computation per time step is proportional to the number of
particles N. In polar fluids, the situation is quite different. A coulom-
bic term is added to the potential function and all pairwise interac-
tions need to be accounted for, a standard N-body calculation. The
FMM allows for the simulation of much larger chemical systems than
was previously possible. The study of a protein molecule in water, for
example, requires following the motion of tens of thousands of atoms
over tens of thousands of time steps. Real gains are possible in the
long term, beginning with detailed theoretical studies of chemical
kinetics.
24. INTERIOR POINT METHODS FOR LINEAR PROGRAMMING
Many problems in resource allocation can be modeled by what is
called the "linear programming" problem, in which one attempts to
optimize a linear function over the vertices of a multidimensional
polytope. The traditional Simplex algorithm for this problem, which
works by traveling along the boundary of the polytope, has had immense
value and influence during the 40 years since it was discovered. It has
a significant theoretical drawback, however: its running time can, in
pathological cases, grow as an exponential function of the number of
variables. Much more desirable, at least from a theoretical point of
view, would be an algorithm with polynomially bounded worst-case
running time.
In 1976, the first such algorithm, the Ellipsoid method, was discov-
ered. Its running time was O(n4L2), where n is the number of variables
and L is a measure of the number of bits needed to describe the input.
This algorithm has the additional desirable property that it applies to
more general "convex programming." Moreover, it does not require a
complete description of the convex body over which optimization is to
take place, but merely a "black box" subroutine that, given a point,
tells whether that point is in the polytope, and if it is not, will identify
a hyperpla~ne that separates the point from the polytope. The Ellip-
soid method is thus applicable to a much broader class of problems
than is the Simplex method, and its existence has led to a wide variety
of polynomial-time algorithms for previously open problems. For
linear programming, however, researchers quickly discovered that its
improved worst-case running-time bound clid not correlate with bet-
ter performance in practice.
127

OCR for page 87

APPENDIX B
Polynomial-time programming algorithms thus seemed an impractical
theoretical nicety. In 1984 this was changed with the introduction of
a new breed of polynomial-time linear programming algorithm, based
on a clever variant of an old idea. The idea, one that had been aban-
doned long ago as impractical, is to cut across the interior of the
polytope in searching for an optimum vertex, rather than traversing
the outside as does Simplex. The difficulty in making such an "inte-
rior point" approach work lies in finding the right direction and dis-
tance to travel at each step. The solution involves the use of projective
transformations and a logarithmic potential function to guide the search,
and yields a running time of O(n35L2~. The theoretical improvement
over the Ellipsoid method running time was not the main story, however;
more important, this algorithm (along with several of its variants)
appears to be competitive with Simplex when implemented with
appropriate sparse matrix techniques. Moreover, it appears to have
substantial running-time advantages for large and/or degenerate in-
stances. Indeed, important practical applications have been found in
the design of large telecommunication networks and in the solution of
large-scale logistics planning and scheduling problems.
Since the first reports of the potential practicality of the approach,
there has been a torrent of research on interior point methods. Rela-
tions between this algorithm and earlier algorithms have been exten-
sively explored. For instance, the algorithm can be viewed as a type of
"logarithmic barrier function" algorithm, or even as an application of
Newton's method (in an appropriately transformed space). In the
limit of infinitesimal step length, it generates a vector field inside the
polytope, all of whose limiting trajectories go to the optimal vertex. In
this context, it can be viewed as attempting to follow such a trajectory
approximately, by taking short steps along tangent lines. This in turn
suggests variants in which one steps along curves that represent higher-
order power series approximations to the vector field. Other variants
concentrate on approximately following a particular trajectory, the so-
called "central trajectory." These latter have led to better and better
worst-case running times, with the current champion having a run-
ning time of O(n25L2), based on a clever use of recent developments in
the field of fast matrix multiplication.
New algorithms and insights continue to pour forth at a rapid rate,
and although it seems unlikely that this will lead to polynomial-time
solutions for the much harder NP-complete problems, there is much
128

OCR for page 87

APPENDIX B
hope that the interior point approach will greatly extend the range of
problems for which useful answers can be determined.
25. STOCHASTIC LINEAR PROGRAMMING
Deterministic models for linear programming problems and their so-
lutions, ever since their first appearance 40 years ago, have been keep-
ing pace with the extraordinary advances that have taken place in
computing power. At the present time, industry and government
routinely solve models in which there is no uncertainty; that is, they
solve deterministic linear and mathematical programs for planning
and scheduling purposes, some involving many thousands of vari-
ables with a linear or nonlinear objective function and many thou-
sands of inequality constraints. These assume, for example, that
knowledge about what future technologies will be available to choose
from is known with certainty. As a result, the solutions obtained from
deterministic models are incomplete because they do not properly
take account of future contingencies. Although it is easy to reformu-
late the mathematical models to include uncertainty, the resulting size
of the mathematical systems to be solved becomes too enormous in
most practical applications. The bottleneck to solving stochastic pro-
grams has been (and is) calculating capability.
Therefore, despite the progress made, there remains an unresolved
class of decision problems of great importance: that of finding an
"optimal" solution to stochastic mathematical and linear programs.
Stochastic here means uncertainty about, for example, the availability
of resources, foreign competition, or the effects of possible political
upheavals. Since such problems concern the optimal allocation of
scarce resources under uncertainty, it is of fundamental importance to
include uncertainty in the problem setup. If such problems could be
solved in general, it would significantly advance our ability to plan
and schedule complex situations.
At the present time there is intense activity taking place in the United
States and elsewhere to solve certain relevant classes of stochastic
linear and nonlinear programs. Important new developments in
computer hardware are spurring this effort, particularly the availabil-
ity of multiple vector processor mainframes and parallel processors.
It is hoped that the combination of three techniques improved ways
to mathematically decompose large-scale problems into smaller ones,
129

OCR for page 87

APPENDIX B
improved techniques of Monte Carlo (importance) sampling, and the
use of parallel processors will bring about important advances in the
relatively near future.
26. APPLICATIONS OF STATISTICS TO DNA STRUCTURE
A strand of DNA can be represented as a string of bases, A,C,G,T, that
carries the information governing the growth and function of an or-
ganism. Great effort has therefore been expended in determining
DNA sequences. Rapid sequencing methods were introduced in 1976
and were followed by an explosion in quantitative genetic informa-
tion. Today over 25 million letters of sequence have been determined,
in segments averaging length 1000, from a wide variety of organisms.
Improvements in sequencing technology continue to be made, and the
associated discoveries in basic biology are staggering.
Two kinds of maps are constructed for DNA: genetic maps and physi-
cal maps. Both types are generated from the use of restriction en-
zymes that cut DNA at specific patterns in the sequence, producing
fragments whose lengths can be measured with some degree of inher-
ent experimental error. It was suggested in 1980 that slight variations
in DNA sequence could produce differing restriction fragment lengths
that could be used as "markers" traits that could be approximately
mapped to specific locations on specific chromosomes. The amount of
data subsequently available has created a number of new statistical
problems.
Physical maps give the relative locations of identifiable and clonable
pieces of DNA. Availability of a physical map facilitates the complete
sequencing of a DNA strand. Given the mapped locations of a com-
plete library of clones each having a length on the order of several
tens of thousands of nucleotides a number of laboratories in coordi-
nation could then proceed simultaneously to sequence the individual
clones. We can expect statistical analysis of design options, such as
number and choice of cutting enzymes, to yield benefits in the map-
ping process. The process of physically locating clones along the
genome should be substantially facilitated by an understanding of the
design parameters and sources of variation inherent in the process.
Obtaining DNA sequence data is only a first step in modern molecular
biology. The sequence is next subjected to extensive analysis, to relate
it to what is already understood about DNA sequences. Because
130

OCR for page 87

APPENDIX B
evolution operates to preserve biological features of importance, in-
cluding sequence features, these analyses can be very important in
understanding the function of specific portions of the sequence. Use
of computers to implement complex algorithms is often required;
mathematical analysis of algorithms is essential, both to ensure an
unambiguous, informative interpretation of the results and to ensure
that a programmed algorithm will complete its operations rapidly
enough.
The study of molecular evolution has developed with the ability to
read DNA and protein sequences. It is just now becoming possible to
sample the sequence for a gene within a defined population. This
opens up many new questions. How does molecular evolution pro-
ceed, in the long term and in the short term? Constructing evolution-
ary trees and determining rates of evolution can both be accomplished
with stochastic process models of molecular evolution. Some of the
most central work goes into identifying protein coding regions or
genes in DNA. Locating a gene of 600 letters that is spread out in
small segments along 10,000 or 20,000 letters of DNA is a daunting but
essential task, requiring sophisticated combinatorial and statistical
analysis.
The science of molecular biology is currently undergoing rapid treat-
ment. The anticipated quantity of DNA and protein sequence data
makes it an area ripe for mathematical development. The nature of
the science makes it necessary that mathematical and biological scien-
tists closely communicate. Both sciences will surely benefit from such
collaborative effort.
27. BIOSTATISTICS AND EPIDEMIOLOGY
Epidemiology concerns the distribution and determinants of disease
in human populations. It encompasses such varied subjects as the
worldwide geographic variation in disease incidence rates, the setting
of radiation exposure standards in the workplace, and the evaluation
of vaccine efficacy using randomized field trials.
Two distinct study designs, cohort and case-control, are used for much
of the research in chronic disease epidemiology. In cohort studies,
exposures and covariables are measured on a defined population of
disease-free persons, who are then followed forward in time to deter-
mine which ones develop or die from the disease of interest. The
131

OCR for page 87

APPENDIX B
methods and concepts of survival analysis, particularly the propor-
tional hazards regression model, have greatly affected the statistical
treatment of cohort clata. They provide a mathematically rigorous
framework for elaboration of the key epidemiologic notions of inci-
dence rate and relative risk. Older epidemiologic techniques are given
a new interpretation in terms of classical statistical theory, while the
way is paved for the development of more flexible and powerful methods
of data analysis.
Case-control studies involve samples of diseased and nondiseased
persons whose history of exposure is known. The demonstration that
the exposure odds ratio calculated from case-control data approxi-
mates the ratio of disease incidence rates among exposed and nonex-
posed was of paramount importance in establishing the scientific
validity of this design. More recently, biostatisticians and economet-
ricians independently have developed methods for the analysis of
case-control and other data where the sample is selected on the basis
of the outcome of primary interest. Further elaboration of this meth-
odology is needed to handle more general stratified designs and for
situations where only partial information is available for a large number
of the sampled subjects.
Additional work is needed also on methods of analysis of epidemiol-
ogic data with dependent outcomes, such as arise in longitudinal studies
with repeated measurements on the same person over time or in ge-
netic studies of the patterns of disease in families. Better techniques
are needed for assessing the magnitude of measurement errors and to
correct for the tendency of such errors to dilute the observed associa-
tion between exposure and disease. Recent advances in statistical
computing and in the statistical theory of quasi-likelihood analysis
based on generalized estimating equations promise rapid advances in
this field.
Finally, AIDS, the major epidemic of our time, poses urgent challenges
to the biostatistician and epidemiologist. One problem is to estimate
the past, present, and future rates of infection with HIV so as to
determine the future number of AIDS cases. Another is to understand
better the patterns of HIV transmission within and between various
pools of susceptible individuals so as to be able to plan the optimal
organization of community resources for the prevention of further
infection. The collection of data related to AIDS is seldom routine and
often suffers from a lack of key pieces of information, so that studies
132

OCR for page 87

APPENDIX B
are rarely amenable to straightforward analysis. Mathematically, the
problem of estimating HIV infection rates, using data on AIDS inci-
dence rates and the distribution of time from infection to diagnosis,
can be viewed as an unusual type of deconvolution problem. It is
related to problems that occur in the field of image processing, where
rapid progress has been made. But the lack or poor nature of key
types of data makes it much more formidable.
NOTE
'Goldhammer, M. I., and Rubbert, P. E., "C.F.D. in design: An airframe perspective,"
Proceedings of the 27th Aerospace Sciences Meeting, January 9-12,1989, Publication Num-
ber 89-0092 (American Institute of Aeronautics & Astronautics, Washington, D.C.).
133

OCR for page 87