Click for next page ( 88


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 87
Appendix B Recent Research Accomplishments and Related Opportunities This appendix describes research achievements in the mathematical sciences and outlines prospects and opportunities that these achieve- ments open up. The richness and diversity of these achievements, which span core mathematics and a wide range of applications, point to the vigor, creativity, and depth and breadth of current research in the mathematical sciences. The unification and cross-fertilization of areas within core mathematics, increaser! reaching out to applications (which often uncovers unusual and unexpected uses of mathematics), and the growing role of the computer are all themes that are illus- trated in these descriptions. It should be emphasized that this list is only a selection that is not intended to be complete or comprehensive, nor is it intended to be an agenda for the future. Many important achievements and opportuni- ties are not ~liscussed for lack of space. If past patterns continue, a number of new achievements that we cannot visualize now will open up yet newer opportunities. It is interesting and significant to note how many of the achievements described in this appendix were not even suggested in the appendix "Ordering the Universe: The Role of Mathematics" in the 1984 Report. The committee would like to thank the following individuals for their assistance in preparing this appendix: W. Ballhaus, N. Breslow, R.L. Bryant, D.M. Burns, S. Childress, W. Cleveland, R.R. Coifman, G.B. Dantzig, H. Flaschka, J. Geanakoplos, J.G. Glimm, L. Gordon, L.F. Greengard, J. Harris, T. Hoist, W-C. Hsiang, 87

OCR for page 87
APPENDIX B A.M. Jaffe, A. Jameson, N. Jewell, D.S. Johnson, R.M. Karp, H. Kocak, A. Kupianinen, H.B. Lawson, F.T. Leighton, C.E. Leith, G.L. Lieber- man, A. Mazda, A. Marden, B. Mazur,tW. Murray, F.M. Odeh, C.S. Peskin, P. Seymour, L.A. Shepp, T.C. Spencer, P. Switzer, M.S. Water- man, S. Winograd, and I.A. Yorke. The following topics are discussed: 1. Recent Advances in Partial Differential Equations 2. Vortices in Fluid Flow 3. Aircraft Design 4. Physiology 5. Medical Scanning Techniques 6. Global Change 7. Chaotic Dynamics 8. Wavelet Analysis 9. Number Theory 10. Topology 11. Symplectic Geometry 12. Noncommutative Geometry Computer Visualization as a Mathematical Tool Lie Algebras and Phase Transitions 15. String Theory 16. Interacting Particle Systems 17. Spatial Statistics 18. Statistical Methods for Quality and Productivity 19. Graph Minors 20. Mathematical Economics 21. Parallel Algorithms and Architectures 22. Randomized Algorithms 23. The Fast Multipole Algorithm 24. Interior Point Methods for Linear Programming 25. Stochastic Linear Programming 26. Applications of Statistics to DNA Structure 27. Biostatistics and Epidemiology SYNOPSIS OF TOPICS In the first section, "Recent Advances in Partial Differential Equa- tions'" the items discussed are formation of shocks in non-linear waves, recent advances in elliptic equations, free boundary problems, and finally some remarkable advances in exactly solvable partial differen- 88

OCR for page 87
APPENDIX B fiat equations. "Vortices in Fluid Flow', (Section 2) continues some of these general themes to discuss vortex motion in fluid flow, a phe- nomenon of great importance in many applications, including the accurate tracing of hurricanes, the study of blood flow through the heart, the efficient mixing of fuel in internal combustion engines, air- craft flight, and the manner in which radiotelescopes sense distant galaxies through the motion of galactic jets. "Aircraft Design" (Section 3) illustrates the use of computational fluid dynamics, a technique that has matured so that it is now seen as the primary aerodynamic design tool for any problem. Analogous com- puter models are described in Section 4, "Physiology," which dis- cusses computational fluid models of the heart and other organs. Modern medical scanning techniques using X-rays or nuclear mag- netic resonance depend critically on algorithms deriving from the mathematics of the Radon transform. Recent progress in emission tomography is based on some newly developed algorithms of a very different sort that have probabilistic elements; these developments are described in Section 5, "Medical Scanning Techniques." Finally, "Global Change" (Section 6) discusses the key role played by computational fluid dynamics in global circulation models that are used in the analy- sis of climate change on a worldwide scale. Section 7, "Chaotic Dynamics," shows how ideas of Poincare on aperi- odic orbits for ordinary differential equations, complemented by ideas from topology, differential geometry, number theory, measure theory, and ergodic theory, plus the ability of modern computing facilities to compute trajectories, have led to a body of core mathematics that has many interesting and important applications. "Wavelet Analysis" (Section 8) outlines how classical ideas growing out of Littlewood-Paley and Calder6n-Zygmund theory have been developed within core mathematics and then have led to new and very efficient numerical tools for analysis of a wide variety of prob- lems. Algorithms based on wavelet analysis promise to significantly speed up communications and signal-processing calculations. The discussion titled "Number Theory" (Section 9) centers on a classic area of core mathematics that is actively and vigorously moving for- ward, spurred on in part by the resolution of the Mordell conjecture in the early l980s. Advances of great significance for the future include new results on the local-to-global problem in number theory and in arithmetic algebraic geometry, and significant progress on Fermat's 89

OCR for page 87
so APPENDIX B last theorem. Section 10, "Topology," notes important advances in major problems in low-dimensional topology, including remarkable connections with Yang-Mills theory, and recent advances in knot the- ory that involve a striking and unexpected connection with van Neu- mann algebras and mathematical physics. Section 11, "Symplectic Geometry," is devotecl to important recent developments in that field, including the use of nonlinear elliptic equations to establish a form of the Heisenberg uncertainty principle, the discovery of new kinds of symplectic structures, and a basic ad- vance in the understanding of regions of stability for area-preserving maps of the plane. "Noncommutative Geometry" (Section 12) describes a broad spectrum of very interesting developments involving a link between analysis and geometry and how the ideas of differential geometry extend to a noncommutative setting. This is an excellent example of cross-fertili- zation between areas within core mathematics and the building of an internal unification. The availability of powerful computers is stimulating research in core mathematics. Section 13, "Computer Visualization as a Mathematical Tool," indicates how computer graphics can be used as a tool in mini- mal surface theory and other areas of geometry to enhance under- standing and provide experimental evidence stimulating conjectures. The exposition in Section 14, "Lie Algebras and Phase Transitions," displays the rich and deep interaction between this topic in statistical mechanics and a number of areas of mathematics, including especially the Kac-Moody Lie algebras. "String Theory', (Section 15) discusses another topic in physics that relies heavily on developments from core mathematics. Section 16, "Interacting Particle Systems," indicates how systems similar to those discussed in the context of phase transitions can have applications in the study of biological systems, image proc- essing, and for medical and defense purposes. "Spatial Statistics" (Section 17) describes an area that addresses some overlapping prob- lems and uses new statistical tools for handling data in multidimen- sional arrays. Section 18, "Statistical Methods for Quality and Produc- tivity," discusses problems, opportunities, and new methods for ad- dressing important problems of national interest and significance. "Graph Minors" (Section 19) surveys some recent results in graph theory, which open up new avenues for research especially important

OCR for page 87
APPENDIX B in the design of algorithms. Section 20, "Mathematical Economics," describes some important recent developments and discusses how several parts of core mathematics, especially differential topology, have played key roles in the analysis of general equilibrium theory for incomplete markets, a new departure that is a better model for real markets than the now classic model for complete markets. The next group of topics have as a common general theme the search for new and efficient algorithms. "Parallel Algorithms and Architec- tures" (Section 21) concerns the design of algorithms to take advan- tage of parallel architectures, a problem not only for computer scien- tists but also for mathematicians working in large-scale computation. Here the idea is to see how problems that seem to be inherently se- quential can be parallelized. Section 22, "Randomized Algorithms," describes recent progress in the development of these new kinds of algorithms. Such algorithms are useful in primality testing, with re- sulting consequences for cryptography, in sorting and searching algo- rithms, in the design of distributed computing systems, and in many other areas. The subject of Section 23, "The Fast Multipole Algo- rithm," is a new, very efficient algorithm for computing interactions in many-particle systems. This algorithm will have many applications in the modeling of high-powered electron beam devices and in mo- lecular dynamics, which affects theoretical studies of chemical kinet- ~cs. The next two sections discuss recent advances in algorithms for nu- merical optimization; Section 24 is devoted to the new and very im- portant interior point methods for linear programming, which pro- vide an alternative to the classic simplex methods and are beginning to have a significant practical impact in the design of telecommunica- tions networks and the solution of large-scale logistics planning and scheduling problems. Section 25 discusses yet another approach- stochastic linear programming, a technique that allows one to include non-deterministic elements in the formulation and solution of a prob- lem. Thereby real problems that involve uncertainties in future be- havior or availability of resources can be better modeled. Sections 26 and 27 discuss an array of applications of mathematics in various additional areas. "Applications of Statistics to DNA Struc- ture" includes as an application the statistical analysis of options for cutting the DNA sequence to aid in the mapping processes) and ana- lyzing the evolutionary process at the genetic level. "Biostatistics and Epidemiology" is devoted to the use of statistics in epidemiology, 91

OCR for page 87
APPENDIX B including survival analysis, analysis of incidence rate and relative risk, and deconvolution techniques for estimating infection rates and incubation periods from observed data. 1. RECENT ADVANCES IN PARTIAL DIFFERENTIAL EQUATIONS An important trend of the last 15 years has been the great progress made in understanding nonlinear partial differential equations (PDEs). Many physical phenomena are described by partial differential equa- tions, e.g., fluid flow, electromagnetic fields, gravity, and heat. Roughly speaking, linear partial differential equations govern small vibrations or small disturbances from equilibrium, while nonlinear equations govern large disturbances. The real world is nonlinear. Since the mid-1970s, understanding of nonlinear equations has grown much deeper. Finally, in the last few years, some of the most important equations from geometry, physics, and engineering have been suc- cessfully studied. Many other equations are still too hard, and much more work is needed. Among the important problems solved recently are the following. Formation of Shocks in Nonlinear Waves In one space dimension, a small, smooth initial disturbance will be propagated by any truly nonlinear wave equation into a shock after a finite time. In more than four space dimensions, such shocks do not have to form. In three dimensions, "most" wave equations lead to shocks, but only after an exponentially long time. Moreover, an important class of equations (those satisfying a natural geometric property called the "null condition") do not build up shocks. Very recently there have been significant advances for one of the most important nonlinear equations, Einstein's equations for gravitational waves. At large distances and after a long time, one has a detailed picture of how gravitational waves behave. Very difficult and inter- esting questions remain in the study of Einstein's equations. Of spe- cial interest is the formation of black holes. Elliptic Equations Another important class of partial differential equations arises in geometry, when one tries to construct surfaces with prescribed curva- ture. These equations are called nonlinear elliptic differential equa- 92

OCR for page 87
APPENDIX B lions. A general theorem on regularity of solutions of elliptic equa- tions with boundary conditions was recently proved, making it pos- sible to treat boundary conditions that arise inevitably in real prob- lems. This result is a basic step forward in the analysis of partial differential equations. Important progress has been made also on some singular elliptic equations, namely those having "critical nonlinearity." If the- nonlin- ear term in such an equation were made slightly weaker, then the equation could be regarded as a small perturbation of a linear prob- lem, but at the critical nonlinearity this becomes impossible. An out- standing equation of this kind occurs in the Yamabe problem, in which one is asked to deform a curved manifold until it has constant (scalar) curvature. A complete solution to this problem has recently been established. Free Boundary Problems An iceberg melting in the sea, the flow of oil and water through a reservoir, and crystal growth are examples of free boundary problems governed by partial differential equations. For the melting iceberg, the temperature flow in the iceberg is governed by one parabolic partial differential equation, the temperature flow in the water around the iceberg by another, and the boundary between ice and water is given by a third equation. The three equations are coupled. What makes the problem very hard is the fact that the domains where the differential equations are satisfied keep changing with time, and are not known ahead of time. Recently proposed techniques have already led to new regularity theorems for free boundary problems and prom- ise further results. Exactly Solvable Partial Differential Equations Remarkably, a number of nonlinear PDEs can be solved exactly. These equations admit stable solutions (solitons) that persist even after inter- action with other solitons. Recently, equations for solitons have been used to solve the Schottky problem, an outstanding open problem in the theory of Riemann surfaces. The method used to solve soliton equations may be illustrated by the case of the Korteweg-deVries (KdV) equation, which describes the propagations of water waves in a long, narrow channel. At a single 93

OCR for page 87
APPENDIX B instant of time, we imagine the shape of the water wave to be frozen and rigid. We then bombard the rigid shape with imaginary quan- tized test particles. By studying how the test particles are scattered, one can reconstruct the shape of the wave. Thus, the scattering data provide an alternative description of the wave at a fixed time. Instead of asking how the shape of the wave changes with time, we can there- fore ask how the scattering data evolve with time. When rewritten in terms of scattering data, the K6V equation becomes amazingly simple, and the complete solution may be written down by inspection. In particular, the highly stable behavior of solitons is explained for the case of the K6V equation. More recently, a number of physically interesting PDEs have been solved completely by analogous methods, including the Kadomtsev- Petviashvili (K-P) equation for weakly two-dimensional water waves, and the sine-Gordon and nonlinear SchrBdinger equations. Explicit solutions of the K-P equation successfully predicted the results of experiments in water tanks, and a combination of theoretical and numerical analysis has been applied to model the behavior of a Jo- sephson junction. Remarkable connections have been discovered be- tween explicit solutions for nonlinear waves, exact solutions of statis- tical mechanics problems in two dimensions, and the Jones polynomi- als for knots, some of which are discussed below in sections on phase transitions and topology. 2. VORTICES IN FLUID FLOW Intense swirling or vortex motion is a primary feature of many prob- lems, including the accurate tracing of hurricanes and studies of blood flow through the heart, efficient fuel mixing in carburetors, aircraft flight, and the manner in which radiotelescopes sense distant galaxies through the motion of galactic jets. Governing much of this flow is a complicated set of nonlinear partial differential equations called the Navier-Stokes equations; these equa- tions are derived from Newton's laws of motion and include the fric- tional effects of viscosity. Intuition suggests that this frictional effect is extremely small in air or rapidly moving water, and this is con- firmed by experiments. The simpler partial differential equations obtained when this coefficient vanishes are called Euler equations. These are accurate enough for studying the movement and accumula- tion of vortices. 94

OCR for page 87
APPENDIX B Recent ingenious large-scale computer simulations using these equa- tions reveal unexpectedly that the sheets where vorticity typically accumulates clump and concentrate in an amazing fashion. In re- sponse to these discoveries, a new mathematical theory of "oscilla- tions and concentrations" has developed using potential theory and fractal (Hausdorff) measures. New kinds of "solutions" for the Euler equations are being introduced. One outgrowth of this theory is an explicit criterion to check whether numerical calculations for vortex sheets actually converge to solutions of the Euler equations. Conver- gence has been verified for many calculations of importance. The vortex sheets in the applications just described involve fluid moving at rapid speed but still much less than the speed of sound. Com- pletely different phenomena transpire when the vortex sheets are supersonic, as they are for the new space planes and for galactic jets in astrophysics. One recent success in the alliance between large-scale computing and modern mathematical theory is the discovery of a new mechanism of nonlinear instability for supersonic vortex sheets. Recent large-scale simulations have demonstrated that all supersonic vortex sheets exhibit nonlinear instability, belying the predictions of stability made in the 1950s and 1960s. One of the most important problems in fluid dynamics, an extension of the study of vortices, is the understanding of turbulence, which occurs when the frictional effect is extremely small but not negligible. Understanding turbulence requires the mathematical analysis of solu- tions of the Euler and the Navier-Stokes equations in the limit of small viscosity. This analysis is ongoing. 3. AIRCRAFT DESIGN Within the last five years, full simulations of a whole aircraft have appeared. Such a computation usually starts with steady Euler equa- tions that accurately describe the flow outside the boundary layer. Such flows are smooth until the Mach number, M, comes close to 1. For Mach numbers in the transonic range that is, less than but close to 1 small shocks are generated from a typical airfoil that dramati- cally increase the drag. It is a mathematical theorem that in almost all cases such shocks cannot be avoided. Since the cruising efficiency of a plane is roughly proportional to ML/D, where L is lift and D is cirag, it is imperative for manufacturers to design aircraft that minimize shocks. Of course if M exceeds 1, there is no avoiding or even minimizing 95

OCR for page 87
APPENDIX B shocks, and we have the inefficiency of the Concorde. In the past 15 years, great effort has been put into designing two-dimensional airfoil cross-sections that at some cruising speed or range of cruising speeds with M less than 1 have minimal shocks. When a wing cross-section is chosen, the flow at design conditions is computed and compared with wind tunnel results. To extend the computation to the whole aircraft, new computational capabilities have been added. The complex geometrical configura- tions demand new methods not only for discretizing the equations but also for handling the enormous volume of data. Currently the chal- lenge is to resolve higher-dimensional shocks and vortex sheets to predict viscous effects as described in the previous section. The most useful end product of simulations is a determination of how surface pressure varies with such parameters as Mach number and the angle of attack. Varying parameters on the computer is much more eco- nomical than doing enormous numbers of experiments in a wind tunnel. The simple model provided by the Euler equations is remarkably good, airplane flight being basically stable. But key elements are missing. Despite considerable effort, there is still no good mathematical model for the turbulent boundary layer, and when one is found it will in- crease the size of the computation at least as much as by adding a dimension. An ultimate goal of design is to pick an optimal pressure distribution and then find the aircraft shape that corresponds to it. Such inverse problems also increase drastically the computational needs. The hope is that computer hardware speedups and algorithmic im- provements will combine to make these goals achievable. One area of particular note is in the design of aircraft engines. A typical example is a turbomachinery compressor simulation where instantaneous temperature contours are calculated. This computation is based upon time-dependent Navier-Stokes equations. Simulations show viscous wakes created by the blades and how some of the blades chop or break these wakes into different pieces, creating an extremely complex flow pattern. This flow pattern would be difficult or impos- sible to describe and adjust without dependable mathematical models coupled with computer simulations. For very-high-altitude high-speed conditions, numerical simulations are also being used for vehicle design. At these altitudes, air can dissociate into its atomic constituents and even eventually ionize, 96

OCR for page 87
APPENDIX B creating a situation that is extremely difficult to simulate in ground- based experimental facilities. As a result, numerical flow simulations, with the appropriate air chemistry models added, are currently being used as an integral part of the design process for many high-speed or atmospheric entry vehicles. The best summary of the situation has been given by Goldhammer and Rubbert: The present state-of-th~art has progressed to the point where the design engineer no longer considers Computational Fluid Dynamics (CFD) to be an irritant imposed on him by a seldom seen researcher, but rather CFD is re- garded as the primary aerodynamic design tool for any problem, and the wind tunnel is treated as more of an evaluation and confirmation tools 4. PHYSIOLOGY In the realm of physiology mathematical modeling has come into its own over the last ten years. Today there are computational models of the heart, the kidney, the pancreas, the ear, and many other organs. Many of these models rely on fluid dynamics. Physiological fluid dynamics has a long and illustrious history. Le- onardo da Vinci first described the vortices that form behind heart valves and that enable the valves to avoid backflow by closing while the flow is still in the forward direction. Leonhard Euler first wrote down the partial differential equations for blood flow in arteries. With the recent flowering of computer technology and numerical algorithms, there is unprecedented opportunity to simulate the fluid dynamic functions of the human body at a level of detail sufficient to be of use in the understanding and treatment of disease. For instance, blood flow in the heart is governed by coupled equations of motion of the muscular heart walls, the elastic heart valve leaflets, and the blood that flows in the cardiac chambers. Computer solutions allow one to study both normal and diseased states, and lead to the design of prosthetic devices such as artificial valves and artificial hearts. The methods used have a very general character since they are applicable to any problem in which a fluid interacts with an elastic medium of complicated geometry. Among these are the flow of sus- pensions, blood clotting, wave propagation in the inner ear, blood flow in arteries and veins, and airflow in the lung. Like much of computational fluid dynamics, this work pushes computer technology 97

OCR for page 87
APPENDIX B Almost from the beginning of the computer era, random number generators have been applied to the simulation of complex systems involving queueing and other stochastic phenomena and to the esti- mation of multidimensional integrals and other mathematical quanti- ties, using various sophisticated sampling techniques known collec- tively as the Monte Carlo method. A major factor in drawing the attention of computer scientists to the wider uses of randomization was the discovery, around 1975, of two efficient randomized algorithms for checking whether a number is prime. Each of these algorithms is based on the concept of a witness to the compositeness of a number. A simple illustration of this con- cept is based on a theorem due to Fermat, which says that if n is a prime number, then, for any integer m that is not a multiple of n, me - ~' - 1 is a multiple of n. If this calculation is performed for some m, and one does not get the result predicted by Fermat's theorem, then n is composite (i.e., not prime); in this case, m is called a witness to the compositeness of n. The tests mentioned are based on slightly more complicated kinds of witnesses. The effectiveness of these tests stems from theorems that show that, if n is composite, then most of the integers between 1 and n -1 will serve as witnesses. An interesting aspect of these tests is that they do not provide witnesses for primal- ity, but this weakness was rectified in work that defined witnesses for primality rather than compositeness, showing that if n is prime, most randomly chosen numbers will bear witness to that fact. There are many other randomized algorithms based on the abundance of wit- nesses. Randomized techniques have also proved to be a very effective tool for algorithm construction in the areas of sorting, searching, and computational geometry. A simple illustration is the problem of list- inz all intersections among a set of line segments in the mane. There ~ ~ O ~ O--~ -- r-- ~ ~ ~ . ~ ~ . t . ~ . ~ . ~ lS a talrly OUVlOUS incremental algorithm that considers the segments one at a time and reports the intersections of each new segment with all the previous ones. If the segments are read in a particularly unfor- tunate order then the run time of this algorithm will be excessively long; however, it can be shown that if the segments are processed in a random order, then with extremely high probability the algorithm will be very fast. In addition, randomization plays a crucially important role in the design of distributed computing systems, in which many geographi- 124

OCR for page 87
APPENDIX B catty dispersed computers connected by suitable communication links work together as a single system. Such systems must cope with the possibility that individual computers or communication links may fail or may run synchronously at different speeds/ and must ensure that the overhead of communication between processors will not become an insurmountable obstacle. Randomization is particularly effective in allocating computational tasks to the individual processors and in choosing the communication paths along which data shall flow. It can be shown in a variety of settings that random allocation of tasks to processors and data to memory modules, together with randomized routing of messages, yields near-optimal performance with high proba- bility. All the applications of randomization that we have mentioned depend on the assumption that algorithms, or computer programs, have ac- cess to a stream of independent random bits. More commonly, com- puters use pseudorandom numbers that are generated from an initial number, called the seed, by some purely deterministic iterative proc- ess. These generators are typically subjected to certain statistical tests in order to confirm that the streams of numbers they generate have some of the properties of random sequences, even though they are generated by a purely deterministic process. Currently, a deep line of research into the properties of pseudorandom number generators is being pursuecl. The goal of this research is to show that, as long as the seed is random, the output of the generator cannot be distinguished from a purely random sequence by any polynomial-time computa- tional test whatsoever. Finally, recent theoretical research has focused on a connection be- tween pseudorandom generators and the concept of a one-way func- tion, which is fundamental in cryptography. A one-way function is a function that is easy to compute but hard to invert. It has been shown that any one-way function can be used to construct a rigorously justi- fied pseudorandom number generator. Unfortunately, researchers in computational complexity theory have not yet determined whether one-way functions even exist. This is one of the many important problems remaining to be addressed. 23. THE FAST MULTIPOLE ALGORITHM There are great opportunities for progress in algorithms dealing with problems such as particle beams in plasma physics, underwater acous- 125

OCR for page 87
APPENDIX B tics, molecular modeling, and even very important aspects of aerody- namics. A basic calculation of central importance in these applica- tions is the calculating of interactions in a many-particle system. These interactions are often long-range, so all pairs of particles must be considered. Because of the latter constraint, the direct calculation of all interactions requires on the order of N2 operations in a system with N particles. We will refer to this calculation as the N-body potential problem. There have been a number of efforts aimed at reducing the computa- tional complexity of the N-body potential problem. The oldest ap- proach is that of particle-in-cell (PIC) methods, requiring on the order of N log(NJ operations. Unfortunately, they also require a mesh that provides limited resolution and is inefficient when the particle distri- bution is nonuniform. A more recent approach is that of the hierarchi- cal solvers, which are gridless methods for many-body simulations, having computational complexities also estimated to be of order N log(N). The Fast Multipole Method (FMM), which has recently been devel- oped, requires an amount of work only proportional to N to evaluate all pairwise interactions to within roundoff error, irrespective of the particle distribution. Like the hierarchical solvers, the FMM is a di- vide-and-conquer algorithm, based on the idea of clustering particles on a variety of spatial length scales. The method is in fact based on a combination of physics (multipole expansions), mathematics (approxi- mation theory), and computer science, and its use in applications is growing. There are several immediate industrial applications for the techniques being developed. The payoff should be substantial and almost imme- diate in the straightforward use of particle simulations. Simulations of this type are performed in the modeling of high-powered electron beam microwave devices (e.g., gyrotrons and free-electron lasers), particle beams, controlled fusion devices, and so forth. A second immediate industrial application is in molecular dynamics, a technique for studying the properties of fluids (and other phases of matter) by computer simulation. Once initial positions and velocities are chosen for some number of representative particles, their trajecto- ries are followed by integrating Newton's second law of motion. In 126

OCR for page 87
APPENDIX B early simulations, only nonpolar fluids were considered, where the amount of computation per time step is proportional to the number of particles N. In polar fluids, the situation is quite different. A coulom- bic term is added to the potential function and all pairwise interac- tions need to be accounted for, a standard N-body calculation. The FMM allows for the simulation of much larger chemical systems than was previously possible. The study of a protein molecule in water, for example, requires following the motion of tens of thousands of atoms over tens of thousands of time steps. Real gains are possible in the long term, beginning with detailed theoretical studies of chemical kinetics. 24. INTERIOR POINT METHODS FOR LINEAR PROGRAMMING Many problems in resource allocation can be modeled by what is called the "linear programming" problem, in which one attempts to optimize a linear function over the vertices of a multidimensional polytope. The traditional Simplex algorithm for this problem, which works by traveling along the boundary of the polytope, has had immense value and influence during the 40 years since it was discovered. It has a significant theoretical drawback, however: its running time can, in pathological cases, grow as an exponential function of the number of variables. Much more desirable, at least from a theoretical point of view, would be an algorithm with polynomially bounded worst-case running time. In 1976, the first such algorithm, the Ellipsoid method, was discov- ered. Its running time was O(n4L2), where n is the number of variables and L is a measure of the number of bits needed to describe the input. This algorithm has the additional desirable property that it applies to more general "convex programming." Moreover, it does not require a complete description of the convex body over which optimization is to take place, but merely a "black box" subroutine that, given a point, tells whether that point is in the polytope, and if it is not, will identify a hyperpla~ne that separates the point from the polytope. The Ellip- soid method is thus applicable to a much broader class of problems than is the Simplex method, and its existence has led to a wide variety of polynomial-time algorithms for previously open problems. For linear programming, however, researchers quickly discovered that its improved worst-case running-time bound clid not correlate with bet- ter performance in practice. 127

OCR for page 87
APPENDIX B Polynomial-time programming algorithms thus seemed an impractical theoretical nicety. In 1984 this was changed with the introduction of a new breed of polynomial-time linear programming algorithm, based on a clever variant of an old idea. The idea, one that had been aban- doned long ago as impractical, is to cut across the interior of the polytope in searching for an optimum vertex, rather than traversing the outside as does Simplex. The difficulty in making such an "inte- rior point" approach work lies in finding the right direction and dis- tance to travel at each step. The solution involves the use of projective transformations and a logarithmic potential function to guide the search, and yields a running time of O(n35L2~. The theoretical improvement over the Ellipsoid method running time was not the main story, however; more important, this algorithm (along with several of its variants) appears to be competitive with Simplex when implemented with appropriate sparse matrix techniques. Moreover, it appears to have substantial running-time advantages for large and/or degenerate in- stances. Indeed, important practical applications have been found in the design of large telecommunication networks and in the solution of large-scale logistics planning and scheduling problems. Since the first reports of the potential practicality of the approach, there has been a torrent of research on interior point methods. Rela- tions between this algorithm and earlier algorithms have been exten- sively explored. For instance, the algorithm can be viewed as a type of "logarithmic barrier function" algorithm, or even as an application of Newton's method (in an appropriately transformed space). In the limit of infinitesimal step length, it generates a vector field inside the polytope, all of whose limiting trajectories go to the optimal vertex. In this context, it can be viewed as attempting to follow such a trajectory approximately, by taking short steps along tangent lines. This in turn suggests variants in which one steps along curves that represent higher- order power series approximations to the vector field. Other variants concentrate on approximately following a particular trajectory, the so- called "central trajectory." These latter have led to better and better worst-case running times, with the current champion having a run- ning time of O(n25L2), based on a clever use of recent developments in the field of fast matrix multiplication. New algorithms and insights continue to pour forth at a rapid rate, and although it seems unlikely that this will lead to polynomial-time solutions for the much harder NP-complete problems, there is much 128

OCR for page 87
APPENDIX B hope that the interior point approach will greatly extend the range of problems for which useful answers can be determined. 25. STOCHASTIC LINEAR PROGRAMMING Deterministic models for linear programming problems and their so- lutions, ever since their first appearance 40 years ago, have been keep- ing pace with the extraordinary advances that have taken place in computing power. At the present time, industry and government routinely solve models in which there is no uncertainty; that is, they solve deterministic linear and mathematical programs for planning and scheduling purposes, some involving many thousands of vari- ables with a linear or nonlinear objective function and many thou- sands of inequality constraints. These assume, for example, that knowledge about what future technologies will be available to choose from is known with certainty. As a result, the solutions obtained from deterministic models are incomplete because they do not properly take account of future contingencies. Although it is easy to reformu- late the mathematical models to include uncertainty, the resulting size of the mathematical systems to be solved becomes too enormous in most practical applications. The bottleneck to solving stochastic pro- grams has been (and is) calculating capability. Therefore, despite the progress made, there remains an unresolved class of decision problems of great importance: that of finding an "optimal" solution to stochastic mathematical and linear programs. Stochastic here means uncertainty about, for example, the availability of resources, foreign competition, or the effects of possible political upheavals. Since such problems concern the optimal allocation of scarce resources under uncertainty, it is of fundamental importance to include uncertainty in the problem setup. If such problems could be solved in general, it would significantly advance our ability to plan and schedule complex situations. At the present time there is intense activity taking place in the United States and elsewhere to solve certain relevant classes of stochastic linear and nonlinear programs. Important new developments in computer hardware are spurring this effort, particularly the availabil- ity of multiple vector processor mainframes and parallel processors. It is hoped that the combination of three techniques improved ways to mathematically decompose large-scale problems into smaller ones, 129

OCR for page 87
APPENDIX B improved techniques of Monte Carlo (importance) sampling, and the use of parallel processors will bring about important advances in the relatively near future. 26. APPLICATIONS OF STATISTICS TO DNA STRUCTURE A strand of DNA can be represented as a string of bases, A,C,G,T, that carries the information governing the growth and function of an or- ganism. Great effort has therefore been expended in determining DNA sequences. Rapid sequencing methods were introduced in 1976 and were followed by an explosion in quantitative genetic informa- tion. Today over 25 million letters of sequence have been determined, in segments averaging length 1000, from a wide variety of organisms. Improvements in sequencing technology continue to be made, and the associated discoveries in basic biology are staggering. Two kinds of maps are constructed for DNA: genetic maps and physi- cal maps. Both types are generated from the use of restriction en- zymes that cut DNA at specific patterns in the sequence, producing fragments whose lengths can be measured with some degree of inher- ent experimental error. It was suggested in 1980 that slight variations in DNA sequence could produce differing restriction fragment lengths that could be used as "markers" traits that could be approximately mapped to specific locations on specific chromosomes. The amount of data subsequently available has created a number of new statistical problems. Physical maps give the relative locations of identifiable and clonable pieces of DNA. Availability of a physical map facilitates the complete sequencing of a DNA strand. Given the mapped locations of a com- plete library of clones each having a length on the order of several tens of thousands of nucleotides a number of laboratories in coordi- nation could then proceed simultaneously to sequence the individual clones. We can expect statistical analysis of design options, such as number and choice of cutting enzymes, to yield benefits in the map- ping process. The process of physically locating clones along the genome should be substantially facilitated by an understanding of the design parameters and sources of variation inherent in the process. Obtaining DNA sequence data is only a first step in modern molecular biology. The sequence is next subjected to extensive analysis, to relate it to what is already understood about DNA sequences. Because 130

OCR for page 87
APPENDIX B evolution operates to preserve biological features of importance, in- cluding sequence features, these analyses can be very important in understanding the function of specific portions of the sequence. Use of computers to implement complex algorithms is often required; mathematical analysis of algorithms is essential, both to ensure an unambiguous, informative interpretation of the results and to ensure that a programmed algorithm will complete its operations rapidly enough. The study of molecular evolution has developed with the ability to read DNA and protein sequences. It is just now becoming possible to sample the sequence for a gene within a defined population. This opens up many new questions. How does molecular evolution pro- ceed, in the long term and in the short term? Constructing evolution- ary trees and determining rates of evolution can both be accomplished with stochastic process models of molecular evolution. Some of the most central work goes into identifying protein coding regions or genes in DNA. Locating a gene of 600 letters that is spread out in small segments along 10,000 or 20,000 letters of DNA is a daunting but essential task, requiring sophisticated combinatorial and statistical analysis. The science of molecular biology is currently undergoing rapid treat- ment. The anticipated quantity of DNA and protein sequence data makes it an area ripe for mathematical development. The nature of the science makes it necessary that mathematical and biological scien- tists closely communicate. Both sciences will surely benefit from such collaborative effort. 27. BIOSTATISTICS AND EPIDEMIOLOGY Epidemiology concerns the distribution and determinants of disease in human populations. It encompasses such varied subjects as the worldwide geographic variation in disease incidence rates, the setting of radiation exposure standards in the workplace, and the evaluation of vaccine efficacy using randomized field trials. Two distinct study designs, cohort and case-control, are used for much of the research in chronic disease epidemiology. In cohort studies, exposures and covariables are measured on a defined population of disease-free persons, who are then followed forward in time to deter- mine which ones develop or die from the disease of interest. The 131

OCR for page 87
APPENDIX B methods and concepts of survival analysis, particularly the propor- tional hazards regression model, have greatly affected the statistical treatment of cohort clata. They provide a mathematically rigorous framework for elaboration of the key epidemiologic notions of inci- dence rate and relative risk. Older epidemiologic techniques are given a new interpretation in terms of classical statistical theory, while the way is paved for the development of more flexible and powerful methods of data analysis. Case-control studies involve samples of diseased and nondiseased persons whose history of exposure is known. The demonstration that the exposure odds ratio calculated from case-control data approxi- mates the ratio of disease incidence rates among exposed and nonex- posed was of paramount importance in establishing the scientific validity of this design. More recently, biostatisticians and economet- ricians independently have developed methods for the analysis of case-control and other data where the sample is selected on the basis of the outcome of primary interest. Further elaboration of this meth- odology is needed to handle more general stratified designs and for situations where only partial information is available for a large number of the sampled subjects. Additional work is needed also on methods of analysis of epidemiol- ogic data with dependent outcomes, such as arise in longitudinal studies with repeated measurements on the same person over time or in ge- netic studies of the patterns of disease in families. Better techniques are needed for assessing the magnitude of measurement errors and to correct for the tendency of such errors to dilute the observed associa- tion between exposure and disease. Recent advances in statistical computing and in the statistical theory of quasi-likelihood analysis based on generalized estimating equations promise rapid advances in this field. Finally, AIDS, the major epidemic of our time, poses urgent challenges to the biostatistician and epidemiologist. One problem is to estimate the past, present, and future rates of infection with HIV so as to determine the future number of AIDS cases. Another is to understand better the patterns of HIV transmission within and between various pools of susceptible individuals so as to be able to plan the optimal organization of community resources for the prevention of further infection. The collection of data related to AIDS is seldom routine and often suffers from a lack of key pieces of information, so that studies 132

OCR for page 87
APPENDIX B are rarely amenable to straightforward analysis. Mathematically, the problem of estimating HIV infection rates, using data on AIDS inci- dence rates and the distribution of time from infection to diagnosis, can be viewed as an unusual type of deconvolution problem. It is related to problems that occur in the field of image processing, where rapid progress has been made. But the lack or poor nature of key types of data makes it much more formidable. NOTE 'Goldhammer, M. I., and Rubbert, P. E., "C.F.D. in design: An airframe perspective," Proceedings of the 27th Aerospace Sciences Meeting, January 9-12,1989, Publication Num- ber 89-0092 (American Institute of Aeronautics & Astronautics, Washington, D.C.). 133

OCR for page 87