Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 27

2
Vitality of the Mathematical Sciences
The vitality of the U.S. mathematical sciences enterprise is excellent.
The mathematical sciences have consistently been making major advances
in both fundamental theory and high-impact applications, and recent
d
ecades have seen tremendous innovation and productivity. The discipline
is displaying great unity and coherence as more and more bridges are built
between subfields of research. Historically, such bridges serve as drivers for
additional accomplishments, as do the many interactions between the math-
ematical sciences and fields of application, and so the existence of several
striking examples of bridge-building in this chapter is a very promising sign
for the future. The programs of the National Science Foundation’s (NSF)
mathematical science institutes offer good evidence of this bridge-building,
and the large-scale involvement of graduate students and postdoctoral
r
esearchers at those institutes suggests that the trend will continue. New
tools for scholarly communications, such as blogs and open-access reposi-
tories of research, contribute to the excellent rate of progress at the research
frontiers. As shown in this chapter and in the separate report Fueling Inno
vation and Discovery: The Mathematical Sciences in the 21st Century, the
discipline’s vitality is providing clear benefits for diverse areas of science
and engineering, for industry and technology, for innovation and economic
competitiveness, and for national security.
Further down the road, toward 2025, stresses are likely, and these will
be discussed in Chapters 4 through 6. The focus of this chapter is to docu-
ment some recent advances as illustrations of the health and vitality of the
mathematical sciences. Indeed, the growth of new ideas and applications in
the mathematical sciences is so robust that inevitably it far exceeds the spec-
27

OCR for page 27

28 THE MATHEMATICAL SCIENCES IN 2025
trum of expertise of a modest-sized committee. What appears below is just
a sampling of advances, to give a flavor of what is going on, and it is not
meant to be either comprehensive or proportionally representative. This
chapter is aimed primarily at the mathematical sciences community, and so
the examples here presume a base of mathematical or statistical knowledge.
The topics covered range from the solution of a century-old problem by
using techniques from one field of mathematics to solve a major problem in
another, to the creation of what are essentially entirely new fields of study.
The topics are, in order:
• The Topology of Three-Dimensional Spaces
• Uncertainty Quantification
• The Mathematical Sciences and Social Networks
• The Protein-Folding Problem and Computational Biology
• The Fundamental Lemma
• Primes in Arithmetic Progression
• Hierarchical Modeling
• Algorithms and Complexity
• Inverse Problems: Visibility and Invisibility
• The Interplay of Geometry and Theoretical Physics
• New Frontiers in Statistical Inference
• Economics and Business: Mechanism Design
• Mathematical Sciences and Medicine
• Compressed Sensing
THE TOPOLOGY OF THREE-DIMENSIONAL SPACES
The modest title of this section hides a tremendous accomplishment.
The notion of space is central to the mathematical sciences, to the physi-
cal sciences, and to engineering. There are entire branches of theoretical
mathematics devoted to studying spaces, with different branches focus-
ing on different aspects of spaces or on spaces endowed with different
characteristics or structures.1 For example, in topology one studies spaces
without assuming any structure beyond the notion of coherence or con-
tinuity. By contrast, in geometry one studies spaces in which, first of all,
one can differentiate, leading to notions such as tangent vectors, and,
second, for which the notion of lengths and angles of tangent vectors are
defined. These concepts were first introduced by Riemann in the 1860s
in his thesis “The hypotheses that underlie geometry,” and the resulting
structure is called a Riemannian metric. Intuitively, one can imagine that to
a opologist spaces are made out of rubber or a substance like taffy, while
t
1 Box 2-1 discusses the concept of mathematical structures.

OCR for page 27

VITALITY OF THE MATHEMATICAL SCIENCES 29
BOX 2-1
Mathematical Structures
At various points, this chapter refers to “mathematical structures.” A mathemati-
cal structure is a mental construct that satisfies a collection of explicit formal rules
on which mathematical reasoning can be carried out. An example is a “group,”
which consists of a set and a procedure by which any two elements of the set can
be combined (“multiplied”) to give another element of the set, the product of the
two elements. The rules a group must satisfy are few: the existence of an identity
element, of inverses for each element of the set, and the associative property for
the combining action. Basic arithmetic conforms with this definition: For example,
addition of integers can be described in these terms. But the concept of groups is
also a fundamental tool for characterizing symmetries, such as in crystallography
and theoretical physics. This level of abstraction is helpful in two important ways:
(1) it enables precise examinations of mathematical sets and operations by strip-
ping away unessential details and (2) it opens the door to logical extensions
from the familiar. As an example of the latter benefit, note that the definition of a
group allows the combination of two elements to depend on the order in which
they are “multiplied,” which is contrary to the rules of arithmetic. With the explicit
recognition that that property is an assumption, not a necessary consequence,
mathematicians were able to define and explore “noncommutative groups” for
which the order of “multiplication” is significant. It turns out that there are many
natural situations that are naturally represented by a noncommutative group.
It is possible, of course, to define a mathematical structure that is uninteresting
and that has no relevance to the real world. What is remarkable is how many in-
teresting mathematical structures there are, how diverse are their characteristics,
and how many of them turn out to be important in understanding the real world,
often in unanticipated ways. Indeed, one of the reasons for the limitless possibili-
ties of the mathematical sciences is the vast realm of possibilities for mathematical
structures. Complex numbers, a mathematical structure build around the square
root of –1, turn out to be rooted in the real world as part of the essential equa-
tions describing electromagnetism and quantum theory. Riemannian metrics, the
mathematical structure developed to describe objects whose geometry varies
from point to point, turns out to be the basis for Einstein’s description of gravitation.
“Graphs” (these are not the graphs used to plot functions in high school) consist-
ing of “nodes” joined by “edges,” turn out to be a fundamental tool used by social
scientists to understand social networks.
A striking feature of mathematical structures is their hierarchical nature—it
is possible to use existing mathematical structures as a foundation on which to
build new mathematical structures. For example, although the meaning of “prob-
ability” has long vexed philosophers, it has been possible to create a mathematical
structure called a “probability space” that provides a foundation on which realistic
structures can be built. On top of the structure of a probability space, mathemati-
cal scientists have built the concept of a random variable, which encapsulates
in a rigorous way the notion of a quantity that takes its values according to a
continued

OCR for page 27

30 THE MATHEMATICAL SCIENCES IN 2025
BOX 2-1 Continued
certain set of probabilities, such as the roll of a pair of dice—one gets a different
random variable depending on whether the dice are honest or loaded. There then
are certain broad classes of random variables, of which the most famous are
G
aussian random variables, the source of the well-known bell curve and which
provide the rigorous basis of many of the fundamental tools of statistics. These
different classes of random variables can be put together into structures known
as probabilistic models, which are an incredibly flexible class of mathematical
structures used to understand phenomena as diverse as what goes on inside a
cell, financial markets, or the physics of superconductors.
Mathematical structures provide a unifying thread weaving through and uniting
the mathematical sciences. For example, algorithms represent a class of math-
ematical structure; algorithms are often based on other mathematical structures,
such as the graphs mentioned earlier, and their effectiveness might be measured
using probabilistic models. Partial differential equations (PDEs) are a class of
mathematical structure built on the most basic of mathematical structures—
functions. Most of the fundamental equations of physics are described by PDEs,
but the mathematical structure of PDEs can be studied rigorously independent of
knowing, say, what the ultimate structure of space looks like at subnuclear scales.
Such study can, for example, provide insight into the structures of the potential
solutions of a certain PDE, explaining which phenomenology can be captured by
that PDE and which cannot. Computations involving PDEs involve yet another type
of mathematical structure, a “discretization scheme,” which transforms what is a
fundamentally continuous problem into one involving just a very large but finite set
of values. The catch is that finding the right discretization scheme is highly subtle,
and discretization is one example of the mathematical depth of the computational
wing of the mathematical sciences.
to a geometer they are made out of steel. Although we have no direct visual
representation of spaces of higher dimension, they exist as mathematical
objects on the same footing as lower-dimensional spaces that we can di-
rectly see, and this extension from physical space has proved very useful.
Topological and geometric spaces are central objects in the mathematical
sciences. They are also ubiquitous in the physical sciences, in computing,
and in engineering, where they are the context in which problems are pre-
cisely formulated and results are expressed.
Over 100 years ago Poincaré initiated the abstract and theoretical study
of these higher-dimensional spaces and posed a problem about the three-
dimensional sphere (a unit sphere in a four-coordinate space whose surface
has three dimensions) that motivated topological investigations into three-

OCR for page 27

VITALITY OF THE MATHEMATICAL SCIENCES 31
dimensional spaces over the century that followed.2 The three-dimensional
sphere sits in ordinary four-dimensional coordinate space as the collection
of vectors of unit length: that is, the space
{(x,y,z,w) | x2 + y2 + z2 + w2 = 1}
It has the property that any closed path (one that starts and ends at the
same point) on the sphere can be continuously deformed through closed
paths, always staying on the sphere, so as to shrink it down to a point
path—that is, a path in which no motion occurs. Poincaré asked if this was
the only three-dimensional space of finite extent, up to topological equiva-
lence, with this property. For the next 100 years this problem and its gen-
eralizations led to enormous theoretical advances in the understanding of
three-dimensional spaces and of higher-dimensional spaces, but Poincaré’s
original problem remained unsolved. The problem proved so difficult, and
the work it stimulated so important, that in 2000 the Clay Mathematics
Institute listed it as one of its seven Millennium Prize problems in math-
ematics, problems felt to be among the hardest and most important in
theoretical mathematics.
This is purely a problem in topology. The condition about paths is on
the face of it a topological condition (since to make sense of it one only
needs the notion of continuity), and the conclusion is explicitly topological.
For nearly 100 years it was attacked in purely topological terms. Then in
2002, Grigory Perelman succeeded in answering this question in the af-
firmative. Resolving a central question that has been the focus of attention
for many decades is an exciting event. Here the excitement was accentuated
by the fact that the solution required powerful results from other parts of
theoretical mathematics, suggesting connections between them that had not
been suspected. Perelman invoked deep ideas from analysis and parabolic
evolution equations developed by Richard Hamilton. Briefly, Hamilton had
introduced and studied an evolution equation for Riemannian metrics that
is an analogue of the heat equation. Perelman was able to show that under
the topological hypotheses of Poincaré’s conjecture, Hamilton’s flow con-
verged to the usual metric on the 3-sphere, so that the underlying topologi-
cal space is indeed topologically equivalent to the 3-sphere. But Perelman’s
results applied to Riemannian metrics on every three-dimensional space and
gave a description of any such space in terms of simple geometric pieces,
which is exactly what William Thurston had conjectured some 20 years
earlier. Ironically, the ideas that led Thurston to his conjectured description
2 The convention in mathematics is that the surface of a basketball is a 2-sphere (because
it has two dimensions) even though it sits in three-dimensional space. The entire basketball
including the air inside is called a “ball” or a “3-ball” rather than a sphere.

OCR for page 27

32 THE MATHEMATICAL SCIENCES IN 2025
have their source in other, more geometric, parts of Poincaré’s work, his
work on Fuchsian and Kleinian groups.
Perelman’s solution brings to a close a chapter of theoretical math-
ematics that took approximately 100 years to write, but at the same time
it has opened up new avenues for the theoretical study of space. This
mathematical breakthrough is fairly recent and it is too early to accurately
assess its full impact inside both mathematics and the physical sciences
and engineering. Nevertheless, we can make some educated guesses. Even
though this is the most abstract and theoretical type of mathematics, it may
well have practical import because spaces are so prevalent in science and
engineering. What Perelman achieved for the particular evolution equation
that he was studying was to understand how singularities develop as time
passes. That particular equation is part of a general class of equations,
with the heat equation being the simplest. In mathematics, various geo-
metric problems belong to this class, as do equations for the evolution of
many different types of systems in science and engineering. Understanding
singularity development for these equations would have a huge impact on
mathematics, science, and engineering, because the behavior of solutions
near singularities can be so important. Already, we are seeing the use of
the techniques Perelman introduced to increase the understanding in other
geometric contexts, for example in complex geometry. Time will tell how
far these ideas can be pushed, but if they extend to equations in other
more applied contexts, then the payoff in practical terms could well be
enormous.
UNCERTAINTY QUANTIFICATION
Central to much of science, engineering, and society today is the build-
ing of mathematical models to represent complex processes. For example,
aircraft and automotive manufacturers routinely use mathematical repre-
sentations of their vehicles (or vehicle components) as surrogates for build-
ing physical prototypes during vehicle design, relying instead on computer
simulations that are based on those mathematical models. The economic
benefit is clear. A prototype automobile that is destroyed in a crash test,
for example, can cost $300,000, and many such prototypes are typically
needed in a testing program, whereas a computer model of the automobile
that can be virtually crashed under many varying conditions can often be
developed at a fraction of the cost.
The mathematical modeling and computational science that underlies
the development of such simulators of processes has seen amazing advances
over the last two decades, and improvements continue to be made. Yet the
usefulness of such simulators is limited unless it can be shown that they are
accurate in predicting the real process they are simulating.

OCR for page 27

VITALITY OF THE MATHEMATICAL SCIENCES 33
A host of issues arise in the process of ensuring that the simulators are
accurate representations of the real process. The first is that there are typi-
cally many elements of the mathematical model (such as rate coefficients)
that are unknown. Also, inputs to the simulators are often themselves
imperfect—for example, weather and climate predictions must be initiated
with data about the current state, which is not known completely. Further-
more, the modeling is often done with incomplete scientific knowledge,
such as incomplete representation of all the relevant physics or biology, and
approximations are made because of computational needs—for example,
weather forecasting models being run over grids with 100-kilometer edges,
which cannot directly represent fine-scale behaviors, and continuous equa-
tions being approximated by discrete analogues.
The developing field for dealing with these issues is called Uncer-
tainty Quantification (UQ), and it is crucial to bringing to fruition the
dream of being able to accurately model and predict real complex processes
through computational simulators. Addressing this problem requires a
variety of mathematical and statistical research, drawing on probability,
measure theory, functional analysis, differential equations, graph and net-
work theory, approximation theory, ergodic theory, stochastic processes,
time series, classical inference, Bayesian analysis, importance sampling,
nonparametric techniques, rare and extreme event analysis, multivariate
techniques, and so on.
UQ research is inherently interdisciplinary, in that disciplinary knowl-
edge of the process being modeled is essential in building the simulator,
and disciplinary knowledge of the nature of the relevant data is essential
in implementing UQ. Thus, effective UQ typically requires interdisciplin-
ary teams, consisting of disciplinary scientists plus mathematicians and
statisticians.
This appreciation for the need for UQ has grown into a clamor; calls
for investment in UQ research can be found in any number of science and
engineering advisory reports to funding agencies. Good progress has been
made, and building the necessary capabilities for UQ is essential to the
reliable use of computational simulation. The clamor has also been noticed
by the mathematical and statistical societies. Interest groups in UQ have
been started by the Society for Industrial and Applied Mathematics and the
American Statistical Association. The two societies have also started a joint
journal, Uncertainty Quantification.
THE MATHEMATICAL SCIENCES AND SOCIAL NETWORKS
The emergence of online social networks is changing behavior in many
contexts, allowing decentralized interaction among larger groups and fewer
geographic constraints. The structure and complexity of these networks

OCR for page 27

34 THE MATHEMATICAL SCIENCES IN 2025
have grown rapidly in recent years. At the same time, online networks
are collecting social data at unprecedented scale and resolution, making
s
ocial networks visible that in the past could only be explored via in-depth
surveys. Today, millions of people leave digital traces of their personal
s
ocial networks in the form of text messages on mobile phones, updates on
F
acebook or Twitter, and so on.
The mathematical analysis of networks is one of the great success
s
tories of applying the mathematical sciences to an engineered system, going
back to the days when AT&T networks were designed and operated based
on graph theory, probability, statistics, discrete mathematics, and optimiza-
tion. However, since the rise of the Internet and social networks, the under-
lying assumptions in the analysis of networks have changed dramatically.
The abundance of such social network data, and the increasing complexity
of social networks, is changing the face of research on social networks.
These changes present both opportunities and challenges for mathematical
and statistical modeling.
One example of how the mathematical sciences have contributed
to the new opportunity is the significant amount of recent work focused
on the development of random-graph models that capture some of the
qualitative properties observed in large-scale network data. The math-
ematical models developed help us understand many attributes of social
networks. One such attribute is the degree of connectivity of a network,
which in some cases reveals the smallness of world, in which very distant
parts of a population are connected via surprisingly short paths.3 These
short paths are surprisingly easy to find, which has led to the success of
decentralized search algorithms.
Another important direction is the development of mathematical
m
odels of contagion and network processes. Social networks play a funda-
mental role in spreading information, ideas, and influence. Such contagion
of behavior can be beneficial when a positive behavior change spreads from
person to person, but it can also produce negative outcomes, as in cascading
failures in financial markets. Such concepts open the way to epidemiological
models that are more realistic than the “bin” models that do not take the
structure of interpersonal contacts into account. The level of complexity in
influencing and understanding such contagion phenomena rises with the
size and complexity of the social network. Mathematical models have great
potential to improve our understanding of these phenomena and to inform
policy discussions aimed at enhancing system performance.
3
See http://www.nytimes.com/2011/11/22/technology/between-you-and-me-4-74-degrees.
html?_r=2&hp.

OCR for page 27

VITALITY OF THE MATHEMATICAL SCIENCES 35
THE PROTEIN-FOLDING PROBLEM
Knowing the shape of a protein is an essential step in understanding
its biological function. In Nobel-prize winning work, biologist Christian
Anfinsen showed that an unfolded protein could refold spontaneously
to its original biologically active conformation. This observation led to
the famous conjecture that the one-dimensional sequence of amino acids
of a protein uniquely determines the protein’s three-dimensional struc-
ture. That in turn led to the almost 40-year effort of quantitative scien-
tists in searching for computational strategies and algorithms for solving
the so-called “ rotein-folding problem,” which is to predict a protein’s
p
three- imensional structure from its primary sequence information. Some
d
subproblems include how a native structure results from the interatomic
forces of the sequence of amino acids and how a protein can fold so fast.4
Although the protein-folding conjecture has been shown to be incorrect
for a certain class of proteins—for example, sometimes enzymes called
“chaperones” are needed to assist in the folding of a protein—scientists
have observed that more than 70 percent of the proteins in nature still fold
spontaneously, each into its unique three-dimensional shape.
In 2005, the protein-folding problem was listed by Science maga-
zine as one of the 125 grand unsolved scientific challenges. The impact
of solving the protein-folding problem is enormous. It will have a direct
and profound impact on our understanding of life: how these basic units,
which carry out almost every function in living cells, play their roles at the
fundamental physics level. Everything about the molecular mechanism of
protein otions—folding, conformational changes, and evolution—could
m
be revealed, and the whole cell could be realistically modeled. Furthermore,
such an advance would have a great influence on novel protein design and
rational drug design, which may revolutionize the pharmaceutical industry.
For example, drugs might be rather accurately designed on a computer
without much experimentation. Genetic engineering for improving the
function of particular proteins (and thus certain phenotypes of an indi-
vidual) could become realistic.
Conceptually, the protein-folding problem is straightforward: Given the
positions of all atoms in a protein (typically tens of thousands of atoms),
one would calculate the potential energy of the structure and then find a
configuration that minimizes that energy. However, such a goal is techni-
cally difficult to achieve owing to the extreme complexity of the means
by which the energy depends on the structure. A more attractive strat-
egy, termed “molecular dynamics,” has a clear physical basis: One uses
4 Dillet al., 2007, The protein folding problem: When will it be solved? Current Opinion
in Structural Biology 17: 342-346.

OCR for page 27

36 THE MATHEMATICAL SCIENCES IN 2025
N
ewton’s law of motion to write down a set of differential equations, called
Hamiltonian equations, to describe the locations and speeds of all the atoms
involved in the protein structure at any instant. Then, one can then numeri-
cally solve the protein structural motion equations, which not only predict
low-energy structures of a protein but also provide information on protein
movements and dynamics. To achieve the goal, one typically discretizes the
time and approximates the differential equations by difference equations.
Then, a molecular dynamic algorithm such as the leapfrog algorithm is used
to integrate over the equations of motion. However, because the system is
large and complex, the time step in the discretization must be sufficiently
small to avoid disastrous conflicts, which implies that the computation cost
is extremely high for simulating even the tiny fraction of a second that must
be simulated.
Another strategy is based on the fundamental principle of statistical
mechanics, which states that the probability of observing a particular struc-
tural state is proportional to its Boltzmann distribution, which is of the form
P(S) ∝ e–E(S)/kT, where E(S) is the potential energy of structural state S. Thus,
one can use Monte Carlo methods to simulate the structural state S from
this distribution. However, because of the extremely high dimensionality and
complexity of the configuration space and also because of the complex en-
ergy landscape, simulating well from the Boltzmann distribution is still very
challenging. New Monte Carlo techniques are needed to make the simula-
tion more efficient, and these new methods could have a broader impact on
other areas of computation as well.
Both molecular dynamics and Monte Carlo methods rely on a good
energy function E(S). Although much effort has been made and many
insights have been gained, it is still an important challenge to accurately
model interatomic interactions, especially in realistic environments—for
example, immersed in water or attached to a membrane—in a practical
way. The overall approach is still not as precise as would be desired. Given
the availability of a large number of solved protein structures, there is still
room for applying certain statistical learning strategies to combine infor-
mation from empirical data and physics principles to further improve the
energy function.
In recent years, great successes have been made in computational pro-
tein structure prediction by taking advantage of the fast-growing protein
structure databases. A well-known successful strategy is called “homology
modeling,” or template-based modeling, which can provide a good approxi-
mate fold for a protein that has a homologous “relative”—that is, one with
sequence similarity >30 percent—whose structure has been solved. Another
attractive strategy successfully combines empirical knowledge of protein
structures with Monte Carlo strategies for simulating protein folds. The
idea is to implement only those structural modifications that are supported

OCR for page 27

VITALITY OF THE MATHEMATICAL SCIENCES 37
by observed structural folds in the database. So far, these bioinformatics-
based learning methods are able to predict well the folding of small globular
proteins and those proteins homologous to proteins of known structure.
Future challenges include the prediction of structures of medium to
large proteins, multidomain proteins, and transmembrane proteins. For
multidomain proteins, many statistical learning strategies have been devel-
oped to predict which domains tend to interact with one another based on
massive amounts of genomic and proteomic data. More developments in
conjunction with structure modeling will be needed. There are other more
technical, but also very important, challenges, such as how to rigorously
evaluate a new energy function or a sampling method; how to more objec-
tively represent a protein’s structure as an ensemble of structures instead
of a single representative; how to evaluate the role of entropies; and how
to further push the frontier of protein design. The potential impact of solv-
ing the protein-folding problem in its current form is muted by a grander,
much more challenging issue. That is, how to come up with a description
of protein folding and structure prediction, as well as of function and
mechanism, that accords with a quantum mechanical view of reality. A
protein’s dynamic properties, given that they presumably conform to the
laws of quantum electrodynamics, may exhibit unexpected, counterintuitive
behavior unlike anything that we have ever seen or that can be predicted
based on classical physics, which is the prevailing viewpoint now owing
to computational limitations. For example, when a protein folds, does it
utilize “quantum tunneling” to get around (imaginary) classical energy
barriers that would cause problems for a molecular dynamics program?
This is a mathematical and algorithmic problem that has been largely left
to one side by biologists because it appears to be intractable. Some highly
novel, outside-the-box mathematical concepts will likely be required if one
is to overcome these limitations. An important challenge for mathemati-
cal biologists is to help discover additional protein properties, beyond the
handful identified thus far, that are nonclassical and thus counterintuitive.
Currently, statistical approaches have provided an indirect way of attack-
ing such problems and a way that is analogous to the statistical approaches
used by classical geneticists to work around their lack of molecular bio-
logical and cytological methods. In a similar way, statistical approaches,
applied in an evolutionary context, may augment our current arsenal of
experimental and theoretical methods for understanding protein folding
and predicting protein structure, function, and mechanisms.
THE FUNDAMENTAL LEMMA
The fundamental lemma is a seemingly obscure combinatorial identity
introduced by Robert Langlands in 1979, as a component of what is now

OCR for page 27

VITALITY OF THE MATHEMATICAL SCIENCES 47
Eugene Wigner to wonder what accounts for the unreasonable effective-
ness of mathematics in physics.
A more recent version of the same basic pattern is the Yang-Mills
theory. Here again the physicists were struggling to develop a mathematical
framework to handle the physical concepts they were developing, when in
fact the mathematical framework, which in mathematics is known as con-
nections on principal bundles and curvature, had already been introduced
for mathematical reasons. Much of the recent history of quantum field
theory has turned this model of interaction on its head. When quantum
field theory was introduced in the 1940s and 1950s there was no appro-
priate mathematical context. Nevertheless, physicists were able to develop
the art of dealing with these objects, at least in special cases. This line
of reasoning, using as a central feature the Yang-Mills theory, led to the
standard model of theoretical physics, which makes predictions that have
been checked by experiment to enormous precision. Nevertheless, there was
not then and still is not today a rigorous mathematical context for these
computations. The situation became even worse with the advent of string
theory, where the appropriate mathematical formulation seems even more
remote. But the fact that the mathematical context for these theories did
not exist and has not yet been developed is only part of the way that the
current interactions between mathematics and physics differ from previous
ones. As physicists develop and explore these theories, for which no rigor-
ous mathematical formulation is known, they have increasingly used ever
more sophisticated geometric and topological structures in their theories. As
physicists explore these theories they come across mathematical questions
and statements about the underlying geometric and topological objects in
terms of which the theories are defined. Some of these statements are well-
known mathematical results, but many turn out to be completely new types
of mathematical statements.
These statements, conjectures, and questions have been one of the main
forces driving geometry and topology for the last 20 to 25 years. Some of
them have been successfully verified mathematically; some have not been
proved but the mathematical evidence for them is overwhelming; and some
are completely mysterious mathematically. One of the first results along
these lines gave a formula for the number of lines of each degree in a gen-
eral hypersurface in complex projective four-dimensional space given by a
homogeneous degree-5 polynomial. Physics arguments produced a general
formula for the number of such lines where the formula came from a com-
pletely different area of mathematics (power series solutions to certain ordi-
nary differential equations). Before the input from physics, mathematicians
had computed the answer for degrees 1 through 5 but had no conjecture for
the general answer. The physics arguments provided the general formula,
and this was later verified by direct mathematical argument.

OCR for page 27

48 THE MATHEMATICAL SCIENCES IN 2025
These direct mathematical arguments gave no understanding of the
original physics insight that connected the formula with solutions to an or-
dinary differential equation. Indeed, finding such a connection is one of the
central problems today in geometry. Many mathematicians work on aspects
of this problem, and there are partial hints but no complete understanding,
even conjecturally. This statement characterizes much of the current input
of physics into mathematics. It seems clear that the physicists are on the
track of a deeper level of mathematical understanding that goes far beyond
our current knowledge, but we have only the smallest hints of what that
might be. Understanding this phenomenon is a central problem both in
high-energy theoretical physics and in geometry and topology.
Nowadays, sophisticated mathematics is essential for stating many of
the laws of physics. As mentioned, the formulation of the standard model
of particle physics involves “gauge theories,” or fiber bundles. These have
a very rich topology. These topological structures are described by Chern-
Simons theories, Index theory, and K-theory. These tools are also useful for
condensed matter systems. They characterize topological phases of matter,
which could offer an avenue for quantum computing.12 Here the q-bits are
encoded into the subtle topology of the fiber bundle, described by Chern-
Simons theory. Recently, K-theory has been applied to the classification of
topological insulators,13 another active area of condensed matter physics.
String theory and mathematics are very closely connected, and research
in these areas often straddles physics and mathematics. One recent develop-
ment, the gauge gravity duality, or AdS/CFT, has connected general relativ-
ity with quantum field theories, the theories we use for particle physics.14
The gravity theory lives in hyperbolic space. Thus, many developments in
hyperbolic geometry, and black holes, could be used to describe certain
strongly interacting systems of particles. Thinking along these lines has
connected a certain long-distance limit of gravity equations to the equations
of hydrodynamics. One considers a particular black-hole, or black-brane,
soluion of Einstein’s equations with a negative cosmological constant.
t
These black holes have long-distance excitations representing small fluc-
tuations of the geometry. The fluctuations are damped since they end up
being swallowed by the black hole. According to AdS/CFT, this system is
described by a thermal system on the boundary, a thermal fluid of quantum
interacting particles. In this formulation, the long-distance perturbations
are described by hydrodynamics, namely by the Navier-Stokes equation
12 A. Yu. Kitaev, 2003, Fault-tolerant quantum computation by anyons. Annals of Physics
303:2-30
13 A.P. Schnyder, S. Ryu, A. Furusaki, and A.W.W. Ludwig, 2008, Classification of
topological insulators and superconductors in three spatial dimensions. Physical Review
Letters B 78:195125.
14 J. Maldacena, 2005, The illusion of gravity. Scientific American October 24.

OCR for page 27

VITALITY OF THE MATHEMATICAL SCIENCES 49
or its relativistic analog. The viscosity term in this equation produces
the damping of excitations, and it is connected with the waves falling into
the black hole. Computing the viscosity in the particle theory from first
principles is very difficult. However, it is very simple from the point of view
of Einstein’s equations, because it is given by a purely geometric quantity:
the area of the black hole horizon. This has been used to qualitatively
model strongly interacting systems of quantum particles. These range from
the quark-gluon fluids that are produced by heavy ion collisions (at the
Relativistic Heavy Ion Collider at Brookhaven National Laboratory or at
the Large Hadron Collider at Geneva) to high-temperature superconductors
in condensed matter physics.
Many examples of AdS/CFT involve additional structures, such as
supersymmetry. In these cases the geometry obeys special constraints, giv-
ing rise to Sasaki-Einstein spaces, which are closely related to Calabi-Yau
spaces. This is merely an example of a more general trend developing
connections between matrix models, algebraic curves, and supersymmetric
quantum field theory.
A recent development of the past decade has been the discovery of
integrability in N = 4 super-Yang-Mills. This four-dimensional quantum
field theory is the most symmetric quantum field theory. The study of this
highly symmetrical example is very useful since it will probably enable us
to find some underlying structures common to all quantum gauge theories.
Integrability implies the existence of an infinite-dimensional set of symme-
tries in the limit of a large number of colors. In this regime the particles of
the theory, or gluons, form a sort of necklace. The symmetry acts on these
states and allows us to compute their energies exactly as a function of the
coupling. The deep underlying mathematical structures are only starting to
be understood. Integrability in so-called (1 + 1)-dimensional systems has led
to the development of quantum groups and other interesting mathematics.
The way integrability appears here is somewhat different, and it is quite
likely that it will lead to new mathematics. A closely related area is the
computation of scattering amplitudes in this theory. A direct approach us-
ing standard methods, such as Feynman diagrams, quickly becomes very
complicated. On the other hand, there are new methods showing that
the actual answers are extremely simple and have a rich structure that is
asso iated with the mathematics of Grassmanians. This has led to another
c
fruitful collaboration.15
The connection between theoretical physics and mathematics is grow-
ing ever stronger, and it is supported by the emergence of interdisciplinary
centers, such as the Simons Center for Geometry and Physics at Stony
15 See, for example, A.B. Goncharov, M. Spradlin, C. Vergu, and A. Volovich, 2010, Clas-
sical polylogarithms for amplitudes and Wilson loops. Physical Review Letters 105:151605.

OCR for page 27

50 THE MATHEMATICAL SCIENCES IN 2025
Brook, and math/physics initiatives at various universities, such as Duke
University, the University of Pennsylvania, and the University of California
at Santa Barbara.
NEW FRONTIERS IN STATISTICAL INFERENCE
We live in a new age for statistical inference, where technologies now
produce high-dimensional data sets, often with huge numbers of measure-
ments on each of a comparatively small number of experimental units.
Examples include gene expression microarrays monitoring the expression
levels of tens of thousands of genes at the same time and functional mag-
netic resonance imaging machines monitoring the blood flow in various
parts of the brain. The breathtaking increases in data-acquisition capa-
bilities are such that millions of parallel data sets are routinely produced,
each with its own estimation or testing problem. This era of scientific mass
production calls for novel developments in statistical inference, and it has
inspired a tremendous burst in statistical methodology. More importantly,
the data flood completely transforms the set of questions that needs to be
answered, and the field of statistics has, accordingly, changed profoundly
in the last 15 years. The shift is so strong that the subjects of contemporary
research now have very little to do with general topics of discussion from
the early 1990s.
High dimensionality refers to an estimation or testing problem in
which the number of parameters about which we seek inference is about
the same as, or much larger than, the number of observations or samples
we have at our disposal. Such problems are everywhere. In medical re-
search, we may be interested in determining which genes might be associ-
ated with prostate cancer. A typical study may record expression levels of
thousands of genes for perhaps 100 men, of whom only half have prostate
cancer and half serve as a control set. Here, one has to test thousands of
hypotheses simultaneously in order to discover a small set of genes that
could be investigated for a causal link to cancer development. Another ex-
ample is genomewide association studies where the goal is to test whether
a variant in the genome is associated with a particular phenotype. Here
the subjects in a study typically number in the tens of thousands and the
number of hypotheses may be anywhere from 500,000 to 2.5 million. If we
are interested in a number of phenotypes, the number of hypotheses can
easily rise to the billions and trillions, not at all what the early literature
on multiple testing had in mind.
In response, the statistical community has developed groundbreak-
ing techniques such as the false discovery rate (FDR) of Benjamini and
Hochberg, which proposes a new paradigm for multiple comparisons and
has had a tremendous impact not only on statistical science but also in the

OCR for page 27

VITALITY OF THE MATHEMATICAL SCIENCES 51
medical sciences and beyond.16 In a nutshell, the FDR procedure controls
the expected ratio between the number of false rejections and the total
number of rejections. Returning to our example above, this allows the stat-
istician to return a list of genes to the medical researcher assuring her that
she should expect at most a known fraction of these genes, say 10 percent,
to be “false discoveries.” This new paradigm has been extremely successful,
for it enjoys increased power (the ability of making true discoveries) while
simultaneously safeguarding against false discoveries. The FDR methodol-
ogy assumes that the hypotheses being tested are statistically independent
and that the data distribution under the null hypothesis is known. These
assumptions may not always be valid in practice, and much of statistical
research is concerned with extending statistical methodology to these chal-
lenging setups. In this direction, recent years have seen a resurgence of
empirical Bayes techniques, made possible by the onslaught of data and
providing a powerful framework and new methodologies to deal with some
of these issues.
Estimation problems are also routinely high-dimensional. In a genetic
association study, n subjects are sampled and one or more quantitative
traits, such as cholesterol level, are recorded. Each subject is also measured
at p locations on the chromosomes. For instance, one may record a value
(0, 1, or 2) indicating the number of copies of the less-common allele
observed. To find genes exhibiting a detectable association with the trait,
one can cast the problem as a high-dimensional regression problem. That
is to say, one seeks to express the response of interest (cholesterol level) as
a linear combination of the measured genetic covariates; those covariates
with significant coefficients are linked with the trait.
The issue is that the number n of samples (equations) is in the thou-
sands while the number p of covariates is in the hundreds of thousands.
Hence, we have far fewer equations than unknowns, so what shall we
do? This is a burning issue because such underdetermined systems arise
everywhere in science and engineering. In magnetic resonance imaging, for
example, one would like to infer a large number of pixels from just a small
number of linear measurements. In many problems, however, the solution is
assumed to be sparse. In the example above, it is known that only a small
number of genes can potentially be associated with a trait. In medical im-
aging, the image we wish to form typically has a concise description in a
carefully chosen representation.
In recent years, statisticians and applied mathematicians have devel-
oped a flurry of highly practical methods for such sparse regression prob-
lems. Most of these methods rely on convex optimization, a field that has
16 As of January 15, 2012, Google Scholar reported 12,861 scientific papers citing the
original article of Benjamini and Hochberg.

OCR for page 27

52 THE MATHEMATICAL SCIENCES IN 2025
registered spectacular advances in the last 15 years. By now there is a
large literature explaining (1) when one can expect to solve a large under
determined system by L1 minimization and when it is not possible and
(2) when accurate statistical estimation is possible. Beyond the tremendous
technical achievements, which have implications for nearly every field of
science and technology, this modern line of research suggests a complete
revision of those metrics with which the accuracy of statistical estimates
is evaluated. Whereas classical asymptotic approaches study the size of
e
rrors when the number of parameters is fixed and the sample size goes to
infinity, modern asymptotics is concerned with situations when the number
of both parameters p and observations n tends to infinity but perhaps in
a fixed ratio, or with p growing at most polynomially in n. Further, the
very concept of errors has to be rethought. The question is not so much
whether asymptotic normality holds but whether the right variables have
been selected. In a sea of mostly irrelevantvariables, collected because data
are now so readily acquired in many contexts. What is the minimum signal-
to-noise ratio needed to guarantee that the variables of true importance can
be identified?
Accurate estimation from high-dimensional data is not possible unless
one assumes some structure such as sparsity, discussed above. Statisticians
have studied other crucial structures that permit accurate estimation, again,
from what appear to be incomplete data. These include the estimation of
low-rank matrices, as in the famous Netflix problem, where the goal is to
predict preferences for movies a user has not yet seen; the estimation of
large covariance matrices or graphical models under the assumption that
the graph of partial correlations is sparse; or the resolution of X-ray diffrac-
tion images from magnitude measurements only. The latter is of paramount
importance in a number of applications where a detector can collect only
the intensity of an optical wave but not its phase. In short, modern statistics
is at the forefront of contemporary science since it is clear that progress will
increasingly rely on statistical and computational tools to extract informa-
tion from massive data sets.
Statistics has clearly taken on a pivotal role in the era of massive data
production. In the area of multiple comparisons, novel methodologies have
been widely embraced by applied communities. Ignoring statistical issues
is a dangerous proposition readily leading to flawed inference, flawed sci-
entific discoveries, and nonreproducible research. In the field of modern
regression, methods and findings have inspired a great number of communi-
ties, suggesting new modeling tools and new ways to think about informa-
tion retrieval. The field is still in its infancy, and much work remains to be
done. To give one idea of a topic that is likely to preoccupy statisticians
in the next decade, one could mention the problem of providing correct
inference after selection. Conventional statistical inference requires that a

OCR for page 27

VITALITY OF THE MATHEMATICAL SCIENCES 53
model be known before the data are analyzed. Current practice, however,
typically selects a model after data analysis. Standard statistical tests and
confidence intervals applied to the selected parameters are in general com-
pletely erroneous. Statistical methodology providing correct inference after
massive data snooping is urgently needed.
ECONOMICS AND BUSINESS: MECHANISM DESIGN
Mechanism design is a subject with a long and distinguished history.
One such example is designing games that create incentives to produce a
certain desired outcome (such as revenue or social welfare maximization).
Recent developments highlight the need to add computational consider-
ations to classic mechanism-design problems.
Perhaps the most important example is the sale of advertising space on
the Internet, which is the primary source of revenue for many providers of
online services. According to a recent report, online advertising spending
continues to grow at double-digit rates, with $25.8 billion having been spent
in online advertising in the United States in 2010.17 The success of online
advertising is due, in large part, to providers’ ability to tailor advertisements
to the interests of individual users as inferred from their search behavior.
However, each search query generates a new set of advertising spaces to be
sold, each with its own properties determining the applicability of different
advertisements, and these ads must be placed almost instantaneously. This
situation complicates the process of selling space to potential advertisers.
There has been significant progress on computationally feasible mecha-
nism design on many fronts. Three highlights of this research so far are the
following:
• Understanding the computational difficulty of finding Nash equi-
libria. Daskalakis, Goldberg, and Papadimitriou won the Game
Theory Society’s Prize in Game Theory and Computer Science in
2008 for this work.
• Quantifying the loss of efficiency of equilibria in games that do
not perfectly implement a desired outcome, which is referred to as
the price of anarchy. While initial success in this area has been in
the area of games such as load balancing and routing, recent work
in this direction that is relevant to online auctions also has great
potential.
• Enabling computationally feasible mechanism design by developing
techniques that approximately implement a desired outcome.
17 Data from emarketer.com. Available at http://www.emarketer.com/PressRelease.
aspx?R=1008096. Accessed March 14, 2012.

OCR for page 27

54 THE MATHEMATICAL SCIENCES IN 2025
MATHEMATICAL SCIENCES AND MEDICINE
The mathematical sciences contribute to medicine in a great many
ways, including algorithms for medical imaging, computational methods
related to drug discovery, models of tumor growth and angiogenesis, health
informatics, comparative effectiveness research, epidemic modeling, and
analyses to guide decision making under uncertainty. With the increasing
availability of genomic sequencing and the transition toward more wide-
spread use of electronic health record systems, we expect to see more prog-
ress toward medical interventions that are tailored to individual patients.
Statisticians will be deeply involved in developing those capabilities.
To illustrate just one way in which these endeavors interact, consider
some mathematical science challenges connected to the diagnosis and plan-
ning of surgery for cardiac patients. One of the grand challenges of com-
putational medicine is how to construct an individualized model of the
heart’s biology, mechanics, and electrical activity based on a series of
measurements taken over time. Such models can then be used for diagnosis
or surgical planning to lead to better patient outcomes. Two basic math-
ematical tasks are fundamental to this challenge. Both are much-studied
problems in applied mathematics, but they need to be carefully adapted to
the task at hand.
The first of these tasks is to extract cardiac motion from a time-varying
sequence of three-dimensional computerized tomography (CT) or magnetic
resonance imaging (MRI) patient images. This is achieved by solving the
so-called deformable image registration problem, a problem that comes up
over and over in medical imaging. To solve this problem—to effectively
align images in which the principal subject may have moved—one needs to
minimize a chosen “distance” between the image intensity functions of im-
ages taken at different times. Unfortunately, this problem is ill-posed: there
are many different maps that minimize the distance between the two im-
ages, most of which are not useful for our purposes. To tease out the appro-
priate mapping, one must choose a “penalty function” for the amount of
stretching that is required to bring the successive images into approximate
alignment. Finding the right penalty function is a very subtle task that relies
on concepts and tools from a branch of core mathematics called differential
geometry. Once a good penalty function has been applied, the work also
requires efficient computational algorithms for the large-scale calculations.
The second mathematical task is to employ the extracted cardiac mo-
tion as observational data that drive the solution of an inverse problem. In
this case, the inverse problem is to infer the parameters for the bioelectro-
mechanical properties of the cardiac model based on the motion observed
externally through the imaging. The cardiac biophysical model draws on
another area of mathematics—partial differential equations—and must

OCR for page 27

VITALITY OF THE MATHEMATICAL SCIENCES 55
bring together multiple physics components: elasticity of the ventricular
wall, electrophysiology, and active contraction of the myocardial fibers.
The full-blown setting of this problem is analogous to a “blind
d
econvolution” problem, in the sense that neither the model nor the source
is fully known. As such, this presents enormous difficulty for the inversion
solvers; as in the image registration case, it requires careful formulation and
regularization, as well as large-scale computational solvers that are tolerant
of ill-conditioning. Recent research18 is following a hybrid approach that
interweaves the solution of the image registration and model determination
problems.
COMPRESSED SENSING
The story of compressed sensing is an example of the power of the
mathematical sciences and of their dynamic relationship with science and
engineering. As is often the case, the development of novel mathematics can
be inspired by an important scientific or engineering question. Then, math-
ematical scientists develop abstractions and quantitative models to solve
the original problem, but the conversion into a more abstract setting can
also supply insight to other applications that share a common mathematical
structure. In other words, there is no need to reinvent the wheel for each
instantiation of the problem.
Compressed sensing was motivated by a great question in MRI, a medi-
cal imaging technique used in radiology to visualize detailed internal struc-
tures. MRI is a wonderful tool with several advantages over other medical
imaging techniques such as CT or X-rays. However, it is also an inherently
slow data-acquisition process. This means that it is not feasible to acquire
high-quality scans in a reasonable amount of time, or to acquire dynamic
images (videos) at a decent resolution. In pediatrics for instance, the impact
of MRI on children’s health is limited because, among other things, children
cannot remain still or hold their breath for long periods of time, so that it
is impossible to achieve high-resolution scans. This could be overcome by,
for example, using anesthesia that is strong enough to stop respiration for
several minutes, but clearly such procedures are dangerous.
Faster imaging can be achieved by reducing the number of data points
that need to be collected. But common wisdom in the field of biomedical
imaging maintained that skipping sample points would result in informa-
tion loss. A few years ago, however, a group of researchers turned signal
processing upside down by showing that high-resolution imaging was pos-
18 H. Sundar, C. Davatzikos, and G. Biros, 2009, Biomechanically-constrained 4D estima-
tion of myocardial motion. Medical Image Computing and Computer-Assisted Intervention
(MICCAI):257-265.

OCR for page 27

56 THE MATHEMATICAL SCIENCES IN 2025
sible from just a few samples. In fact, they could recover high-resolution
pictures even when an MRI is not given enough time to complete a scan. To
quote from Wired, “That was the beginning of compressed sensing, or CS,
the paradigm-busting field in mathematics that’s reshaping the way people
work with large data sets.”19
Despite being only a few years old, compressed sensing algorithms are
already in use in some form in several hospitals in the country. For example,
compressed sensing has been used clinically for over 2 years at Lucile
Packard Children’s Hospital at Stanford. This new method produces sharp
images from brief scans. The potential for this method is such that both
General Electric and Phillips Corporation have medical imaging products
in the pipeline that will incorporate compressed sensing.
However, what research into compressed sensing discovered is not just
a faster way of getting MR images. It revealed a protocol for acquiring in-
formation, all kinds of information, in the most efficient way. This research
addresses a colossal paradox in contemporary science, in that many proto-
cols acquire massive amounts of data and then discard much of it, without
much or any loss of information, through a subsequent compression stage,
which is usually necessary for storage, transmission, or processing purposes.
Digital cameras, for example, collect huge amounts of information and then
compress the images so that they fit on a memory card or can be sent over
a network. But this is a gigantic waste. Why bother collecting megabytes of
data when we know very well that we will throw away 95 percent of it? Is
it possible to acquire a signal in already compressed form? That is, Can we
directly measure the part of the signal that carries significant information
and not the part of the signal that will end up being thrown away? The
surprise is that mathematical scientists provide an affirmative answer. It was
unexpected and counterintuitive, because common sense says that a good
look at the full signal is necessary in order to decide which bits one should
keep or measure and which bits can be ignored or discarded. This view,
although intuitive, is wrong. A very rich mathematical theory has emerged
showing when such compressed acquisition protocols are expected to work.
This mathematical discovery is already changing the way engineers
think about signal acquisition in areas ranging from analog-to-digital con-
version, to digital optics, and seismology. In communication and electronic
intelligence, for instance, analog-to-digital conversion is key to transducing
information from complex radiofrequency environments into the digital
domain for analysis and exploitation. In particular, adversarial communi-
cations can hop from frequency to frequency. When the frequency range
is large, no analog-to-digital converter (ADC) is fast enough to scan the
19 Jordan
Ellenberg, 2010, Fill in the blanks: Using math to turn lo-res datasets into hi-res
samples. Wired, March.

OCR for page 27

VITALITY OF THE MATHEMATICAL SCIENCES 57
full range, and surveys of high-speed ADC technologies show that they
are advancing at a very slow rate. However, compressed sensing ideas
show that such signals can be acquired at a much lower rate, and this has
led to the development of novel ADC architectures aiming at the reliable
acquisition of signals that are in principle far outside the range of current
data con erters. In the area of digital optics, several systems have been
v
designed. Guided by compressed sensing research, engineers have more
design freedom in three dimensions: (1) they can consider high-resolution
imaging with far fewer sensors than were once thought necessary, dramati-
cally reducing the cost of such devices; (2) they can consider designs that
speed up signal acquisition time in microscopy by orders of magnitude,
opening up new applications; and (3) they can sense the environment with
greatly reduced power consumption, extending sensor life. Remarkably, a
significant fraction of this work takes place in industry, and a number of
companies are already engineering faster, cheaper, and more-efficient sen-
sors based on these recently developed mathematical ideas.
Not only is compressed sensing one of the most applicable theories
coming out of the mathematical sciences in the last decade, but it is also
very sophisticated mathematically. Compressed sensing uses techniques of
probability theory, combinatorics, geometry, harmonic analysis, and op-
timization to shed new light on fundamental questions in approximation
theory: How many measurements are needed to recover an object of inter-
est? How is recovery possible from a minimal number of measurements?
Are there tractable algorithms to retrieve information from condensed
measurements? Compressed sensing research involves the development of
mathematical theories, the development of numerical algorithms and com-
putational tools, and the implementation of these ideas into novel hard-
ware. Thus, progress in the field involves a broad spectrum of scientists and
engineers, and core and applied mathematicians, statisticians, computer sci-
entists, circuit designers, optical engineers, radiologists, and others regularly
gather to attend scientific conferences together. This produces a healthy
cycle in which theoretical ideas find new applications and where applica-
tions renew theoretical mathematical research by offering new problems
and suggesting new directions.