1
Problem Definition and History

INTRODUCTION

This report identifies the major scientific and technical challenges in four fields of science and engineering that are critically dependent on high-end capability computing (HECC), and it characterizes the ways in which they depend on HECC. The committee thinks of its study as a gaps analysis, because it looked at the science and engineering “pull” on computing rather than the technology “push” enabled by computing. This perspective complements the more typical one, in which a new or envisioned computing capability arises and is followed by studies on how to profit from it.

It is generally accepted that computer modeling and simulation offer substantial opportunities for scientific breakthroughs that cannot otherwise—using laboratory experiments, observations, or traditional theoretical investigations—be realized. At many of the research frontiers discussed in this report, computational approaches are essential to continued progress and will play an integral and essential role in much of twenty-first century science and engineering. This is the inevitable result of decades of success in pushing those frontiers to the point where the next logical step is to characterize, model, and understand great complexity.

It became apparent during the committee’s analysis of the charge, especially task (b), that “high-end capability computing” needs to be interpreted broadly. A 2005 report from The National Research Council (NRC, 2005) defines capability computing as “the use of the most powerful supercomputers to solve the largest and most demanding problems.”1 The implication is that capability computing expands the range of what is possible computationally. In computationally mature fields, expanding that range is generally accomplished by such steps as increasing processing power and memory and improving algorithmic efficiency. The emphasis is on the most powerful supercomputers. But in other fields, computationally solving the “largest and most demanding problems” can be limited by factors other than availability of extremely powerful supercomputers. In those situations, capability computing is still whatever expands the range of what is possible computationally, but it might not center around the use of “the most powerful” supercomputers.

1

Quote from p. 279 of the NRC report.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 9
1 Problem Definition and History INTRODUCTION This report identifies the major scientific and technical challenges in four fields of science and engineering that are critically dependent on high-end capability computing (HECC), and it characterizes the ways in which they depend on HECC. The committee thinks of its study as a gaps analysis, because it looked at the science and engineering “pull” on computing rather than the technology “push” enabled by computing. This perspective complements the more typical one, in which a new or envisioned com- puting capability arises and is followed by studies on how to profit from it. It is generally accepted that computer modeling and simulation offer substantial opportunities for scientific breakthroughs that cannot otherwise—using laboratory experiments, observations, or tradi- tional theoretical investigations—be realized. At many of the research frontiers discussed in this report, computational approaches are essential to continued progress and will play an integral and essential role in much of twenty-first century science and engineering. This is the inevitable result of decades of success in pushing those frontiers to the point where the next logical step is to characterize, model, and understand great complexity. It became apparent during the committee’s analysis of the charge, especially task (b), that “high- end capability computing” needs to be interpreted broadly. A 2005 report from The National Research Council (NRC, 2005) defines capability computing as “the use of the most powerful supercomputers to solve the largest and most demanding problems.”1 The implication is that capability computing expands the range of what is possible computationally. In computationally mature fields, expanding that range is generally accomplished by such steps as increasing processing power and memory and improving algorithmic efficiency. The emphasis is on the most powerful supercomputers. But in other fields, com- putationally solving the “largest and most demanding problems” can be limited by factors other than availability of extremely powerful supercomputers. In those situations, capability computing is still whatever expands the range of what is possible computationally, but it might not center around the use of “the most powerful” supercomputers. 1Quote from p. 279 of the NRC report. 

OCR for page 9
0 THE POTENTIAL IMPACT OF HIGH-END CAPABILITY COMPUTING Thus, in this report, capability computing is interpreted as computing that enables some new science or engineering capability—an insight or means of investigation that had not previously been available. In that context, high-end capability computing is distinguished by its ability to enable science and engineering investigations that would otherwise be infeasible. HECC is also distinguished by its nonroutine nature. As the term “capability computing” is normally used, this corresponds to computing investments that are more costly and risky than somewhat time-tested “capacity computing.” This report retains that distinction: HECC might require extra assistance—for example, to help overcome the frailties encountered with a first-time implementation of a new model or software—and entail more risk than more routine uses of computing. But the committee’s focus on increasing scientific and engineering capabilities means that it must interpret “computing investments” to include whatever is needed to develop those nonroutine computational capabilities. The goal is to advance the fields. Progress might be indicated by some measure of computational capability, but that measure is not necessarily just processing power. Targeted investments might be needed to stimulate the development of some of the components of HECC infrastructure, and additional resources might be necessary to give researchers the ability to push the state of the art of computing in their discipline. These investments would be for the development of mathematical models, algorithms, software, hardware, facilities, training, and support—any and all of the foundations for progress that are unlikely to develop optimally without such investments given the career incentives that prevail in academe and the private sector. Given that the federal government has accepted this responsibility (see the section below on history), it is faced with a policy question: How much HECC infrastructure is needed, and of what kind? To answer this, the committee sought and analyzed information that would give it two kinds of understanding: 1. An understanding of the mix of research topics that is desirable for the nation. Each field explicitly or implicitly determines this for itself through its review of competing research proposals and its sponsorship of forward-looking workshops and studies. Federal policy makers who define programs of research support and decide on funding levels are involved as well. 2. An understanding of the degree to which nationally important challenges in science and engineer- ing can best be met through HECC. Answering the question about what kind of infrastructure is needed requires an understanding of a field’s capacity for making use of HECC in practice, not just potential ways HECC could contribute to the field. In that spirit, this report considers “HECC infrastructure” very broadly, to encompass not just hardware and software but also train- ing, incentives, support, and so on. This report supplies information that is needed to gain the insights described in (1) and (2) above. It also suggests to policy makers a context for weighing the information and explains how to work through the issues for four disparate fields of science and engineering. The report does not, however, present enough detail to let policy makers compare the value to scientific progress of investments in HECC with that of investments in experimental or observational facilities. Such a comparison would require estimates of the cost of different options for meeting a particular research challenge (not across a field) and then weighing the likelihood that each option would bring the desired progress. Computational science and engineering is a very broad subject, and this report cannot cover all of the factors that affect it. Among the topics not covered are the following, which the committee recognizes as important but which it could not address:

OCR for page 9
 PROBLEM DEFINITION AND HISTORY • The current computing capabilities available today for the four fields investigated and specific projections about the computations that would be enabled by a petascale machine. • The need for computing resources beyond today’s emerging capabilities and desirable features of future balanced high-end systems. • Pros and cons associated with the use of community codes (although they are discussed briefly in Chapter 6). • The policies of various U.S. government funding agencies with respect to HECC. • The ability of the academic community to build and manage computational infrastructure. • Policies for archiving and storing data. HISTORY OF HIGH-END COMPUTING The federal government has been a prime supporter of science and engineering research in the United States since the 1940s. Over the subsequent decades, it established a number of federal laboratories (primarily oriented toward specific government missions) and many intramural and extramural research programs, including an extensive system for supporting basic research in academia for the common good. Until the middle decades of the twentieth century, most of this research could be classified as either theoretical or experimental. By the 1960s, as digital computing evolved and matured, it became widely appreciated that com- putational approaches to scientific discovery would become a third mode of inquiry. That idea had, of course, already been held for a number of years—at least as early as L.F. Richardson’s experiment with numerical weather prediction in 1922 (Richardson, 1922) and certainly with the use of the ENIAC in the 1940s for performing ballistics calculations (Goldstine and Goldstine, 1946). By the 1970s, the confluence of computing power, robust mathematical algorithms, skilled users, and adequate resources enabled computational science and engineering to begin contributing more broadly to research progress (see, for instance, Lax, 1982). In its role of furthering science and engineering for the national interest, the federal government has long accepted the responsibility for supporting high-end computing, beginning with the ENIAC. Clearly, the ENIAC would not be considered a supercomputer today (nor would the Cray-1, to choose a cutting-edge technology from the late 1970s), but high-end computing is commonly defined as what- ever caliber of computing is pushing the state of art of computing at any given time. Similarly, today’s teraflop2 computing is becoming fairly routine within the supercomputing community, and some would no longer consider computing at a few teraflops as being at the high end. Petascale computing will be the next step,3 and some are beginning to think about exascale computing, which would represent a further thousandfold increase in capability beyond the petascale. Science and engineering progress over many years has been accompanied by the development of new tools for examining natural phenomena. The invention of microscopes and telescopes four centu- ries ago enabled great progress in observational capabilities, and the resulting observations have altered our views of nature in profound ways. Much more recently, techniques such as neutron scattering, atomic force microscopy, and others have been built on a base of theory to enable investigations that were otherwise impossible. The fact that theory underpins these tools is key: Scientists needed a good 2The prefix “tera-” connotes a trillion, and “flop” is an acronym for “floating point operations.” 3Los Alamos National Laboratory announced in June 2008 that it had achieved processing speeds of over 1 petaflop/s for one type of calculation; see the news release at http://www.lanl.gov/news/index.php/fuseaction/home.story/story_id/13602. Accessed July 18, 2008.

OCR for page 9
 THE POTENTIAL IMPACT OF HIGH-END CAPABILITY COMPUTING understanding of (in the cases just cited) subatomic nature before they could even imagine such probes, let alone engineer them. Computational tools are analogous to observational tools. Computational simulation enables us to explore natural or man-made processes that might be impossible to sense directly, be just a hypotheti- cal creation on a drawing board, or be too complex to observe in adequate detail in nature. Simulations are enabled by a base of theory sufficient for creating mathematical models of the system under study, which also provides a good understanding of the limitations of those models. Richardson’s experiment in weather prediction could not have been developed without a mathematical model of fluid flow, and Newton’s laws were necessary in order to run ballistics calculations on the ENIAC. But the theoretical needs are much deeper than just the understanding that underpins mathematical models. For instance, Richardson’s projections did not converge because the necessary numerical analysis did not yet exist, and so the approximation used was inappropriate for the task in ways that were not then understood. More recently (since the 1970s), the development of ever more efficient algorithms for the discretiza- tion of differential equations and for the solution of the consequent linear algebra formulations has been essential to the successes of computational science and engineering. In fact, it is generally agreed that algorithmic advances have contributed at least as much as hardware advances to the increasing capabili- ties of computation over the past four decades. Historically, there has been a strong coupling between the development of algorithms and software for scientific computing and the fundamental mathematical understanding of the underlying models. For example, beginning in the 1940s mathematicians such as von Neumann, Lax, and Richtmyer (see, for example, von Neumann and Goldstine, 1947; Lax and Richtmyer, 1956) were deeply involved in investigating the well-posedness properties of nonlinear hyperbolic conservation laws in the presence of discontinuities such as shock waves. At the same time they were also developing new algorithmic concepts to represent such discontinuous solutions numerically. The resulting methods were then incor- porated into simulation codes by the national laboratories and industry, often in collaboration with the same mathematicians. This interaction between numerical algorithm design and mathematical theory for the underlying partial differential equations has continued since that time, leading to methods that make up the current state of the art in computational fluid dynamics today: high-resolution methods for hyperbolic conservation laws; projection methods and artificial compressibility methods for low- Mach-number fluid flows; adaptive mesh refinement methods; and a variety of methods for representing sharp fronts. A similar connection can be seen in the development of computer infrastructure and computational science. The first computers were developed for solving science and engineering problems, and the earli- est development efforts in a variety of software areas from operating systems to languages and compilers were undertaken to make these early computers more usable by scientists. Over the last 30 years, math- ematical software such as LINPACK and its successors for solving linear systems have become standard benchmarks for measuring the performance of new computer systems. One strain of current thinking in the area of computer architecture is moving toward the idea that the algorithm is the fundamental unit of computation around which computer performance should be designed, with most of the standard algorithms for scientific applications helping to define the design space (Asanovic et al., 2006). A lot of the technology for doing computing (e.g., for computational fluid dynamics) was subsidized by national security enterprises from the 1960s through the 1980s. To a large extent, those enterprises have become more mission focused, and so we can no longer assume that all the components of future HECC infrastructure for science and engineering generally, such as algorithm development and visual- ization software, will be created in federal laboratories. This heightens the need to examine the entire

OCR for page 9
 PROBLEM DEFINITION AND HISTORY HECC infrastructure and consciously determine what will be needed, what will be the impediments, and who will be responsible. CURRENT STATE OF HIGH-END CAPABILITY COMPUTING By thinking of HECC as whatever level of computing capability is nonroutine, it follows that the definition of capability computing is inextricably bound to the capabilities of a given field and a particular science or engineering investigation. Computational science and engineering is a systems process, bring- ing together hardware, software, investigators, data, and other components of infrastructure to produce insight into some question. Capability computing can be nonroutine due to limitations in any of these infrastructure components. Many factors constrain our ability to accomplish the necessary research, not just the availability of processing cycles with adequate power. Emphasizing one component over the others, such as providing hardware alone, does not serve the real needs of computational science and engineering. For instance, one recent HECC fluid dynamics simulation produced 74 million files; just listing all of them would crash the computer. A systems approach to HECC will consider these sorts of practical constraints as well as the more familiar constraints of algorithms and processing speed. Some scientific questions are not even posed in a way that can map onto HECC, and creating that mapping might be seen as part of the HECC infrastructure. Typically, simulations that require HECC have complications that preclude an investigator simply running them for a long time on a dedicated workstation. Overcoming those complications often requires nonroutine knowledge of a broad range of disciplines, including computing, algorithms, data manage- ment, visualization, and so on. Therefore, because the individual investigator model might not suffice for HECC, mechanisms that enable teamwork are an infrastructure requirement for such work. This is consistent with the general need in computational science and engineering to assemble multidisciplinary teams. While a number of federal agencies provide high-end hardware for general use, other aspects of HECC infrastructure are handled in a somewhat ad hoc fashion. For instance, middleware (e.g., data management software) and visualization interfaces might be created in response to specific project requirements, with incomplete testing and documentation to allow them to truly serve as general-purpose tools, and the creators might be graduate students who then move to other facilities. Data repositories and software might not be maintained beyond the lifetime of the particular research grant that supported their development. “Hardening” of software, development of community codes, and other common- good tasks might not be done by anyone. In short, many of the components of HECC infrastructure are cobbled together, resulting in a mix of funding horizons and purposes. While the DOE national laboratories, in particular, have succeeded in developing environments that cover all the components of HECC infrastructure, more often the systems view is lacking. Thus, the United States may not have the best environments for enabling computational advances in support of the most pressing science and engineering problems. This report does not prognosticate about future computing capabilities, a task that would be beyond the study’s charge and the expertise of the committee.4 The committee relies on the following vision of the next generation (NSF, 2006, p. 13): By 2010, the petascale computing environment available to the academic science and engineering com- munity is likely to consist of: (i) a significant number of systems with peak performance in the 1-50 tera- 4NRC (2005) is a good source for exploring this topic.

OCR for page 9
 THE POTENTIAL IMPACT OF HIGH-END CAPABILITY COMPUTING flops range, deployed and supported at the local level by individual campuses and other research organiza- tions; (ii) multiple systems with peak performance of 100+ teraflops that support the work of thousands of researchers nationally; and, (iii) at least one system in the 1-10 petaflops range that supports a more limited number of projects demanding the highest levels of computing performance. All NSF-deployed systems will be appropriately balanced and will include core computational hardware, local storage of sufficient capacity, and appropriate data analysis and visualization capabilities. HECC challenges abound in science and engineering, and the focus of this report on four fields should not be taken to imply that those particular fields are in some sense special. The committee is well aware of the important and challenging opportunities afforded by HECC in many other fields. REFERENCES Asanovic, K., Ras Bodik, Bryan Christopher Catanzaro, et al. 2006. The Landscape of Parallel Computing Research: A View from Berkeley. Technical Report No. UCB/EECS-2006-183. Available at http://www.eecs.berkeley.edu/Pubs/ TechRpts/2006/EECS-2006-183.pdf. Accessed July 18, 2008. Goldstine, H.H., and A. Goldstine. 1946. The Electronic Numerical Integrator and Computer (ENIAC). Reprinted in The Ori- gins of Digital Computers: Selected Papers. New York, N.Y.: Springer-Verlag, 1982, pp. 359-373. Lax, Peter (ed.). 1982. “Large Scale Computing in Science and Engineering.” National Science Board. Lax, P.D., and R.D. Richtmyer. 1956. Survey of the stability of linear finite difference equations. Communications on Pure & Applied Mathematics 9: 267-293. NRC (National Research Council). 2005. Getting Up to Speed: The Future of Supercomputing. Washington, D.C.: The National Academies Press. NSF (National Science Foundation). 2006. NSF’s Cyberinfrastructure Vision for st Century Discoery. Draft Version 7.1. Available at http://www.nsf.gov/od/oci/ci-v7.pdf. Accessed July 18, 2008. Richardson, Lewis Fry. 1922. Weather Prediction by Numerical Process. Cambridge, England: Cambridge University Press. von Neumann, J., and H.H. Goldstine. 1947. Numerical inverting of matrices of high order I. Bulletin of the American Math- ematical Society 53(11): 1021-1099.