Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
1 Introduction Supercomputers are systems that provide significantly greater sustained performance than is available from contemporary mainstream computer systems. In applications such as analysis of intelligence data, weather prediction, or climate projection, supercomputers enable the generation of information that would not otherwise be available or that could not be generated in time to be actionable. Supercomputing can also accelerate scientific research in important areas, such as physics, biology, or medicine. It can augment experimentation or replace it with simulation, thus reducing the cost and increasing the accuracy and repeatability of experimentation in science and engineering. Further, supercomputing has the potential to suggest entirely novel experiments that can revolutionize our perspective of the world. It enables faster evaluation of design alternatives, thus improving the quality of engineered products. The value of supercomputers derives from the problems they solve, not from the innovative technology they showcase. The technology must be motivated by the application requirements. Historically, better performance was achieved using faster logic and more parallelism that is, by performing many computations and data accesses concurrently. Today, although there is some promising device research, parallelism is still the primary approach. The performance of supercomputers should be measured in terms of the time required to solve problems of interest. Some problems, such as searches for patterns in data, can be broken down into subproblems that can be solved independently and the results easily combined later. Thus, a collection of PCs that are intermittently available and are connected by a low-speed network such as the Internet can exhibit supercomputing performance, as shown by the example of SETI~Home.~ For such problems, a computational grid can replace a conventional supercomputer.2 However, many important problems requiring high-performance computing, such as the modeling of fluid flows, do not admit that kind of decomposition. While these problems can be solved using parallelism, dependencies among the subproblems necessitate frequent exchange of data and partial results, requiring significantly better communication (higher bandwidth and lower latency) between the computation and data storage loci than that achieved by network-connected PCs.3 i"SETI(~home: the Search for Extraterrestrial Intelligence." Available at <setiathome.ssl.berkeley.edu>. 2A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities (I. Foster and C. Kesselman, 1999, Computational Grids, The Grid: Blueprint for a New Computing Infrastructure, Morgan-KauLman.~. 3Although the grid does not replace supercomputers for many high-end applications, grid computing enhances our ability to solve the kinds of problems for which large amounts of computation and often-unique data are essential. In addition, the grid provides the infrastructure within which tightly coupled supercomputers reside. It enables efficient access to remote or specialized computation resources and efficient exchange of results between collaborating scientists. s
6 THE FUTURE OF SUPERCOMPUTING: ANINTERIMREPORT This report focuses mostly on the latter kind of problem and, hence, on "one machine room" systems. To achieve high performance on problems of interest, such supercomputers need not only the ability to perform operations at a high rate but also support for high-bandwidth, low-latency internal communication, large memories, and high-performance I/O subsystems. They also need suitable software, both high-quality systems software such as compilers and operating systems and well-tuned applications software. To maintain focus, this report does not address networking (i.e., the external communication requirements of supercomputers) except to note its importance, nor does the report address special- purpose systems such as signal processors. STUDY CONTEXT Much has changed since the 1 980s, when a variety of agencies invested in developing and using supercomputers and when the High Performance Computing and Communications Initiative (HPCCI), which bridged and built on these agency efforts, was conceived and the 1 990s, when the HPCCI evolved into a broader and more diffuse program of computer science research support.4 More recently, federally supported high-performance computing research has Reemphasized computer architecture research and begun to emphasize the networked grid for high-performance computing, an emerging industry interest,5 as a unifying concept. Whereas early investments in high-performance computing research were shown to have had trickle-down benefits for mainstream computing,6 recent trends cloud the picture for such benefits. Meanwhile, there is increasing evidence of and concern about technical leadership in Japan, where the industry has benefited from sustained government investments Concern about the diminishing U.S. ability to meet national security needs led to a recommendation in 2000 that DOD subsidize a Cray computer development program as well as invest in relevant long-term research.8 CSTB convened the Committee on the Future of Supercomputing, sponsored jointly by the DOE Office of Science and DOE's Advanced Simulation and Computing (ASC) Program to assess the state of supercomputing in the United States, including the characteristics of relevant systems and architecture research in government, industry, and academia and the characteristics of the relevant market. Specific questions of interest to both the sponsors and Congress are listed in Box 1.1. 4The proliferation of PCs and the rise of the Internet commanded attention and resources, diverting attention and effort from research in high-end computing. There were, however, efforts into the 1 990s to support traditional high-performance computing. See, for example, NSF, 1993, From Desktop to Teraflop: Exploiting the U.S. Lead in High Performance Computing. NSF Blue Ribbon Panel on High Performance Computing. Arlington, Va.: National Science Foundation, August. 5Barnaby J. Feder. 2000. Supercomputing Takes Yet Another Turn. The New York Times, November 20, p. C4. Computer Science and Telecommunications Board (CSTB), National Research Council. 1995. Evolving the High Performance Computing and Communications Initiative to Support the Nation 's Infrastructure. Washington, D.C.: National Academy Press. This report noted the time-machine quality of high-performance systems. 7Japanese support for indigenous capabilities has led to U.S. allegations of dumping, the most recent of which were resolved by an arrangement featuring U.S. resale of Japanese machines by a U.S. vendor (see William M. Bulkeley, 2001, Outlook improves for U.S. supercomputer access, The Wall Street Journal, March 2, p. B6.) Meanwhile, U.S. vendors have long complained about controls on high-performance computer exports (see Ted Bridis, 2001, Study suggests easing limits to export high-performance computers overseas, The Wall Street Journal, June 8, p. B5.) Defense Science Board. 2000. Report of the Defense Science Board Task Force on DOD Supercomputing Needs. Washington, D.C.: Office of the Under Secretary of Defense for Acquisition and Technology, October 1 1 .
INTRODUCTION 7 Box 1~1 Concerns of the Sponsor and of Congress Issues of concern to the Department of Energy: · How should the nation approach research and development of the highest end computers to ensure its future leadership in science and technology? · What is the economic model (e.g., commercial investment, particle accelerator, or submarine) for high-performance computers? · How do we allocate the investment between scientific applications, mathematical algorithms, system software, hardware architectures, hardware engineering, and so on? What is the current state of the art of supercomputing in the United States and the rest of the . world? . · What/who are the requirements drivers of supercomputing? What are supercomputing's "gold nuggets" and how might they be exploited? · Does (or should) open source software have a role to play? · What are the shortfalls and how do we address them over the next 3, 5, 10, and 20 years? What are the costs of the solutions? . play? What should be the U.S. supercomputing vision? What role should (or can) the government Questions of particular interest to the Senate's Energy and Water Development Appropriations Committee, which funds the Department of Energy: years? What are the mission requirements driving the ASC program? How much capacity is needed, and when is it needed over the next 10 years? · What is the maximum capability required in the top ASC platform, and when over the next 10 . Was the NNSA wise to abandon custom-designed chips and vector architecture for much cheaper commodity-chip-based, massively parallel systems? What level of customization is needed for the various government interests in supercomputing- for example, weapons design, molecule modeling and simulation, cryptanalysts, bioinformatics, climate modeling? How effective are the current or planned ASC platforms in addressing the requirements of the . . program? . Are there alternative architectures, interconnect technologies, systems software and tools, or other approaches that will improve the performance of future ASC platforms? · If so, can industry supply the required alternative architectures and software? That is, is industry properly motivated to do this? Or must government fundamentally lead the development of these alternatives? · Is the current ASC approach the most cost-effective and efficient manner of achieving the desired capability and capacity? · Finally, as they relate to the ASC mission requirements, what are the costs and benefits of investing more heavily in capacity now and deferring acquisition of capability machines, so as to take advantage of the falling price per teraflop? SOURCE: Daniel Hitchcock, DOE, Jose L. Munoz, DOE, and Clay Sell, Senate Energy and Water Development Appropriations Committee, presentations to the committee, March 6, 2003. Many of the questions in the second list are to be addressed by the JASONs' study, mentioned elsewhere in this report.
8 THE FUTURE OF SUPERCOMPUTING: ANINTERIMREPORT ABOUT THIS INTERIM REPORT This interim report focuses on stage setting and context for example, the history and current state of supercomputing and the socioeconomic context together with some identification of issues being addressed by the committee. The committee expects that its understanding of these issues will change as it collects more data and deepens its analysis for the final report. The short time it had to prepare the interim report did not allow the committee to develop findings or recommendations. The presentations to the committee and the collective knowledge and experience of committee members have, however, enabled it to come to a preliminary understanding of some important issues, which are outlined in this interim report. However, the report does not document the detailed evidence that supports these views. The final report will provide the needed depth of information and analysis. Since the committee appreciates the desire to use this interim report to inform budgetary discussions for FY 2005, it shares some of its initial views in this preliminary form. Chapter 2 outlines the history and current state of supercomputing. The importance of continuity is summarized in Chapter 3. Chapter 4 discusses the need for research and innovation. Chapter 5 addresses the role of the government in ensuring the future health of supercomputing.