Skip to main content

Currently Skimming:

4 Future National-Scale Needs
Pages 64-82

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 64...
... This chapter discusses previous approaches for discussing needs that were based primarily on floatingpoint performance and makes recommendations for how to think about the more complex, multidimensional requirements for computing and data systems in the future. 4.1  THE STRUCTURE OF NSF INVESTMENTS AND THE BRANSCOMB PYRAMID For the past 30 years, National Science Foundation (NSF)
From page 65...
... , hybrid architectures (e.g., Cray XK7 for Oak Ridge National Laboratory's Titan or the Intel Xeon Phi nodes on Texas Advanced Computing Center's [TACC's] Stampede)
From page 66...
... In What Ways Does the Branscomb Pyramid Misrepresent the State of Computing Today? There are several, and each is important and addressed in this report: • Compute power is not simply measured.
From page 67...
... For example, in the 2011 report National Science Foundation Advisory Committee for Cyberinfrastructure: Task Force on Campus Bridging,2 there is this finding: The cyberinfrastructure environment in the US is now much more complex and varied than the long-useful Branscomb Pyramid. As regards computational facilities, this is largely due to continued improvements in processing power per unit of money and changes in CPU architecture, continued development of volunteer computing systems, and evolution of commercial Infrastructure/Platform/Software as Service (cloud)
From page 68...
... The modern interpretation also needs to recognize that there are significant pressures to build out the resources that are affordable rather than those that are needed. While budget realism is essential, it can lead to an acquisition process dominated by cost considerations rather than one driven by the science requirements.
From page 69...
... These needs must not be forgotten when provisioning computing resources. 4.2  DATA-INTENSIVE SCIENCE AND THE NEEDS FOR ADVANCED COMPUTING The current generation of advanced computing infrastructure focuses largely on meeting the requirements of workflows for simulation science that has fueled advances across many disciplines over the past two decades.
From page 70...
... It can be difficult to accommodate different types of applications on the same system -- for example, long-running applications using a few nodes can fragment the available nodes, making it impos sible to schedule the resources for a tightly coupled application to run efficiently. In the short term, providers of computing resources can develop different service models that match modern science workflows, possibly by adaptively partitioning the computing system into groups of nodes that support develop ment and tuning of applications, real-time applications, long-running applications requiring only a few nodes, and highly parallel applications (adaptive to support requirements that change with time, such as the need to use an entire system to run a single tightly coupled parallel application)
From page 71...
... More generally, as data accrue from experiments and simulations, and as data from multiple experiments and simulations are integrated, scientific discoveries are increasingly being made from the accumulated and integrated data using advanced computing. This is sometimes known as the "fourth paradigm" of scientific discovery, because it supplements discovery paradigms based on theory, experiment, and simulation.4 Further, there are additional opportunities for scientific insights at the interfaces of each of these paradigms of discoveries.
From page 72...
... Understanding the software frameworks that are enabled within the various cloud services and then mapping scientific workflows onto them requires a high level of both technical and scientific insight. Moreover, these new services enable a deeper level of collaboration and software reuse that are critical for data-intensive science.
From page 73...
... Today, with growing demand for computing and constrained budgets, it has become especially important to understand the relative benefits and risks of different technical approaches for the science portfolio. This section describes some of the challenges NSF will face in developing science requirements for advanced computing.
From page 74...
... Past experience has shown that although a procurement can be completed in several years, large systems sometimes take as long as 10 years from initial concept to full availability to users. A rolling decadal roadmapping process could help inform users about plans for the upgrade and replacement of existing systems and, more generally, the performance characteristics of expected future systems.
From page 75...
... The science community rapidly adopts these new "providers," such as Dropbox, until a new and improved service appears on the market. Along with the challenges of a changing scientific and technical landscape, any requirements process must recognize that there will always be gaps.
From page 76...
... A process that relies on documented science objectives and assessment of the progress made toward achieving these objectives, rather than simply statements that greater computational capacity will improve understanding of a specific scientific process or phenomenon, can help improve future decisions. For example, such an assessment might show that the ability to run an ensemble of 1,000 short-term weather forecasting models will improve the quality of the forecasts by a specific percentage.
From page 77...
... Balancing its primary science mission with the need to operate infrastructure will require constant assessment by NSF, as noted in the recent decadal survey of the ocean sciences.9 4.5 ROADMAPPING The Department of Energy (DOE) has created a roadmap for future advanced scientific computing research systems that provides research 9  NationalResearch Council, Sea Change: 2015-2025 Decadal Survey of Ocean Sciences, The National Academies Press, Washington, D.C., 2015.
From page 78...
... It may be necessary to develop separate road 10  For example, the end-to-end challenges in managing massive research data are consid ered in NASA Earth Science Technology Office/Advancd Information Systems Technology (ESTO/AIST) Big Data Study Roadmap Team, "NASA Earth Science Research in Data and Computational Science Technologies," September 2015, http://ieee-bigdata-earthscience.jpl.
From page 79...
... Intel Knights Multiple IBM AMD 64-bit Landing many- Power9 CPUs Intel Ivy Opteron Node processors PowerPC core CPUs and Bridge Nvidia A2 Intel Haswell CPU multiple Nvidia Kepler in data partition Voltas GPUS 9,300 nodes 5,600 18,688 System size (nodes) 49,152 1,900 nodes in ~4,600 nodes >2,500 nodes >50,000 nodes nodes nodes data partition Dual Rail EDR System interconnect Aries Gemini 5D Torus Aries Aries IB 7.6 PB 32 PB 26 PB 28 PB 120 PB 150 PB 168 10 PB, 210 GB/s File system 1 TB/s, 300 GB/s 744 GB/s 1 TB/s 1 TB/s GB/s, Lustre initial Lustre® GPFS™ Lustre® GPFS™ Lustre® Lustre® FIGURE 4.2  Advanced Scientific Computing Research (ASCR)
From page 80...
... Relevant measures include memory size and bandwidth, data size and bandwidth, interconnect bandwidth and application sensitivity to interconnect latency, integer and floating-point performance, and long-term data storage requirements. Some of this information could be gathered by tools designed for this purpose, applied to an application running on a current system, reducing the burden on the computational scientists.
From page 81...
... The list below is targeted at parallel high-performance computing applications, but the approach can be applied to other areas; some items include examples relevant to some data science applications.
From page 82...
... If a shorter list was desired, the data in item 2 (application performance characteristics) , com bined with the number of SUs required, would provide valuable guidance in setting requirements for production computing systems.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.