As Dr. Colella’s discussion suggests, in addition to the technical challenges, there are a variety of nontechnical barriers to progress in algorithms. These topics are discussed in subsequent chapters.

Software Issues

In extrapolating technology trends, it is easy to forget that the primary purpose of improved supercomputers is to solve important problems better. That is, the goal is to improve the productivity of users, including scientists, engineers, and other nonspecialists in supercomputing. To this end, supercomputing software development should emphasize time to solution, the major metric of value to high-end computing users. Time to solution includes time to cast the physical problem into algorithms suitable for high-end computing; time to write and debug the computer code that expresses those algorithms; time to optimize the code for the computer platforms being used; time to compute the desired results; time to analyze those results; and time to refine the analysis into an improved understanding of the original problem that will enable scientific or engineering advances. There are good reasons to believe that lack of adequate software is today a major impediment to reducing time to solution and that more emphasis on investments in software research and development (as recommended by previous committees, in particular, PITAC) is justified. The main expense in large supercomputing programs such as ASC is software related: In FY 2004, 40 percent of the ASC budget was allocated for application development; in addition, a significant fraction of the acquisition budget also goes, directly or indirectly, to software purchase.70 A significant fraction of the time to solution is spent developing, tuning, verifying, and validating codes. This is especially true in the NSA environment, where new, relatively short HPC codes are frequently developed to solve new emerging problems and are run once. As computing platforms become more complex, and as codes become much larger and more complex, the difficulty of delivering efficient and robust codes in a timely fashion increases. For example, several large ASC code projects, each involving tens of programmers, hundreds of thousands of lines of code, and investments from $50 million to $100 million had early milestones that proved to be too aggressive.71 Many supercomputer users feel

70  

Advanced Simulation and Computing Program Plan, August 2003.

71  

See Douglass Post, 2004, “The Coming Crisis in Computational Sciences,” Workshop on Productivity and Performance in High-End Computing, February; and D. Post and R. Kendall, 2003, “Software Project Management and Quality Engineering Practices for Complex, Coupled Multi-Physics, Massively Parallel Computation Simulations: Lessons Learned from ASCI,” DOE Software Quality Forum, March.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement