7
Supercomputing Abroad

The committee devoted most of its attention tosupercomputing in the United States. A subcommittee made a visit to Japan in March 2004, but there were no visits to other countries. However, most committee members have significant contact with supercomputing experts in other countries, and there is considerable literature about supercomputing activities abroad. A very useful source is a recent survey by the Organisation Associative du Parallelisme (ORAP)1 in France. The committee drew on all those sources when it considered the state of supercomputing and its future abroad.

Supercomputing is an international endeavor and the research community is international. Many countries have significant supercomputing installations in support of science and engineering, and there is a significant exchange of people and technology. However, the United States clearly dominates the field. Of the TOP500 systems in June 2004, 255, or 51 percent, are installed in the United States, which also has 56 percent of the total compute power of the systems on that list. The next country, Japan, has 7 percent of the systems and 9 percent of the total compute power. As Figure 7.1 shows,2 this situation has not changed significantly in the last

1  

ORAP. 2004. Promouvoir le Calcul Haute Performance 1994-2004.

2  

In contrast to the data presented in Chapter 3, Figure 3.7, which are based on manufacturer, the data in Figure 7.1 present the percent of worldwide supercomputing systems that are installed in a given country regardless of manufacturer. This figure was generated at the TOP500 Web site.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 180
Getting up to Speed the Future of Supercomputing 7 Supercomputing Abroad The committee devoted most of its attention tosupercomputing in the United States. A subcommittee made a visit to Japan in March 2004, but there were no visits to other countries. However, most committee members have significant contact with supercomputing experts in other countries, and there is considerable literature about supercomputing activities abroad. A very useful source is a recent survey by the Organisation Associative du Parallelisme (ORAP)1 in France. The committee drew on all those sources when it considered the state of supercomputing and its future abroad. Supercomputing is an international endeavor and the research community is international. Many countries have significant supercomputing installations in support of science and engineering, and there is a significant exchange of people and technology. However, the United States clearly dominates the field. Of the TOP500 systems in June 2004, 255, or 51 percent, are installed in the United States, which also has 56 percent of the total compute power of the systems on that list. The next country, Japan, has 7 percent of the systems and 9 percent of the total compute power. As Figure 7.1 shows,2 this situation has not changed significantly in the last 1   ORAP. 2004. Promouvoir le Calcul Haute Performance 1994-2004. 2   In contrast to the data presented in Chapter 3, Figure 3.7, which are based on manufacturer, the data in Figure 7.1 present the percent of worldwide supercomputing systems that are installed in a given country regardless of manufacturer. This figure was generated at the TOP500 Web site.

OCR for page 180
Getting up to Speed the Future of Supercomputing FIGURE 7.1 TOP500 by country. decade: No particular trend emerges, except the progressive broadening of the “other” category, indicating the progressive democratization of supercomputing, attributable to the advent of relatively low cost commodity clusters. The dominance is even more striking when one looks at manufacturers: 91 percent of the TOP500 systems are manufactured in the United States (see Figure 3.7). Many of the remaining systems use U.S.-manufactured commodity parts. The software stack of supercomputing systems used worldwide (operating systems, compilers, tools, libraries, application codes, etc.) was also largely developed in the United States, with significant contributions from researchers in other countries. However, this is no reason for complacency. Since late 2001, the system that heads the TOP500 list has been the Earth Simulator (ES), installed in Japan. Even more important than being the most powerful system, the ES, because of its use of custom vector processors, achieves higher sustained performance on application codes of interest than many of the other top-performing machines. While the ES is likely to lose its top position on

OCR for page 180
Getting up to Speed the Future of Supercomputing the TOP500 list soon,3 it is likely to continue providing significantly better performance than competing systems on climate codes and the other applications it runs. (At present, the ES is 5 to 25 times faster than large U.S. systems on the various components of climate models used at NCAR.) IDC estimates that in the last few years, the North America, Europe, and the Asian-Pacific regions each purchased about one-third of the total dollar value of the capability systems sold. Another trend that has been much in evidence in recent years is the ability of many countries to build top-performing systems using commodity parts that are widely available. This reduces the usefulness of export restrictions and enables many countries to reduce their dependence on the United States and its allies for supercomputing technology. China is vigorously pursuing a policy of self-sufficiency in supercomputing. Next, the committee presents highlights of supercomputing activities in various countries. JAPAN The committee found both similarities and differences in supercomputing in Japan and in the United States.4 Similarities In many areas the issues and concerns about HPC are broadly similar in Japan and in the United States. HPC continues to be critical for many scientific and engineering pursuits. Many are common to the United States and Japan, for example, climate modeling, earthquake simulation, and biosystems. However, Japan does not have the kind of defense missions, such as stockpile stewardship, that have historically been drivers for U.S. supercomputing. The HPC community is small in both countries relative to the science and engineering community overall and may not have achieved a critical mass—in both countries it is hard to attract top young researchers with the needed skills in simulation and high-performance computing. The 3   On September 29, 2004, IBM announced that the Blue Gene/L system, which is being assembled for LLNL, had surpassed the performance of the Earth Simulator according to the standard Linpack benchmark. On October 26, 2004, Silicon Graphics announced that the Columbia system installed at NASA Ames had surpassed the Earth Simulator. As a result, it is expected that the Earth Simulator will lose the top spot on the November 2004 TOP500 list. 4   The subcommittee participated in a 1-day joint NAE–Engineering Academy of Japan forum and visited six supercomputing sites in Japan (see Appendix B for a complete list of sites and participants).

OCR for page 180
Getting up to Speed the Future of Supercomputing committee had a lively technical exchange at the 1-day joint forum, where its members learned of several Japanese research projects with which they had not been familiar. More international collaboration on research would clearly be beneficial to both countries. The commercial viability of traditional supercomputing architectures with vector processors and high-bandwidth memory subsystems is problematic. Commodity clusters are increasingly replacing such traditional systems and shrinking their market. It has become harder to identify attractive payoffs for investments in the development of vector architectures. The large investments needed to continue progress on custom HPC systems, as well as the opportunity costs, are increasingly difficult to justify. However, at least one large company in Japan (NEC) continues to be committed to traditional vector architectures. Continuity is a problem in both countries. The ES project was officially proposed in 1996 and started in 1997,5 at a time when Japan’s economy and politics were different. In the current Japanese economic and political climate, it has become harder to allocate significant funds on a continuous basis for large, innovative projects in HPC. Similar pressures exist in the United States. HPC usage is also constrained in both countries by the lack of suitable software and by the difficulty of using less expensive machines with lower memory bandwidth. Differences There were some notable differences between the United States and Japan. Traditional supercomputer architectures (vector, pseudo vector, etc.) play a larger role in Japan. Top NEC, Fujitsu, and Hitachi machines are still the mainstay of academic supercomputing centers and national laboratories. As a result, there is more reliance on vendor-provided software than on third-party or open source software, which is less available. However, the trend is toward increased use of clusters and open source software. Also, since Japan does not have a military rationale for HPC, it has to be justified on the basis of its ultimate economic and societal benefits for a civil society. The Earth Simulator The Earth Simulator was developed as a national project by three government agencies: the National Space Development Agency of Japan 5   See <http://www.es.jamstec.go.jp/esc/eng/ES/birth.html>.

OCR for page 180
Getting up to Speed the Future of Supercomputing FIGURE 7.2 Earth Simulator Center. This figure is available at the Earth Simulator Web site, <http://www.es.jamstec.go.jp/esc/eng/ES/hardware.html>. (NASDA), the Japan Atomic Energy Research Institute (JAERI), and the Japan Marine Science and Technology Center (JAMSTEC). The ES (see Figure 7.2) is housed in a specially designed facility, the Earth Simulator Center (approximately 50 m × 65 m × 17 m). The fabrication and installation of the ES at the Earth Simulator Center of JAMSTEC was completed at the end of February 2002. The ES is now managed by JAMSTEC, under the Ministry of Education, Culture, Sports, Science and Technology (MEXT). The system consists of 640 processor nodes, connected by a 640 by 640 single-stage crossbar switch. Each node is a shared memory multiprocessor with eight vector processors, each with a peak performance of 8 Gflops. Thus, the total system has 5,120 vector processors and a peak performance of 40 Tflops. Most codes are written using MPI for global communication and OpenMP or microtasking for intranode parallelism. Some codes use HPF for global parallelism. As shown in Figure 7.3, the sustained performance achieved by application codes is impressive: The ES achieved 26.58 Tflops on a global atmospheric code; 14.9 Tflops on a three-dimensional fluid simulation code for fusion written in HPF; and 16.4 Tflops on a turbulence code. The ES, with its focus on earth sciences, was one of the first mission-oriented projects of the Science and Technology Agency.6 Although the 6   MEXT took over the ES after the merger of the Ministry of Education and the Science and Technology Agency.

OCR for page 180
Getting up to Speed the Future of Supercomputing FIGURE 7.3 2002 Earth Simulator performance by application group. U.S. NCAR is also mission-oriented for earth sciences, it is perceived that in the United States “mission-oriented” usually implies “national security.” The ES also might turn out to be a singular event: MEXT officials with whom the committee met stated that as of March 2004 there were no plans to build topical supercomputing centers in support of Japanese priority science areas (biotechnology, nanotechnology, the environment, and IT), nor were there plans to build a second ES. Tetsuya Sato, Director-General of the Earth Simulator Center, has plans for another very powerful system and is trying to marshal the necessary support for it. Plans for research on technology for an ES successor with 25 times the performance of the ES were recently announced.7 The launch of the Earth Simulator created a substantial amount of concern in the United States that this country had lost its lead in high- 7   According to an article in the August 27, 2004, issue of the newspaper Nihon Keizai, MEXT will request ¥2 billion (about $20 million) in FY 2005 to fund an industry-university-government collaboration on low-power CPU, optical interconnect, and operating system. Participants include NEC, Toshiba, and Hitachi.

OCR for page 180
Getting up to Speed the Future of Supercomputing performance computing. While there is certainly a loss of national pride because a supercomputer in the United States is not first on a list of the world’s fastest supercomputers, it is important to understand the set of issues that surround that loss of first place. The development of the ES required a large investment (approximately $500 million, including the cost of a special facility to house the system) and a commitment over a long period of time. The United States made an even larger investment in HPC under the ASC program, but the money was not spent on a single platform. Other important differences are these: The ES was developed for basic research and is shared internationally, whereas the ASC program is driven by national defense and may be used only for domestic missions. A large part of the ES investment supported NEC’s development of its SX-6 technology. The ASC program has made only modest investments in industrial R&D. ES uses custom vector processors; the ASC systems use commodity processors. The ES software technology largely comes from abroad, although it is often modified and enhanced in Japan. For example, a significant number of ES codes were developed using a Japanese-enhanced version of HPF. Virtually all software used in the ASC program has been developed in the United States. Surprisingly, the Earth Simulator’s number one ranking on the TOP500 list is not a matter of national pride in Japan. In fact, there is considerable resentment of the Earth Simulator in some sectors of the research community in Japan. Some Japanese researchers feel that the ES is too expensive and drains critical resources from other science and technology projects. Owing to the continued economic crisis in Japan and the large budget deficits, it is becoming more difficult to justify government projects of this kind. Computing time on the Earth Simulator is allocated quite differently from the way it is done by NSF in the U.S. supercomputer centers. Most projects are sponsored by large consortia of scientists, who jointly decide which projects are of most interest to the science community. The director has a discretionary allocation of up to 20 percent that can be used, for example, to bring in new user communities such as industry or to support international users. Japanese private sector companies are permitted to use the resources of the government-funded supercomputer. (For example, auto manufacturers recently signed a memorandum of understanding for use of the ES.) The machine cannot be accessed remotely, although that policy may

OCR for page 180
Getting up to Speed the Future of Supercomputing change within Japan. Collaborators must be on site to run on the ES. They may not use the machine unless they can demonstrate on a small subsystem that their codes scale to achieve a significant fraction of peak performance. Because of the custom high-bandwidth processors used in ES and the user selection policy, the codes running on the ES achieve, on average, a sustained performance that is 30 percent of the peak. Thus the system is used to advantage as a capability machine, but at the political cost of alienating scientists who are unable to exploit that capability. There are several international collaborations being conducted at the ES, including a joint effort between NCAR and the Central Research Institute of the Electric Power Industry (CRIEPI), which involves porting and running the NCAR CCSM on the ES, and a joint research effort with scientists from the California Institute of Technology in earthquake simulation.8 Other Japanese Centers Other large supercomputer installations in Japan are found in university supercomputer centers, in national laboratories, and in industry. In the June 2004 TOP500 list, Japan appears again in 7th place with a Fujitsu system at the Institute of Physical and Chemical Research (RIKEN); in 19th place with an Opteron cluster at the Grid Technology Research center at the National Institute of Advanced Industrial Science and Technology (AIST); in 22nd place with a Fujitsu system at the National Aerospace Laboratory (JAXA); and also further down the list. Japanese manufacturers are heavily represented. Commodity clusters are becoming more prevalent. The university supercomputer centers were until recently directly funded by the government. Funding was very stable, and each center had a long-term relationship with a vendor. The centers have been managed mostly as “cycle-shops” (i.e., centers that do not advance the state of the art but, rather, maintain the status quo) in support of a research user community. For example, at the University of Tokyo center, the main applications are climate modeling and earthquake modeling. There appear to be less software development and less user support than the NSF centers provide in our country. Since April 1, 2004, universities in Japan have been granted greater financial autonomy. Funds will be given to a university, which will decide how to spend the money. Universities are being encouraged to emulate the American model of seeking support from and fostering collabora- 8   See <http://www.es.jamstec.go.jp/esc/images/journal200404/index.html> for more information on the Caltech research project at the ES.

OCR for page 180
Getting up to Speed the Future of Supercomputing tion with industry. This change could have a dramatic effect on existing university supercomputing centers because the government will no longer earmark money for the supercomputer centers. There is widespread concern on the part of many in Japan regarding the quality of students. Both industry and government agencies (such as JAXA) expressed concern that students have no practical experience. Universities have been encouraged to provide more practical training and decrease the emphasis on academic study. JAXA has a comprehensive 2-to 3-year program to train graduate students before hiring them; a constraint to participation is that the students are not paid while training. CHINA China is making significant efforts to be self-sufficient in the area of high-performance computing. Its strategy is based on the use of commodity systems, enhanced with home-brewed technology, in an effort to reduce its dependence on technologies that may be embargoed. China had little or no representation on the TOP500 list until recently. It reached 51st place in June 2003, 14th in November 2003, and 10th in June 2004. The accumulated TOP500 performance has been growing by a factor of 3 every 6 months since June 2003. Today, China has a cumulative performance roughly equal to that of France, making it the fifth largest performer. The top-listed Chinese system has a peak performance of 11 Tflops. It is a cluster of 2,560 Opteron multiprocessors (640 four-way nodes) connected by a Myrinet switch. The system was assembled and installed at the Shanghai Supercomputing Center by the Chinese Dawning company.9 This company markets server and workstation technologies developed by the Chinese Academy of Science (CAS-ICT), the National Research Center for Intelligent Computing Systems (NCIC), and the National Research Center for High Performance Computers. Another top-ranked system (in 26th place) is the DeepComp 6800, a 1,024-processor Itanium cluster with a Quadrics QsNet interconnect that is used by the CAS-ICT. The system was assembled by the Chinese Lenovo Group Limited.10 CAS-ICT is the main shareholder of Lenovo, an important PC manufacturer. China is also developing its own microprocessor technology: The Dragon Godson microprocessor is a low-power, MIPS-like chip; the God- 9   See <http://www.dawning.com.cn>. 10   See <http://www.legendgrp.com>.

OCR for page 180
Getting up to Speed the Future of Supercomputing son-II runs at 500 MHz and consumes 5 W. Dawning has announced plans to build clusters using this microprocessor. EUROPE Collectively, the European Union countries had 113 of the TOP500 systems as of June 2004; this amounts to 23 percent of the TOP500 listed systems and 19 percent of their total compute power. However, it is not clear that one should treat the European Union as a single entity. In the past, the European Union made significant coordinated investments in HPC research: The 1995-1998 Fourth EU Framework Program for Research and Technological Development11 included €248 million for high-performance computing and networking (HPCN). However, HPC is not identified as a separate area in the Fifth or Sixth Framework Programs.12 The thematic areas are life sciences, information technology, nanotechnology, aeronautics and space, food quality and safety, sustainable development, and citizens and governance. While some of the funding under these headings supports the use of supercomputing systems, it is quite clear that HPC is driven in Europe by national policies rather than EU initiatives. United Kingdom The United Kingdom is the largest European supercomputer user, with two large academic centers—the Edinburgh Parallel Computing Center (EPCC) and the CSAR consortium at Manchester. Recently, it announced a large e-science initiative with a total budget of £213 million. The budget funds a national center at Edinburgh, nine regional centers, and seven centers of excellence. The e-science vision promoted by this initiative is similar to the cyberinfrastructure vision promoted by the Atkins report;13 it includes significant funding for supercomputers as part of a grid infrastructure. Some U.K. users have recently moved from vector systems to commodity-based systems. The European Center for Medium-Range Weather Forecasts, which was a major Fujitsu user, now has an IBM Power 4-based 11   The Fourth Framework Program is available online at <http://europa.eu.int/comm/research/fp4.html>. 12   The Sixth Framework Program, the current program, is available online at <http://europa.eu.int/comm/research/fp6/pdf/how-to-participate_en.pdf>. 13   Daniel E. Atkins. 2003. Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure. January.

OCR for page 180
Getting up to Speed the Future of Supercomputing system that was ranked 6th on the June 2004 TOP500 list. The center’s operational forecasts are carried out in ensembles of up to 100 simultaneous runs, which require large computing capacity rather than capability. (NCAR in the United States has also moved to an IBM system, but unwillingly, as a result of the antidumping case against NEC; see Box 8.1.) On the other hand, many weather and climate centers, including the U.K. Meteorology Office and DKRZ, the German HPC Center for Climate and Earth System Research, prefer to use custom SX-6 systems with 120 and 192 processors, respectively. EPCC was a heavy Cray T3E user and now hosts the 18th place system (owned by the HPCx consortium); also, Power 4-based CSAR deploys large shared memory machines with Origin and Altix processors. An interesting aspect of U.K. HPC is the use of long-term contracts for procurements. Both EPCC and CSAR have 6-year service contracts with their platform suppliers that include an initial platform delivery and a 3-year refresh. Plans are made to allow such contracts to be extensible for up to 10 years, with periodic hardware refresh; 2-year extensions can be granted subject to a “comprehensive and rigorous review.”14 Germany Germany has almost as many listed supercomputers as the United Kingdom. Many of the systems are hosted in regional centers that are locally funded by provincial authorities and by federal programs. There are three national centers: HLRS at Stuttgart, NIC at Jülich, and LRZ at Munich. The centers at Stuttgart and Munich host several large custom systems: a 48-processor NEC SX-6 at Stuttgart and a 1,344-processor Hitachi SR8000-F1 and a 52-processor Fujitsu VPP700/52 vector supercomputer at Munich. France France has fallen behind Germany and the United Kingdom in supercomputing. The largest French supercomputer is operated by the French Atomic Energy Commission (CEA-DAM) and is 28th on the TOP500 list. It supports the French equivalent of the ASC program and is similar to (but smaller than) the ASC-Q system at LANL. Unlike the DOE 14   U.K. Engineering and Physical Sciences Research Council. 2004. A Strategic Framework for High-End Computing, May, <http://www.epsrc.ac.uk/Content/Publications/Other/AStrategicFrameworkForHighEndComputing.htm>.

OCR for page 180
Getting up to Speed the Future of Supercomputing centers, the French center is partly open and supports a collaboration with French industrial partners and other agencies (power, EDF; space, ONERA; engines, SNECMA; and turbines, TURBOMECA). France’s next two largest systems are industrial and commercial (petroleum, Total-Fina ELF; and banking, Société Générale). France has two academic supercomputing centers: CINES (55 people, yearly budget of about €10 million) and IDRIS (44 people, yearly budget of about €1 million). Spain Spain recently announced its plan to build a 40-Tflops cluster system in Barcelona using IBM Power G5 technology. The Spanish government will invest €70 million in the National Centre for Supercomputing over 4 years. This will significantly enhance the compute power available in that country. APPLICATION SOFTWARE Generally, the type of research performed in these various centers is similar to the research performed in the United States; similar software is being used, and there is significant sharing of technology. However, both in Japan and in Europe there seem to be more targeted efforts to develop high-performance application software to support industry. Japan’s Frontier Simulation Software Project for Industrial Science is a 5-year program to develop parallel software in support of industrial applications, funded at about $11 million per year. The expectation is that the program, once primed, will be able to support itself from revenues produced by commercial software use. In joint university/industry projects, it is anticipated that university-developed software will be available through open source licensing, although industry-developed software will probably be proprietary. Various European countries, in particular France, have significant programs with industrial participation for the development of engineering codes. For example, the French SALOME project aims at the development of a large open source framework for CAD and numeric simulation; currently available code is distributed and maintained by the French Open Cascade company. EDF, EADS (aerospace) and other French companies are partners in the project. DARPA invested in similar projects as part of the SCI program, but that support seems to have disappeared. Furthermore, from the committee’s visits to DOE sites, members got the clear impression that there are no incentives for the transfer of codes developed at those sites to industrial use and no significant funding to facilitate the transfer.