Read "An Agenda for Improved Evaluation of Supercomputer Performance" at NAP.edu

« Previous: 1. Introduction

Page 13 Cite

Suggested Citation:"2. Principles for the Evaluation of Supercomputers." National Research Council. 1986. An Agenda for Improved Evaluation of Supercomputer Performance. Washington, DC: The National Academies Press. doi: 10.17226/1001.

Page 14 Cite

Page 15 Cite

Page 16 Cite

Page 17 Cite

Page 18 Cite

Page 19 Cite

Page 20 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

2 PRINCIPLES FOR THE EVALUATION OF SUPERCOMPUTERS An evaluation of supercomputers may be undertaken for any of several purposes : to choose the machine that best satisfies stated requirements, to allocate machine resources efficiently among many user`;, to estimate the capability of solving a given problem within constraints of time and cost, or to evaluate alternatives for the design of new supercomputer systems. THE CONTEXT FOR ENJALUAT I ON As suggested by the definition of the word "evaluate, " the purpose of computer evaluation is to determine the value of a computer (and, by implication, the computer system of which it is a major part). However, no evaluation method can determine the value of a computer in the abstract; it is a fundamental principle of computer evaluation that the value of a computer is dependent on the context in which it is used. The connotation of ''context'' can be understood through three corollaries of this principle: Corollary 1: The value of a computer is application dependent. Corollary 2: The value of a computer is site dependent. Corollary 3: The value of a computer is time dependent. Different applications having comparable computational complexity perform differently on the same computer, sometimes by an order of magnitude or more, so the validity of an evaluation is limited to the applications studied. Further, because different computer center sites with similar hardware and software have different application mixes, the validity of an evaluation of a computer for one site or installation may not, and usually does not, extend to others. Finally, applications are continually being refined through changes in programming, algorithms, 13

14 mathematical methods, and the scientific or engineering principles that are used, so the validity of an evaluation may degrade over time. In summary, the value of a computer is a dynamic characteristic that is dependent on the changing requirements of the users. CRITERIA FOR EVALUATION Figure 2-1 shows a relevance tree for some of the criteria that affect the value of a computer. Qualitative and quantitative criteria are distinguished and discussed briefly below to emphasize that evaluation rests on many factors. One of these factors--performance--is then singled out as the main topic of the report. Qualitative Criteria Obsolescence The value of a computer is in part dependent on its position in its life cycle. If only quantitative measures such as performance and cost are considered in the evaluation, a system can appear to have high value even though it is obsolescent; however, the system may have a relatively short remaining life so the cost per productive year may actually be quite high. Obsolescence is an issue not only for the acquisition of computers but also for their design. For example, the main-memory capacity of supercomputers has increased dramatically in recent years because of the precipitous decline in the cost of random-access memory chips. Thus, to design a supercomputer without a very large real-address space would be an instance of obsolescent design. Compatibility A policy in wide use among users of large-scale computers is that a system incompatible with the systems already installed must provide a performance gain commensurate with the cost of conversion. Compatibility is an issue not only in acquisition of existing systems but in the design of new systems: departing from the details of a previous design will often permit higher performance to be achieved but at the cost of conversion efforts by the users.

15 EVALUATION CRITERIA 1 ORAL ITATIVE OBSO~SC-NC~ CC~ATIBILI~ ST=D~S QUANTITATIVE ~ - ' 1 P_~e-ORMANCE COST PRODUCTIVE= FIGURE 2-1 A relevance tree for supercomputer evaluation.

16 Standards Various standards of performance and usage are necessary to computer operations. These standards may pertain to hardware characteristics, interfaces, operating systems, languages, and protocols for communication. Conformity with two kinds of standards, official and de facto, affects the value of supercomputers. De facto standards connote those products that are in use by such a large percentage of a user community that they take on the character of a standard without official promulgation. Examples of de facto standards occur in the minicomputer, main frame, and supercomputer markets, wherever a given product series enjoys a major market share. Operating systems, such as UNIX and \'MS, and languages, such as FORTRAN and ADA, have become de facto standards. Conformity with de facto standards increases the value of a computer because of the large set of user-generated applications and system software available to it and the ease of collaboration with a large set of users. In addition, both kinds of standards facilitate the portability to new machines of operating systems, applications packages and large programs. Quantitative Criteria Performance Performance evaluation is usually conducted by analyti Cal modeling, simulation, and measurement of the running time of selected codes (commonly called "benchmarking" ~ . Performance criteria for highly complex problems i nclude problem solution time and execution rate; the criterion for large volumes of problems each having low complexity is throughput, that is, the number of problems that can be solved in a given time. In either instance, it is the requirements of the users that determine the appropriate criteria. Co s ~ s A complete list of costs begins with hardware, software, personnel, operations, and system maintenance. It goes on to include such things UNIX is a trademark of AT&T Information Systems.

17 as development of application programs, networking, training, consulting, obsolescence, upgrading, conversion of programs, and documentation. Costs are often underestimated by including only those of acquisition and installation. Productivity Productivity is defined as output divided by input, where output includes all results perceived useful by users and input includes all costs to produce the output. Productivity is not adequately measured by the performance and cost of the system alone: the performance and cost of the users themselves must be evaluated. For example, the U.S. Department of Energy high-energy physics community considers "quality of life" to be of equal merit with system performance in assessing the value of a system. Quality of life includes such things as fast response time in an interactive environment, powerful debugging aids, high-speed and easy to use graphics, powerful languages, and optimizing compilers. MATCHING COMPUTATIONAL AND APPLICATION CHARACTERISTICS The task of evaluation of supercomputers is conceptually one of matching the characteristics of the computational environment with those of the intended applications. Table 2-1 characterizes various computing resources according to the technologies used in computing environments (processing, storage, input-output, and communications) and the mode of access to those technologies (centralized, distributed, and personal). Centralized resources are those that are shared by a large number of users, typically hundreds to thousands. Distributed resources are usually shared by a tens to hundreds of users, although in principle they are accessible by all users on the network. Personal resources are typically not shared at all. It is the whole environment, not just centralized processing, that must be evaluated in matching resources to requirements. In the chapters that follow, the emphas is is on quantitative evaluation of system performance of centralized resources. THE CYCLE OF PERFORMANCE EVALUATION Another principle of supercomputer evaluation is that the performance evaluation process is typically cyclic in nature, that is, it may require several iterations through the process before a satisfactory

18 Table 2-1 Computing Resources by Technologies and Operating Environments Technologies Operating Environments Centralized Distributed Personal Processing Large-scale Mini Micro and personal computers computers computers Storage Common file Local disk Floppy and hard disks system systems Input-output High-speed Medi~'m-speed Terminals, graphics printers and work stations, and plotters slow printers Communications Site networks, Site networks, Site networks, LAN s a, and WANsb LANs, and WAN s LANs, and WAN s a LAN = Local-area network b WAN = Wide-area network

19 result is obtained. Figure 2-2 illustrates the major steps in the cycle. We begin with the systems of interest, either existing or proposed supercomputers, and generate either empirical or analytical models of them. In the past these models have been largely empirical, consisting of relationships drawn from experience in the category of ''rules of thumb." scalar performance in iloat~ng-point operations per unit time is 1/~12*tC), where tc is the scalar cycle time. This rule has been found to provide an estimate typically accurate within a few tens of percent. However, the rule does not rest on a theory to explain why it is correct or how to improve on it. To move performance evaluation toward a science it is necessary to go beyond rules of thumb to analytical models that have both explanatory and predictive power, and better accuracy, through the identification of explanatory variables and their relationships. The type of experiments used in this cycle depends on whether the system of interest is an existing computer, in which case its performance can be measured, or whether it is a proposed computer, in wn~cn case its performance must be simulated. The raw data obtained from experiments can be combined to give several metrics of performance. These metrics include time metrics such as problem solution time and interactive response time; rate metrics · ~ ~ · ~ ~ ~ · ~ - . ~ · ~ · . ~ For example, a simple empirical rule for estimating such as moons or ~nscrucc~ons per second, moons or t~oat~ng-point operations per second, and throughput; and cost metrics such as the product of the number of processors and the solution time for parallel processors. Depending on the observed Performance. the system may be r O modified and the cycle repeated. Evaluation usually requires many iterations of this cycle to satisfy optimization criteria for either selection or design of a supercomputer system. This iterative process is discussed further in Chapter 4.

20 r 1 SUPERCOMPUTERS \ / \ - - · Existing · Proposed . METRICS MODELS · Time | | ~ Empirical | · Rate I | ~ Analytical | ·Cost I l l 1 EXPERIMENTS · Measurements · Simulations FIGURE 2-2 The cycle 0 f perf ormance evaluation .

Next: 3. The Current State of the Art in Evaluation of Supercomputers »

An Agenda for Improved Evaluation of Supercomputer Performance (1986)

Chapter: 2. Principles for the Evaluation of Supercomputers

Welcome to OpenBook!

Get Email Updates