Click for next page ( 96


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 95
A Core CS&E Research Agenda for the Future Core CS&E research is characterized by great diversity. Some core research areas are fostered by technological opportunities, such as advances in microelectronic circuits or optical-fiber communica- tion. Such research generally involves system-building experiments. The successful incorporation of the remarkable advances in technolo- gy over the past several decades-has been largely responsible for making computer systems and networks enormously more capable, while reducing their cost to the point that they have become ubiqui- tous. For other research areas, computing itself provides the inspira- tion. Complexity theory, for example, examines the limits of what computers can do. Computing-inspired CS&E research has often provided the key to effective use of computers, making the difference between the impossible and the routine. The diversity of technical interests within the CS&E research com- munity, of products from industry, of demands from commerce, and of missions between the federal research-funding agencies has creat- ed an intellectual environment in which a broad range of challenging problems and opportunities can be addressed. Indeed, the subdisci- plines of CS&E exhibit a remarkable synergy, one that arises because the themes of algorithmic thinking, computer programs, and infor- mation representation are common to them all; Box 3.1 provides il- lustrative examples. Narrowing the focus to a few research topics to the exclusion of others would be a mistake. Thus the description of 95

OCR for page 95
96 CO~G ME FUTURE promising research areas that fatuous below should not be regarded as definitive or exclusive. As the saying goes/ precise predictions are difficult particularly about the future. Nevertheless/ the committee ~ confident that tech- nology-driven advances ~iU be sustained for many more years and that computing and CS~E It continue to thrive on the philosophy,

OCR for page 95
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE 97 well stated by Alan Kay, that the best way to predict the future is to invert it. Major qualitative and quantitative advances will continue in several technological dimensions: Processor capabilities and multiple-processor systems; Available bandwidth and connectivity for data communications and networking; Program size and complexity; and Management of multiple types, sources, and amounts of data; Number of people who use computers and networks. For all of these dimensions, change will be in the same direction: systems will become larger and more complex. Coping with such change will demand substantial intellectual effort and attention from the CS&E research community, and indeed in many ways the overall theme of "scaling up" for large systems defines a core research agen- da. (See also Box 3.2.) Parts of the following discussion incorporate, augment, and ex- tend key recommendations from recent reports that have addressed various fields within CS&E: the 1988 CSTB report The National Chal- lenge in Computer Science and Technology; the 1989 Hopcroft-Kennedy report, Computer Science Achievements and Opportunities; the 1990 La- gunita report, Database Systems: Achievements and Opportunities; and the 1989 CSTB report Scaling Up: A Research Agenda for Software Engi neerlng.= PROCESSOR CAPABILITIES AND MULTIPLE-PROCESSOR SYSTEMS As noted in Chapter 6, future advances in computational speed are likely to require the connection of many processor units in paral- lel; Box 3.3 provides more detail. This trend was recognized in the Hopcroft-Kennedy report, which advocated research in parallel com- puting as described in Box 3.4. Computing performance will increase partly because of faster pro- cessors. Advances in technology, the good fit of reduced-instruction- set computing (RISC) architectures with microelectronics, and opti- mizing compiler technology have permitted processor performance to rise steeply over the past decade. Continued improvements are pushing single-processor performance toward speeds of 108 to 109 instructions per second and beyond. Even larger gains in performance will be achieved by the use of multiple processors that operate in parallel on different parts of a

OCR for page 95
CO~G ME FEE 11 1111111BO1X11 Il{~IlCEi1~11 111 ~!~S,s~ss,s~ss.s~s.s.ss.s.s.s.sss.s~!~s.s.s.s.s.s /1111 11lll m llll l 7 1 1 1 ~ ~ ~ 1 1 1 ~ 1 1 3 1 ~ 1 1 1 1 1 ~ C L - 1 1 ~ ~ # -

OCR for page 95

OCR for page 95
100 COMPUTING THE FUTURE demanding application. The Hopcroft-Kennedy report, written in 1986-1987, described a goal of 10-fold to 100-fold speedups.3 But since that time, technological advances have made this goal far too modest. Today, it is plausible to aim for increases in speed by factors of 1000 or more for a wide class of problems. Apart from this point, the Hopcroft-Kennedy outline remains generally valid. Nlector supercomputers have gained speed, more from multiple (10 to 100) arithmetic units than from increases in single-processor performance. Massively parallel supercomputers (1,000 to 100,000 very simple processors) have now passed vector supercomputers in peak performance. Workstations with multiple (2 to 10) processors are becoming more common, and similar personal computers will not be far behind. The ability of parallel systems to handle many demanding com- nutina problems has been demonstrated clearly during the period since the Hopcroft-Kennedy report was written. It had not been clear that linear speedups are practically achievable by using proces- The oracti 1 V 1 ~ r r ~ sors in parallel; indeed on some problems they are not.4 cat difficulty in exploiting parallel systems is that their efficient use generally requires an explicitly parallel program, and often a pro- gram that is tailored for a specific architecture. Although such pro- grams are generally more difficult to sprite than are sequential pro- grams, the investment is often justified. On well-structured problems in scientific computing, visualization, and databases, results have been obtained that would not otherwise be affordable.5 Insights into pos- sibilities for programming and architectures are provided by the study and instrumentation of problems that are less regular or algorithmi- cally more difficult and that "push the envelope" of parallel systems. Parallel computing will be a primary focus of the high-performance computing systems component of the HPCC Program. Distributed computing, another focus of the Hopcroft-Kennedy report, is perhaps a more pressing concern now than where that re- port's recommendations were formulated.6 Computing environments have been evolving from individual computers to networks of com- puters. Seamless integration of heterogeneous components into a coherent environment has become crucial to many applications. Cus- tomers are increasingly insisting on the freedom to buy their com- puter components-software as well as hardware from any vendor on the basis of price, performance, and service and still expect these various components to operate well together. Such pressure from customers has hastened the movement toward "open system" archi- tectures. Businesses are becoming more dispersed geographically, yet more integrated logically and functionally. Distributed computer systems are indispensable to this trend.

OCR for page 95
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE 101 As computing penetrates more arid more sectors of society, reli- ability of operation becomes ever more important. Many applica- tions (e.g., space systems, aircraft, air-traffic control, factory automa- tion, inventory control, medical delivery systems, telephone networks, stock exchanges) require high-availability computing. Distributed computing can foster high availability by eliminating vulnerability to single-point failures in software, hardware, electric service, the labor pool, and so on. What intellectual problems arise in parallel and distributed com- puting? As discussed at length in the Chapter 6 section "Systems and Architectures," parallel and distributed computing systems are capable of nondeterministic behavior, producing different results de- pending on exactly when and where different parts of a computation happen. Unwanted conditions may occur, notably deadlock, in which each of two processes waits for something from the other. These complications are exacerbated when the system must continue to op- erate correctly in the presence of hardware, communication, and soft- ware faults. Sequential programming is already difficult; the additional be- havioral possibilities introduced by concurrent and distributed sys- tems make it even harder to assure that a correct or acceptable result is produced under all conditions. New disciplines of parallel, con- current, and distributed programming, together with the develop- ment and experimental use of the programming systems to support these disciplines, will be a high priority of and a fundamental intel- lectual challenge for CS&E research for at least the next decade. DATA COMMUNICATIONS AND NETWORKING Compared with copper wires, fiber-optic channels provide enor- mous bandwidths at extremely attractive costs. A 1000-fold increase in bandwidth completely changes technology trade-offs and requires a radically different network design for at least three reasons. One reason is that the speed of transmission, bounded by the speed of light, is about the same whether the medium of transmis- sion is copper wire or optical fiber. Current computer networks are based on the premise that transit time (i.e., the time it takes for a given bit to travel from sender to receiver) is small compared to the times needed for processing and queuing.7 However, data can be entered into a gigabit network so fast that transit time may be com- parable to or even longer than processing and queuing time, thereby invalidating this premise. For example, a megabyte-size file can be queued in a gigabit network in ten milliseconds. But if the file is transmitted coast to coast, the transit time is about twice as long.

OCR for page 95
102 COMPUTING THE FUTtIRE Under these conditions, millions of bits will be pumped into the cross- country link before the first bit appears at the output. A second reason is that current networks operate slowly enough that incoming messages can be stored temporarily or examined "on the fly." Such examinations underlie features such as dynamic route computation, in which the precise path that a given message takes through a network is determined at intermediate nodes through which it passes. In a gigabit network the volume of data is much larger and the time available to perform "on-the-fly" calculations is much small- er, perhaps so much so that store-and-forward operation and dynam- ic routing may not be economically viable design options. A third reason is that the underlying economics are very differ- ent. In current networks, channel capacity (i.e., bandwidth) is expen- sive compared with the equipment that allows many users to share the channel. Sharing the channel (i.e., "multiplexing," or switching among many users) minimizes the idle time of the channel. But fiber optics is based on the transmission of light pulses (photons) rather than electrical signals. The technology for switching light pulses is immature compared with that for switching electrical signals, with the result that switching devices for fiber optics are relatively more expensive than channel capacity. As noted in Chapter 6, a complete understanding of networks based on first principles is not available at this time. Today's knowl- edge of networking is based largely on experience with and observa- tion of megabit networks. Gigabit networking thus presents a chal- lenging research agenda, one that is an important focus of the HPCC Program. Consider the following kinds of the research problems that arise in the study of gigabit networking: Network stability (i.e., the behavior of the flow of message traf- fic) is particularly critical for high-speed networks. A network is an interconnected system, with many possible paths for feedback to any given node. A packet sent by one node into the network may trig- ger at some indeterminate point in the future further actions in other nodes that will have effects on the originating node. The in- ability to predict just when these feedback effects will occur presents many problems for system designers concerned about avoiding cata- strophic positive feedback loops that can rapidly consume all avail- able bandwidth. This so-called delayed-feedback problem is unsolved for slower networks as well, but our understanding for slower net- works is at least informed by years of experience. Network response is another issue that depends on empirical understanding. In particular, the fiber-based networks of the future

OCR for page 95
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE 103 will transmit data much more rapidly, the computer systems inter- connected on these networks will operate much more quickly, the number of users will be much larger, and computing tasks may well be dispersed over the network to a much greater degree than today. All of these factors will affect the behavior of the network. Network management itself requires communication between net- work nodes. Gigabit networks will involve significant quantities of this "overhead" information (e.g., routing information), primarily be- cause there will be so many messages in transit. Thus, fast networks require protocols and algorithms that will reduce to an absolute min- imum the overhead involved in the transmission of any given mes- sage. Network connections will have to be much cheaper. Scaling a network from 100,000 connections (the Internet today) to 100 million connections (the number of households in the United States, and the ultimate goal of many networking proponents for which the National Research and Education Network may be a first step) will require radical reductions in the cost of installing and maintaining individu- al connections. These costs will have to drop by orders of magni- tude, a result possible only with the large-scale automation of opera- tions, similar to that used in the telephone network today. SOFTWARE ENGINEERING The problems of large-scale software engineering have been the focus of many previous studies and reports, in particular the Hopcroft- Kennedy report (Box 3.5) and the CSTB report Scaling Up (Table 3.1~. Nevertheless, large-scale software engineering remains a central chal- lenge, as discussed in Box 3.6. The committee recommends continuing efforts across a broad front to understand large-scale software engineering, concurs with the re- search agendas of the Hopcroft-Kennedy and CSTB reports, and wishes to underscore the importance of two key areas, reengineering of ex- isting software and testing. Reengineering of Existing Software Large-scale users of computers place great emphasis on reliabili- ty and consistency of operation, and they have enormous investments tied up in software developed many years ago by people who have long since retired or moved on to other jobs. These users often rec- ognize that their old software systems are antiquated and difficult to maintain, but they are still reluctant to abandon them. The reason is

OCR for page 95
104 COMPUTING THE FUTURE that system upgrades (e.g., converting an air traffic control system that might be written in PL/1 to a more modern one written in Ada) present enormous risks to the users who rely daily on that system. The new system must do exactly what the old system did; indeed, a new system may need to include bugs from the old system that pre- viously necessitated "work-arounds," because the people and other computer systems that used the old system have become accustomed to using those work-arounds. In many cases the current operating procedures of the organization are only encoded in (often undocu- mented) programs and are not written down or known completely by any identifiable set of people. Thus effective reengineering requires the ability to extract from code the essentials of existing designs. New technologies that sup- port effective and rapid upgrade within operational constraints would

OCR for page 95
CORE CSSE RESEARCH ~GE~ FOR ME FUTURE TABLE 31 The "Scaling Up" Agenda for Software Engineering Research 703 Short Term (1-5 years) Long Term (3-10 years) Perspective Engineering practice Research modes Portray systems real~Ucally: vow sagas as systems and recognize change as intrinsic Study and preserve software art/facts Codify software engineering knowledge for dissemination and reuse Develop software ~g~e~ handbooks Foster p~ct~r researcher ~terachons Research ~ unifying model for software development- for matching programming languages to appl~abons domains and design phases Strengthen mathematical Id scientific foundations Automat handbook knowledge, access and reuse-and make development of routine software more routine Nurture collaboration among system developers and between developers and users Legitimize academic exploration of large software systems in situ Clean insights from behavioral and managerial sciences Develop iddibonal research directions and paradigms amours common of review studies, contribution to handbooks SOURCE: Reprinted Mom Computer Sconce and Technology Board, Nadonal Re- search Councit Sag by: ~ Research Banjo jar Sphere En~f~eerfng, National Acad- emy Press, Washington, D.C., 1989, p. 4. have enormous value to software engineering, especially in the com- mercial world. Such technologies.could include graphical problem- description methodologies that provide visual representations of pro- gram or data Bow or automated software tools that make ~ easier to extract specifications from existing code or to compare different sets of specifications for contradictions or inconsistencies.

OCR for page 95
106 COMPUTING THE FUTURE ...... ... is ~ ~ ............ .. . __ At. . . ... .....

OCR for page 95
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE 107 Testing As noted in Box 3.2, testing is a severe bottleneck in the delivery of software products to market.8 Moreover, while program verifica- tion, proofs of program correctness, and mathematical modeling of program behavior are feasible at program sizes on the scale of hun- dreds of lines, these techniques are inadequate for significantly larg- er programs. One reason is sheer magnitude. A second, more im- portant reason is the inherent incompleteness, if not incorrectness, of large sets of specifications. Program verification can show that a program conforms to its specifications or that the specifications con- tain inadvertent loose ends, but not that the specifications describe what really needs to be done. Thus theories and practical methods of software testing that are applicable to real-world development environments are essential. Some relevant questions are the following: How can competent test cases be generated automatically? How can conformity between documentation and program function be achieved? How can requirements be tested and verified? INFORMATION STORAGE AND MANAGEMENT The Lagunita report described an important and far-reaching agenda for database research (Box 3.7~. The committee believes that the La- gunita research agenda remains timely and appropriate, and also com- mends for attention: Data mining and browsing techniques that can uncover previous

OCR for page 95
108 COMPUTING THE FUTURE . . .~ . . ly unsuspected relationships in data aggregated from many sources. Box 3.8 describes some database research questions that are motivat- ed by commercial computing. Systems architectures, data representations, and algorithms to ex- ploit heterogeneous, distributed, or multimedia databases on scales of terabytes and up. Multimedia databases will be especially useful to modern businesses, most of which make substantial use of text and images; document and image scanning, recognition, storage, and display are at the core of most office systems. Current networks . , , databases, tools, and programming languages do not handle images or structured text very well. Image searches, in particular, usually depend on keyword tags assigned to images manually and in ad ~rance. Distributed databases maintained at different nodes are increas- ingly common. Integrating data residing in different parts of the database (e.g., in different companies, or different parts of the same company) will become more important and necessary in the future. Thus research on multimedia and distributed databases would have a particularly high payoff for commercial computing.

OCR for page 95
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE 109

OCR for page 95
110 COMPUTING THE FUTURE ......... .................................................... ... . . . .................... : . ~ : , , ,,, ........ . . . . '' ' ' '' ' ' ' ' ' ' ' ' ' '' ' ' ' '' ' ' ' ' '.'.','.. '.''"'s"2'u.''"'"'"2'"'"" ""''' "" 1~111)f~141~171~1115~333333333~1;c1~il chances ro iei~ottiD8 st nouns 21 r to A fist c os PF e" o i~ a percentage bt~tllt>ricing to actua| F3r~tesl as is currently ha 3~1 fa333i~i3331~1~51~!~1~!ut~; , ... . . ........ . . . . .. . . . ..... . . ....... ...... r ~ ~ he ess2 4 a I Q IthmS a O Me|| Furthermo ei it s t em tabs acme ails the ~eceivel1~3deteil ~ ire crt~|iepeous da:| ........ ............................... .. IS: RELIABILITY Reliability informally defined here as the property of a comput- er system that the system can be counted on to do what it is sup- posed to do- is an example of a research area that potentially builds on, or is a part of, many other areas of CS&E. Distributed systems provide one promising method for constructing reliable systems. Assuring that a program behaves according to its specification is one of the first requirements of software engineering. Large (terabyte- scale) databases will need to be maintained on line and to be accessi- ble for periods longer than the time between power failures, the time between media failures (disk crashes), and the lifetime of data for- mats and operating software. As computing becomes a crucial part of more and more aspects of our lives and the economy, the reliability of computing correctly comes into question (Box 3.9~. The following technical problems are often relevant to decisions regarding whether computers should be used in critical applications: As failures in telephone and air traffic control systems have demonstrated, errors can propagate catastrophically, causing service

OCR for page 95
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE 111 ~ ~ ~: . ~ ~ ~ ~ ~: ~ ~ :~ . ~ . ~ ~ ~ ~ . . ~ ~: ~ :~ . ~ ~ .~ ~ ~ . ~ :~ ~ ~ ~ ~ ~ . ~: ~ .~ ~ outages out of proportion to the local failures that caused the prob- lem. The problems of ensuring reliability in distributed computing are multiplying at least as rapidly as solutions. Software systems, particularly successful ones, usually change enormously with time. Yet almost everything in the system builder's tool kit is aimed at building static products. Existing techniques do not contemplate making a system so that it can change and evolve without being taken off line. Upgrading a system while it runs is an important challenge. A related challenge is doing large computa- tions whose running time exceeds the expected "up time" of the computer or computers on which the computation is executed. Ad hoc meth- ods of checkpointing and program monitoring are known, but their use may introduce debilitating complications into the programs. Nonexpert users of computers need graceful recovery from er- rors (rather than cryptic messages, such as "Abort, Retry, or Fail?") and automatic backup or other mechanisms that insulate them from the penalties of error. USER INTERFACES User interfaces, one dimension of a subfield of CS&E known as human-computer interaction, offer diverse research challenges.9 The keyboard and the mouse remain the dominant input devices today.

OCR for page 95
112 COMPUTING THE FUTURE Talking to a computer is in certain situations more convenient than typing, but the use of speech as an input medium poses many prob- lems, some of which are listed in Box 3.10. If it is to cope with situations of any complexity, a computer must be able to interpret imprecisely or incompletely formulated utterances, recognize ambi- guities, and exploit feedback from the task at hand. Analogous prob- lems exist in recognizing cursive handwriting, and even printed mat- ter, in which sequences of letters are merged or indistinct. The experimental DARPA-funded SPHINX system, described more fully in the Chapter 6 section "Artificial Intelligence," is a promising start to solving some of the problems of speech recognition, and Ap- ple Computer expects to bring to market in the next few years (and has already demonstrated) a commercial product for speech recogni- tion called "Plaintalk" based on SPHINX. Recognition of gestures would also increase the comfort and ease of human-computer interaction. People often indicate what they want with gestures they point to an object. Touch-sensitive screens can provide a simple kinesthetic input in two dimensions, but the recog- nition of motions in three dimensions is much more difficult. Pen- based computing, i.e., the use of a pen to replace both the keyboard (for the input of characters) and the mouse (for pointing), is another form of gesture recognition that is enormously challenging and yet has the potential for expanding the number of computer users con- siderably. Indeed, the ability to recognize handwritten characters, both printed and cursive, will enable computers to dispense entirely with keyboards, making them much more portable and much easier to use. The primary output devices of today adhere to the paper meta- phor; even the CRT screen is similar to a sheet or sheets of paper on , by.. ~ ~

OCR for page 95
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE 113 which two-dimensional visual objects (e.g., characters or images) are presented, albeit more dynamically than on paper. People can, how- ever, absorb information through, and have extraordinary faculties for integrating stimuli from, different senses. Audio output systems can provide easily available sensory cues when certain actions are performed. Many computers today beep when the user has made a mistake, alerting him or her to that fact. But sounds of different intensity, pitch, or texture could be used to provide much more sophisticated feedback. For example, an audio output system could inform a user about the size of a file being deleted, without forcing the user to check the file size explicitly, by making a "clunk" sound when a large file is deleted and a "tinkle" sound when a small one is deleted. Touch may also provide feedback. Chapter 6 (in the section ti- tled "Computer Graphics and Scientific Visualization") describes the use of force feedback in the determination of molecular "fitting"- how a complex organic molecule fits into a receptor site in another molecule. But the use of a joystick is relatively unsophisticated com- pared to the use of force output devices that could provide resistance to the motion of all body parts. Three-dimensional visual output provides other interesting re- search issues. One is the development of devices to present three- dimensional visual output that are less cumbersome than the elec- tronic helmets often used today. A second issue cuts across all problem domains and yet depends on the specifics of each domain: many appealing examples of "virtual reality" displays have been proposed and even demonstrated, but conceiving of sensible mappings from raw data to images depends very much on the application. In some cases, the sensible mappings are obvious. A visual flight simulator simulates the aircraft dynamics in real time and presents its output as images the pilot would see while flying that airplane. (Along the lines of the discussions above, the sounds, motions, control pres- sures, and instruments of the simulated aircraft may also be present- ed to the pilot. These simulations are so realistic that an air-trans- port pilot's first flight in a real aircraft of a given type may be with passengers.~ ~ But in other cases, such as dealing with abstract data, useful mappings are not at all obvious. What, for example, might be done with the reams of financial data associated with the stock market? SUMMARY AND CONCLUSIONS The core research agenda for CS&E has been well served in the past by the synergistic interaction between the computer industry,

OCR for page 95
14 COMPUTING THE FUTURE the companies that are the eventual consumers of computer hard- ware, software, arid services, and the federal research-funding agen- cies. As a result, CS&E research exhibits great diversity, a diversity that is highly positive arid beneficial. Ire turret, this diversity allows a broad range of challenging problems and opportunities to be ad- dressed by CS&E research. Thus, although the committee cannot escape its obligation to address priorities and to provide examples of research areas that it believes hold promise, it must be guarded in its judgment of what constitutes today's most important research. That said, the committee believes that major qualitative and quan- titative advances in several dimensions will continue to drive the evolution of computing technology. These dimensions include pro- cessor capabilities and multiple-processor systems, available band- width and connectivity for data communications and networking, program size and complexity, the management of increased volumes of data of diverse types and from diverse sources, and the number of people usurp computers and networks. Understanding and manag- ing these changes of scale will pose many fundamental problems in computer science and engineering, and using these changes of scale properly will result in more powerful computer systems that will have profound effects on all areas of human endeavor. NOTES 1. The definition of which subareas of CS&E research constitute the "core" is subject to some debate within the field. For example, the Computer Science and Technology Board report The National Challenge in Computer Science and Technology (National Academy Press, Washington, D.C., 1988) identified processor design, dis- tributed systems, software and programming, artificial intelligence, and theoretical computer science as the subfields most likely to influence the evolution of CS&E in the future, noting that "the absence of discussion of areas such as databases does not mean that they are less important, but rather that they are likely to evolve further primarily through exploitation of the principal thrusts that [are discussed]" (p. 39). In its own deliberations, the committee included such areas in the "core" of CS&E, moti- vated in large part by its belief that their importance is likely to grow as CS&E ex- pands its horizons to embrace interdisciplinary and applications-oriented work. 2. Computer Science and Technology Board, National Research Council, The Na- tional Challenge in Computer Science and Technology, National Academy Press, Washing- ton, D.C., 1988; John E. Hopcroft and Kenneth W. Kennedy, eds., Computer Science: Achievements and Opportunities, Society for Industrial and Applied Mathematics, Phila- delphia, 1989; Avi Silberschatz, Michael Stonebraker, and Jeff Ullman, eds., "Database Systems: Achievements and Opportunities," Communications of the ACM, Volume 34(10), October 1991, pp. 110-120; Computer Science and Technology Board, National Re- search Council, Scaling Up: A Research Agenda for Software Engineering, National Acad- emy Press, Washington, D.C., 1989.

OCR for page 95
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE 115 3. Hopcroft and Kennedy, Computer Science: Achievements and Opportunities, 1989, p. 72. 4. A simple simulation argument shows in general that super-linear speedup on homogeneous parallel systems (i.e., systems that connect the same basic processor many times in parallel) is not possible. Super-linear speedup would involve, for ex- ample, applying two processors to a problem (or to a selected class of problems) and obtaining a speedup larger than a factor of two. In addition, for many interesting applications it turns out that even linear speedup is impossible even when the ma- chine design should "in principle" allow linear scaleup. Both the limitations of real machines and the issues of what scaling implies for problems from the physical world make even linear scaleup impossible for many real problems. 5. For example, parallel processors are emerging as effective search engines for terabyte-size databases. Automatic declustering of data across many storage devices and automatic extraction of parallelism from nonprocedural database languages such as SQL are demonstrating linear speedup and scaleup. Teradata, Inc., has demonstrat- ed scaleups and speedups of 100:1 on certain database search problems. 6. Distributed computing refers to multiple-processor computing in which the overall cost or performance of a computation is dominated by the requirements of communicating data between individual processors, rather than the requirements of performing computations on individual processors. Parallel computing refers to the case in which the requirements of computations on individual processors are more important than the requirements of communications. 7. Transit time is important to gigabit networks because the arrival of messages at a given node is a statistical phenomenon. If these messages arrive randomly (i.e., if the arrival times of messages are statistically independent), the node can be designed to accommodate a maximum capacity determined by well-understood statistics. How- ever, if the arrival time of messages is correlated, the design of the node is much more complicated, because "worst cases" (e.g., too many messages arriving simultaneously) will not be smoothed out for statistical reasons. In gigabit networks, the network-switching and message-queuing time for small files will be much smaller than the transit time. The result is that the end-to-end transmission time for all messages will cluster around the transit time, rather than spread out over a wide range of times as in the case of lower-speed networks. 8. The impact of software testing on product schedules has been known for a long time. In 1975, Fred Brooks noted that testing generally consumed half of a project's schedule. See Frederick Brooks, The Mythical Man-Month, Addison-Wesley, Reading, Mass., 1975, p. 29. 9. Human-computer interaction is a very broad field of inquiry, some other areas of which are discussed in Box 2.8 in Chapter 2. Human-computer interaction is highly interdisciplinary, drawing on insights provided by fields such as anthropology, cogni- tive science, and even neuroscience to develop ways for computer scientists and engi- neers to maximize the effectiveness of these interactions. 10. For example, the flight simulator for the A320 Airbus is sufficiently sophisticat- ed that pilots can receive flight certification based solely on simulator training. See Gary Stix, "Along for the Ride," Scientific American, July 1991, p. 97.