Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Part I] The Science, Engineering, and Technologies In Part IT we take up the scientific, engineering, and technolog- ical thrusts that were identified and summarized in Part ~ as most likely to influence the evolution of the field in the next decade and beyond. Here we provide more detail about their potential uses as weld as some of the associated problems and prospects. Again, we remand the reader to keep in mind two important points: first, there are at least four technologies relevant to the link between computers and productivitycomputer technology, communication technology, semiconductor technology, and packaging and manufacturing tech- nology. We have focused on the first. Second, we are not presenting here an exhaustive taxonomy of all the computer subfields and their capabilities, but rather what we consider the most promising. The absence of discussion on areas such as databases does not mean that they are less unportant, but rather that they are likely to evolve fur- ther prunarily through exploitation of the principal thrusts that we do discuss, such as those in multiprocessors and intelligent systems.
6 Machines, Systems, and Software One of the biggest tasks facing computer designers is the devel- opment of systems that exploit the capabilities of many computers in the form of what are called multiprocessor and distributed sys- tems. Related to current and future hardware systems is the task of developing the associated software. These three topics are discussed below. MULTIPROCESSOR SYSTEMS Multiprocessor systems strive to harness tens, hundreds, and even thousands of computers to work together on a single task, for example, to solve a large scientific problem or to understand human speech and transcribe it to text. These systems involve many processors located close to each other and with some means for communicating with one another at relatively high speeds. The effort to build multiprocessor systems is the consequence of a powerful economic trend. Generational advances in VEST design have made it possible to fabricate powerful microprocessors relatively cheaply; today, for example, a single silicon chip capable of executing 1 million instructions per second costs less than $100 to make. But the cost of manufacturing a very high-speed single processor, using a number of chips and other high-speed components, has not declined correspondingly. As a result, the world's fastest computers, which 41
42 perform only hundreds of times faster than a single microprocessor, cost hundreds or thousands of times as much. Consequently, link- ing many inexpensive processors makes sound economuc sense and raises the possibility of scalable computing, i.e., using only as many processors as are needed to perform a given task. Another import ant incentive to build multiprocessor systems is related to physical limits. Advances in the speed of large, prunarily single-processor machines have slowed down: after averaging a ten- fold increase every 7 years for more than 30 years, progress is now at the rate of a threefold increase every 10 years (Kuck 1986~. These superprocessors are approaching limits dictated by conflicts between the speed of light and thermal cooling: they must be small for infor- mation to move rapidly among different circuits of the machine, yet they must be large to permit dissipation of the heat generated by the fast circuits. Multiprocessors are expected to push these limits higher because they tackle problems through the concurrent opera- tions of many processors. If the recent advances in superconductivity result in effectively raising these limits for superprocessors, they will also raise them for multiprocessors and thus the relative advantages of multiprocessors over single processors will remain the same. The key technological problems related to the creation of use- fu] mutliprocessor systems include: (1) the Recovery and design of architectures, i.e., ways of interconnecting the processors so that the resulting aggregates compute desirable applications rapidly and efficiently; (2) finding ways to program these large systems to per- form their complex tasks; and (3) solving the problem of reliability, i.e., minimizing failure of performance within a system in which the probability of individual component failures may be high. Current experunental work In the multiprocessor area includes exploration of different processor-communication architectures, design of new lan- guages, and extension of popular older languages for multiprocessor programming. Theoretical work includes the exploration of the ulti- mate limitations of multiprocessors and the design of new algorithms suited to such systems. The potential uses of multiprocessors are numerous and signifi- cant. The massive qualitative increase in computing power expected of multiprocessors promises to make these systems ideally suited to large problems of numerical and scientific computing that are charac- terized by inherent parallelism, e.g., weather forecasting, hydra- and aerodynamics, weapons research, and high-energy physics. Perhaps more surprisingly, conventional transaction-oriented computing tasks
43 in banks, insurance companies, mrlines, and other large organizations can also be broken down into independent subtasksorganized by account or flight number, for example indicating that they can be managed by a multiprocessor system as well as, and potentially more cheaply than, by a conventional mainframe computer. In short, the fact that current programs are sequential ~ not a consequence of their natural structure in all cases, but rather of the fact that they had to be written sequentially to fit the sequential constraint of single-processor machines. Most promising of ad, multiprocessors are viewed by many com- puter scientists as a prerequisite to the achievement of artificial intelligence applications involving the use of machines-for sensory functions, such as vision and speech understanding, and cognitive functions, such as learning, natural language understanding, and reasoning. This view is based on the large computational require- ments of these problems and on the recognition that multiprocessor systems may imp ate in some primitive way human neurological or- gan~zation: human vision relies on the coordinated action of millions of retinal neurons, while higher-level human cognition makes use of more than a trillion cells in the cerebrum. Traditional supercomputers, which rely on one or a handful of processors running at very high speeds, have already demonstrated their utility in several important applications. The proven capabili- ties of supercomputers will be greatly multiplied if the potential of multiprocessors is realized, leading to the tantalizing possibility of ultracomputers, which will harness together large numbers of super- processors to yield mind-boggling computational power. Such power, in turn, could be used to expand scientific capabilities, for example, through computational observatories, computational microscopes, computational biochemical reactors, or computational wind tunneb. In these applications, massive-scale simulations would be performed to addrem previously unsolved scientific problems and to chart un- explored intellectual territory. DISTRIBUTED SYSTEMS Distributed systems are networks of geographically separate computers collections of predominantly autonomous machines con- trolled by individual users for the performance of individual tasks, but also able to communicate the results of their computations with one another through some common convention. E a multiprocessor
44 system can be likened to several horses puDing a cart with a single destination as their goal, then a distributed system can be likened to a properly functioning society of individuals and organizations, pur- suing their own work under their own planning and decision schemes, yet also engaging in intercommunication toward achieving common organizational or individual goals. Multiprocessor systems link many computers primarily for rea- sons of performance, whereas distributed systems are a consequence of the fact that computers and the people who use them are scattered geographically. Networking making it possible for these scattered machines to communicate with one anotheropens up the possibil- ity of using resources more efficiently; more important, it connects the users into a community, making it possible to share knowledge, improve current business, and transact it in new ways, for exam- ple, by purchase and sale of information and informational labor. Networking also raises new concerns about job displacement; the po- tential for the invasion of privacy; the dissemination and uncritical acceptance of unreliable, undesired, and damaging information; and the prospect of theft on a truly grand scale. These are reminders that technology, like any innovation, carries with it risks as well as. benefits and that safeguards must be provided to protect against such incursions. Devising appropriate safeguards is itself an urgent topic of theoretical systems research. Distributed systems have emerged naturally in our decentralized industrial society. Their emergence reflects the proliferation of com- puters, especially the spread of personal desktop computers, and the appetite of users for more and more information. It also reflects the demands of the marketplace, in which users operate sometimes as in- dividuals, at other times as members of an organization, and at stiD Other times for interorganizational purposes. Distributed systems rely on a range of communication technologies and approaches- including telephone and local area networks, long-haul networks, satellite networks, cellular, packet radio and optical fiber networks- to connect computers and move information as necessary. Distributed systems are at the basis of modern office automation. They are evident in computer networks such as the ARPANET, which has facilitated the exchange of ideas and information within the nation's scientific community. Perhaps most significant, distributed systems have begun to transform national and international economic life, creating the beginnings of an information marketplace geared to the exchange of information. Electronic mail enables communities of
45 users to annotate, encapsulate, and broadcast messages with little or no handling of paper and makes it possible to send people-to-people or program-to-program messages. Customers from every part of the country can tap into centralized resources, including bibliographic databases, electronic encyclopedias, or any of a growing number of specialized financial, legal, news gathering, and other information services. Geographically separated individuals can pool resources for joint work: a manual for a new product can be assembled with input from the technical people on one coast and from the marketing staff on the other; a proposal can be circulated widely for comments and rebuttals from many contributors electronically. Homebound and disabled individuals can participate more actively in the economy, liberated by networking from the constraints of geographical isolation or physical handicap. The technology of distributed systems, through enhanced and nationwide network access, could have a major and unique impact on the future economy. The limitations of today's distributed systems, whether they form a small local system in a building or a much larger corporate communication system, inhibit the growth of an information mar- ketplace. They do so because the systems communicate at a rather low level, with the only commonly understood concepts being typed characters and symbols. They also are often heterogeneous, made up of machines from a variety of manufacturers, which employ a number of different hardware and software conventions. Except at the lowest level of communicating characters, there are no universally accepted standard conventions (protocols), although such shared communica- tion regimes represent the first level of software needed for providing a higher level of commonality of concepts among such Separate machines. As a result, if two computer program are currently to understand each other in order, for example, to process an invoice, they must be specifically programmed to do so. Such understanding is not shared by other number of machines unless they are similarly programmed. The greater the number of machines participating in a Attributed system, the greater the agreement needed on com- mon programming. Such agreement is not easy to arrive at, in part because system heterogeneity makes it difficult to implement. Con- sequently, one of the major problems ahead is the development of common and effective distributed system semanticsthat is, lan- guages and intelligent software systems that will help machines com- municate with and understand one another at levels higher than
46 the communication of characters, and whose use is sunple and fast enough to win acceptance among a large number of users. Despite these obstacles, the number of interconnected systems is increasing because of their great utility. The ongoing proliferation of these computer networks provides a test bed for research and a pow- erful incentive for developing an understanding of their underlying principles. Beyond the need for Attributed system semantics, other important aspects include the development of innovative and robust system architectures that can survive computer and communication failures without unacceptable losses of information, the development of network management systems to support reliable network ser- vice on an ever growing scale, and the creation and evaluation of algorithms specifically tailored to Attributed systems. Finally, on the software side of these systems, the problems associated with programming large and complex computer systems must be better understood. SOFTWARE AND PROGRAMMING Computers are general-purpose tools that can be specialized to many different tasks. The collections of instructions that achieve this specialization are called programs or, collectively, software. It is software that allows a single computer to be used at various tunes (or even sunultaneously by several users) for such diverse activities as inventory and payroll computations, word processing, solving differ- ential equations, and computer-ass~ted instruction. Programs are in many ways similar to recipes, game rules, or mechanical assembly in- structions. They must express rules of procedure that are sufficiently precise and unambiguous to be carried out exactly by the machine that is executing them, and they must allow for error conditions resulting from bad or unusual combinations of data. Unlike other products, the essence of software is in its design, which is inherently an intellectual activity. This is so because produc- ing many instances of a program involves straightforward duplication rather than extensive fabrication and assembly. Accordingly, the cost of producing software is dominated by the costs of designing it and making certain that it defines a correct procedure for performing all of its desired taskscosts that are high because of the difficulties inherent in software design. Creating software involves devising representations for informa- tion called data structures and procedures called algorithms to carry out the desired information processing. One of the major difficulties
47 associated with this task is that there are generally many different combinations of data structures and algorithms that can perform a desired task effectively. For example, in a machine vision system that inspects circular parts, a circle could be represented by its center and radius (a data structure of two numbers for every part) or by a few thousand points that approxanate the circumference; the first data structure is more economical but does not by itself permit deviations from circularity to be represented, as does the second. In carrying out the artful process of data structure and procedure selection, the programmer must often pay equal attention to both large and small software parts, like an architect who must design a house, down to all its windows, doors, doorknobs, and even bricks. Furthermore, it ~ difficult for programmers to anticipate during design all the circumstances that might conceivably arise while a program is being executed. Indeed, another difficulty involves the illusion that software, because it is the stuff of design, is infinitely malleable and can therefore be easily changed for improvement or to meet new demands. Unfortunately, software designs are often so complex and variations among different parts so subtle that the implications of even a small change are hard to anticipate and control. These factors make it fundamentally hard to specify and design software. They also make software difficult to test by anticipation of ah the failure circumstances that may accidentaBy arise during actual operation. For these reasons, software development ~ often quite costly and time consuming. Beyond design, and after a program is put to use, modifications are often required to repair errors, add new capabilities, or adapt it to changes in other programs with which it interacts. This activity is called software maintenance a misleading term since it involves continued system design and development rather than the traditional notion of fending oE the ravages of wear and age. Such maintenance can amount to as much as 75 percent of life-cycle cost (Boehm 1981~. In the 1960s the cost of computing was dominated by the cost of hardware. As the use of computers became more sophisticated. the cost of hardware dropped, and the salaries of programmers in- creasecI, software costs came to dominate the cost of computing (OECD 1985~. The problem has been aggravated by an appar- ent shortage of good software professional and limited productivity growth. The overall annual growth of programming productivity is at best 5 percent (OECD 1985~. The increase in software costs is taking place throughout the field, from small programs on personal
48 computers to life-critical applications and supercomputing. The cost increase is fueling efforts to improve software engineering through de- velopment of tools to partially automate software development and techniques for reusing software modules as parts of larger pieces of software. A lot of work is needed to increase productivity ~ software development by a factor of ten, considered a critical milestone in industry. Another serious software problem has to do with people's per- ceptions and expectations. Hardware and software sound similar, and people are frequently appalled that as the former gets cheaper by some 30 percent per year, the latter stubbornly resists produc- tivity improvements. Such a comparison, however, reflects a mis- understanding of the nature of the software development process; a brief recounting of the development of MAC SYMA, one of the earliest knowledge-based programs, suggests why. To develop a re- search prototype of that program a mathematical assistant capable of symbolic integration, differentiation and solution of equations- took 17 calendar years and some 100 person-years. In terms of the number of moving parts and their relationship to one another, the program's complexity was comparable to that of a jumbo jet, whose design and development cost more than 100 tunes as much. Most people can intuitively grasp the difficulties of constructing complex physical systems such as jumbo jets. But the complexities and design difficulties in the more abstract world of software are less obvious and less appreciated. Despite all tliese difficulties, software development has seen sig- nificant progress. In the early days of programming, it was often a triumph to write a program that successfully computed the de- sired result. There was little widespread systematic understanding of program organization or of ways to reason about programs. A1- gorithms and data structures were originally created in an ad hoc fashion, but regular use and research led to an increased fundamen- tal understanding of these entities for certain problem domains: we can now analyze and compare the performance of several proposed algorithms and data structures and, for several kinds of problems, we often know in advance theoretical limits on performance (see Chad ter 8~. Sound theories have also contributed to the construction of certain classes of software systems: for example, in the early 1960s the construction of a compiler (a program that translates programs written at a higher-level language to mach~ne-leve] programs) was a significant achievement for a team of programmers; such systems are
49 now constructed routinely (with largely automated means by a much smaller group) for traditional single-processor machines. In today's computing environment, escalating demands for over- aD computer performance become escalating demands for software capability and software size. Development of large systems requires the coordination of many people, the maintenance and control of many versions of the software, and the testing and remanufacture of new versions after the system has been changed. The problems associated with these activities became a focus of computer science research in the m~-1970s, through techniques of modular decom- position (Parnas 1972) and organization of large teams of program- mers (Baker 1972~. At that tune, a Extinction was made between progranuning-in-the-smaD and programming-in-the-large to cad at- tention to the difference between the problems encountered by a few people writing simple program and the problems encountered by large groups of people constructing and managing sizable assemblies of modules. Beyond these more or less pure software systems that deal only with information, there are other, even more complex, highly dis- tributed systems that often interact with physical processes, such as the U.S. telecommunications, air traffic control, transportation, process control, energy, air defense, strategic offense, and command- control-communication and intelligence systems. These supers tems, as they have been called (Zracket 1981), grow over a period of decades from initially limited objectives to evolutionarily mature end states that are generally unpredictable at the start. The need to create such supersystems and other software with sufficient reli- ability for effective use presents a set of software design and devel- opment problems that are not addressed by the techniques of either programming-~n-the-smaD or programming-in-the-large. Accordingly, two new tasks for software research are: (1) to de- velop better techniques for designing software, especially software to be embedded In very complex, real-tune application systems, and (2) to use emerging artificial intelligence techniques for the development of tools that wait help software developers manage the complexity of such software. Other directions and opportunities for future software progress include: (3) effective ways of improving the productivity of the software development process, e.g., through automation of soft- ware design, reuse of existing software components, or new types of software architectures; (4) ways to reason about the correctness of software, including the task specification process; (5) infrastructure]
so tools and resources, such as electronic software distribution systems; and (6) addressing the new multiprocessor software problems that will inevitably arise from the new multiprocessor architectures dis- cussed above.