Read "Roles of Industry and the University in Computer Research and Development" at NAP.edu

« Previous: OVERVIEW

Page 19 Cite

Suggested Citation:"A REVIEW OF NINE CHARACTERISTIC SUBAREAS OF COMPUTER SCIENCE." National Research Council. 1982. Roles of Industry and the University in Computer Research and Development. Washington, DC: The National Academies Press. doi: 10.17226/10453.

Page 20 Cite

Page 21 Cite

Page 22 Cite

Page 23 Cite

Page 24 Cite

Page 25 Cite

Page 26 Cite

Page 27 Cite

Page 28 Cite

Page 29 Cite

Page 30 Cite

Page 31 Cite

Page 32 Cite

Page 33 Cite

Page 34 Cite

Page 35 Cite

Page 36 Cite

Page 37 Cite

Page 38 Cite

Page 39 Cite

Page 40 Cite

Page 41 Cite

Page 42 Cite

Page 43 Cite

Page 44 Cite

Page 45 Cite

Page 46 Cite

Page 47 Cite

Page 48 Cite

Page 49 Cite

Page 50 Cite

Page 51 Cite

Page 52 Cite

Page 53 Cite

Page 54 Cite

Page 55 Cite

Page 56 Cite

Page 57 Cite

Page 58 Cite

Page 59 Cite

Page 60 Cite

Page 61 Cite

Page 62 Cite

Page 63 Cite

Page 64 Cite

Page 65 Cite

Page 66 Cite

Page 67 Cite

Page 68 Cite

Page 69 Cite

Page 70 Cite

Page 71 Cite

Page 72 Cite

Page 73 Cite

Page 74 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Pa I I: A VIEW OF NINE C =~ Sag ~ Came DIVE

INTRODUCTION The papers in Part II sketch some historical relationships between basic research and industrial practice in computing. m e areas surveyed have been chosen as characteristic; they are not comprehensive. For a more thorough (albeit slightly outdated) technical survey, the reader may wish to consult the Computer Science and Engineering Research Study (COSERS) report, What Can Be Automated? (edited by Bruce W. Arden, MIT Press, 1980~. Paper 2 of Part II, about integrated circuits in general and microprocessors in particular, examines currents in these most characteristic and exploding fields of hardware technology. Basic research in computer science has played a minor role here. Without hardware there would be no computing, but without applications there would be no hardware. m is consideration motivates Paper 7, which considers large-scale scientific and engineering computation. m is field, whose needs inspired the first computers, is an area of computing where research and applications go hand-in-hand. Systems software, the subject of Paper 1, has become to computing what government is to civilization: an indispensable component that pervades and shapes other productive activity. Systems software ideas have come from research institutions as well as from industry, often as much in response to the possibilities of technology as to the needs of users. A degree of theoretical maturity is beginning to be attained in some areas, as illustrated by Paper 4, which shows how one kind of systems software has passed from the hands of practical inventors, through the theoreticians, into codified academic training. It took several decades for computer science to gain legitimacy as a discipline. By now, it is well enough established to have developed a pure as well as an applied wing. Paper 3 describes the way in which this theory influences the computer science curricula, and thus influences the course of industrial development into which graduates are drawn. Another university-dominated field, pure in the sense that it has traditionally been unconcerned with immediate practicality, is artificial intelligence (Paper 5~. Despite years of industrial inattention, and considerable early overselling, this lively branch of academic research has contributed important techniques to modern computing practice. 21

22 The most visible industrial computing falls under the rubrics of "scientific. computing, such as airframe design and refinery scheduling, and "data processing,. such as accounting and reservations systems. Organized computer science has always been concerned with the former activity (Paper 7) and with general systems issues (Paper 9) but rarely with data processing per se (Paper 8~. The application of computers to very large scale integrated {VLSI) circuit chip design (Paper 2) is a research area of growing importance. Moreover, Japanese success has heightened awareness of another area of computing of potential importance to industry, namely robotics (Paper 61.

Paper 1 SYSTEMS SOFTWARE: THE PROBLEMS KEEP CHANGING I NTRODUCTION Much of the history of software systems research reflects the efforts of academic codifiers to find stable, unifying generalization in a field continually revolutionized by changes in hardware technology. The following somewhat oversimplified account of major states in the development of hardware will emphasize the constantly shifting priorities with which systems researchers have had to contend. 1. Era of Poor Reliability. The earliest computers were of very limited capability by today's standards, and operated correctly only for short intervals. The programming approaches used in this period had to reflect this unreliability, which among other things discouraged all attempts to construct complex operating systems. The computer user's whole emphasis was to squeeze a maximum of useful numerical computation out of a limited system that might fail at any moment. The "systems" that existed were simply subroutine libraries directly supporting the mix of problems to which a given machine was dedicated. 2. Era of Very High Hardware Costs. The first generation of commercially produced computers was substantially more reliable than the early experimental machines described above. m ese second- generation products could be expected to run reliably for periods of a few hours--long enough to run several jobs. However, substantial computing equipment was still very expensive and thus available only at a relatively few favored centers, whose main aim was to ensure efficient use of their high-priced machinery. External data storage equipment, consisting largely of magnetic tape drives, was still rudimentary. m is hardware environment encouraged the development of multiprogrammed batch operating systems (like the pioneering University of Manchester ATLAS system, which also introduced the notion of circular memory) that aimed to facilitate the efficient flow of work through an expensive hardware configuration. Although the ATLAS system was a university operating system product, such products were rare in this early period. Indeed, the limited availability and high cost of equipment, and also the batch nature of the systems that could most readily be constructed, inhibited university experimentation with operating systems at all but a few favored centers. Accordingly, the 23

24 batch systems characteristic of this period were generally developed by large industrial users and by computer manufacturers. 3. Era of Large Economies of Scale. Hardware costs had been declining from the first days of computers, but initially these economies could more readily be captured by large than by small computers. Grosch's Law, i.e., the statement that the performance of a machine is proportional to the square of its cost, emphasizes this phenomenon. Thus at first, the time-division multiplexing, or time sharing, of a single large computer was the most economical way to use hardware. Since falling hardware costs had already made programmer productivity an important issue, this favored the development of large, complex interactive time-sharing systems. Universities played an important role in this development, the MIT work on CTSS being particularly significant. The Manchester ATLAS work and the somewhat . . . later MIT work on MULTICS also explored the memory protection and segmentation schemes needed to support highly dynamic shared use of large machines. Semantic structures for parallel processing, including such formal notions as "process, Semaphore, n and ~monitor, n emerged from this period of academic involvement in interactive systems building. Nevertheless, the commercially successful, widely used time-sharing systems were developed by industry, largely by computer manufacturers, experimental university systems being generally not stable enough, well-documented enough, or sufficiently well maintained to support widespread use. 4. The Coming of Large Random Access Storage Devices. Processing of large volumes of commercial data had been central to computer sales from the time of the first commercially available machines. Fairly elaborate libraries and tape-sorting procedures were built up to facilitate the processing of these data on the tape-based systems originally available. However, the development of large-capacity random access devices (discs) changed the context of systems design radically. Data bases of a hitherto infeasible complexity became possible, and large interactive systems became practical for the first time. Systems of this type spread rapidly to universities as interactive time-sharing systems, and commercially as online reservations systems and other interactive query systems. Data base systems soon became the fastest growing commercial applications of computing. However, for almost the first decade of their development, data base systems failed to attract much interest. Although data base systems have subsequently become a recognized area of research, it is fair to say that their early development was spurred entirely by applications pressures, and that the initial research contribution to these systems, either by universities or by applications pressures, was negligible. Developers in large commercial firms, manufacturers, and software houses had brought data base systems to a fairly advanced practical state before most university research had become particularly familiar with the meaning of the term "data base." Modern research in data bases includes adoption of notions drawn from set theory, from language theory, from multiprocessing control, and from distributed systems and artificial intelligence, and it is beginning to have a significant impact on industrial practice.

25 5. Era of Multiprocessors and Minicomputers. Continuing declines in the cost of hardware, especially the fact that the introduction of monolithic memory chips caused the cost of memory to drop substantially, began to make small computers relatively more advantageous than they had been before, thus undermining Grosch's Law. This technological change tended to favor the use of inexpensive minicomputers for small applications, and also the combination of multiple small computers into higher performance systems. A new type of systems software, namely operating systems of simplified structure able to support single users or small groups of users, became appropriate. The UNIX system developed at Bell Laboratories is the best-known system of this type; however, universities also experimented with minicomputer software, producing a few successful systems, of which the UCS Pascal system is the most widely used. Multiprocessor configurations are favorable for applications, such as air traffic control and airline reservation systems, that must be highly reliable and available without interruption. Manufacturers and software groups concerned with such applications have studied the problems of ultrareliable computing and the way in which multiprocessor configurations could be used to solve them. Since the university computing environment does not demand high reliability and most university researchers are unfamiliar with applications demanding such reliability, universities have thus far played only a small role in these investigations. 6. Era of Distributed Computing. The economic factors favoring the use of smaller computers have continued to gain in strength. Currently, it is possible to produce a reasonably powerful micro- computer, with a megabyte or more of monolithic memory, for just a few thousand dollars. Systems of this sort provide enough computing power for many small-scale applications and are ideal working tools for the individual university or industrial researcher. Such researchers are generally interested in relatively light experimental computing and have reason to prefer microprocessor-based interactive computing environments decoupled from the load-conditioned shifts in response time to which large time-shared systems are subject. However, microcomputer-based work station systems are most useful if they can also be used to access larger systems, on which major data bases can be stored, and to which heavier computations can occasionally be relegated. Since communication costs have been declining steadily (though by no means as rapidly as computing costs), it has become feasible to provide this kind of computing environment by linking moderately powerful intelligent terminals or personal computers together in a communications network, which also includes a few big data base machines or scientific ~number-crunchers. n Relatively elaborate communication software and work station software systems emphasizing very high reliability file handling must be combined to support heterogeneous distributed systems of the sort we have described. The communications software needed for this was pioneered by the ARPANET effort, which was largely the work of nonmanufacturer software research laboratories with some participation by university groups. This effort demonstrated many important packet-switching

26 techniques and concepts that were subsequently adopted in commercial systems. m e University of Hawaii's ALOHANET project, which became the basis of the Xerox Corporation's Ethernet product, represents another case in which university concepts in communication software were successfully transferred to industry. THE ROLE OF RESEARCH The history that we have recounted emphasizes the fact that the set of problems faced by software systems designers has changed steadily and radically over the whole period examined, owing to constant changes in the underlying technology. Given these changes, the task of the software designer has been to develop systems that could exploit new hardware developments as these appeared; he has had to deal with daily realities, rather than with abstract principles. To make research meaningful in a subject area whose ground rules are constantly changing is difficult. It is all too easy for an intricate piece of systems analysis to "miss the boat" because technological development changes the value of some assumed parameter drastically. Nevertheless, software research has a number of real accomplishments to its credit: 1. Such fundamental systems approaches as multiprogramming, time sharing, and virtual memory were all initially demonstrated at universities. Industrial software research groups originated the important notions of virtual machines, relational data bases, and packet-switching networks. 2. University and industrial research has given us at least some degree of understanding of the factors governing system performance. For example, the notion of "working set" forms the basis of our present understanding of the behavior of demand-paging systems. Performance analysis is an area in which it has even proved fruitful to apply formal mathematical techniques such as probability and queuing theory. However, our understanding of system performance remains fragmentary, in part because technological changes have steadily changed the aspects of performance to which theoretical attention could most appropriately and usefully be directed. 3. Research has codified and clarified the terminology useful for thinking about the structure of complex systems. Examples of this are the notions semaphore, monitor, process, transaction, and deadlock. Research has also begun to contribute important structural protocols around which complex systems can be organized. Note, however, that many of the system-building techniques that are still most widely used even for constructing complex systems, e.g., the resource preallocation and ordering techniques used to prevent deadlock in operating systems, and the detection of deadlock by location of cycles in a "wait-for graph, stem from the work of systems developers rather than from formally organized research. To be successful, software research needs to walk a relatively narrow path. On the one hand, it must aim to precipitate some relatively unitary, technology-independent concept out

27 of the confusing welter of practical applications. On the other hand, it must not abstract so much or go so far beyond what is immediately practical that contact with current practice is effectively lost. Software research efforts that deviate in either direction from this path may fail to contribute substantially to developing practice. Demonstration systems lacking unitary focus have often failed to leave anything behind. m is remark applies especially to systems developed at universities, which lack the sales organization that could bring a state-of-the-art system to wide public notice. Other problems that university software systems face is that their definitive completion and release tends to be long delayed and that ordinarily the developers of these systems do not dispose of large enough resources for the attractiveness of their systems to be enhanced by systematic and substantial additions of functionality. With few exceptions, such systems have failed because they have been competing with much larger industrial developments on unfavorable ground. On the other hand, it is not hard to cite software research efforts that have failed by projecting overabstract notions that reflected reality in a manner that was too distant or one-sided. For example, research prototypes systems based upon a concept of hierarchical structuring and data and control abstraction have been constructed, and held up as ideas. Practitioners have remained unconvinced, since many of these attempts have led to unacceptable degradations of performance. Modular decomposition is an idea on which software researchers and practical systems developers have generally tended to agree, but, for efficiency reasons, practical use of this idea has never been as extensive or as rigorous as researchers would advocate. Formal program verification remains a research ideal, but its impact on practice has thus far been close to nil. University work on computer architectures during the last decade has also had limited impact on industrial practice. Various novel architectural suggestions, including architectures directly supporting high-level languages, data flow machines, and capability machines, have been proposed, but commercial development has not followed. While future development of such machines remains a possibility, it should be noted that successful earlier ideas such as time sharing and packet switching were quickly followed up commercially. WHAT THE UNIVERSITY ROLE SHOULD BE University computer science departments involved in software systems research have attempted to play several roles. They have tried to 1. demonstrate new and fundamental systems software concepts; 2. contribute by codification and analysis to the conceptual undertaking of existing systems; 3. invent improved algorithms and protocols around which improved systems could be structured; and 4. demonstrate systems concepts by building and disseminating new systems.

28 The recent attempts of university researchers to demonstrate fundamentally new systems concepts (role 1) have been disappointing, especially when compared to earlier periods in which the university impact was large. The increased number of issues that a system design must address may be partly accountable. In data base systems, computer networking, and small operating systems, substantial industrial efforts have generally been required to make new ideas and approaches entirely credible. Per Brinch Hansen's work on Concurrent PASCAL illustrates this pessimistic remark. In many ways this work, which built on such fundamental concepts as monitor and process and incorporated them into an interesting parallel processing language, was excellent. However, the small and relatively inefficient system built to illustrate these ideas ignored too many aspects of the overall operating system problem to be convincing. By contrast, UNIX, although less clearly structured, reached a level of efficiency and functionality that allowed it to achieve wide acceptance. The invention of improved systems algorithms and protocols is an area in which universities can be expected to lead. Some quite interesting inventions of this sort, having applications to compiler optimization and data communication schemes, have already been published. Although the practical impact of this work is still limited, it can be expected to grow in the future, especially as such new fields as robotics, which promises to make particularly extensive use of complex algorithms, develop in importance. One can safely predict that conceptualization and codification (role 2) will continue to be a role in which the university contribution will be predominant. However, this role is secondary in the sense that it records, preserves, and disseminates the results of prior research, rather than produces anything fundamentally new. University researchers are not likely to be content with so subsidiary a role, especially since, to play it well, they would have to make far more strenuous efforts to stay in touch with developing field practice than are now common. Roles 2 and 3 are particularly congenial for universities. The efforts they require can be undertaken by individual researchers or small groups, producing results that are often available within a short time and often discreet enough to be immediately suitable for publication. By contrast, university researchers attempting to play role 4 often find themselves on unfavorable ground. Nevertheless, attempts continue, in part because efforts to play role 1 often inspire them. This is only to be expected. New systems ideas and methods require credible demonstration; credible demonstration requires the construction of a system. However, not just any system will do if the work is to be taken seriously. While production of a commercial system should never be taken as the prime criterion of technical success, an effort to demonstrate the viability of a new software concept must at least address itself to the major problems addressed by commercial systems. This is what MULTICS, System R. and Ethernet development did, but what most university systems, built with limited resources and perhaps faced with obstacles inherent in institutional sociology, do not do.

29 These difficulties would seem to confine university systems research efforts to roles 1, 2, and 3. But here there arises the danger of increasing the already undesirable separation of university researchers from essential real-world concerns. Systems research divorced from systems building can easily come to concern itself with unreal problems. Moreover, even if the problems studied are well chosen, a university researcher working in a relatively ~costless" environment may lose sight of the cost factors that must nonntrain realistic solution and thus may impair the interest of his work as seen by industry. Some encouragement for university systems building efforts does come from the continuing decline in computation data storage and communication costs. As already noted, this strengthens the tendency to construct small specialized text processing, data base, process control, and personal computer systems. These systems will generally be simpler than the centralized time-sharing, multiprogramming, multiprocessing systems associated with large computers today. Their relative simplicity will make it somewhat easier for university-sized projects to produce interesting systems. Here, however, the question of contact with real field requirements becomes all-important. As systems grow to be more specialized and application-dependent, high- quality human factors design comes to be a central issue. The sales arms of commercial organizations alert them immediately to field requirements of this sort. University researchers tend to disdain predilections of applications-oriented end users, which may blind them to the issues that small-systems designers will need to face. If this tendency can be overcome, the availability of powerful small machines may make it possible for university software developers to affect developments to affect practice more directly; otherwise commercial software developers and practitioners with backgrounds in a variety of application areas will call the tune. At any rate, it would certainly be unfortunate for university researchers to ignore the views of this latter group.

Paper 2 I NTEGRATED CIK:UITS: A DEVELOPMENTAL AREA DOMINATED BY INDUSTRY The fabrication of electronic circuits is the best-known example of innovation and industrial effectiveness today. Our ability to put tens or even hundreds of thousands of computing and logic elements onto small chips that can be produced in large quantities by highly specialized but moderately sized laboratory-factories has opened many new markets for electronics. m is very successful line of work furnishes a prime example of a heavily development-oriented area of technological innovation that has been dominated by industry. Industrial organizations have the capital necessary for activity in this area, have had an immediate interest in its outcome, and were the ~ , I, however, the problems faced in perfecting complex chip designs have encouraged universities to build up VLSI design automation activities. first to confront its problems. More recently, THE PROCESS TECHNOLOGY A brief review of the process by which integrated circuits are produced will help us understand why this vital area of the computer industry has been so very strongly development-oriented. Integrated circuits are manufactured by introducing microscopic quantities of carefully chosen impurities into a pure crystal of a semiconductor, typically silicon. m e impurities introduced, and their macroscopic geometric pattern, determine the electrical properties and the switching behavior of the circuits that result. m e manufacturing process is largely a photolithographic one, in which a silicon wafer 4 to 5 inches in diameter is coated with a photosensitive mask. Exposed areas are then etched, and carefully controlled quantities of impurities are introduced into them; unexposed areas are unperturbed. Through a series of steps involving exposure to several successive masks, followed by processing related to each mask, numerous microscopic transistors are formed in small rectangular regions on the wafers. Each quarter-inch-square region comes to contain 10,000 to 300,000 transistors. On completion of the wafer processing, the wafer is cut along the region boundaries to form separate "chips" about 1/2 square inch in size. One wafer can yield about 100 to 200 such chips. Since a fixed number of masking steps are involved, roughly 5 to 10 depending on the precise technology 30

31 used, the cost of a chip is more or less independent of the number of transistors it contains. Hence one aims to put the greatest possible number of transistors on a chip. The success of this effort, which has doubled circuit densities each year for over a decade, has resulted in tremendous increases in processing capability per chip, while the cost per chip in volume production has been more or less constant. Think smalls has been the guiding slogan of the astonishing series of technological triumphs that have characterized the microelectronics industry. Major advances in photolithography have been required to sustain this development. Channel widths have dropped to 1 or 2 microns, from 10 to 12 microns a decade ago. New improvements, which it is hoped will lead to a still more refined submicron technology, promise further improvements. However, since the physics of visible and even of ultraviolet light constrains minimum feature sizes to something in the range of 0.1 to 0.5 microns, the industry is approaching the limits of optical techniques. In attempts to attain higher density, some - manufacturers are experimenting with elector beam machining in place of exposure to light through a mask. The line widths that can be attained in this way are much finer than those attainable by conventional means, but processing is a good deal more costly, since the electron beam exposes just one point on a wafer at a time, and thus must draw a pattern serially. Currently fabricated chips generally use a MOS (metal oxide semiconductor) technology, though limited but significant use is also made of other integrated circuit technologies. MOS has moved into the forefront because it is generally denser and simpler to fabricate than other processes, characteristics that are crucial to the success of VLSI. The speed disadvantage that MOS suffers with respect to other processes is not significant for low-end computing devices, and MOS speeds are improving relative to those of other technologies. The competing, faster bipolar microprocessors tend to be more expensive and to require more power and are used primarily in more demanding applica- tions such as real-time control or in large scientific mainframes, where speed is of the essence. The dominance of MOS is noteworthy because MOS developed several years later than some of the technologies that compete with it. Its success contradicts the common observation that in high technology the later developments often fail to dominate, or even to survive, because investment in earlier developments faces them with an overwhelming competitive handicap. DEVELOPMENTS LEADING TO VLSI Both the bipolar and the MOS technologies derive from the original invention of the transistor by Shockley, Bardeen, and Brattain of Bell Laboratories, which carried forward lines of development originating in projects (ca. 1952) of the inventors Lillienfield and Hell, both expatriates from Germany. Shockley's 1939 notes show a grasp of the principles of MOS field- effect transistors (FETS) , some 19 years before Teszner demonstrated a

32 working FET at CFTH in France. However, although Shockley and Brattain were collaborating at Bell Laboratories as early as 1939, the war interrupted their work on transistors, and they did not return to intensive work on the transistor until 1945, when they were joined by Bardeen, Pearson, and others in a solid-state laboratory at Bell Laboratories. During the war, notable research on semiconductors had also been performed at Purdue by Lark-Horvitz, but this research terminated with the end of the war before yielding a solid-state amplifier. The invention of the transistor conventionally dates from December 23, 1947, when a point-contact transistor amplifier was successfully constructed at Bell Laboratories. Bipolar transistor technology developed rapidly in the following decade. The bipolar transistor was joined by the fledgling field-effect transistor in 1958. Two major breakthroughs occurred as the 1950s ended and the 1960s began. In 1958-1959, Noyce invented the planar transistor, which could be fabricated by a simple sequence of diffusion and masking steps. In 1958 and 1959, Kilby at Texas Instruments demonstrated that a complete integrated device could be fabricated from planar transistors, resistors, and capacitors formed together out of semiconductor materials on a common substrate. These two inventions opened the path to what is now the integrated circuit. Today a series of about 10 to 20 steps involving masking, diffusion, and implantation suffice to produce about 200 chips on a single wafer, each chip having upwards of 10,000 transistors. Formerly, the fabrication of a complete device of this complexity would have meant that each transistor had to be individually and tediously connected by wires running from point to point, with only some limited application of automation speeding up to the wiring process. Between 1959 and the construction of the first microprocessor, the Intel 4004, many more developments in integrated circuit technology occurred, but none of these were so fundamental as the original inventions of the transistor, the planar transistor, and the integrated circuit, all three of which are achievements of industry. THE DESIGN AUTOMATION 80TTLENECK In the 1960s it became apparent that rapidly improving integrated circuit technology would soon permit several hundred to several thousand circuits to be fabricated on a single chip. As noted, the interconnections between chips that were previously made on printed circuit boards could now be made on the chip itself. This promised great reductions in cost, but gave rise to a serious debugging problem. A design error on a printed circuit board can be repaired by patching in a discrete wire, after which debugging can proceed immediately. By contrast, if an error occurs in an integrated circuit chip design, the whole chip has to be refabricated after the design error is removed, and this can take from days to months (depending on the availability of a processing line, priority of the various efforts sharing it, any special requirements that a given chip may impose, etch. Thus successful integrated circuit chip fabrication calls for a

33 first-class design automation system, using which, a designer can debug his circuits before he commits the design to silicon fabrication. If many different chips are needed, a design automation system efficient enough to handle a large number of such designs is required. The semiconductor industry has invested large sums of money in attempts to develop such design automation systems, but these efforts have so far met with only modest success. Some microelectronics producers have reacted to this difficulty by surrendering some of the density and performance otherwise possible, specifically by restricting the chips to a few allowed diffusion patterns, and allowing designers to specify the subsequent metallization layers only. This so-called "gate array" approach is still the only one that can be used to turn out large numbers of functionally distinct chips without manual intervention. Another approach is to sidestep the design automation bottleneck by producing a few carefully designed microprocessor chips only; customization is then attained by changing the program that these microprocessors execute. THE FIRST MICROPROCESSOR The first microprocessor was an evolutionary device rather than one involving any totally new concept. The first such processor is generally considered to be the Intel 4004. Marcian E. Hoff is credited with leading the development effort that led to this very significant device; Hoff and others at Tntel hold patents on it. In the years immediately preceding the development of the 4004, several companies had moved toward large-scale integration of control functions in computer-like devices. Typical LSI products were single- chip devices for very simple calculators, and a few multichip sets for somewhat more complex calculators. Of course, the history of these calculator chips showed the usual technical difficulties and abortive starts. For example, an early effort in 1964 and 1965 by Victor Comptometer Corporation developed an LSI type of calculator that was never put into production because of problems encountered in building satisfactory chips. Between 1965 and 1970, several companies did succeed in producing workable LSI devices that found their way into handheld and desktop calculators. me Sharp Company made important advances in devices design during this period and took an early lead in electronic calculators. The commerical success of these calculator chips set the stage for the development of true microprocessors. When Hoff initiated work on the Tntel 4004 in 1969, intelligent controller chip designs generally involved an embedded, unchangeable program, usually fabricated using an on-chip ROM (read-only memory) or a logic array. The Busicom Company had just contracted with Intel to produce an intelligent controller for a calculator family that they intended to market. Hoff's response was to design a microprocessor chip set having a very broad capability rather than the narrow set of capabilities that had typically been associated with calculator chips. The instruction set chosen for Hoff's microprocessor may have been influenced by many sources, including several outside of Tntel, but the

34 design of this chip is clearly original to Intel. Hoff's team created working breadboards of the 4004 chip set in 1969 and 1970 and were able to convince Busicom of the viability of this new product approach before 4004 chips actually became available in quantity. By mid-1971, Intel was marketing the 4004 chip set as a product. The motivation for putting together a microprocessor instead of following the more restricted approach conventional in the late 1960s is interesting in itself. m e market for microprocessors did not exist in 1970, and its potential could at best be estimated crudely. A good deal of risk was involved in committing major resources to chip development for a microprocessor having limited computational power. While the new chip was clearly going to be useful for building hand calculators, the market there was highly competitive and volatile. Beyond the calculator market, the uses of the 4004 were highly speculative. Designers of electronic control equipment, which represented another potential market for the 4004 chip, were used to thinking in terms of discrete devices, including mechanical or analog devices, for control purposes. Most designers were not trained to design with microprocessors and generally did not understand how to use software to accomplish functions that they were accustomed to realize in other ways. Nor was it clear at first that the 4004 would be cost effective in control applications because its initial price was rather high. Nevertheless, Hoff believed that there was a very large potential for a primitive device that was itself a computer, provided the cost was low enough and the computational power was great enough. He succeeded in getting the 4004 into production, and the resulting demand showed that the development was well worth the risk. Without detracting from his credit as a pioneer, it is also fair to remark that if Hoff had not developed the 4004, someone else would probably have constructed some other microprocessor within a few years: the capability of doing so was becoming widespread and the idea behind the 4004 grew naturally out of prior technological advances. Subsequent to its introduction of the 4004, Intel enhanced this product greatly to create a line of microprocessors constituting a major product family within the company. By 1975, Intel's product line was dominated by the 8080, a third-generation 8-bit micro that had several times the computational power of the 4004. Although initially the price of the 8080 was upwards of $2000, by 1980 the price had dropped to less than $10. Total sales of this device, which is currently supplied by several producers, now number in the tens of millions. In retrospect, we can remark that the microprocessor development required mainly a strong development team and an integrated circuit facility. m ough substantial, its research aspect represents only a small fraction of the total effort that was required to make the product a practical success. A similar study carried out in a university or in an industrial research laboratory might well have stopped at the paper design or at the breadboard level. The key to the success of the 4004 project is that a product actually appeared and that through its availability and low cost this product created a market that had not existed.

35 THE NEW VLSI BOTTLENECK AND THE AWAKENING OF UNIVERSITY INTERESTS Up to the middle of the 1970s, most chips were designed with little computer assistance. {Memory chips, because of their highly repetitive patterns, are an exception to this statement.) The typical industrial electronic chip designer drafted a design by hand, checked his design by hand, and then digitized the design manually for computer input and generation of photographic masks. Many of the most successful products of the mid-1970s were produced in this relatively primitive fashion. However, since chip complexity has increased tenfold in the last few years, hand design has become infeasible, and extensive computer-aided design and layout aids have become necessary. University contributions to this increasingly important area supplement the large industrial effort that has been devoted to these tools. Design automation aids in common use today allow full-chip designs to be constructed as compositions of prestored subdesigns. mese aids are able to elaborate preliminary design sketches into fully detailed drawings, to route interconnections between specified points, to replicate predesigned cells in specified areas of a chip, and to produce logic arrays automatically from high-level specifications of a control function. Some of the design tools currently being used in industry reflect university contributions. In particular, the computerized graphics systems used to replace hand sketching techniques were stimulated in significant ways by university research, in particular by experimental graphics systems developed at Utah and MIT. m e pioneers in the application of this graphic software to VLSI design systems were either academics or industrial researchers who had access to the work done in universities. (However, as the economic importance of these tools has grown, commercial systems, into which large investments have gone, have taken the lead, and university work has fallen behind.) Until quite recently, virtually no universities had state-of-the- art, on-campus facilities for manufacturing integrated circuits. This meant that design engineers emerging from universities generally had little practice in chip design. Stanford and Berkeley were exceptions, but even at these institutions chip fabrication specialists were not computer designers, and education in computer hardware design was organized on a conventional basis, academic designers developing systems from commercially available chips, but not designing new chips themselves. This situation is now changing. An important and active VLSI design center developed by Carver Mead at CalTech has shown the feasibility of university involvement in VLSI education. Mead got his start in VLSI research by working closely with semiconductor companies such as Intel and observing that their essentially manual methods were about to run out of steam. He therefore began to study more systematic ways of producing chips, and also began to sensitize other university researchers to the importance of this problem. In connection with his other efforts, he co-authored a significant text presenting VLSI techniques to university computer designers, thereby furthering university access to this area. Mead's demonstration of feasibility, his useful book, and a greatly increased flow of government and industrial funding into university VLSI laboratories have all helped to increase university interest in this

36 field of research dramatically. Several universities are now in the process of creating semiconductor fabrication facilities, and still more are teaching chip design but having student projects fabricated off-campus by cooperating industrial establishments. Student papers on VLSI design, once totally unknown, are becoming a regular part of professional meetings. However, significant adverse factors still constrain the ability of universities to make significant contributions (other than educational contributions) to this area. Universities still do not have easy access to the most advanced technology; disciplined functioning of a chip fabrication line requires large numbers of technicians, as well as the commitment of substantial resources to the development and maintenance of software. (However, some of this access may be available through cooperation with design automation groups in industry, including the major industrial research laboratories.) Universities have traditionally been strong in theoretical areas and in areas spanning theory and application. These may turn out to be the aspects of VLSI technology where they can most readily make impor- tant contributions. For example, universities have already contributed significantly to the modeling of semiconductors, and are more likely to continue productive work in this area than are most industrial labora- tories. Though Bell Laboratories and IBM are exceptions, most semicon- ductor firms sponsor very little of this kind of research. Work on algorithms and data structures for VLSI design is another area in which university efforts can easily surpass those of industry. Little work of this kind has been done thus far at universities, largely because university researchers have not generally been aware of the problems that need solving, but such awareness is now growing very rapidly. Activity in this area has already begun at MIT, Stanford, Carnegie- Mellon, and other schools, and this work should swiftly become a valuable supplement to the more pragmatic efforts of industry. To summarize, universities have done best and are likely to continue to do best in areas that lie within reach of the talents of their faculty and students and that permit work within the constraints imposed by the relatively modest capital investments that universities can justify. Universities can initiate important streams of work, but when specific lines of research and development come to stand at the focus of industrial interest, universities cannot expect to match the manpower and capital investment that industry is able to mobilize. The university's real advantage is that it can pursue basic questions more readily and consistently than industry can. University researchers are free to examine many different lines of investigation in an environment protected from deadline-generated pressures. How- ever, university groups lack the focus that product concerns gives to industry. A desirable ingredient, conspicuously missing in the past, has been greater involvement of the university community in the practical activities of the VLSI industry. Today there is almost a rush for involvement. This may be both healthy and unhealthy: Overall valuable new research perspectives should result, but some institutions may prematurely commit to on-site semiconductor fabrication, and serious problems may develop if this financially substantial commitment becomes too burdensome.

Paper 3 I~ORETICAL "SEA=H: ITS PURPOSES AD INFL==E Theoretical computer science is concerned with discovering the inherent power and limitations of computing techniques. This agenda proceeds in both an abstract vein of existence and nonexistence results for various models of computation and a concrete vein of algorithms and general methods for particular classes of problems. Both universities and industry contribute to the development of theory, but this is an area in which the universities lead. Good theory codifies, synthesizes, and transmits ideas. It lives most comfortably in universities and plays a dominant role in the advanced curriculum of computer science. The agenda and style of theory contributes both directly and indirectly to industrial development. Theory Provides a common body of understanding that practitioners automatically employ in attacking technical tasks. Practical computing would be different indeed without ideas instilled by theory, ideas such as recursion, assignment, tree, grammar, critical section, divide-and- conquer, NP-completeness, and many more. mese words are so fully familiar to every practitioner that it is hard to realize that only a couple of decades ago none were in common use in the field and some did not exist. Theory pervades education and thereby invades industry. Industry . counts on fresh college graduates to know the concepts referred to in the previous paragraph, to have a ready stock of techniques, and to understand the power and limitations of these techniques. In effect, education injects theory into the industrial milieu as a free good. Theory solves certain Problems that would otherwise be out of reach. m is truth is particularly obvious in the domain of applied mathematics--witness theoretically based developments such as linear programming, fast Fourier transform (FFT), or finite element techniques. Automata theory has given us fast search techniques suitable for bibliographic retrieval. Formal language theory has given us automatic consistency tests for language designs. Algebra has given us new cryptographic techniques. Complexity theory has given us combinatoric algorithms to help with geometric layout. Theory gives insight into the capabilities of computers. Nonexistence results advise practitioners not to waste time in quests such as absolutely optimal compilation or general verification of 37

38 programs. Finer theoretical analysis often tells how closely, or in what cases, some of these goals might actually be achieved. Proven bounds give objective standards of performance for algorithms and point out opportunities for improvement. An interesting recent example is the Ziv-Lempel result that certifies the efficacy of dictionary-style data compression. Knowing definitively that one class of statistical methods for text compression is good whether or not we know the statistics of the source, we can now attack new problems ' in data compression with a degree of confidence similar to that with which we approach sorting. Theory suggests generalizations that simplify computing. m e conventional core of programming language technology is strongly informed by theory: environments and binding, functional abstraction and composition, and nested scope and activity of variables are matters upon which common agreement has' been reached after refining in the theoreticians' fire. In other equally important matters, no comprehensive theory exists (e.g., global scopes, libraries, dynamic storage allocations, and asynchrony). Industrial software applications that fit within the ground covered by well-understood theory--and there are many--can be promptly and reliably implemented. Furthermore, second-order tools (program generators) suggest themselves in these areas and can be built with confidence. THE THEORETICIAN'S PLACE The world of computer science is in general an artificial one, divorced from concern with most physical laws, which are most decidedly not digital. In this respect, computer science has more in common with humanly constructed fields like accounting than with hard science: many of its artifacts merely reflect convention. Thus theoretical computer science, in pursuing mathematical ' truth, can touch only on certain core aspects of computing, which pervade, but by definition cannot embrace, the mutable, artificial world of systems built by or used in industry. Theory cannot prescribe how systems should work, but it can illuminate inherent costs and trade-offs that may be considered in designing them. Some prevalent modes of theoretical research engender isolation. Considerable work in theoretical computer science has been devoted to excessively inward-looking corners of the field whose raison d'etre is partial fulfillment of Ph.D. requirements. Moreover, even good work often goes unsung beyond the main audience of peers for whom many theoreticians play. And potentially interesting questions from outside that group may go unnoticed because they are unstylish or not worded in the accepted jargon. The industrial theorist is better protected from these perils than his university colleague. Isolation is abetted by computer science curricula. Hard and deep science has tended to disappear from standard computer science curricula. A typical bachelor in computer science graduates almost illiterate in science in general, and is much less able to understand

39 the hard sciences (except for elementary discrete mathematics) than other science or engineering graduates are to comprehend computer science. With this preparation, it is no wonder that much graduate research tends to look inward, and to touch other disciplines only in simplistic fashion. For this reason, industry may weigh the extracurricular or hobby-like experience of a research candidate heavily: a ~pure" computer scientist is often seen as a dubious investment. Direct applications of theory tend to be invisible. The nature of software products is such that even a central, critical algorithm supplied by theoretical research may be completely submerged, since in most cases (outside of applied mathematics) the basic solubility of the problem with which it deals is easily perceived. What theory contributes may ~only" be efficiency, the difficulty and importance of achieving which is hard to appreciate unless one happens to have tried to do it oneself. Who but an insider would imagine, for example, that most fast pattern-matching routines are direct and conscious applications of the equivalence of deterministic and nondeterministic finite automata? Especially when diffuse, the role played by theoretical insight in devising particular pieces of software will often be hard to perceive. Only the original programmer will know which of any of a hundred minor algorithmic choices in a piece of software were shaped by theoretical generalizations. Theory is not a journeyman trade. This truism explains industry's widespread unwillingness to hire and inability to exploit people to do theory. Save for an exceptionally insightful few, being "good at theory" suggests native brightness more than it does immediate utility. A theoretician will do industry little good except as he evinces a lively interest in seeking challenges from and propagating results within the firm. Unlike a cobbler, a theoretician will serve industry ill by minding his own last. He must be something of a showman, for his product does not speak for itself, and something of a sponge, for his questions must bear on those of others. CONVERSION OF THEORY INS O PRACTICE We have argued that theory provides much of the common body of understanding that practitioners automatically employ in attacking new technical task. For lack of a better phrase we may call this immediate recognition that causes us to do things right without thinking about them Permanent knowledge. n Codification and transmission of permanent knowledge is the theoretician's responsibility. Let us examine some concrete examples of how theoretical work ideas in computer science have, or have not, impinged on practice, with special reference to the mechanisms of technology transfer.

40 Steady Progress: Formal Language Theory in Programming Languages Programming languages, notably FORTRAN, came first, well before any recognized applicable theory. (For a more extended discussion of compiler technology, see Paper 4.) That is not to say that early compiler technology was unscientific. Quite to the contrary, the seminal FORTRAN project led by John Backus at IBM had a strong research flavor, but the goal was a working demonstration, not scholarly publication. FORTRAN provided an overwhelming existence proof; it became obvious to most people in the field that they could in principle do the same without further instruction, but only by dint of mighty effort. m e science that followed immediately upon FORTRAN was largely derivative, seeking generality of syntax and semantics by obvious analogy to conventional mathematical notation. However, when ALGOL 60 syntax was described in a notation much like that which Chomsky had recently brought to linguistics, and when LISP was defined by McCarthy in terms from the theory of recursive functions, it became clear that programming languages could be investigated and implemented systematically. m e injection of a simple idea borrowed by theoretically knowledgeable scientists from other domains and the creation of a coherent mode of discourse was all it took to start the . explosion. University researchers enthusiastically took up formal syntax during the 1960s, but commercial purveyors of compilers, while aware of the field, were essentially uninfluenced by it except as a descriptive tool. In the commercial world, everybody "knew" you could do better by hand methods than by formalized techniques. Only a few groups tried to exploit the new knowledge. Massachusetts Computer Associates, for instance, developed and used precedence parsing and reduction analysis for their contract software business. In this exploratory period, various grammar models were tried with different trade-offs between simplicity and generality. Eventually, sufficient expertise developed that various research groups undertook to-prove they could write production compilers with their new techniques. GE's FORTRAN Y. for example, convincingly demonstrated the practicality of so-called LL parsing. Later, the so-called LALR model gained ascendancy. With the technology now fully established, new languages like CHILL and ADA have benefited from analysis of their grammars while the languages were being defined. Grudging Awareness: Theoretical Algorithms in Optimizing Compilers Optimization and flow analysis remained outside of fashionable theoretical research for 15 years, until they were perceived as pragmatically significant examples of the combinatoric problems that came in the 1970s to dominate the subject of algorithms. Individual practitioners began to introduce formal techniques into this corner of combinatoric algorithms over a decade ago, but in that whole period the subject of optimizing compilers remained outside the domain of

41 permanent knowledge, and even today, most optimizing compilers are undertaken mainly as prodigious, largely heuristic, projects. Moreover, modest compiler efforts have not seen optimization as a necessary goal. Nevertheless a transmissible framework now exists, and code optimization may be expected to join syntatic analysis as a well-established part of programming language technology. Scant Contact: Formal Semantics in Languages The history of formal semantics illustrates a different story. The primary groundwork of the theory dates to the 1930s. Its specific application to accurate definition of the meaning of programming languages began in the early 1960s. By now, a great deal of theory has developed in this area. Nevertheless, the influence of this theory on practice has been minimal. No catalyzing event like the ALGOL 60 report has yet occurred to demonstrate its utility to the world at large. Certain derivative ideas, such as the virtue of applications languages, have gained wide currency, and numerous ex post facto unofficial formal descriptions have been published. An attempt to tighten and debug the informal definition of DOD's ADA language by formal (and automated) techniques is in progress today. But overall this theory remains an academic rather than a pragmatically influential subject. Contagious Spread: Cryptography and FFT Spurred by the NBS data encryption standard, cryptographic theory burst upon computer science theoreticians and practitioners almost simultaneously. Natural curiosity concerning a type of activity customarily carried out in deepest secrecy drew the attention of the scientific and popular press to the theorists' work. Important theoretical innovations, notably public-key cryptosystems, entered the general consciousness almost immediately. The contagious spread of these cryptographic ideas is not unique. The FFT (fast Fourier transform) caught on even faster upon its rediscovery by Tukey and Cooley 20 years ago, and knowledge of radix exchange sort, then Hoare's quicksort, exploded similarly. mese techniques speeded up particular computing processes dramatically, and, more importantly, they impressed the community at large with the importance of considering the complexity of computing processes. Saturation: NP-Completeness Originally promulgated by Cook, the importance of the idea of NP-Completeness was hammered home when Karp published a portfolio on NP-complete problems with something for everybody. NP-Completeness has taken its place as a prime concept concerning computing.

42 HOW IDEAS SPREAD The preceding discussion suggests that the fruits of theory are imprinted in permanent knowledge by catalyzing events, such as the publication of the Algol report or of the quicksort algorithm, but that the ultimate influence of the ideas thus introduced may greatly overshadow the events that initially make them known. m e contributions of theory have diverse provenances: the FET came from outside the computer science community, Algol from a committee, quicksort from an individual, and cryptography from an underground bureaucracy. When the time is right, ideas can spread more rapidly than the printed word if there is a network ready and eager to propagate them. The network did not operate as well for formal semantics: Scott's results were transmitted quickly and with respect but not with fanfare, and not into the general consciousness. What accounts for the difference? First and foremost, one observes that the catalyzing events were publications on significant demonstration of an idea. The network advertised them, but did not constitute their prime publication. Second, though the most pragmatically successful theoretical contributions were in one sense breakthroughs, in another sense they were continuous advances--they did not require a totally new way of looking at the world. _ _ _ , . _ (Even FORTRAN, so remote syntactically from machine language, was transparently equivalent to it, and FORTRAN was comfortably like the mathematics for which machines were being used at the time.) Third, regardless of any step toward abstraction--often the most important contribution of a theoretical idea--applicability to known problems was demonstrated. In short, it is immediately applicable rather than pure theory that spreads fastest. A further proviso is that there be a theoretician to popularize it; it is difficult to find examples where a popularizer, not himself theoretically inclined, has succeeded in spreading useful theoretical gospel. Before Popularity We should not conclude, however, that theory of value, or theory that pays off, comes first and foremost from theorists with a bent for popularization, or that only work from which a dramatic development can be anticipated pays off. Far from it. Popularization is only the visible part of the iceberg of theory. m e clarity of the Algol report derived from a hidden prior decade of theoretical contributions. How did an astute organizer (Bauer) know that this was the time to act, or achieve such unity of purpose among a convention, or happen to cultivate a m omas Jefferson (Naur) to draft this report? Could it be done again for semantic description, or for data base technique, or for command languages? The most we can assert is a platitude: Good results are bound to emerge from creative people of vision and purpose.

Paper 4 COMPILER WRITING: MATURATION OF AN AREA OF PRAGMATIC IMPORTANCE Like most other branches of computer science, the initial development of compiler technology was spurred directly by pragmatic pressure, specifically the intolerable cost of programming major scientific applications. Out of the first compiler writing efforts, and especially the distinguished FORTRAN effort led by John Backus at IBM Research, there precipitated a seminal bag of tricks and inventions: table-driven methods of syntax analysis, stacks as an identified data structure, hashing as a technique for searching symbol tables, register allocation as a combinatorial problem, and global program analysis as an issue--that soon began to attract the attention of computer theorists. By now, theoretical contributions in the compiler area have become substantial enough to govern practice to a significant degree. The first subarea in which manageable theoretical questions could _ _ Useful models of language structure appeared quite early in the work of the linguist No am Chomsky and were rapidly adapted to the computer context (Backus and Naur). m ese models were seen to organize the parsing process, to have an interesting algebraic/combinatorial flavor, and to make contact with the theory of decidability that earlier work in mathematical logic had established. As a consequence, they attracted the attention of many theoretically inclined academic computer scientists. be cleanly put was that of syntax analysis. In retrospect, it is possible to feel that concentration of effort in this area was at times excessive. Nevertheless, this work served well to organize our understanding of the parsing process. . . . In particular, it clarified the mild restrictions that must be imposed to make a grammar efficiently passable, and it led to the development of schemes that allow fairly intuitive descriptions of language syntax to be transformed automatically into parsing programs. While this theoretical work was in full swing at universities, industrial researchers were opening up new lines of investigation focused on other phases of the compilation process. Efficiency of compiler-produced code had from the start been regarded as a key issue by industrial compiler writing groups. Much attention had been devoted to this point in the pioneering IBM FORTRAN effort, and this emphasis had continued to influence such subsequent developments as the Computer Sciences Corporation UNIVAC 1108 FORTRAN compiler and IBM' s later FORTRAN H compiler, developed for the System 370. mis work identified 43

44 various important code-optimization issues and began to develop techniques for dealing with them. First of all, code optimization was seen not as a way of compensating for bad programming, but as a way of removing those wholesale inefficiencies that could result from naive program translation when a programmer used a language like FORTRAN appropriately. To remove these inefficiencies, a code-optimizer program had to do various things. Redundant computations had to be identified and suppressed; computations unnecessarily repeated within high-frequency loops had to be identified and moved to less frequently traversed program regions. Effective assignment of computed quantities to machine registers was seen to have a strong impact on efficiency, but to find favorable patterns of assignment it was necessary to work out the "live regions of computed quantities, which is to say the points at which they were used for the last time before becoming either superseded or irrelevant. Quite early on, it was realized that this information could most readily be calculated by regarding a program as a kind of abstract graph whose nodes were elementary program fragments and whose edges represented the control transitions that could occur during execution. Analysis of such "program" graphs led systems developers to a general concern with the analysis of paths through graphs, to an interest in the large graph-theoretic literature inherited from mathematics, and to attempt to find graph notions that would facilitate global program analysis, in part by giving a precise definition to the informal notion of "program region" with which code-motion procedures necessarily worked. A pragmatically successful but still precise notion of this kind was found in 1961 by V.A. Vyssotsky at Bell Laboratories and was brought to light by John Cooke, who at the time played a key role in a large IBM development group that was designing a new scientific super- computer that was to be supported by high-efficiency FORTRAN compiling techniques. Papers by Cocke and Allen of IBM and Kildall of the Naval Post-Graduate School brought this graph-theoretical approach to the attention of university computer scientists. Though the techniques performed well for most program graphs encountered in practice, relatively easy theoretical analysis showed that conceivable worst cases would require an amount of time that increases quadratically with the size of the program being analyzed. This observation spurred efforts, notably by Ullman at Princeton, Graham and Wegman at the University of California at Berkeley, and Tarjan at Stanford, to find algorithms with improved worst-case behavior. By now, these investigations, which build on a battery of high-efficiency graph algorithms of the Cocke-Allen type, have led to algorithms that perform with close to optimal efficiency in all cases. These sophisticated algorithms are starting to find their way into industrial compiler practice. Moreover, the availability of high- performance global optimization techniques is starting to change the basic flavor of industrial compiler code-generation techniques. To generate really good code for a particular machine, one needs to take advantage of the peculiarities of its instruction set, favoring instruc- tions that the machine executes with particularly high efficiency. In earlier attempts to exploit such efficient instructions, compiler

45 writers introduced larger and larger masses of "special casing" code, an approach that proved to be liable to subtle and hard-to-find errors. In practice, such loosely organized collections of optimizations were often gradually removed from compilers containing them: maintenance groups, imperfectly familiar with the special-case optimizations being applied, frequently preferred to excise complex malfunctioning optimizations rather than to attempt to effect repairs. An improved approach, which is just now developing and which is very directly based on the availability of high-performance global optimization techniques, resorts to long lists of disparate special cases. One writes a relatively naive code generator that, if unsupported, would produce quite inefficient machine code and then relies on a global optimizer of known properties to clean up the resulting inefficiencies. The optimizer may also be used to glean contextual information to guide' the choices that the code generator will make. m ese techniques seem likely to allow previously ad hoc approaches to quality code generation to be systematized. The important problem of register allocation also serves to illuminate the manner in which practice and theory have interacted in the writing of compilers. It is well known that an effective assignment of quantities to registers can often double a code's execution speed; this is one of the most important techniques by which programs written directly in machine language attain efficiency. Since Backus's first IBM FORTRAN compiler, this problem has been studied intensively by a variety of techniques, some ad hoc, others having at least some fragmentary theoretical basis. Early on, register allocation was seen to be related to the classical mathematical problem of graph coloring: assigning a minimum number of colors to the nodes of a graph, subject to the restriction that no pair of nodes connected directly by an edge may be given the same color. Since this graph coloring problem is also related to the famous four-color conjecture, an extensive mathematical literature concerning it had accumulated long before computers existed. m e relationship between graph coloring and register alloction is as follows. A calculated quantity in a program is said to be live at a point in a program if it may have to be used after this program point is reached and passed, and two quantities are said to conflict if they are ever live at the same time. Quantities that fail to conflict can be packed into the same register, but quantities that conflict can only be packed into separate registers. If we view the set of all conflicts associated with a program as a kind of abstract graph, the conflict Graph of the program, then the register allocation problem is seen to be isomorphic with the problem of coloring the nodes of this conflict graph using a number of colors close to the number of registers available on the machine for which code is to be compiled. The first significant theoretical results to arise from the connection between register allocation and graph coloring were discouraging: once Cocke's pioneering theoretical work on NP-complete problems established the existence of a broad class of combinatorial algorithmic problems, all equivalent in difficulty and none likely to be amenable to solutions that were at all efficient, it was immediately

46 realized that the general graph coloring problem belonged to this class, so that no generally efficient algorithm for it would be expected. This negative news at first shifted attention back to more ad hoc register allocation schemes involving program-related combinatorial structures other than the conflict graph. Recently, however, work at IBM has vindicated the coloring approach by showing experimentally that the conflict graphs actually associated with programs are generally sparse enough in structure to be colorable rapidly by a very straightforward coloring algorithm, provided that the computer for which register allocation is being performed has enough (i.e., more than a dozen) registers. It appears likely that further developments of this approach will yield register allocation procedures that are general and effective enough to standardize practice in this subarea of compiler design. The sustained technical efforts that have gone into the analysis and improvement of the table-searching technique known as hashing provide another interesting albeit peripheral illustration of the manner in which theory has helped the technology of compiler writing to mature. Although very simple probabilistic arguments suffice to make the general advantages of this technique perfectly clear, it can be implemented in many variants. Quite strenuous probabilistic analysis is required to clarify the relative advantages of these many variants. The invention of the hashing technique triggered extensive subsequent investigations of the statistical behavior of various hashing methods. As a result of this work, which involved powerful combinatorists such as Alan Konheim of IBM and Donald Knuth of Stanford, a compiler writer who needs to use a hash-table organization as one component of his overall design can now consult a standard reference in which available methods are listed and rated for efficiency in various typical situations. Here we see theory contributing not broad unifying concepts, but precise analytic information on the performance of a significant system component, which is of course one of its most characteristic functions in a mature discipline. In summary, our picture of compiler writing, a very important subarea of software building, is that of clever but unsystematic art progressively being replaced by substantially more systematic, efficient, and well-analyzed techniques. A corollary benefit is the ability to handle more of the aspects of compiler building automatically or semiautomatically. m e problems to be solved have more often been recognized by industrial research groups than by universities, and in many cases initial solutions have come from industry. However, university research has played a vital role in refining this work, in making it generally available, and in preparing by analysis for advance to an improved level of sophistication. As this body of techniques matures, the production of efficient, trustworthy compilers will become a considerably more routine matter than it has previously been.

Paper 5 ARTIFICIAL INTELLIGENCE: SUSTAINED UNIVERSITY PIONEERING WINS INDUSTRIAL ACCEPTANCE INTRODUCTION Artificial intelligence (AI) is the branch of computer science that collects and studies the so-called Tweak methods" that problem solvers can use when "strong," algorithmic methods are unknown or impractical. The truly stunning future perspectives opened by AI research have attracted some of the most imaginative and eloquent thinkers about the meaning and potential of the computer revolution. Weir enthusiasm, the press's nose for dramatic stories, and the public's natural curiosity about the possibility of fabricating intelligence make AI unusually visible and controversial among the subfields of computer science. Because of the excitement that naturally surrounds it, AI has prospered, attracting funds and students aplenty e It has contributed more than its share of the seminal ideas in modern programming practice. Yet at the same time it draws fire, for its record of overly rosy claims and unmet promises. The long-term goal of AI is to develop a theory of intelligence, or intelligent behavior, as exhibited in man and machine. Individual researchers have their own individual styles and preferences, so AI research encompasses many different activities. Broadly speaking, there are three research methodologies in AI: experimental, psychological, and theoretical. 1. Experimental--By constructing and analyzing computer programs that use weak methods to solve problems, AI researchers can study the characteristics of these methods experimentally. Once a problemrsolving method is represented by a working program, its scope and limits can be studied by varying parameters of the program or by varying characteristics of the problem to which it is applied. 2. Psychological--By studying human problem-solving behavior, AI researchers can gain insight into heuristic methods. men by constructing programs that embody those methods, they construct psychological models that can be tested and refined. 3. Theoretical--By formalizing key concepts of problems and problem-solving methods, AI researchers can examine the logical properties of heuristic methods, such as the consistency and completeness of inference calculi, or the efficiency of search techniques. 47

48 An essential ingredient of artificial intelligence research is a body of heuristics, or judgmental rules that it has collected. These provide good guesses--justifiable but not infallible guides for problem solvers and algorithmic methods known for some classes of problems--and should be used when appropriate. However, algorithms can be more expensive to use than heuristics. Indeed, efficient algorithms are completely lacking for many problems studied by AI researchers, from the problem of answering homework questions to that of finding answers to life and death questions of medical diagnosis and therapy. The structure of AI is far from rigid, but research has focused on a few major topics: · How to encode knowledge (both common-sense knowledge and specialized expertise) for efficient use by programs. · How to control the use of knowledge for making plausible inferences. · How to recognize objects in visual scenes. · How to control robot "actuators" (vehicles, mechanical manipulators, etc.) · How to create and use plans of action. · How to modify or extend a program's knowledge base (e.g., as a result of learning from experience, learning by being told, rote learning, concept formation, or learning by other means). · How to understand and generate English-language sentences. · How to prove theorems efficiently, and how to use theorem provers as inference engines for problem solving (especially question answering). · How to model human problemrsolving methods. . These questions, and variations of them, have motivated AI research since the beginning. They can all be seen as instances of the central question of AI: How can we give computers those capabilities that are taken as evidences of intelligence when displayed by humans? Many of these themes come together in current investigation into Expert systems. n In an expert system, a serious attempt is made to codify judgmental methods in some specialized field--for example, chemical synthesis. An expert system typically comprises a loosely organized and easily modifiable collection of rules, a more or less specialized reasoning program that uses the rules as needed, and a mechanism for explaining any outcome that may be reached. WHERE IS AI RESEARCH DONE? Named before computer science itself, AI remained in the hands of just a dozen or so researchers (those named in Computer and Thought, Feigenbaum and Feldman, editors, McGraw-Hill, 1964, and a few others) until the late 1960s. With a few exceptions, AI research to date has been carried out at universities, especilly CMU, MIT, and Stanford, which enjoy massive federal sponsorship. Though industrial interest in AI is growing

49 rapidly, little support for or output in this field came historically from computer manufacturers. However, this picture has begun to change. A recent tabulation (ACM SIGART, April 1978) listed 20 U.S. corporate AI research locations. m e ARPANET has benefited from sustained Advanced Research Projects Agency and Office of Naval Research funding and from good intersite communications facilities--and a national computing resource for AI research in medically related problem areas has been serving a geographically distributed research community since 1975. m is Sumex-AIM resource, funded by the Division of Research Resources of the National Institutes of Health and located at Stanford, gives individual researchers access to a large, interactive computing environment at only the cost of a terminal and modem. m us, even though applications of AI to medicine and the medical sciences require large capital outlays, access to the equipment and to colleagues can be provided to qualified researchers anywhere in the country, without relocating them. Artificial intelligence has always been the most speculative area of computer science. From the start, its focus was on building programs that were out of the reach of conventional programming methodology. In the course of doing this, researchers sometimes have been driven to invent new programming concepts that then become important in other areas of computer science. Examples are alpha-beta search, functional programming, time sharing, list processing, and garbage collection. Until recently, the impact of Al on computer science and industrial computing has been largely through a direct transfer of programs. So far, these byproducts of AI research, of which MIT's MAC SYMA is another example, have been more important than the more direct transfer of products of AI research. Nevertheless, some direct products of AI research enjoy significant use today. Stanford's Dendral program has been repeatedly used by chemical companies to aid in the analysis of organic compounds. Some firms have adopted, or are experimenting with, techniques drawn from work with expert systems: notably DEC for configuring computers and Schlumberger for oil processing. On the other hand, industrial use of computer-aided decision and robots seems to owe little to AI research, despite the historical interest of AI researchers in these topics. It appears that these industrial efforts depend primarily on classical science and engineering. This should not be surprising: one can hardly expect methods suggested by the weak Al paradigm to regenerate the codified wisdom of the centuries on the spur of the moment. Only by assiduously appropriating that wisdom (as MACSYMA and SYNCHEM 2 have attempted to do) can AI researchers hope to succeed in fields of hard science or technology. POTENTIAL FOR TECHNOLOGY TRANSFER Frugal or efficient use of machines and systems capable of handling production-scale problems has seldom been a central concern of Al research. As a result, there often arises a certain tension between 1

50 its values and style and those of industrial development. Although industry's willingness to experiment with AI techniques is now growing very rapidly, considerable skepticism remains in industry about the complexity, size, and usability of AI programs. In contrast, the Japanese are investing $100 million in the fifth-generation computer project with the expectation that the difficulty of using AI techniques can be overcome. The majority of this committee believes, however, that crash funding of this area does not make sense. The academic AI community has recently become more conscious of the potential interest in AI within industry. Given this motivation, we believe that many methods drawn from Al research will be tried, and they will prosper or not according to the laws of economics. CONCLUSIONS Without knowing the nature of intelligence, how far can one push current AI techniques for representing knowledge for use by programs and for programming machines to use that knowledge? Researchers in AI are looking at the limits of what can be done. Industry may be expected to use the practicable results to build programs that succeed in real-world, complex environments. Modest demonstrations are in progress; we see no significant institutional barriers to their maturing.

Paper 6 ROBOTICS: THE INTERPLAY OF INDUSTRIAL AND ACADEMIC ACTIVITY OPENS A MAJOR NEW FIELD OF RESEAE~H EARLY HISTORY ffl e history of robotics, a subject area that is only now coming to the forefront of interest in computer science, is worth examining for the interplay that it exhibits between direct marketplace concerns and far-reaching research goals inspired by artificial intelligence. The intelligent robot is an old dream of mankind, robots having played a role in fiction since ancient days. The second-century Chinese general Chu-ko-Liang was reputed to have constructed robot donkeys and horses for use in his military transport operations, and in the thirteenth century, the English monk Roger Bacon was rumored to have built a talking bronze head that served him as a personal oracle. Robots are also prominent actors in twentieth-century science fiction. However, the development we shall trace first began to take on substantial flesh in the middle to late l950s. By that time, the availability of advanced servomechanism theory, the increasing sophistication and falling costs of electronics, and the fundamentally new capability provided by the stored program computer began to tempt engineers concerned with industrial automation to look at generalized, computer-controlled mechanical devices of broad potential applicability. The numerically controlled machine tool was one outcome, and the first robot manipulators another result of the climate of innovation to which the confluence of these three technological streams led. Viewed narrowly, industrial robots are simply computer-controlled machine tools specialized for the manipulation of work pieces. Seen in this light, they might be considered close relatives of numerically controlled machine tools, the most sophisticated of which are also regulated by stored programs, but which serve for cutting blank stock rather than for the manipulation and assembly of preformed parts. However, since the general environment of parts assembly is far more varied and complex than that of parts cutting, robot manipulators require programs that are more sophisticated than the simple geometric routines that suffice for numerically controlled machine tools. Hence robotics proper begins with the construction of the robot manipulators rather than with the simpler numerical machine tool technology. The loci of innovation were at first purely industrial, universities not being active in the earliest years of robotics. Two main companies, 51

52 Unimation and AMF (American Machine and Foundry) dominated developments in these years. Joseph Engelberger, George Devol, and Maurice Dunn led the early technical work at Unimation and are still active today. An aspect of the early history worth noting is that Unimation was from the start a specialized company whose future wan strongly conditioned by the need to make a success of the robot manipulators they were developing. This circumstance concentrated the attention of Unimation's management and technical developers on robotic. and did in fact lead to success. Tb survive in the relatively adverse technical environment of these pioneering days was not easy, since among other things, computers were still quite expensive and no clear market for robot devices had been established. By contrast, AMF, another pioneering company, was a large organization with much higher inertia and a much more limited focus on and commitment to robotics. In consequence of this, AMF's initial efforts soon fell by the wayside, and Rudy Malenkovic, the technically successful pioneer of robotics at AMF, moved to the Ford Motor Company, where his work concentrated more narrowly on the application of robots to automobile assembly lines. mis bit of history illustrates the critical importance of major innovations of small, fast-moving, specifically committed companies. Although Unimation survived, the years from 1956 to 1970 were lean ones for robotics. The general technological context in which robotics research was constrained to operate was insufficiently~favorable to allow any dramatic commercial success. Compared to the price of labor, the price at which robot assembly devices could be offered was simply too high. The technology was viewed with suspicion by many members of its potential customer base. Most robot-oriented companies other than Unimation simply went under or moved on to other activities. At this point, however, academic activity in robotics began to become significant. Marvin Minsky, realizing the technical depth of the robotic language/vision/manipulator control problem, persuaded AMF to give him a robot for use at the MIT artificial intelligence laboratory. m e MIT work with robots was undertaken as part of their general exploration of the broad area of artificial intelligence, and thus drew inspiration from the same very far-reaching goals as the MIT work on scene analysis, game playing, semantic nets, etc. At about the same time, John McCarthy, Jerome Feldman, and m omas Binford at Stanford began work on robotic hand-eye systems. m e most successful early outcomes of this university research lay in the areas of computer vision, geometrical reasoning and modeling, and general AI-like schemes for planning robot motions. Direct research on the manipulation problems of more immediate interest to potential industrial users of robots (e.g., the problems of grasp planning, force-guided motion control, work to close tolerances, etc.) was by contrast limited. There was, however, some significant university work on design of small, fast manipulators, especially that of Victor Scheinemann, who was associated with both Stanford and MIT before moving on to Unimation; commercial versions of the Scheinemann arm are now being offered by Unimation. Real-time software for control of the kinematic chains constituting manipulators of this type was also developed at Stanford and MIT by Roth, Piper, and Paul at Stanford and Horn at MIT based on

53 prior work of Hartenberg and Uicker. The MIT/Stanford kinematic control software was of direct practical importance, since it allowed the users of robots to plan robot motions in normal Cartesian coordinates, rather than involving them constantly with the complicated geometry of motions planned in joint-angle terms. Other significant robotic research efforts were also undertaken at the Stanford Research Institute, which concentrated on problems of locomotion and computer vision, and at Edinburgh, where problems of computer vision, but more significantly some of the basic problems of robot assembly, were also studied. The work at SRI on a robot rover that navigated in a complex room environment became well-known and helped enlarge the general view of what robotic techniques might accomplish. m e Edinburgh work on the assembly problem also demonstrated some interesting technical points, but unfortunately the Edinburgh manipulator hardware was only suitable for solving toy assembly problems that industry could not regard as realistic. This group's lack of suitable industrial and governmental contacts and sponsors provided a critical obstacle to the transfer of their ideas into serious industrial practice. It is also worth noting that some of the early university work on robots was motivated by enthusiastic expectations of immediate progress and generally reflected the hope that short intense efforts would suffice to reach goals that still have not been attained. m e Stanford hand-eye work had the goal of assemblying a Heathkit radio, which it was hoped would become possible within two years of program inception. Fifteen years later, an assembly operation of this complexity is still well beyond the state of robotic art. Since much of the early university research was funded by the Advanced Research Projects Agency of the Department of Defense, reports at that time had it that early fielding of robot troops or robot-run military vehicles was hoped for. m is, too, requires technological capabilities that we are still far from having. Nevertheless, in spite of the failure of their most ambitious expectations, these university efforts did lay a technical base and they did train initial groups of researchers that supplied manpower for the rapid growth in robotics that began during the early 1970s. THE SECOND DECADE By the early 1970s, inexpensive minicomputers were readily available, and much cheaper microcomputers were obviously on the horizon. These dramatic strides in electronic technology clearly promised to remove one significant cost factor, the cost of computing power for manipulator control, that had impeded the application of robot technology. This fact, obvious to industrial groups both in the United States and abroad, triggered an expansion in the level of industrial robotic research, with major companies such as IBM, Cincinnati Milacron, Texas Instruments, Westinghouse, GE, and GTE all becoming involved in robotic research and development. Japan became a major focus of robotic activity at this time, with Hitachi, Fujitsu, and the powerful research group at the

54 National Electronic Laboratory all building up robotic research groups. Olivetti in Italy, ASEA in Sweden, Volkswagen in Germany, and Renault in France all entered robotic research as well. While a part of this expanded activity was simply inspired by hopes that the use of robotic techniques could alleviate some of the internal productivity problems that the companies involved were facing, at least a few of the companies building up robotics groups (including Olivetti, Fujitsu, IBM, and Cincinnati Milacron) saw robotics as a potential opportunity for major expansion of existing product lines. m e level of robotic research at universities and university-related research institutions also rose substantially. The Electronics Directorate at the National Science Foundation began to fund university work in this area, and increased funding soon became available under the more specialized NSF Research Applied to National Needs (RANN) and Productivity Technology programs. The NSF funding allowed the Stanford robotics efforts to expand to three-dimensional modeling systems. A significant robotics activity was also built up at MIT's Charles Stark Draper Laboratory. The Draper work focused on the problems of industrial assembly, specifically the mating of parts that must be fitted together to very close tolerances. Here the inventiveness of a very able group of mechanical engineers contributed an outstanding mechanical device, the so-called remote center compliance device, which made it possible for a robot manipulator to mate parts that had to be fitted to closer assembly tolerances than the maximum geometric precision of the manipulators themselves. NSF-funded work at the University of Rhode Island demonstrated that computer vision could be practically and successfully combined with robot manipulation. m e Rhode Island work concentrated on a single problem of great industrial importance, namely that of picking up parts made available to a robot in disorganized tote bins. This well-chosen concentration made it possible for the Rhode Island work to succeed much more markedly than other groups whose research aims were perhaps broader but whose focus was also more diffuse. During this same period, industrial robotics pitched at a number of specialized applications began to achieve real commercial success. Concentrated attention to one specific application area--automobile spot welding--made it possible for this operation to be performed reliably and well. In this application a heavy tool (a welding gun) must be positioned to within a few hundredths of an inch, with specified orientation, at prespecified points on an automobile frame. When the tool reaches a significant point, it is activated and a weld made. m e first successful welding applications required the auto frame being welded to remain stationary, but, as computing power increased, the more complex task of making welds on a body moving along an assembly line was mastered. Other similar applications--for example, paint spraying and die casting--were also studied in depth and opened important new markets for the sale of robots. During the 1970s, various industrial and university research groups began experimenting with sensor-equipped robots. As already noted, the Stanford, SRI, and Rhode Island work involved the combination of computer vision and robot manipulation systems. Other work at

55 Stanford, IBM, and elsewhere involved the use of more or less sophisticated tactile sensors. m e use of such sensors demands control software of much greater sophistication than that needed for robots whose control is fixed and purely geometric. For fixed geometric control, simple lists defining the path that a manipulator is to traverse suffices but in the presence of sensors, one needs software that can deal with many sensed conditions, choosing among alternative actions and reacting immediately via interrupt-handling software to detected external events. Accordingly, the development of more sophisticated sensors pushed roboticists, perhaps for the first time, into active concern with the sophisticated software and programming language issues that had been central to other branches of computer science. Out of this involvement with software design questions came the control languages that are being sold with the more sophisticated of the commercial arms available today. For example, the Unimate model 250 and 500 manipulators (which are largely based upon Scheinemann's Stanford and MTT work) are programmed in a language called VAL, which is a direct derivative of the experimental AI robotic language developed during the early 1970s at Stanford. The relatively advanced AML language supporting the IBM line of robot manipulators also reflects the Stanford influence, several key members of the IBM group having been trained at Stanford. The importance of this Stanford work can be seen by contrasting the Stanford-descended manipulators with some of the other robot manipulators being sold today that still make use of more primitive software concepts that derive from pre-1970s research. For example, the Cincinnati-Milacron manipulators and some of the other robot manipulators being sold today still make use of more primitive software concepts that derive from pre-1970s research, and are programmed in a language reminiscent of the APT machine-tool programming language. The expansion of robotic research and development activity that characterized the 1970s has continued and intensified during the first years of the present decade. Westinghouse Corporation has recently established a separate division within their research and development organization to develop robots for internal use. This initiative was undertaken after a visit of the Westinghouse board chairman to Japan made him aware of the relatively advanced assembly techniques being used at Hitachi. Robots for electric motor assembly are of particular interest to Westinghouse. General Electric has announced its intention to undertake a major reorganization of its manufacturing operations, with greatly increased use of robotics. General Motors has stated that it aims to increase its population of working robots substantially. The level of university robotics-related research is also growing rapidly. A major new activity, organized as a Robotics Institute, has begun at Carnegie-Mellon University. The Carnegie-Mellon group combines the interests and talents of the university's computer science, mech- anical engineering, electrical engineering, and industrial engineering departments, and can be expected to bring a broad spectrum of pragmatic and theoretical talents, covering algorithm, software, and hardware design, to the ambitious work that they have undertaken. The University of Florida has also established a Robotics Institute, and the powerful

56 MIT group is undergoing major expansion. Many other universities are starting to involve themselves in robotics, and are actively seeking industrial connections that can ease their entry into this field. Robotics has become a matter of interest to the press and general public and is seen as a vital key to gains that will ensure the ability of the United States to compete internationally. Access to the flow of funds critical for continued robotic research and development is assured by the limited but very real practical successes outlined above. Although science fiction, perhaps abetted by the speculations of some artificial intelligence enthusiasts, has conditioned the public to expect spectacular events, incremental progress based upon complex and strenuous research efforts is more likely to characterize the coming decade. Commercially successful new applications of robotics will become possible at those points where the most advanced concepts projected by research laboratories can be cut back to yield more limited, but well-engineered, reliable devices that answer the immediate needs of industry. Industrial acceptance is, of course, critically conditioned by cost, so that robot systems limited in their kinematic and sensory complexity and in their demand for computing power will come into wide use before more flexible but expensive systems. Another factor bound to condition the rate at which industry demands robot equipment will be the need to rework the existing industrial environment of fixtures, -~ _ to better adapt them to robot-based - ~ - nart feeding, and transport devices production styles. But, as the cost of robots falls, and as these inherently universal devices grow in adaptability and reliability, broad industrial acceptance seems inevitable. CURRENT RESEARCH EMPHASES To manage the technical problems of robotics, very challenging research will have to be undertaken, and complex, expensive developmental activities mounted as well. m is will require both extensive university research efforts and major industrial developments. Here we perceive an area in which well-structured university-industry cooperation could accelerate the growth of a very challenging technology. To make a robot manipulator useful commercially involves a significant software effort, which must at the very least provide for real-time manipulator control, rapid handling of sensor-generated interrupts, and complex geometric computations. These requirements will increase significantly as multiarmed robot systems come to be employed. In robotics, computer science confronts the kinematic and dynamical reality of three-dimensional space, a circumstance that has already begun to involve robotics researchers in many fields whose relationship to the pragmatic requirements of computer science was previously marginal. Among these newly critical subject areas, computational algebra and geometry, Lagrangian dynamics, and the theory of friction and of elasticity may all be listed. All this implies that the complex of issues that researchers in robotics need to face is extraordinarily broad, so that practical progress in this field is likely to be more

57 dependent on advanced research than is the case for other computer application fields. Since little of this scientific material has until now been part of the computer science curriculum, we can also expect the requirements of robotics to encourage a substantial revision and mathematical deepening of the curriculum that university computer science departments will have to offer. Although robotics research seems certain to touch upon a par- ticularly broad range of technologies and scientific disciplines, we can gain some understanding of the areas likely to be of greatest significance over the next decades by surveying the near-term requirements of industrial robotics. These include the following: 1. In-depth studies of important current applications. Robot spotwelding has become routine, and attention is now turning to the more complex physical problems associated with continuous arc welding, where proper control of welder robots requires some understanding of the thermodynamics of the liquid-solid arc pool. Ways of specializing robots to work in environments that are hazardous or inaccessible to humans, e.g., high-purity clean rooms, deep-sea environments, nuclear reactors, and space, also require detailed study and will sometimes raise complex dynamical and other problems. 2. Improvement of visual, tactile, and other robot sensors. Current computer vision software is of limited reliability and quite expensive computationally. Much faster and more stable picture- processing algorithms and devices are required. To produce these will require penetrating theoretical research, as well as the development of specialized high-performance VLSI chips whose logic will have to embody the best algorithms that research can make available. Vision systems that are easily reprogrammable for a wide range of applications are particularly desirable, but at present it is not at all clear how these can be created. Tactile sensing plays a particularly important role in dextrous manual assembly. The subtlety of the human tactile sense is far from being matched by the relatively crude tactile sensors currently avail- able with robot manipulators. It seems clear that greatly improved sensors will be required if complex assemblies, especially of fragile and deformable parts, are to be attempted, and if more sophisticated methods of grasping are to be developed. m is has been recognized as an important research issue, and work on improved tactile sensors under way now should yield considerably improved sensors within a few years. Better sensors will in turn call for more sophisticated software to manage them, a consideration that emphasizes the importance of improved programming techniques to the general progress of robotics. It is also important to develop improved proximity sensors able to give advance warning of impending collisions. One will probably never be willing to set robot arms into rapid motion in an environment that is at all unpredictable. His makes it plain that development of better proximity sensors can contribute substantially to the productivity of robot systems, and also to their flexibility. Such systems will in turn demand software able to react appropriately to warnings that they supply.

58 3. Force-controlled motion primitives. The motion-control primitives supplied with today's robot systems are purely geometric in character, but cannot, as they stand, be used to cause a robot arm to move smoothly while maintaining contact with a curved surface of unknown shape. The ability to do this is essential for the logically flexible adaptation of a manipulator to an environment whose whole geometry is not known in precise detail. Force-controlled motions play an essential role in manual assembly, and the demonstrated advantages of devices like the Draper Laboratories' remote center compliance device point clearly to their importance for robot manipulators as well. Near-term research and development efforts to make motion primitives of this kind available in the commonly used robot-programming languages are therefore likely. 4. Improved robot-programming techniques. A large existing industrial assembly manual literature gives detailed directions for producing a great many common manufactured items. Finding some way of translating these manuals automatically into robot assembly programs would be ideal, but unfortunately, this far exceeds the capability of today's robotics programming languages. For anything close to the language of standard industrial assembly manuals to be accepted as robot control input, much more sophisticated languages than those now coming into use will be required. m e compilers for such languages will have to incorporate knowledge of the part and subpart structure of partly assembled manufactured objects. They will also have to incorporate a routine capable of planning the way in which such objects can be grasped, moved without collision through a cluttered environment, and inserted into a constrained position within a large assembly. This level of programming sophistication only becomes feasible if a robot system can either maintain a detailed model of the environment with which it is dealing through a whole complex sequence of manipulations, or acquire and refresh such a model through visual analysis of the scene before it. Although no robotic language with nearly this degree of sophistication has actually been produced, such languages have at least been projected, e.g., in the work on AUTOPASS at IBM, and its Stanford, MIT, and Edinburgh relatives AL, LAMA, and RAPT, respectively. It should be noted that the implementation of languages of this sophistication will require solution of many levels of fairly complex mathematical and geometric problems. One basic problem of this kind is that of planning collision-free motions of three-dimensional bodies through obstacle-filled environments. This problem, studied by researchers at Caltech, MIT, New York University, and elsewhere, has by now been brought to a preliminary stage of solution, but from the practical point of view this work merely reveals the complexity of computations that motion planning can involve and the importance of seeking much more efficient motion-planning schemes. The work carried out by Fahlman at MIT is also suggestive of possibilities for more advanced robot activity-planning software. Working within a simulated world of blocks, Fahlman constructed a program that could combine geometric knowledge of the collection of blocks given it with an understanding of the final assembly desired, to produce a fully sequenced assembly plan. This demonstration program was even capable of using some of the blocks available to it to construct fixtures useful in the assembly of the remaining blocks.

59 5. Improved manipulator hardware. m e essential elements of a complete robot manipulator subsystem are-a manipulator arm, the grippers and sensors with which it is furnished, and the computer circuitry that controls it. University groups can be expected to contribute substan- tially to the design of new hardware and software control schemes and improved grippers and sensors, but the high costs associated with the development of a new mechanical hardware are likely to make this a matter for industry rather than universities. It may be necessary to include features supporting advanced control concepts such as that of force-guarded motion directly in the basic mechanical structure of a manipulator, a possibility that argues for the importance of close industry-university collaboration in the continued development of robot manipulators. SOME CONCLUSIONS Advances in robotics can certainly contribute to increased U.S. productivity. However, the research issues that will need to be faced in developing robot technology are extremely broad; geometry, dynamics, elasticity theory, materials design, and electronic and software science are all involved. University researchers will find many deep issues to ponder, and industrial development groups will have many large systems to build. In order to realize the great potential of robotics, it will be particularly important to combine the abilities of these two communities and to encourage them to work closely together; this will give industry the mathematical skills that the field demands, and assist universities to find the capital resources and engineering capacity that they will need. As the potential of robot technology is realized over the next few decades through the mastery of successive practical tasks, and as the cost of robot manipulators and their controls continues to fall, economic pressures will increasingly favor wide robot deployment. The immediate technical steps that will lead in this direction are the improvement of manipulators and grippers, the close study of numerous significant applications, and the development of improved sensors and their integration into standard systems. Universities will contribute deeper investigation of the complex mathematical and programming problems of robotics. A word of caution is in order. AS the robot development works itself out, our ability to deal with its social consequences is likely to be challenged. If we fail to respond adequately to this challenge, social unrest and latter-day Luddite tendencies may become a bigger inhibition to the wide deployment of robots than technological difficulties, which in time will surely be overcome. Society will have to decide what it wants to do with the new industrial capabilities that robotics research is creating.

Paper 7 SC IENTIFIC COMPUTING: A SCIENTIFIC GROUP OF MIXED BACKGROUND ATTEMPTS TO COPE WITH A BROAD SPECTRUM OF PROBLEMS I NTRODUCTION Scientific computing is computing in the service of some scientific undertaking external to computer science itself. The sciences that have traditionally been large consumers of computing services are those in which important (though perhaps simplified) models of physical or chemical processes are well enough established for their application to be worth the expenditure of large amounts of computer time. Computation serves (1) to work out the results predicted by theoretical models for which closed-form solutions are not available {perhaps because the models being studied are inherently complex); (2) to manage the large volumes of data sometimes needed to make findings significant; and (3) to present complex, multidimensional measured or predicted data in graphic or other forms that enhance the scientist's ability to comprehend their overall meaning. Computation has come to play an essential role in the interpretation of data, in the assessment of the adequacy of existing models of phenomena, and in the generation of new models. Large-scale computation is increasingly important to scientists and engineers working in such fields as high-energy and condensed matter physics, energy production, geology, meteorology, structural engineer- ing, aerodynamics, chemistry, physiology, and biochemistry. m e results of scientific computation enter, directly and indirectly, into industrial planning and practice. Computation is more and more an integral part of new scientific instruments, and increasingly the raw data gathered by these instruments are processed computationally before the instrument user ever sees them. In some cases, for example, in computerized tomographic equipment, the raw data are totally unintelli- gible before they have been processed. The cost of the computations that scientific models require is often proportional to the second, third, or even higher power of the complexity of the problem being attacked and the fineness of the detail that computation using a theoretical model is required to produce. Therefore only massive increases in computing power can be translated immediately into the ability to handle increasingly ambitious and refined applications. It is for this reason that the scientific computing community demands ever larger and more powerful machines. 60

61 Such demand has traditionally spurred innovation in large machine architectures. DEMANDS ON THE "COMPUTING SCIENTIST. To deal sucessfully with large computations that relate to a scientific field requires a high degree of familiarity with the field itself. The approaches suitable in such computations will reflect the major con- ceptual structures of the field to which the computation belongs, and extensive understanding of the physics or chemistry of the models for which computations are being set up may be needed to understand which of many possible numerical schemes is most likely to produce stable, significant results. An effective computational approach will often rest upon a deep-lying analytical simplification or a clever segrega- tion of the terms appearing in an equation, the terms being treated by different computational strategies. Only when these analytical approaches have been pushed to their limit will a large-scale computa- tional attack on the problems that remain be economically justified. Brute-force number-crunching, which tends to drive computation costs up rapidly, may not be competitive with methods making use of clever approximations, expansion, or mathematical transformations. Thus the practitioner of scientific computation needs to be familiar with all the analytic paths that define the structure of the subject with which he is working, be it hydrodynamics, plasma physics, materials science, neurophysiology, or aircraft design. However, subject-area expertise is only one of the qualifications that the successful practitioner of scientific computing must have. Scientific computing always works with approximate, incomplete representations of the numerical analysis for certification of the stability and accuracy of the computations under- taken. The numerical issues involved are often subtle. If a computa- tion uses an inappropriate numerical scheme, perhaps because some detail (whose significance easily escapes the notice of a physicist or chemist unused to thinking of the numerical properties of a familiar equation) is badly chosen, numerical instabilities that cause large expensive computations to produce nothing but meaningless numerical noise may result. Worse still, plausible-seeming but erroneous output can be produced. Thus the practitioner of large-scale scientific computing must be expert both in a science and in numerical technique. Moreover, in addition to these fields, the computing scientist increasingly needs to be familiar with the sophisticated data structures and combinatorial methods developed by recent computer science research. For example, dissection methods, adaptive numerical approaches, and other subtle geometric techniques have come to play an important role in computa- tional studies of fluid and gas flows. Setting up a combinatorially complex computation may require use of large symbolic manipulation systems, like the MIT MACSYMA system. Moreover, scientific codes often ~ ~ ~ The Crowing likelihood that the largest scientific computations will be carried out on highly parallel computers requiring novel sorts of programming, as well as the probable grow very complex procedurally. _ . . . . . .

62 use of specialized VLSI hardware devices such as array processors and Fourier transform chips, seems certain to increase the level of computer science expertise demanded of practitioners of scientific computation. Finally, where analytical models of the type common in the physical sciences are not available, statistical models will often serve as a substitute. Statistical models are, of course, a primary tool for medical, economic, and social scientists. Computation in these fields therefore requires an extensive knowledge of the principles of statistics, of the mathematical properties of statistical modeling techniques, and of existing statistical packages. This requirement and the need to use efficient computational procedures (when they exist) make it necessary for practitioners of scientific computing to have multiple scientific citizenship, i.e., to maintain a solid standing in an external scientific discipline while at the same time acquiring close familiarity with the specialized tech- niques of numerical analysis and the much larger and more amorphous mass of computer-related detail and programming methodology that they also need to apply. m e fact that they need a high level of expertise in some external science means that most practitioners of large-scale computing are trained as physicists, chemists, physiologists, etc., rather than as computer scientists. It has generally proved easier for scientists to learn computing than for computer specialists to acquire these more mature sciences, perhaps because the numerical analysis required for scientific computing no longer forms any substantial part of the computer science curriculum, and the external science never did. However, practitioners of scientific computing who wish to operate with the best techniques available are coming to require more formal training in computer science than was formerly necessary, and the need to invest time in these studies is making it harder for them to maintain contact with their original scientific disciplines. In the absence of adequate computer science training, a scientist engaged in carrying out a major calculation may be misled by the ease with which a language like FORTRAN can be used into ignoring the subtle algorithmic issues that must be faced in constructing efficient, numerically sound programs. STRENGTHS AND WEAKNESSES IN THE ORGANI ZATION OF SCIENTIFIC CO - =ING m e fact that they must draw on the results of many types of scientific, numerical, and computer science research, plus the fact that practi- tioners of scientific computing are typically trained in scientific fields with well-established scientific literatures, makes this community particularly sensitive to the importance of improved methods, and particularly likely to use the latest algorithms and numerical methods, provided only that awareness of these techniques is present. The pitfalls inherent in the fact that scientific computing is generally conducted by researchers having very limited training in formal computer science are somewhat alleviated by the fact that much advanced scientific computing is conducted at sites where contact with trained computer scientists is easily established, for example, at

63 high-energy physics laboratories and major medical research centers attached to universities. Many such centers of large-scale scientific computing, especially the high-energy physics laboratories, attract numerical analysts from academia during summers and sabbaticals, partly because they provide access to the most advanced large-scale computing resources. Visits of this type furnish opportunities for numerical analysts to become aware of problems affecting major scientific compu- tations, and for applications-oriented scientists to become aware of the latest advances in technique devised by the academic researchers. However, the frequency and breadth of this interaction are limited by the geographic remoteness of many of the large laboratory centers of scientific computing from the nearest strong university computer science departments. In some cases, contact is also hampered by the fact that laboratory computing tends to be dominated by very strongly mission-focused scientific groups having only limited sympathy for generalized research in scientific computing, whose payoff is less direct and may lie several years in the future. Laboratory computing centers have also tended to fall behind academic ones in providing the first-class interactive software development tools to which the academic computer scientists have become accustomed. Often the old-fashioned batch-oriented computing environment of a large scientific computing center may strike an academic researcher as being unattractive, if not unworkable. In some cases, especially in the case of high-energy physics or medical centers carrying out large programs of applied scientific computation, a scientific computing center's ability to obtain the numerical and software consultation it needs may be hampered by an overwhelming ratio of computer users to trained specialists in scientific computation. Even if a common university connection links such a center to a university department, the typical department will have only a small handful of faculty members specializing in numerical analysis, and the number of applied statisticians available to a medical research establishment containing hundreds-of medical researchers can be equally limited. Although the ability of a computer science depart- ment to deal with a heavy demand for applied advice can sometimes be _ . . . . . stretched by using advanced graduate students as application advisors, a small department will soon find itself unable to cope with the requirements of a much larger scientific computing group. Industry's access to the results of academic research is less secure than that of governmental and nonprofit research laboratories. Although industrial research laboratories often attempt to maintain contact with academic work, for example, by sponsoring seminars and sometimes by recruiting academic researchers to spend longer periods in industry, the narrow focus of some industrial work, industry's relative indifference to academic publications, and occasional corporate reluctance to approve publication because of proprietary concerns all act to inhibit the university numerical analysts' willingness to invest effort at industrial sites. The relationship of scientific-computation-related pursuits to the broader field of computer science is also a source of some difficulties. AS we have noted, the relative weight of numerical analysis within the

64 world of computation that it once dominated has become quite small. Forced into a position of administrative dependence on such newer research fields as compiler writing, operating system design, data base theory, and artificial intelligence, workers in scientific computation find that they can no longer command a computing environment ideal for their requirements. m e decentralized small computers that most of their software-oriented colleagues prefer may be useless toys for numerical analysts interested in the problems characteristic of large scientific computations. Even where a large computer is available, the scheduling mechanisms that evolve at an academic center may give decisive preference to small interactive jobs and relegate all substantial computations to overnight or weekend processing. This, and the reluctance of the large mission-oriented laboratories to set aside significant blocks of time for mission-independent research, means that the aggregate of computing resource available to university investi- gators of basic numerical techniques has not improved over the past decade. Indeed, the computational resources available for this work are probably less powerful in relation to the scope of problems being attacked today than they were 10 or 20 years ago. Similar problems exist in industry. If scientific computing is only a minor part of the activity of an industrial laboratory, it may lack the managerial support needed for real expertise to be attained. A factor compounding the isolation of numerically oriented computer scientists within the larger body of their departmental colleagues is the fact that typical computer science training no longer provides the mathematical background needed to comprehend the issues that numerical analysts study. For this, familiarity with such mathematical subjects as ordinary and partial differential equations, eigenvalue theory, and linear and nonlinear optimization theory, to say nothing of various fields of physics and of other sciences, is required. However, mathe- matical training emphasizes discrete and combinatorial rather than continuous models. This means that most computer scientists would need to learn several additional branches of mathematics before they could really comprehend the characteristic problems of scientific computation. Another factor tending to isolate scientific-computation-oriented researchers from their colleagues in other branches of computer science lies in the simple fact that FORTRAN has remained the standard for numerical computation but is regarded with indifference if not disdain by most other computer scientists. Manufacturers have not followed that trend, as seen in the software made available by CDC, CRAY, and DEC's VMS for the VAX computer. Most computer scientists, however, use the UNIX software with the VAX, and here no capable FORTRAN is provided. Indeed, numerical concerns have come to occupy so marginal a position in computer science that even language and machine designers tend not to consider the provision of real arithmetic capabilities to be much of an issue. Even for compiler specialists, the problems of FORTRAN optimization have little appeal as compared to such newer issues as the analysis of encapsulated languages or the design of parallel processing features. The difference in emphasis perpetuates the isolation of numerically oriented specialists within computer science generally, often preventing their colleagues, some of whom might otherwise con-

65 tribute significant new algorithmic techniques to the enterprise of scientific computing, from comprehending the problems that need to be solved. A sociological consequence of the isolation on which we have commented in the last few paragraphs is that computer scientists who begin to devote a substantial portion of their research energy to the problems characteristic of scientific computing often begin to lose contact and credibility with a broader class of peers. As such contact is lost, the ability of a practitioner of scientific computing to generate publications of the sort that his nonnumerical colleagues will find appealing may be hampered. THE OVERALL HEALTH OF SCIENTIFIC COMPUTING As compared to many other of the areas surveyed in this report, research in scientific computing, that is to say, research in the several areas of mathematics, numerical analysis, and algorithm design that serve to advance scientific computing, is a university-dominated area. The results of this research are applied and disseminated by an informally recruited group drawn from scientific disciplines other than computer science itself. The main problem is that this unfocused form of recruiting is only marginally capable of producing a flow of people sufficient for the needs of laboratories and industry. The separate research disciplines that support the work of the practitioner of scientific computing are by comparison flourishing. The general literature on algorithms of potential use to scientific computation has grown impressively and is continuing to grow rapidly. Though small, the core numerical analysis research community is sound and productive. We cannot, however, be so sanguine concerning the activity of scientific computation itself. The difficulties that those contributing to this activity find in mastering the many subjects whose results they need to use are compounded by the lack of any well- identified intellectual home for research and training in scientific computing. m ough a well-trained physicist or chemist certainly has the intellectual background to master almost any topic in computer science, the rapid growth of the computer science literature makes it increasingly difficult to acquire a well-rounded view of this sprawling field without extensive formal training. The coming age of large-scale parallel computation, and the probability that specialized very-high- performance VLSI chips will need to be integrated into the scientific computing environment, will accentuate these difficulties. Moreover, even though researchers active in scientific computation at universities with powerful computer science departments may find it feasible to acquire bits and pieces of numerical analysis and of computer science by osmosis from colleagues as this information is needed, the link between computer science and research laboratories carrying out major scientific computations is growing weaker, and the link with industry weaker still. m us, unless training in scientific computing is progressively strengthened as the demands on its practitioners continue to mount, there is reason to fear that industry's ability to carry out

66 large scientific computations effectively will be injured, and that major laboratories may also suffer, though to a lesser extent. Finally, the scientific computing community may find itself unable to contribute adequately to the complex of architectural decisions concerning highly parallel machines, distributed microprocessor systems, and large vector processors, that will need to be made during the next few years. Two initiatives would help to alleviate the problems we have described. The first would be to accelerate and structure training in numerical scientific computing by establishing a few model training programs specifically oriented to this field at one or two universities with strong numerical analysis groups and a tradition of large-scale scientific computing. The second is to ensure access for university scientific computing groups to the most powerful computers available at any given moment. Such access is particularly necessary if these groups are to contribute the new techniques and algorithms needed to optimize the use of new machines likely to flow out of the present technological ferment.

Paper 8 RESEAP~H IN DATA PROCESSING: THE PRI - CY OF PRACTICE Data processing can most appropriately be defined as that branch of computing that is principally concerned with data-intensive applications, i.e., situations in which a data base is the central organizing component for the application as a whole and in which the amount of computation per unit of data is typically small. Histori- cally, the term data processing has been associated with Business applications,. such as basic accounting systems. More generally, data processing is concerned with the use of the computer to support information systems in which the computer serves principally as a repository of information to support the operations, the control, or the planning of an organization or enterprise. The characteristics of data processing systems have changed over time, evolving from isolated programs based on files having rather elementary structure to integrated large-scale information systems based on shared data bases. But the importance of this activity to the field of computing as a whole has remained unchanged. Data processing accounts for by far the greatest proportion of actual computer usage; as organizations become more complex and the demands on them more exigent, data processing software will have to continue to evolve and advance. However, despite the prominence of this pragmatic subfield of computer science, comparatively little research (as opposed to product development) has focused on the problems characteristic of data processing. Very little of this research is classifiable as basic, and of the little research devoted to this field, even less can be counted as truly effective. A quick overview of the development of the data processing field will confirm this assertion. Major milestones include the following: 1. Advances in programming languages, especially COBOL, and more recently a number of higher-level or procedural languages such as NOMAD. 2. m e evolution of the concept of a data base management system and of the technology underlying it, together with such ancillary notions as query languages and data dictionaries. 3. The emergence of decision support systems combining facilities for retrieval analysis and manipulation as a means of providing decision makers with tools to support them. 4. A variety of application development systems, ranging from transaction processors to automatic documentation facilities, which 67

68 simplify and streamline the process of building data processing application programs. By and large, these major advances have not resulted from research programs conducted either in universities or in industrial research centers. Rather, as detailed below, they have grown out of field practice. That is, advances in data processing have typically been generated by practitioners with very immediate problems seeking incrementally improved but still feasible techniques for coping with them. For example, COBOL evolved from earlier programming languages for business applications. Mark IV and a number of file management systems are descendants of a sequence of systems developed by a number of commercial organizations, and most current data base management systems (especially IMS and the CODASYL family of systems) were initially developed by user organizations rather than by farsighted hardware or software systems vendors. The leaders in data processing innovation tend to be those groups and organizations most directly and broadly in contact with real field requirements. As noted, research concerned with several subareas of data processing is not entirely absent. In particular, research in data base management systems has been pursued at a number of universities, although, as already noted, the initial identification of this whole field must be credited to industrial and commercial users. Subse- quently, several universities (most notably the University of Michigan and the University of Pennsylvania, but others as well) have built up a substantial tradition of data base research. Similarly, a number of industrial research laboratories (most notably the IBM San Jose Laboratory and the Computer Corporation of America) have contributed to this area. By now, formally organized data management research has grown to the extent that it now has its own journal, the ACM Transactions on Database Systems. Several annual conferences for the presentation of research results have been organized, and a thriving research community has grown up. However, with a few exceptions, the impact of this research on data base management systems used in the field has been quiet and small. The most noteworthy case in which theoretically minded researchers have contributed techniques of real practical interest is in the development of the B*-tree and of various other closely related data access structures. Though its history can be traced back even further, the B*-tree as currently known traces back to initial work by Bayer and McCreight, one from the industrial and one from the academic research community. The superiority of the techniques they introduced was recognized at once, and the B*-tree idea was rapidly and widely disseminated to become the file structure of choice in modern data base systems. Another data management research theme, worked out by theoretically minded researchers during the 1970s, is that of the relational data base model. This idea, first proposed by Cod d of IBM research in 1970, seeks to simplify the user's view of the data base and the access languages with which he utilizes and manipulates it. However, the first relational systems are just now beginning to appear

69 on the market; the full impact of this research will probably not be seen for another year or two, and its pragmatic value remains to be proven. Research beyond the two efforts cited, some of which is having a measureable impact on the world of operational data base systems, could be adduced, for example, studies of the design of new query languages, methods for modeling the performance of data base access methods and file structures, and studies of the formal semantics of data base systems. Some of this work can even be characterized as relating to the deeper semantics of data base systems. However, what has charac- terized a trend in recent research in data base management is its distance from the world of applications. It is as though the research community has seized on the concept of data base management as an important issue worthy of study in its own right, and then proceeded to investigate it from purely aesthetic points of view having little to do with the problems that concern data base users. Increasingly ratified, much of this research has lost its vitalizing contact with the practical world of data processing that motivated data base systems in the first place. Other research programs originally inspired by the pragmatic concerns of the practical software builder can claim even less success when measured in terms of impact on applied research and development. Much of the enthusiastic academic work in automatic programming, i.e., the numerous attempts to develop systems with embedded expertise that would enable data applications programs to be built more easily, seems to have miscarried. Despite major funding from the Defense Advanced Research Projects Agency, this research did not achieve any major results in the area that it was addressing (although some potentially important groundwork was laid for subsequent research in natural language processing and formal program specification techniques). A second area of research concerning which a similarly pessimistic conclusion seems justified is found in the area of decision-support systems. Little has been transferred from this research to commercial environments. The main value of this research seems to have been in promoting the notion of decision-support systems as an important direction for data processing and information systems. However, the concept of decision-support systems did not in fact originate in research centers such as the MIT Sloan School of Management and the Wharton School, both of which have been active in this area. Rather, the decision-support notion was initially promulgated by innovative user organizations that were seeking to provide decision makers with more ready access to information. The role played by the research groups was the narrower one of identifying, conceptualizing, formal- izing, and disseminating this innovation, which, like many other innovations in the data processing field, grew directly out of the pressures of practice. We can summarize the preceding observations as follows. Research in data processing exhibits a different pattern from that observed in most other areas of science and many other areas of computing. The research laboratories are not in the position of making independent discoveries that the rest of the world then adopts. Rather, practi-

70 tioners are the driving force in the field in terms of identifying the new problems worthy of solution, finding innovative solutions for them, and applying ideas once they have been codified by the research community. The role played by the research community is that of an intermediary: researchers formulate perspectives covering the field as a whole and advertise new and important intellectual trends. When research laboratories working in data processing become disconnected from the practical world of data processing and lose touch with practi tioners, who are both the source and the consumers of the researchers' general formulations, their work falls prey to an occupational hazard: it becomes artificial and irrelevant to practice. Such research can often be the most appealing "academically," in the narrow sense of precision and formality. However, it is often research that bears little fruit and is destined to have little impact. These observations suggest that for the conduct of effective - research in data processing, close working relationships between the practical world of data processing and the research community, including personnel transfer, consulting, etc., are an absolute necessity. The vital advantage of a software builder in the field is that direct pressures tell him what commercial end users want and need to get the job done.

Paper 9 1 SOFTWARE DEVELOPMENT THE PROBLEM The history of software development over the past 30 years has led to the gradual and painful realization that programming a computer is inherently difficult. m at realization is now deeply embedded in the consciousness of computer scientists and is one factor that separates the professional from the amateur programmer. A computer program is one of the most complex artifacts ever created by mankind. Even a simple, several-hundred-line program can contain as astronomical number of control paths; many programs contain hundreds of thousands of lines, each of which must be exactly correct or the program will eventually fail. Experience has demonstrated the enormous difficulty of precisely specifying, designing, implementing, and validating programs, and in managing programming projects. m e problem of efficiently producing reliable software has been recognized for many years by industry, universities, and government funding agencies. ~ ~ ~ Although significant resources have been expended on research in this area, the results--as measured by improved programmer productivity and program reliability--have been disappointing: · The productivity of software implementors has been increasing at less than 3 percent per year. · Typical software released from the quality control operations of military contractors contains one or two bugs per hundred lines of code and one or two serious bugs per thousand lines of code. RESEAL STY In one important sense, virtually all research in computer science is related to software development. Research results in any area of computing usually imply the ability to produce better software in that area. For example, programmers familiar with research results in data bases or compiler design can implement programs in those areas more rapidly and with fewer errors than someone without that knowledge. Often research results can be encapsulated in software packages, such 71

72 as compilers or data base managers, so that even programmers unfamiliar with the results can use them in programs. A strong case can be made that most of the productivity and reliability gains over the past 30 years have come from this source--that is, from advances in the science of programming. Our main concern here, however, is with software tools, programming methodologies, management approaches, software development systems, and computing environments, all of which can also lead to improved productivity of programmers and improved reliability of programs--this time from advances in the art and the engineering of programming. Any attempt to list the concepts of this type that have had major effects on productivity is certain to be controversial, but would undoubtedly yield a short list. One such list is as follows: 1. High-level languages, starting with FORTRAN in the late 1950s, evolving through several generations of new languages, and leading to present languages such as PASCAL and ADA. The resultant productivity increase is due to the fact that while a programmer can generate the same number of lines of debugged code a day in a high-level language as in a lower level one, one line of code in the higher level language is equivalent to many lines in the lower level language. 2. Interactive computing environments, starting with time sharing in the early 1960s, and leading to video editors and text processing in the 1970s, with the promise of widespread use of personal computer work stations, and integrated development environments in the 1980s. The productivity increase is due to faster and easier interaction between the programmer and the computer. 3. Programming methodologies such as top-down design and structured programming, large portions of which have become part of the programming culture and are often used without much thought as to their origin. The productivity increase is due to increased discipline in the design and coding process. 4. Management methodologies such as the use of written require- ments, specifications, designs, and test plans. The productivity increase is due to increased discipline in the programming project and increased control by the project manager. Several lessons can be inferred from this brief history: 1. Important research in software development has been done in both universities and industry. (FORTRAN was developed in industry, PASCAL at a university, time sharing at universities, and the personal computer work station in industry.) 2. The major research results have been obtained by groups who were living through real problems in development software. Time sharing was developed by universities who needed an environment for student programming; management methodologies were developed by industrial groups who were having difficulties in managing large software projects. 3. Major portions of the research in this area-are experimental in nature and require the availability of experimental state-of-the-art

73 hardware, which may appear at the time to be uneconomical. (Initially, time sharing was thought to be a wasteful use of resources, the original work on personal computer work stations was done when such work stations were too expensive to be routinely supplied to all programmers.) CONCLUSIONS Al CONNATIONS Research in software development has been disappointingly slow in yielding results that significantly improve programmer productivity and program reliability. Nor is there any reason to expect that imminent breakthroughs will significantly increase the historic 3 percent annual productivity improvement. Nevertheless, research in this area is of utmost importance to the national economy. Andres Grove, president of Intel, has estimated that to program all the microprocessor applications that will exist in 1990 would require over a million programmers, if we were forced to use today's programming technology. It is crucial that research be conducted with the goal of making important advances in that technology. The best use of funding resources in the 1980s may be for the following: 1. Increased exposure of both academic and industrial computer science researchers to real-world problems of software development. 2. Continued revitalization of experimental computing facilities both in industry and in universities. 3. Support of significant experiments with promising software productivity tools as these emerge. Important results are most likely to come from researchers who understand the real-world problems and have the experimental facilities necessary to solve them. History suggests that advances in software development naturally occur in certain kinds of environments sites that face pressing problems in constructing software that they require and that have the resources (both in computational ability and in personnel) to devote to the problem. Rather than mounting major formal research projects to solve the "software problems we would be better advised to seek out these kinds of environments, provide them with the needed resources, and allow natural processes to take over.

Next: APPENDIX: INDUSTRIAL ATTITUDES TOWARD RESEARCH »

Roles of Industry and the University in Computer Research and Development (1982)

Chapter: A REVIEW OF NINE CHARACTERISTIC SUBAREAS OF COMPUTER SCIENCE

Welcome to OpenBook!

Get Email Updates