1
What Is Experimental Computer Science and Engineering?

Computer science is younger than most academic disciplines, and its partitioning into experimental and theoretical components has occurred more recently still. As used in this report, experimental computer science and engineering (ECSE) refers to the building of, or the experimentation with or on, nontrivial hardware or software systems. ECSE as a field is a constituent of the intellectual revolution begun by the invention of the electronic computer in the 1940s and its subsequent commercialization.

Although the building of the first (experimental) computers fits the definition given above, the field did not emerge until perhaps the mid-1960s as an identifiable research discipline distinct from either numerical computation or what has since become known as theoretical computer science. With the need for ECSE faculty greatly exceeding the supply, the Feldman report1 in 1979 identified the limited availability of computer hardware as the principal constraint on the production of Ph.D.s in the field. The National Science Foundation (NSF) initiated the highly successful Coordinated Experimental Research (CER) program as a remedy. Today, many schools on the

1  

Feldman, Jerome A., and William R. Sutherland. 1979. "Rejuvenating Experimental Computer Science," Communications of the ACM (September):497–502.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers 1 What Is Experimental Computer Science and Engineering? Computer science is younger than most academic disciplines, and its partitioning into experimental and theoretical components has occurred more recently still. As used in this report, experimental computer science and engineering (ECSE) refers to the building of, or the experimentation with or on, nontrivial hardware or software systems. ECSE as a field is a constituent of the intellectual revolution begun by the invention of the electronic computer in the 1940s and its subsequent commercialization. Although the building of the first (experimental) computers fits the definition given above, the field did not emerge until perhaps the mid-1960s as an identifiable research discipline distinct from either numerical computation or what has since become known as theoretical computer science. With the need for ECSE faculty greatly exceeding the supply, the Feldman report1 in 1979 identified the limited availability of computer hardware as the principal constraint on the production of Ph.D.s in the field. The National Science Foundation (NSF) initiated the highly successful Coordinated Experimental Research (CER) program as a remedy. Today, many schools on the 1   Feldman, Jerome A., and William R. Sutherland. 1979. "Rejuvenating Experimental Computer Science," Communications of the ACM (September):497–502.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers Forsythe list2 have one or more experimentalists, and more than half of the members of some departments are ECSE faculty. Experimental computer science and engineering programs at U.S. universities differ significantly. Some schools, typically those with large groups of experimentalists, are quite able to build and maintain the faculty and infrastructure necessary to conduct significant experimental research. Highly trained graduates are produced, and technology is created that is valuable to national competitiveness as well as important for its scholarly content. Junior faculty are mentored to be successful at promotion time. At these schools, the tenure and promotion process seems to accommodate the particular characteristics (enumerated below) of experimental research that complicate an academic ECSE career. However, a much larger number of schools—perhaps characterized by their smaller size or accidents of their development—have few experimentalists and little experimental activity under way. These schools include some that otherwise have strong reputations. Many such schools present a career environment that new assistant professors in ECSE often perceive as difficult or hostile. Many ECSE faculty at these "nonexperimental" schools—untenured and tenured alike—described for the committee the difficulties of creating and maintaining research environments appropriate for their needs; further, they reported a strong belief that promotion required them to do theoretical research as assistant professors in order to gain the respect of senior faculty and produce enough publications to meet the tests of the "paper counters" within the department, college, and university. To the extent that these latter views are valid, there are several major implications. In the absence of a fair and balanced academic reward system for ECSE faculty, promising and talented experimental computer scientists and engineers may well forsake academic life in disproportionate numbers, leaving an academic community unduly weighted toward theoretical work and increasingly irrelevant to computing practice. Such a shift would serve students poorly, since a balanced education includes instruction in the state-of-the-art technologies that are essential for productive careers. Such instruction is most effectively provided by faculty engaged in cutting-edge experimental research. Moreover, because academic research in ECSE is critical to the continued technological preeminence of the United States 2   The set of academic departments offering a doctorate in computer science or computer engineering is known in the field as the "Forsythe list" after the late George Forsythe, a professor of Computer Science at Stanford University, who originally compiled it.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers in computer hardware and software, the competitive advantages that derive from such research may be dulled if that research infrastructure is weakened. ORGANIZATION AND SCOPE OF THIS REPORT The remainder of this chapter sets experimental computer science and engineering in context, defines it, describes defining characteristics of its research methodologies, and then elaborates on the need to conduct ECSE research at universities. Chapter 2 describes the requirements of academic career development in ECSE, along with the infrastructure and support needs of experimental computer scientists and engineers. Chapter 3 deals with the educational dimensions of academic ECSE. Chapter 4 addresses both the philosophy of evaluating ECSE research and some of the practicalities related to such evaluation. Chapter 5 describes the committee's judgment about the characteristics of a positive environment for academic ECSE. Chapter 6 describes the special needs and concerns of non-doctorate-granting and less recognized institutions. Chapter 7 provides the key findings and recommendations of this report. The scope of this report was limited by the committee's charge and the resources available to the committee. In particular, no systematic attempt was made to survey the entire computer science and engineering community or to characterize all institutions in which ECSE research is undertaken. Rather, the committee developed its insights through its own deliberations and its informal, although extensive, contacts with other experimental computer scientists and engineers. As a check on its insights and conclusions, the committee made substantial use of several informal surveys sent to the ECSE community with the cooperation of the Computing Research Association (CRA);3 these surveys are referred to collectively throughout this report as "the CRA-CSTB survey." Appendix A describes the survey instruments in more detail. Appendix B provides a quantitative comparison between two modes of publication: journals and conferences. The committee did not examine in detail analogous problems in other disciplines. It may be that many of the problems, real or perceived, discussed in this report are also encountered by biotechnologists, 3   The Computing Research Association is a service organization for the computer science and engineering (CS&E) research community. It is supported primarily by academic CS&E departments, whether doctorate granting or not.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers materials scientists, and other academic researchers in scientific or engineering disciplines closely linked to practice. Some comparisons with other fields are given later in this chapter. COMPUTER SCIENCE AND ENGINEERING To set the context for defining experimental computer science and engineering in detail (see following section), it is necessary to consider the field of computer science and engineering (CS&E) as a whole. Defining computer science and engineering is not a trivial matter, and multiple definitions exist. Two are considered from the point of view of experimentation. A recent CSTB report, Computing the Future, characterizes CS&E as a field.4 Although it does not explicitly identify CS&E's experimental components, experimentation is indeed an important aspect of many of the field's subdisciplines. Computing the Future defines the essential intellectual content of the field in the following terms: [C]omputer scientists and engineers focus on information, on the ways of representing and processing information, and on the machines and systems that perform these tasks. (p. 19) It then identifies the key intellectual themes of the field as algorithmic thinking, information representation, and computer programs. Furthermore, the report cites accomplishments in five subdisciplines: (1) systems and architectures; (2) programming languages, compilers, and software engineering; (3) artificial intelligence; (4) computer graphics and user interfaces; and (5) algorithms and computational complexity. Although each of these subdisciplines has an experimental and a theoretical component, all but the last have been domains of intensive experimental research. Another description of the field, ''Computing as a Discipline'' (also known as the Denning report), takes a different approach to defining the field, which has both strengths and flaws with respect to experimental research.5 The report enumerates subareas of the field, which is both inclusive and respectful of natural partitionings of the discipline (Box 1.1). The report then classifies the content of each of 4   Computer Science and Telecommunications Board. 1992. Computing the Future. National Academy Press, Washington, D.C. 5   Denning, Peter, Douglas E. Comer, David Gries, Michael C. Mulder, Allen Tucker, Joe Turner, and Paul R. Young. 1989. "Computing as a Discipline," Communications of the ACM 32(1):9–23.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers BOX 1.1 A Taxonomy of Computer Science Algorithms and data structures Programming languages Computer architecture Numeric and symbolic computation Operating systems Software engineering Databases and information retrieval Artificial intelligence and robotics Human-computer interaction SOURCE: Denning, Peter, Douglas E. Comer, David Gries, Michael C. Mulder, Allen Tucker, Joe Turner, and Paul R. Young. 1989. "Computing as a Discipline," Communications of the ACM 32(1):9–23. the subareas via "three basic processes—theory, abstraction, and design—that are used by the disciplinary subareas to accomplish their goals." However, this tripartite classification ignores the role of experimentation. For example, either by dictionary definition or common usage in the field, performance evaluation—the activity of understanding how well hardware or software systems perform—would not be described as theory, abstraction, or design. It is experimentation, and it is an important subdiscipline of ECSE. This and other experimental topics cannot be found in the taxonomy. Indeed, constructing artifacts, a defining property of ECSE as explained below, is not included in the trinity. The Denning report must continually append to the design component the phrase "and implementational issues" to make the connection to constructing artifacts. A formulation recognizing an experimental and a theoretical component in each topic area might have been more descriptive of the field. For the purposes of this report, CS&E is defined by the above-quoted "representing and processing information" definition from Computing the Future. The topic areas of Computing the Future or the Denning report, broadly construed, can serve as a high-level decomposition of the field. Further, most phenomena in CS&E are being studied by using both theoretical and experimental methodologies.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers Partitioning the field into experimental and theoretical topics is therefore difficult.6 To the extent that any research meets the definition of ECSE in the next section, the findings of this report apply. DEFINING CHARACTERISTICS OF ECSE Experimental computer science and engineering was defined in the opening paragraph of this chapter as "the building of, or the experimentation with or on, nontrivial hardware or software systems." Although sufficient for an introduction, this one-sentence definition is not precise enough to describe the field exactly. Better definitions would use terms such as computational artifact that presuppose an understanding of the field that would obviate the need for a definition. Moreover, they would assume an appreciation of subtle distinctions such as the differences between the nature of experimentation in ECSE and experimentation in physics or biology. Thus, a detailed description of ECSE's characterizing features is presented before returning to the matter of a succinct definition at the end of this section. In addition, because subtle distinctions can be appreciated only after the overall context is understood, the discussion of artifacts is presented in a general fashion, with additional fine (but still important) points raised in the section immediately following this. Experimental computer science and engineering is defined here in terms of six essential characteristics that, if not unique individually, collectively define a unique field of intellectual depth directed toward understanding diverse phenomena. Perhaps the most critical property is that ECSE focuses on computational artifacts such as hardware or software systems. However, to introduce artifacts properly requires that the synthetic nature of the discipline be treated first. After its synthetic nature is discussed and artifacts are introduced, four properties of experimental research are covered—the complexity of artifacts, their dependence on technology, the universality of the phenomena, and the nonreliance on theory. 6   In Computing the Future (1992, p. 194), the definition of theoretical computer science is discussed in "A Note on Terminology." Although the assertion that theory is usually construed too narrowly is certainly true, the revised definition, "all nonexperimental work in CS&E intended to build mathematical foundations and models for describing, explaining, and understanding various aspects of computing," is somewhat vague because experimentation could equally be defined as ''all nontheoretical work ...." In fact, formulating models for describing, explaining, and understanding various aspects of computing in whatever form is fundamental to the field as a whole, whether the research is experimental or theoretical in nature.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers ECSE Is a Synthetic Discipline Experimental computer science and engineering shares with other branches of CS&E the fact that it is largely a synthetic discipline. That is, the phenomena studied by most practitioners have been created by a person rather than being "given" by nature.7 There are fundamental truths, just as there are in mathematics, but computers and information processing are entirely the creations of human beings. This synthetic quality comes into contact with physical phenomena at the extremes of the field (e.g., in metal-oxide semiconductor (MOS) technology for chips or in properties of light reflectance in graphics), but generally the subject matter is synthetic. With few direct physical constraints, the practitioner has wide latitude to be creative. This possibility of being very imaginative and then implementing one's ideas in a working computer system is one of the exciting rewards of ECSE. However, the synthetic property also introduces complications into the field that may not be encountered by researchers in other fields. Examples described below include difficulties in conveying the intangible qualities of very creative research, complications in assessing the contribution embodied in an artifact, and the interrelatedness of experimental systems. The less constrained quality of this synthetic discipline can be at once liberating to the imagination and at odds sometimes with the traditional assumptions of academic career development in the sciences. In later chapters, this point is discussed further. ECSE Focuses Primarily on Artifacts An artifact in ECSE is an instance or implementation of one or more computational phenomena. Because the phenomena being studied are processes, algorithms, mechanisms, and the like that manipulate and transform information, the artifact will embody such manipulations and transformations. Examples of artifacts are hardware sys 7   Computer science and engineering is synthetic in the sense of being man-made. Another sense of "synthetic"—the composition of parts or elements—also applies to the field, as it does to other engineering disciplines, to describe its focus on the organizing principles of systems. The man-made sense is the more significant in terms of characterizing the field. See Simon, Herbert. 1969. Sciences of the Artificial. MIT Press, Cambridge, Mass. Additionally, the committee recognizes the philosophical debate over whether mathematics is invented or discovered. However, a resolution or a detailed discussion of that subject is not appropriate for this report.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers tems (such as computers) or software systems (such as text editors). The artifact can be the subject of study, the apparatus with which to conduct the study, or both. It often embodies a substantial portion of the intellectual contribution of experimental research, and its creation represents a significant intellectual effort. The term artifact is often used here and in the field synonymously with electronic hardware or software systems, but it should be construed much more broadly. Thus, in addition to hardware systems such as computers, chips and circuit boards, and software systems, including compilers, editors, expert systems, computer-aided design (CAD) tools, and so on, experimentalists would likely include graphic images and animations, robots, certain hard-to-construct data files including multiprocessor execution traces, test and benchmark suites such as the International Symposium on Circuits and Systems (ISCAS) suite, structural descriptions such as the Utah Tea Pot, and so on.8 Other things that experimentalists build and study might be classified as artifacts by some, but not all practitioners would agree. Programming languages, architectures, protocols, and methodologies, such as object-oriented programming, the spiral approach to software development, and domino logic, are examples. The definitional issue in some cases derives from the fact that certain artifacts (e.g., an interpreter for Pascal) are implementations of abstractions (e.g., the Pascal programming language), which in turn could be thought of as implementations of some still more abstract concepts (e.g., procedural imperative programming languages). What is the concept and what is the instance? It may depend on how abstractly one thinks; Table 1.1 illustrates one mapping between ideas and artifacts. However, it is unnecessary for the purposes of this report to be perfectly definitive, because the problems outlined below pertaining to the creation and study of artifacts apply to both the narrow and the broad interpretations. Artifacts are central to ECSE because the phenomena studied—processes, algorithms, mechanisms, and the like—transform information, describe particular behaviors in response to inputs, and generally are very complex in terms of the total number of constituent parts. Such characteristics almost always overwhelm our ability to understand them by direct analysis. They are simply too complex. Moreover, what is often important is the interaction of the parts (i.e., 8   The ISCAS suite is a set of circuits designed and used to test computer-assisted tools for chip design. The Utah Tea Pot is a geometric description of a china teapot with lid that is used as a standard picture for graphics-rendering algorithms.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers TABLE 1.1 Mapping Ideas to Artifacts Idea Artifact Extra registers to speed context saving on procedure calls Microprocessor chip with "register window" and a compiler that generates code to use them Avoiding global clock synchronization to speed simulation of parallel machines A "conservative" simulation that advances individual clocks independently, but only if all earlier events have been completed   An "optimistic" simulation that advances individual clocks independently but "rolls back the clock" when it turns out that clock advance was inappropriate Better software engineering methodology Tools to enforce the methodology Building programs using this methodology their dynamic behavior). Both the large number of program or hardware states and the temporal characteristics of the interaction exacerbate the problem of predicting how well a given computational idea will perform on the basis of a purely logical or theoretical analysis. Consequently, the processes, algorithms, and/or mechanisms must be implemented so that the behavior of the system and the interaction of the components can be observed in action. Artifacts serve three easily identifiable roles in ECSE research, although there are probably others. The somewhat cumbersome but perhaps suggestive names, proof of performance, proof of concept, and proof of existence, will be used for these roles. (Examples are given in Box 1.2, and further discussion of these roles follows.) Proof of Performance An artifact acting in the proof-of-performance role provides an apparatus or testbed for direct measurement and experimentation. The artifact exists or can be constructed, and the results produced are usually quantitative. This is perhaps the most typical artifact of ECSE research. A good example of an artifact in a proof-of-performance role is the peephole code optimizer.9 Fraser observed that compilers (i.e., 9   Fraser, Christopher W. 1979. "A Compact, Machine-Independent Peephole Optimizer," pp. 1–6 in Proceedings of the Sixth ACM Symposium on the Principles of Programming Languages, Association of Computing Machinery, New York.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers BOX 1.2 Examples of Artifacts in Different Roles Proof of Performance The original RISC prototypes were built to demonstrate the performance and implementation advantages of what became known as reduced instruction set computers. These implementations were critical to verifying the claimed advantages of the approach. The Sprite operating system was designed to demonstrate several improved techniques for implementing a parallel and distributed operating system, such as the efficiency of new methods of caching and the effectiveness of log-structured file systems. Proof of Concept Cosmic cube was one of the first successful multiprocessor computers using off-the-shelf microprocessors. This machine demonstrated that a range of open problems in building this type of multicomputer were surmountable and that the resulting machines could be used as computational engines in several important scientific problems. Geometry engine demonstrated that it was possible to use very large scale integration (VLSI) technology to build a much lower cost implementation of what had been very expensive 3-D graphics hardware. Ethernet demonstrated the feasibility of building a local area network with good performance and low cost. Information hiding demonstrated improvements in the modularity of programs when the internal structure of those programs was completely concealed from and inaccessible to programmers not responsible for that structure. The cut-copy-paste desktop metaphor demonstrated its usefulness in thousands of applications for nonprogrammers by drawing close analogies between computer manipulations of information and more familiar scissors-and-tape operations on paper-based text. Proof of Existence ALTO was the first personal computer/workstation to unify several technologies: a bit-mapped screen, a mouse, a local computer, and a local area network. This integrated machine was truly different from computers that had been built before, and it had a profound impact on the development of both workstations and personal computers. Simula was the first language to introduce the ideas of encapsulation and data abstraction. As such, it created a new set of language features and demonstrated their usefulness. The ideas of Simula have had a profound impact on language design.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers programs that translate high-level computer languages into the binary instructions that are executed by computers) often generate redundant instructions, such as loading data into a register when the register already contains those data. This redundancy is the result of the compiler's strategy of translating high-level language statements one at a time. He conjectured that by examining the generated instructions a few at a time (i.e., through a "peephole"), a program optimizer could eliminate many redundancies and thus speed up processing time. Fraser added an optimizer to an existing compiler and discovered that enormous improvements were possible. Indeed, his optimizer was so successful at removing unnecessary instructions that the time saved in not having to perform other types of processing on those instructions exceeded the time needed to perform the optimizations (i.e., the savings exceeded the cost, and the optimization introduced "negative overhead"). Proof of Concept An artifact acting in the proof-of-concept role demonstrates by its behavior that a complex assembly of components can accomplish a particular set of activities, behavior that could not be argued simply by logical reasoning or abstract argument from first principles. To illustrate by analogy this role in helping ECSE researchers understand complex systems, imagine that frogs did not exist but were being created for the first time. Could a credible case for a frog be made without exhibiting one? The image-processing capability of a frog eye might be describable; the muscle structure of the mouth and tongue, and its actuation by the nervous system, could be comprehensible; and the mechanisms of the nervous system and brain could be explained. Yet using logical argument alone to convince a skeptic (the standard of science) that this collection of mechanisms and processes could catch a passing fly would be inconceivable. The dynamic behavior of this system whose interacting parts can exhibit a multitude of states or configurations is too complex to be defended convincingly by direct analysis. Proof in terms of a demonstration is necessary. The working system, the artifact, is a witness "proving" that the concepts in at least one configuration are correct. Experimental computers are good examples of proof-of-concept artifacts. A key problem to be solved in parallel computing is how the processors can avoid long delays in accessing or referencing memory. Solutions to this problem have motivated many different computer designs serving proof-of-concept roles. One design uses caches (i.e.,

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers when the computers that perform the compress/decompress operations are so slow that it would be faster simply to transmit the data uncompressed; however, when computer speeds improve faster than data transmission rates, then the compress-transmit-decompress approach might pay off. In the 1960s, the high cost of individual logic gates made logic gate minimization an important problem in circuit design; the advent of integrated circuits made gates so cheap that minimization became unnecessary. Then in the early 1980s, when chip area was at a premium, area minimization became a hot theoretical topic, only to have the subsequent submicron feature sizes of semiconductors render this a problem of negligible importance. The use of a cutting-edge technology for a project potentially subjects it to the hazards posed by such technologies (e.g., instability, errors, and delays), whereas using a stable, mature technology carries with it the risk of obsolescence before the project is finished. Reliance on technology is probably obvious for many areas of ECSE, such as hardware and architecture, graphics, and communications. Indeed, it is the advances in technology that often open new research opportunities in ECSE. Examples include small disks and redundant arrays of independent disks, very large scale integration (VLSI) technologies, and reduced instruction set computers (RISCs). Software is critical too, often in the form of CAD tools or other facilities to aid in managing complexity. Experimental software artifacts require significant software technology infrastructure, although this is not widely appreciated. Such software takes different forms, including developmental tools such as advanced programming languages or "tool kits," subsystems for performing sophisticated kinds of analysis such as dependence analysis or symbolic expression transformers, and standard "parts" such as symbol tables, YACC, and window systems. Such dependence on technology means that nearly all experimental software systems rely on many components (modules and subsystems) that are peripheral to the specific experiment, with a corresponding increase in system complexity. It is not possible or wise for the experimentalist to create all of this software a new, and yet for the experiment to be "complete" these components are essential. So the ECSE researcher must acquire this peripheral software and build interfaces between it and the rest of the system. The quality of the resulting experiment depends on the availability and quality of this software as much as it relies on the ingenuity of the experimentalist. The other side to the relationship between ECSE and technology is the fact that in recent years, ECSE research has increasingly been responsible for technological advancement in the computer field. Whereas

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers product development teams for the large computer manufacturers may have been responsible for most of the innovation in the 1960s and 1970s, many widely known computer advancements of the 1980s trace their origins to ECSE research. Examples are known to the average user—RISC processors, networking, window systems, UNIX, relational databases—but many more are hidden inside systems, making them faster, more efficient, or more functional, or they are part of the technological infrastructure that supports the rapid innovation so characteristic of the computer field. Computing Artifacts Are Universal A fifth characteristic of ECSE concerns the fact that computers are malleable and versatile. Unlike other machines, computers are universal, which means that within broad limits, whatever one machine can do, all machines can do. Whereas a washing machine only washes and a coffee grinder only grinds, virtually all computers are capable of word processing, circuit optimization, processing spreadsheets, simulating galaxies, and so on. The speed of the processor, its memory size, or the suitability of its peripheral devices (e.g., its monitor) may make the performance of such tasks impractically slow or cumbersome, but they are possible in principle. Although this property is extremely convenient in many respects, it introduces a serious complication: there is no a priori limit on the functionality of computers, which leads to ever-expanding expectations for the capability of artifacts. Because there is no reason in principle that the functionality of a previous artifact cannot be incorporated into or used in conjunction with an artifact currently under development, the expectation generally is that it must be. A familiar computing application, document preparation, illustrates this phenomenon, but it occurs widely in more technical situations: the original text editors, which simply allowed the easy creation of files of characters, soon became ''word processors" with the addition of sophisticated formatting and laser printing capabilities. Then spell-checking was added, as were the creation and incorporation of graphics. Tools now check for grammatical errors and poor writing style. All of these systems do something that their predecessors did not, such as treating the text as a document with footnotes (formatting), or as English words (spell-checking), or as meaningful English sentences (grammar checkers), or as collections of lines, regions, and planes (graphics). Such demands for increasing functionality result in a steady increase in the complexity of new computer systems, and they stand in sharp contrast to the no-

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers tion of improved utility, in which a new word processor might simply "do better" that which was done before (e.g., a text editor that makes smaller files of compressed text or allows more sophisticated text substitutions). There is a second consequence as well. Expectations for increasing functionality can affect other systems that serve similar purposes. For example, spell-checking is a standard feature of most word processors today. This leads to expectations that other systems in which text editing is used (e.g., slide presentation systems) should also incorporate spell-checking. However, because slide presentation systems may use an internal representation of words that is different from that used by word processors, this conceptually simple addition to a slide presentation system may be difficult to implement. ECSE Is Not Strongly Coupled to Theoretical Computer Science Experimental computer science and engineering does not depend on an elaborate and formalized theoretical foundation in the same way that, for example, experimental physics can draw on theoretical physics. In physics, the interplay and coupling between experiment and theory are rather tight. Theoretical explanations are found for experimental phenomena and then evaluated on their ability to predict other phenomena. Experiments in physics are designed on the basis of a theory that predicts the phenomena to be observed. "Theory" in computer science is by tradition very close to mathematics. That is, theoreticians in computer science tend to prove theorems, and the standards for demonstrating correctness are very similar to those traditionally used in mathematics. A good deal of modeling work, which in other engineering disciplines might be considered theoretical in nature, is conducted by experimentalists. In other words, good experimentalists do create models and test (reject or accept) hypotheses, all of which might be considered theoretical work but for accidents of history. However, the complexity of the systems built in ECSE and of the underlying models and theories means that experimental implementation is necessary to evaluate the ideas and the models or theories behind them. Consequently, an "experiment" in ECSE usually does not verify any prediction from theoretical computer science or rely heavily on a model developed by theoreticians, although as noted above, good experimental work is grounded in testable models and hypotheses. Experiments are most often conducted to validate some informal thesis derived from a computational model that is informed but not

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers rigorously specified by theory and that may have been developed expressly for the experiment. A useful analogy might be that ECSE is today where experimental aeronautical engineering was when most of the information used in the design of airframes came from wind tunnel studies rather than computational fluid dynamics. Although many of the subareas within CS&E are studied theoretically, research in theoretical computer science is coupled to experimental work only in certain specialized topics in which the idealized problem aligns well with the practical problem. For example, language theory underpins the parsing component of compilation, and complexity theory underpins data encryption. Experimentalists do use theoretical techniques in the conduct of their work. For example, rough estimates of algorithmic complexity are routinely made, and the recognition that a problem is NP-complete directs experimentalists to examine heuristic solutions or redirects the attack toward alternate approaches.12 These tools are valuable and reinforce the theoretical component in the academic curriculum. Theoretical work occasionally motivates experimental work. A particularly nice example is Manber's application of techniques from exact string matching algorithms to create extremely fast and powerful approximate string matching software.13 Although the artifact's new ideas can be cast in theoretical terms, the key accomplishments of the work have been largely experimental—the engineering required to achieve high performance, the experiments on large databases, the performance characteristics of the program, and so on. Experimental work may motivate theoretical work in CS&E. For example, the first routing protocol used in the Arpanet attempted to use load-sensitive routing based on a distributed routing algorithm. In practice, the algorithm led to oscillatory behavior owing to properties of the protocol, the method used for measuring load, and the selection of parameters. In retrospect, computer scientists were able to explain theoretically the reasons for the oscillations, but it was implementation and experimentation that led them to identify and address the question. The complexity of computational systems makes 12   An NP-complete problem is one that takes a very long time to solve as a function of input size. More specifically, so-called computationally tractable problems can be solved in "polynomial time" (i.e., in a time that increases as a function of input size no faster than some power of the input size). NP-complete problems are believed to be computationally intractable (i.e., the solution of an NP-complete problem is believed to require a time that increases more rapidly than polynomially). 13   Wu, S., and U. Manber. 1992. "Fast Text Searching Allowing Errors," Communications of the ACM 35(10):83–91.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers experimentation critical, because it is often infeasible to anticipate all the important interactions and behaviors. Overall, ECSE is not tightly coupled to, or heavily reliant on, theoretical computer science, although the two intermingle at points along their boundary. Experimental exploration is crucial to understanding the terrain of the field and seems to be a precondition to building permanent foundations. One conclusion to draw is that contrary to a widely held assumption (within the field14 as well as outside it), physics is not a good model for the relationship between experimentation and theory in CS&E. The fact that experimentation and theory are today largely independent areas with little interplay introduces the possibility that a computer science or engineering faculty member might not be well acquainted with research methodologies in the "other" specialty. 15 That possibility raises serious concerns about how professional accomplishments are to be evaluated for the purposes of promotion (see Chapter 4). A Succinct Definition of Experimental Computer Science and Engineering With the foregoing background, a succinct definition of ECSE can be formulated: ECSE involves the creation of, or the experimentation with or on, computational artifacts. Artifacts, which are implementations of one or more computational phenomena, generally take the form of hardware or software systems, but the term should be broadly construed. Artifacts are usually complex in terms of both the number and the integration of components, and their creation often requires considerable intellectual effort. An artifact can be the subject of a study, the apparatus for the study, or both. ECSE, being in its early exploratory stage, is not well supported by theory; therefore, experimentation carries different connotations in ECSE than it has in physics, biology, or medicine. 14   One would imagine that this point is known at least to computer science faculty, but in the testimony received by the committee it was asserted by a departmental chair that "experimentalists are verifying the predictions of the theory of PRAMs—parallel random access machines." They are not. Indeed, no experiment has been proposed. Because most theorems about PRAMs assert properties of asymptotic behavior, it is unclear what prediction a finite set of experiments could confirm. 15   Not incidentally, this fact also explains how some departments of computer science can have thriving intellectual programs without a significant representation of experimentalists.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers MORE ON ARTIFACTS Artifacts and the computational phenomena they embody are fundamental to CS&E. Yet perhaps because they have only recently (since World War II) become the subject of academic study and because they are not well treated in terms of their scientific content in secondary schools, college science "literacy" classes, or the popular scientific press, artifacts may not be as widely understood as, say, biological phenomena. This section elaborates further on artifacts and their roles in ECSE. Artifacts often have an "all-or-nothing" quality to them: either all of the components are functioning, and the device works and is suitable for experimentation, or one or more of the components are not working, and the device cannot be used. Examples include computers, robots, operating system kernels, and the like. Most highly integrated artifacts do not run until they are essentially complete. This all-or-nothing property does not concern the question of whether the working parts have "bugs," nor does it concern the ''add-on" components that many artifacts require to be complete. Rather, it concerns the fact that in highly integrated systems the basic "operating cycle" may rely on a large number of components. An analogy would be an aircraft: before a test flight, the airframe and all of the power and control elements must be operational, although the galley need not be. Because computation is synthetic, human creativity can produce phenomena never before imagined; such phenomena are often explained better by demonstration than by description. Early computers themselves (e.g., Anatasoff's machine and ENIAC16) were proofs of existence. Yet much of what is embodied in the proof-of-existence role concerns conveying intangibles. Much of the significance of the mouse or animated graphic images is knowable only through nonverbal channels. It is virtually impossible to write about such intangibles, and so knowledge about them is not easily archived. Computer graphics is a research area that relies heavily on artifacts to convey intangibles. The channel is visual perception, of course. In instances where the subject concerns a single image, the artifact, namely a program on a graphic workstation, creates a still picture. However, for dynamic images, an artifact—either the program running on a graphic workstation or a film of the image sequence—is 16   These were among the first computers developed in the early days of electronic computing.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers essential to illustrating what has been accomplished. It is obvious, for example, that the demonstration of a flight simulator can convey information beyond that provided in a paper about the simulator. Seeing the image develop in time is simply more powerful, more convincing, and more inclusive than describing what it would look like.17 Another instance from the world of graphics would be more "natural" light reflectance in a graphic image. Notice that the "intangible" need not always concern matters of perception. Because "better" could be a matter of personal taste, reasonable observers could disagree on whether this use of a proof-of-existence artifact is a contribution or not. Proof-of-concept artifacts can sometimes be employed to illustrate the creation of hardware or software using a new methodology. An example is the classic work of Parnas on information hiding.18 Here, the artifact is used chiefly to exhibit the properties claimed for the methodology. A human uses the methodology to produce an artifact, and the artifact is assessed. In CAD systems, where an artifact assists in the production of other artifacts, the methodology may be carried out by a combination of human and computer activity. Synthesis systems that produce circuit layouts for silicon chips are an example. The challenge with this use of artifacts often is determining the proper metrics with which to assess the artifact from which inferences can be drawn about the methodology. Simulation is a powerful methodology, and ECSE researchers use simulation extensively. Occasionally it is asked, Why create the artifact at all? Why not simply simulate it? Such questions often apply to computers and chips, both of which are expensive to build physically. Generally, the answer is that the simulator must account for the behavior of so many parts that a simulated artifact runs much more slowly than a real artifact would: an instruction-level simulation of a computer may run 1,000 times more slowly than the computer itself, and chips can require hours to simulate a few nanoseconds of activity. Thus, creating the artifact is essential to getting any quantity of experience with it. Usually, in the first few seconds that these hardware artifacts run they exhibit more behavior than during the months of simulation used in their design. 17   For example, computer simulations of airplanes in flight are used to train pilots to fly real aircraft, but mere verbal descriptions of how to fly airplanes would clearly not suffice. 18   Parnas, David. 1972. "On the Criteria to Be Used in Decomposing Systems into Modules," Communications of the ACM 5(12):1053–1058.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers The performance of artifacts serving a proof-of-performance role is measured in order to understand how well they perform, but the precise numerical values that result from such measurement are rarely "constants" in the same way that the melting point of zinc is a constant. The reason is that different artifacts serving the same function can be expected to have different performance characteristics. For example, had Fraser used a different compiler or a different suite of test programs, his numbers would have been different, because different host compilers will generate code with different redundancy characteristics, and different programs will be optimized by different amounts. Nor is the relative property of "negative overhead" an outcome that must necessarily occur in any reproduction of Fraser's experiment, although this could indeed be the case. The significance of his measurements was, among other things, that they quantified for compiler writers the effectiveness of limited-context optimizations, showing such things as sensitivity to peephole size or optimization type. COMPARISONS WITH OTHER FIELDS Experimental computer science and engineering shares much with other fields of engineering. Engineering disciplines are ultimately concerned with the creation of artifacts (e.g., computer systems, airplanes, power plants, and automobiles) that provide significant practical utility and functionality for human users. A considerable amount of engineering research is devoted to improving artifacts; thus, aerospace engineers try to build better planes, civil engineers try to build better roads, and experimental computer scientists and engineers try to build better computing systems. However, despite the importance of the end user, a great deal of engineering research is devoted to improving these artifacts in ways that are not necessarily obvious to the end user (e.g., the artifact is made easier to manufacture). Economics matters to engineering, because artifacts that are useful to human beings must also be affordable and practical to construct. Thus, one type of achievement in any engineering discipline is the design of an artifact that consumes significantly fewer resources to provide the same functionality as its predecessor. In all of these cases, then, engineers try to create better artifacts (i.e., artifacts that offer greater utility or functionality in the computational setting, or the same utility for a smaller investment of resources). Because what is better is at root a human judgment, engineering—in ECSE as in other fields—often involves a degree of creativity and insight into how an artifact can be made better.

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers ECSE also has marked similarities to other specific fields. Most obviously, ECSE shares the synthetic property with other parts of CS&E and with other fields such as mathematics. The universality of computers influences research in theoretical computer science, although in quite a different way than it does in ECSE. Yet there are similarities in other dimensions as well. For example, materials science and biotechnology are technical disciplines that share with ECSE a close coupling with a rapidly changing technology. This is not to say that other areas of science and engineering do not profit from advances in technology. They clearly do. However, the advances at the frontiers of the discipline are not tied so closely to technology on a day-to-day basis as they are in these fields. An important difference between ECSE and other engineering disciplines is the previously discussed difference between "functionality" and "utility." Experimental computer scientists and engineers are subject to the demands for ever-increasing functionality. By contrast, an aircraft designer strives for greater utility through improved fuel economy, greater safety, quieter operation, and so on, but greater functionality is never at issue. An airplane carries passengers through the air. It need not, for example, navigate through city traffic—a task reserved for taxis because of obvious physical constraints. Although the distinction between greater functionality and greater utility may be a matter of degree rather than one of kind (indeed, flying cars have been built), designers of computing artifacts nearly always have greater functionality as a design option, principally because no physical constraints prevent it. WHY UNIVERSITIES SHOULD PERFORM ECSE RESEARCH Because industrial products grow quite naturally out of the artifacts of ECSE, it could be asked, Why shouldn't ECSE be the exclusive province of industry? The committee has identified several reasons to conduct ECSE research in universities. Perhaps the most significant reason to conduct ECSE research at universities is to ensure adequate intellectual diversity. For 10 technical concepts to reach maturity and be injected into the technology base, "a thousand flowers must bloom." As explained above, it is difficult to predict how well processes, algorithms, and mechanisms work in practice and how well they work together. Artifacts must be built to understand their behavior. Of these research prototype systems, perhaps only 1 in 10 is worthy of advanced development. Of the advanced development systems, perhaps only 1 in 10 is worthy

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers of product development and production. The numbers are not exact, of course, but rather are intended to suggest how commercial benefits result from the fittest survivors in a diverse environment of technical competition. A concept survives when its technical merit justifies the increased costs of the next level of development, perhaps a factor of 10 greater than the previous level. Although few research artifacts become products, far more of them influence the research community, and ideas from them become incorporated in some way in subsequent research. If performing experimental research at universities adds diversity, expanding the number of universities doing experimental research in CS&E further enriches the intellectual environment. The theory community in computer science is strengthened by a diversity of participants, and a glance at any list of participants at the conference on Foundations of Computer Science and the Symposium on the Theory of Computing will reveal a wide variety of institutional homes for those participants. Although it may be unrealistic to expect a comparable diversity in ECSE (because of the infrastructure and resource constraints discussed in Chapter 2), it would be clearly undesirable for only a few academic institutions to perform ECSE research. Individual institutions develop their own research styles and foci, and if the institutions doing such research are too few, important avenues of investigation may be overlooked. Many universities have become more competitive in ECSE research over the past dozen years through programs such as the NSF's Coordinated Experimental Research program, the Defense Department's University Research Instrumentation program, and corporate equipment donations. The success of these programs demonstrates that direct intervention can expand the number of competitive schools. An additional reason for performing ECSE research in the university environment is the enrichment it gives to the educational mission. Faculty engaged in research keep the curriculum fresh and vitalize the content of design courses. They can and must offer graduate students advanced seminars in their research area, which prepares some to conduct research on the topic and educates others. For lowerlevel classes, it is frequently possible to select examples from problems encountered in research that can enliven the topic for both students and teacher. However, the most valuable benefit may be that conducting research keeps a faculty member current. Given its strong connections to technology, CS&E evolves rapidly, and a degree program can quickly become antiquated and obsolete without continual refreshment. Still another reason is that as a discipline, CS&E has close ties to

OCR for page 9
Academic Careers for Experimental Computer Scientists and Engineers industry. The majority of undergraduate computer science majors find employment in industry; about half of the new Ph.D.s in CS&E do so as well.19 Without exposure to ECSE in their formal education, students would be even less prepared than they are now to engage in meaningful careers in industry. As an associate professor at a large private university noted in response to the CRA-CSTB survey (see Appendix A): The practical aspects of compiler optimization are passed on verbally or in the occasional ''engineering oriented" compiler text, while texts that focus on different grammars and parsing algorithms are considered to be more "pure." Likewise, practical software testing methodologies are often given far less discussion than impractical but theoretically elegant methods. In architecture texts, issues such as signal delays, noise, loading, power, and cost usually take a back seat to the intellectually important issues of organization, instruction sets, and theoretical or simulated performance analysis. This bias is a contributing factor to the common complaint among industrial employers that graduates have to be retrained because they have no practical experience. Obviously, academe cannot and should not recast itself into a training ground for industrial employers, but an increased amount of practical and genuine experimentation would be a benefit to the discipline. Finally, a strong experimental component to the research and teaching programs of CS&E departments is a necessary aspect of reaching out to other academic disciplines. As articulated in Computing the Future , the future of the discipline demands in part an attention to problems with relevance to society or to other intellectual domains. None of this argues that universities and industrial research laboratories are equivalent research environments. Industrial laboratories such as those at AT&T Bell Laboratories generally have better resources and fewer distractions, and their efforts are funded by their parent corporations specifically in the hope and expectation that they will lead to competitive advantages in the marketplace. Sometimes they offer unique advantages, such as proprietary technology or access to profiles of (the parent company's) customers' work loads. Universities offer different advantages, including the enthusiasm and imagination of graduate students and a wide freedom to select topics for study. Despite these differences, both settings have produced important experimental research ideas in recent years. 19   Gries, David, and Dorothy Marsh. 1992. "The 1990–1991 Taulbee Survey," Computing Research News 4 (January):8.