statistics . The mathematics of the collection, organization, and interpretation of numerical data, especially the analysis of population characteristics by inference from sampling.1
software engineering . (1) The application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software; that is, the application of engineering to software. (2) The study of approaches as in (1).2
statistical software engineering . The interdisciplinary field of statistics and software engineering specializing in the use of statistical methods for controlling and improving the quality and productivity of the practices used in creating software.
The above definitions describe the islands of knowledge and experience that this report attempts to bridge. Software is a critical core industry that is essential to U.S. national interests in science, technology, and defense. It is ubiquitous in today's society, coexisting with hardware (micro-electronic circuitry) in our transportation, communication, financial, and medical systems. The software in a modern cardiac pacemaker, for example, consists of approximately one-half megabyte of code that helps control the pulse rate of patients with heart disorders. In this and other applications, issues such as reliability and safety, fault tolerance, and dependability are obviously important. From the industrial perspective, so also are issues concerned with improving the quality and productivity of the software development process. Yet statistical methods, despite the long history of their impact in manufacturing as well as in traditional areas of science, technology, and medicine, have as yet had little impact on either hardware or software development.
This report is the product of a panel convened by the Board on Mathematical Sciences' Committee on Applied and Theoretical Statistics (CATS) to identify challenges and opportunities in software development and implementation that have a significant statistical component. In attempting to identify interrelated aspects of statistics and software engineering, it enunciates a new interdisciplinary field: statistical software engineering . While emphasizing the relevance of applying rigorous statistical and probabilistic techniques to problems in software engineering, the panel also points out opportunities for further research in the statistical sciences and their applications to software engineering. Its hope is that new researchers from statistics and the mathematical sciences will thus be motivated to address relevant and pressing problems of
software development and also that software engineers will find the statistical emphasis refreshing and stimulating. This report also addresses the important issues of training and education of software engineers in the statistical sciences and of statisticians with an interest in software engineering.
At the panel's information-gathering forum in October 1993, 12 invited speakers described their views on topics that are considered in detail in Chapters 2 through 6 of this report. One of the speakers, John Knight, pointed out that the date of the forum coincided nearly to the day with the 25th anniversary of the Garmisch Conference (Randell and Naur, 1968), a NATO-sponsored workshop at which the term "software engineering" is generally accepted to have originated. The particular irony of this coincidence is that it is also generally accepted that although much more ambitious software systems are now being built, little has changed in the relative ability to produce software with predictable quality, costs, and dependability. One of the original Garmisch participants, A.G. Fraser, now associate vice president in the Information Sciences Research Division at AT&T Bell Laboratories, defends the apparent lack of progress by the reminder that prior to Garmisch, there was no "collective realization" that the problems individual organizations were facing were shared across the industry—thus Garmisch was a critical first step toward addressing issues in software production. It is hoped that this report will play a similar role in seeding the field of statistical software engineering by indicating opportunities for statistical thinking to help increase understanding, as well as the productivity and quality, of software and software production.
In preparing this report, the panel struggled with the problem of providing the "big picture" of the software production process, while simultaneously attempting to highlight opportunities for related research on statistical methods. The problems facing the software engineering field are indeed broad, and nonstatistical approaches (e.g., formal methods for verifying program specifications) are at least as relevant as statistical ones. Thus this report tends to emphasize the larger context in which statistical methods must be developed, based on the understanding that recognition of the scope and the boundaries of problems is essential to characterizing the problems and contributing to their solution. It must be noted at the outset, for example, that software engineering is concerned with more than the end product, namely, code. The production process that results in code is a central concern and thus is described in detail in the report. To a large extent, the presentation of material mirrors the steps in the software development process. Although currently the single largest area of overlap between statistics and software engineering concerns software testing (which implies that the code exists), it is the panel's view that the largest contributions to the software engineering field will be those affecting the quality and productivity of the processes that precede code generation.
The panel also emphasizes that the process and methods described in this report pertain to the case of new software projects, as well as to the more ordinary circumstance of evolving software projects or "legacy systems." For instance, the software that controls the space shuttle flight systems or that runs modern telecommunication networks has been evolving for several decades. These two cases are referred to frequently to illustrate software development concepts and current practice, and although the software systems may be uncharacteristically large, they are arguably forerunners of what lies ahead in many applications. For example, laser printer software is witnessing an order-of-magnitude (base-10) increase in size with each new release.
Similar increases in size and complexity are expected in all consumer electronic products as increased functionality is introduced.
Central to this report's theme, and essential to statistical software engineering, is the role of data, the realm where opportunities lie and difficulties begin. The opportunities are clear: whenever data are used or can be generated in the software life cycle, statistical methods can be brought to bear for description, estimation, and prediction . This report highlights such areas and gives examples of how statistical methods have been and can be used.
Nevertheless, the major obstacle to applying statistical methods to software engineering is the lack of consistent, high-quality data in the resource-allocation, design, review, implementation, and test stages of software development. Statisticians interested in conducting research in software engineering must acknowledge this fact and play a leadership role in providing adequate grounds for the resources needed to acquire and maintain high-quality, relevant data. A statement by one of the forum participants, David Card, captures the serious problem that statisticians face in demonstrating the value of good data and good data analysis: "It may not be that effective to be able to rigorously demonstrate a 10% or 15% or 20% improvement (in quality or productivity) when with no data and no analysis, you can claim 50% or even 100%."
The cost of collecting and maintaining high-quality information to support software development is unfortunately high, but arguably essential—as the NASA case study presented in Chapter 2 makes clear. The panel conjectures that use of adequate metrics and data of good quality is, in general, the primary differentiator between successful, productive software development organizations and those that are struggling. Traditional manufacturers have learned the value of investing in an information system to support product development; software development organizations must take heed. All too often, as a release date approaches, all available resources are dedicated to moving a software product out the door, with the result that few or no resources are expended on collecting data during these crucial periods. Subsequent attempts at retrospective analysis—to help forecast costs for a new product or identify root causes of faults found during product testing—are inconclusive when speculation rather than hard data is all that is available to work with. But even software development organizations that realize the importance of historical data can get caught in a downward spiral: effort is expended on collection of data that initially are insufficient to support inferences. When data are not being used, efforts to maintain their quality decrease. But then when the data are needed, their quality is insufficient to allow drawing conclusions. The spiral has begun.
As one means of capturing valuable historical data, efforts are under way to create repositories of data on software development experiments and projects. There is much apprehension in the software engineering community that such data will not be helpful because the relevant metadata (data about the data) are not likely to be included. The panel shares this concern because the exclusion of metadata not only encourages sometimes thoughtless analyses, but also makes it too easy for statisticians to conduct isolated research in software engineering. The panel believes that truly collaborative research must be undertaken and that it must be done with a keen eye to solving the particular problems faced by the software industry. Nevertheless, the panel recognizes benefits to collecting data or experimentation in software development. As is pointed out in more detail in Chapter 5, one of the largest impacts the statistical community
can have in software engineering concerns efforts to combine information (NRC, 1992) across software engineering projects as a means of evaluating the effects of technology, language, organization, and the development process itself. Although difficult issues are posed by the need to adjust appropriately for differences in projects, the inconsistency of metrics, and varying degrees of data quality, the availability of a data repository at least allows for such research to begin.
Although this report serves as a review of the software production process and related research to date, it is necessarily incomplete. Limitations on the scope of the panel's efforts precluded a fuller treatment of some material and topics as well as inclusion of case studies from a wider variety of business and commercial sectors. The panel resisted the temptation to draw on analogies between software development and the converging area of computer hardware development (which for the most part is initially represented in software). The one approach it is confident of not reflecting is over-simplification of the problem domain itself.