3
Summary of Closing Session

The workshop finished with a brainstorming session, the goal of which was to identify the important questions and issues raised during the workshop and to determine those that merit further investigation by the committee. The participants reviewed the following aspects of software certification: applications, lessons learned from failures and successes, scope, techniques, process-based and product-based methods, and obstacles to effective certification.

Certification has traditionally been applied to safety-critical infrastructure that is procured or regulated by the government. Participants wondered whether it should be applied more widely. Some suggested that even consumer software was an appropriate target for certification, as it has a substantial impact on the quality of life, while others advocated certification of operating systems and development platforms. Some participants cautioned that too-strict certification requirements could have negative impacts, such as the expenditure of resources on the production of ultimately unhelpful documents. At the same time, it was noted that regulation and associated certification processes exist to defend and protect those who have no other voice in the product development process, and that it would be unfortunate if truly high quality software were confined to the domain of nuclear reactors and airplanes. The committee was encouraged to host a panel of representatives from areas where certification is generally considered effective (such as avionics) to discuss the potential for broader applicability of certification.

It was also suggested that the committee study certification failures. Failures may include certification of a device or system that failed to achieve the goals for its dependability and certification whose costs caused the project to fail. The National Security Agency’s Orange Book1 was suggested as an example of a certification effort with, at best, mixed outcomes. According to one panelist, “Systems were evaluated and blessed at C2 [a particular level of assurance determined in the Orange Book]; people thought that it meant something, but it didn’t.” Participants suggested that the committee attempt to get honest assessments from a few people who would be candid about what went wrong. It was also suggested that the committee investigate certification successes. The FAA, for example, has been certifying software for quite some time and may be able to offer lessons learned from what has worked and what has not.

Participants disagreed about the appropriate scope of certification. Some thought it should focus on one or two narrow attributes, while others argued that it should encompass all realistic requirements in a common context in order to address the fact that requirements frequently interfere

1  

More formally known as the DOD Trusted Computer System Evaluation Criteria, DOD 5200, 28-STD, December 26, 1985.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 21
Summary of a Workshop on Software Certification and Dependability 3 Summary of Closing Session The workshop finished with a brainstorming session, the goal of which was to identify the important questions and issues raised during the workshop and to determine those that merit further investigation by the committee. The participants reviewed the following aspects of software certification: applications, lessons learned from failures and successes, scope, techniques, process-based and product-based methods, and obstacles to effective certification. Certification has traditionally been applied to safety-critical infrastructure that is procured or regulated by the government. Participants wondered whether it should be applied more widely. Some suggested that even consumer software was an appropriate target for certification, as it has a substantial impact on the quality of life, while others advocated certification of operating systems and development platforms. Some participants cautioned that too-strict certification requirements could have negative impacts, such as the expenditure of resources on the production of ultimately unhelpful documents. At the same time, it was noted that regulation and associated certification processes exist to defend and protect those who have no other voice in the product development process, and that it would be unfortunate if truly high quality software were confined to the domain of nuclear reactors and airplanes. The committee was encouraged to host a panel of representatives from areas where certification is generally considered effective (such as avionics) to discuss the potential for broader applicability of certification. It was also suggested that the committee study certification failures. Failures may include certification of a device or system that failed to achieve the goals for its dependability and certification whose costs caused the project to fail. The National Security Agency’s Orange Book1 was suggested as an example of a certification effort with, at best, mixed outcomes. According to one panelist, “Systems were evaluated and blessed at C2 [a particular level of assurance determined in the Orange Book]; people thought that it meant something, but it didn’t.” Participants suggested that the committee attempt to get honest assessments from a few people who would be candid about what went wrong. It was also suggested that the committee investigate certification successes. The FAA, for example, has been certifying software for quite some time and may be able to offer lessons learned from what has worked and what has not. Participants disagreed about the appropriate scope of certification. Some thought it should focus on one or two narrow attributes, while others argued that it should encompass all realistic requirements in a common context in order to address the fact that requirements frequently interfere 1   More formally known as the DOD Trusted Computer System Evaluation Criteria, DOD 5200, 28-STD, December 26, 1985.

OCR for page 21
Summary of a Workshop on Software Certification and Dependability with one another. There did appear to be some consensus among panelists that certification processes should focus on the desired properties rather than on the manner in which the artifact is constructed. It was observed, however, that some properties that are not directly observable may require secondary evidence. There was substantial interest in certifying components for composability. There was also interest in certification processes that can be applied to a software component as it evolves. There was some discussion of appropriate techniques for certification. Several participants contended that formal methods have a role to play in certification and that although there are limits to what they can achieve, “we should continue the long march of formal methods.” It was suggested that scale and adoptability should be the primary aim of continued research in this area, with a focus on “small theorems about large programs.” There was also controversy over the maturity of certification techniques and of software development in general. Some felt that generally reliable software can be achieved using existing software development techniques but that certification could be used to “raise the bar.” Others emphasized the need for more research on the use of tools and techniques to increase software quality. The question of process-based versus product-based certification was discussed. There was some consensus that a combination of the two is necessary, and that “process evidence is only secondary to product evidence.” The new European standards ESARR 62 and SW013 were offered as examples of the combination approach. It was suggested that formal methods could offer support for “baking the evidence of process into the artifact itself,” using such techniques as proof-carrying code. Panelists warned of several obstacles to effective certification. For example, standards that mandate adherence to untestable criteria are unlikely to be effective, as are certification processes that lack accountability. Standards that provide insufficient connection to the end goal can increase the cost of creating software without increasing its actual dependability. Standards may be too technology dependent, mandating methods of constructing (or not constructing) software. They may mandate the production of enormous documents that have little if any impact on the dependability of the software, or the measurement of “trivia” that is easily measurable but offers only a false sense of security. Finally, participants briefly touched on some additional considerations, and the session concluded with suggestions for the committee in the preparation of its final report. There was some support for the notion that software development companies should be required to publish defect rates. It was noted that software products are often licensed without the “fitness-for-purpose” requirements that apply to other goods. The education of both professionals and consumers was a recurring topic: “We should make tools that give people best practices demos, so people can tell how to write good software, and can tell the difference. Consumer pressure [to produce dependable systems] is not as powerful as people had hoped.” The committee was encouraged “not to be hidebound by current practices” and to take a longer-range view about what people involved in the development of complex software systems need to learn and know. What are the criteria for a good study in the area of certifiably dependable systems? What should the committee do to ensure that such a study provides value to the community and capitalizes on the wisdom and experience of the community at large? It was suggested that the committee’s final report should, at least in part, be positive. Participants observed that the community tends to be negative but that much has been achieved. It 2   The Eurocontrol Safety Regulatory Requirement (ESARR 6) provides a set of guidelines to ensure that risks associated with operating any software in safety-related ground-based air traffic management systems are reduced to a tolerable level. This guideline was developed by the European Organisation for the Safety of Air Navigation. 3   SW01 is part of a larger set of civil aviation guidelines and standards called CAP 670 established by the Civil Aviation Authority in the United Kingdom. SW01 deals specifically with the design and assessment of ground-based services, including air traffic control systems.

OCR for page 21
Summary of a Workshop on Software Certification and Dependability was suggested that clear recognition of the problems and acknowledgment that they are being worked on are equally important. Achieving certifiably dependable systems is a long-term goal, with many unknowns and the potential for significant changes in the way things are done. Changes will take a long time to implement, and so the goal should be to set out a road map. It is important to clearly distinguish between what can be done now and what is desirable but requires technology that does not yet exist.

OCR for page 21
Summary of a Workshop on Software Certification and Dependability This page intentionally left blank.