Trust in Cyberspace
Committee on Information Systems Trustworthiness, National Research Council (1999) 352 pages   6 x 9
    









3

Software for Networked
Information Systems

Introduction

Background

Computing power is becoming simultaneously cheaper and more dispersed. General-purpose computers and access to global information sources are increasingly commonplace on home and office desktops. Perhaps most striking is the exploding popularity of the World Wide Web. A Web browser can interact with any Web site, and Web sites offer a wide variety of information and services. A less visible consequence of cheap, dispersed computing is the ease with which special-purpose networked information systems (NISs) can now be built.

An NIS built to support the activities of a health care provider, such as a medium-sized health maintenance organization (HMO) serving a wide geographic area, is used as an illustration here and throughout this chapter. HMO services might include maintenance of patient records, support for administration of hospitals and clinics, and support for equipment in laboratories. The NIS would, therefore, comprise computer systems in hospital departments (such as radiology, pathology, and pharmacy), in neighborhood clinics, and in centralized data centers. By integrating these individual computer systems into an NIS, the HMO management would expect both to reduce costs and to increase the quality of patient care. For instance, although data and records—such as laboratory test results, x-ray or other images, and treatment logs—previ


software for networked information systems 63

    











ously might have traveled independently, the information now can be transmitted and accessed together.

In building an NIS for an HMO, management is likely to have chosen a "Web-centric" implementation using the popular protocols and facilities of the World Wide Web and the Internet. Such a decision would be sensible for the following reasons:

• The basic elements of the system, such as Web servers and browsers, can now be commercial off-the-shelf (COTS) components and, therefore, are available at low cost.

• A large, growing pool of technical personnel is familiar with the Web-centric approach, so the project will not become dependent on a small number of individuals with detailed knowledge of locally written software.

• The technology holds promise for extensions into consumer telemedicine, whereby patients and health care providers interact by using the same techniques as are commonly used on the rest of the Internet.

Clearly, the HMO's NIS must exhibit trustworthiness: it must engender feelings of confidence and trust in those whose lives it affects. Physicians must be confident that the system will display the medical record of the patient they are seeing when it is needed and will not lose information; patients must be confident that physician-entered prescriptions will be properly transmitted and executed; and all must be confident that the privacy of records will not be compromised. Achieving this trustworthiness, however, is not easy.

NIS trustworthiness mechanisms basically concern events that are not supposed to happen. Nonmalicious users living in a benign and fault-free world would be largely unaffected were such mechanisms removed from a system. But some users may be malicious, and the world is not fault free. Consequently, reliability, availability, security and all other facets of trustworthiness require mechanisms to foster the necessary trust on the part of users and other affected parties. Only with their failure or absence do trustworthiness mechanisms assume importance to a system's users. Users seem unable to evaluate the costs of not having trustworthiness mechanisms except when they experience actual damage from incidents (see Chapter 6 for an extended discussion). So, while market forces can help foster the deployment of trustworthiness mechanisms, these forces are unlikely to do so in advance of directly experienced or highly publicized violations of trustworthiness properties.

Although the construction of trustworthy NISs is today in its infancy, lessons can be learned from experience in building full-authority and other freestanding, high-consequence computing systems for applications


64 trust in cyberspace

    











such as industrial process control and medical instrumentation. In such systems, one or more computers directly control processes or devices whose malfunction could lead to significant loss of property or life. Even systems in which human intervention is required for initiating potentially dangerous events can become high-consequence systems when human users or operators place too much trust in the information being displayed by the computing system.1 To be sure, there are differences between NISs and traditional high-consequence computing systems. An intent of this chapter is to identify those differences and to point out lessons from high-consequence systems that can be applied to NISs, as well as unique attributes of NISs that will require new research.

The Role of Software

Software plays a major role in achieving the trustworthiness of an NIS, because it is software that integrates and customizes general-purpose components for some task at hand. In fact, the role of software in an NIS is typically so pervasive that the responsibilities of a software engineer differ little from those of a systems engineer. NIS software developers must therefore possess a systems viewpoint,2 and systems engineers must be intimately familiar with the strengths (and, more importantly, the limitations) of software technology.

With software playing such a pervasive role, defects can have far-reaching consequences. It is notoriously difficult to write defect-free software, as the list of incidents in, for example, Leveson (1987) or Neumann (1995) confirms. Beyond the intrinsic difficulty of writing defect-free software, there are constraints that result from the nature of NISs. These constraints derive from schedule and budget; they mean that a software developer has only limited freedom in selecting the elements of the software system and in choosing a development process:

• An NIS is likely to employ commercial operating systems, purchased "middleware," and other applications, as well as special-purpose code developed specifically for the NIS. The total source code size for the system could range from tens to hundreds of millions of lines. In this setting, it is infeasible to start from scratch in order to support trustworthiness.

1This is a particularly dangerous state of affairs, since designers may assume that system operation is being monitored, when in fact it is not (Leveson, 1995).

2Once succinctly stated as, "You are not in this alone." That is, that you need to consider not only the narrow functioning of your component but also how it interacts with other components, users, and the physical world in achieving system-level goals. Another aspect of the "systems viewpoint" is a healthy respect for the potential of unexpected side effects.


software for networked information systems 65

    











• Future NISs will, of necessity, evolve from the current ones. There is no alternative, given the size of the systems, their complexity, and the need to include existing services in new systems. Techniques for supporting trustworthiness must take this diversity of origin into account. It cannot be assumed that NISs will be conceived and developed without any reuse of existing artifacts. Moreover, components reused in NISs include legacy components that were not designed with such reuse in mind; they tend to be large systems or subsystems having nonstandard and often inconvenient interfaces. In the HMO example, clinical laboratories and pharmacies are likely to have freestanding computerized information systems that exemplify such legacy systems.

• Commercial off-the-shelf software components must be used to control development cost, development time, and project risk. A commercial operating system with a variety of features can be purchased for a few hundred dollars, so development of specialized operating systems is uneconomical in almost all circumstances. But the implication is that achieving and assessing the trustworthiness of a networked information system necessarily occur in an environment including COTS software components (operating systems, database systems, networks, compilers, and other system tools) with only limited access to internals or control over their design.

• Finally, the design of NIS software is likely to be dictated—at least, in part—by outside influences such as regulations, standards, organizational structure, and organizational culture. These outside influences can lead to system architectures that aggravate the problems of providing trustworthiness. For example, in a medical information system, good security practices require that publicly accessible terminals be logged off from the system after relatively short periods of inactivity so that an unauthorized individual who happens upon an unattended terminal cannot use it. But in emergency rooms, expecting a practitioner to log in periodically is inconsistent with the urgency of emergency care that should be supported by an NIS in this setting.

Fortunately, success in building an NIS does not depend on writing software that is completely free of defects. Systems can be designed so that only certain core functionality must be defect free; defects in other parts of the system, although perhaps annoying, become tolerable because their impact is limited by the defect-free core functionality. It now is feasible to contemplate a system having millions of lines of source code and embracing COTS and legacy components, since only a fraction of the code has to be defect free. Of course, that approach to design does depend on being able to determine or control how the effects of defects propagate. Various approaches to software design can be seen as provid


66 trust in cyberspace

    











ing artillery for attacking the problem, but none has proved a panacea. There is still no substitute for talented and experienced designers.

Development of a Networked Information System

The development of an NIS proceeds in phases that are similar to the phases of development for other computerized information systems:

• Decide on the structure or architecture of the system.

• Build and acquire components.

• Integrate the components into a working and trustworthy whole.

The level of detail at which the development team works forms a Vshaped curve. Effort starts at the higher, systems level, then dips down into details as individual software components are implemented and tested, and finally returns to the system level as the system is integrated into a cohesive whole.

Of the three phases, the last is the most problematic. Development teams often find themselves in the integration phase with components that work separately but not together. Theoretically, an NIS can grow by accretion, with service nodes and client nodes being added at will. The problem is that (as illustrated by the Internet) it is difficult to ensure that the system as a whole will exhibit desired global properties and, in particular, trustworthiness properties. On the one hand, achieving a level of connectivity and other basic services is relatively easy. These are the services that general-purpose components, such as routers, servers, and browsers, are designed to provide. And even though loads on networks and demands on servers are hard to predict, adverse outcomes are readily overcome by the addition or upgrade of general-purpose components. On the other hand, the consequences of failures or security breaches propagating through the system are hard to predict, to prevent, and to analyze when they do occur. Thus, basic services are relatively simple to provide, whereas global and specialized services and properties—especially those supporting trustworthiness—are difficult to provide.

System Planning, Requirements,
and Top-Level Design

Planning and Program Management

A common first step in any development project is to produce a planning and a requirements document. The planning document contains information about budget and schedules. Cost estimation and scheduling


software for networked information systems 67

    











are hard to do accurately, so producing a planning document is not a straightforward exercise. Just how much time a large project will require, how many staff members it will need (and when), and how much it will cost cannot today be estimated with precision. The techniques that exist, such as the constructive cost model (COCOMO) (Boehm, 1981), are only as good as the data given them and the suitability of their models for a given project. Estimation is further complicated if novel designs and the implementation of novel features are attempted, practices common in software development and especially common in leading-edge applications such as an NIS.

Although every attempt might be made to employ standard components (e.g., operating system, network, Web browsers, database management systems, and user-interface generators) in building an NIS, the ways in which the components are used are likely to be sufficiently novel that generalizing from past experiences with the components may be useless for estimating project costs and schedules. For example, it is not hard to connect browsers through a network to a server and then display what is on the server, but the result does not begin to be a medical records system, with its varied and often subtle trustworthiness requirements concerning patient privacy and data integrity. The basic services are even farther from a complete telemedicine system, which must be trusted to correctly convey patient data to experts and their diagnoses back to paramedical personnel. All in all, confidence in budget and schedule estimates for an NIS, as for any engineering artifact, can be high only when the new system is similar to systems that already have been built. Such similarity is rare in the software world and is likely to be even rarer in the nascent field of NIS development.

The difficulties of cost estimation and scheduling explain why some projects are initiated with unrealistic schedules and assignments of staff and equipment. The problem is compounded in commercial product development (as opposed to specialized, one-of-a-kind system development) by marketing concerns. For software-intensive products, early arrival in the marketplace is often critical to success in that marketplace. This means that software development practice becomes distorted to maximize functionality and minimize development time, with little attention paid to other qualities. Thus, functionality takes precedence over trustworthiness.

A major difficulty in project management is coping with ambiguous and changing requirements. It is unrealistic to expect correct and complete knowledge of requirements at the start of a project. Requirements change as system development proceeds and the system, and its environment, become better understood. Moreover, software frequently is regarded (incorrectly) as something that can be changed easily at any point


68 trust in cyberspace

    











during development, and software change requests then become routine. The effect of the changes, however, can be traumatic and lead to design compromises that affect trustworthiness.

Another difficulty in project management is selecting, tailoring, and implementing the development process that will be used. The Waterfall development process (Pressman, 1986), in which each phase of the life cycle is completed before the next begins, oversimplifies. So, when the Waterfall process is used, engineers must deviate from it in ad hoc ways. Nevertheless, organizations ignore better processes, such as the Spiral model (Boehm, 1988; Boehm and DeMarco, 1997), which incorporates control and feedback mechanisms to deal with interaction of the life-cycle phases.

Also contributing to difficulties in project management and planning is the high variance in capabilities and productivity that has been documented for different software engineers (Curtis, 1981). An order-of-magnitude variation in productivity is not uncommon between the most and the least productive programmers. Estimating schedules, assigning manpower, and managing a project under such circumstances are obviously difficult tasks.

Finally, the schedule and cost for a project can be affected by unanticipated defects or limitations in the software tools being employed. For example, a flawed compiler might not implement certain language features correctly or might not implement certain combinations of language features correctly. Configuration management tools (e.g., Rochkind, 1975) provide other opportunities for unanticipated schedule and cost perturbation. For use in an NIS, a configuration management tool not only must track changes in locally developed software components but also must keep track of vendor updates to COTS components.

None of the difficulties are new revelations. Brooks, in his classic work The Mythical Man Month (Brooks, 1975), noted similar problems more than two decades ago. It is both significant and a cause for concern that this book remains relevant today as evidenced by the recent publication of a special 20th anniversary edition. The difficulties, however, become even more problematic within the context of large and complex NISs.

Requirements at the Systems Level

Background

There is ample evidence that the careful use of established techniques in the development of large software systems can improve their quality. Yet many development organizations do not employ techniques that have


software for networked information systems 69

    











been known for years to contribute to success. Nowhere is this refusal to learn the lessons of history more pronounced than with respect to requirements documents.

Whether an NIS or a simple computer game is being implemented, a requirements document is useful. In special-purpose systems, it forms a contract between the customer and the developer by stating what the customer wants and thereby what the developer must build. In projects aimed at producing commercial products, it converts marketing and business objectives into technical terms. In the development of large systems, it serves as a vehicle for communication among the various engineering disciplines involved. And it also serves as a vehicle for communication between different software engineers responsible for developing software, as well as between the software engineers and those responsible for presenting the software to the outside world, such as a marketing team.

It is all too common, however, to proceed with system development without first analyzing and documenting requirements. In fact, requirements analysis and documentation are sometimes viewed as unnecessary or misdirected activities, since they do not involve creating executable code and are thought to increase time to market. Can system requirements not be learned by inspecting the system itself? Requirements derived by such a posteriori inspections, however, run the risk of being incomplete and inaccurate. It is not always possible to determine a posteriori which elements of an interface are integral and which are incidental to a particular implementation. In the absence of a requirements document, project staff must maintain a mental picture of the requirements in order to respond to questions about what should or could be implemented. Each putative requirements change must still be analyzed and negotiated, only now the debate occurs out of context and risks overlooking relevant information. Such an approach might be adequate for small systems, but it breaks down for systems having the size and complexity of an NIS.

The System Requirements Document

The system requirements document states in as much detail as possible what the system should (and should not) do. To be useful for designers and implementers, a requirements document should be organized as a reference work. That is, it should be arranged so that one can quickly find the answer to a detailed question (e.g., What should go into an admissions form?). Such a structure, more like a dictionary than a textbook, makes it difficult for persons unfamiliar with the project to grasp how the NIS is supposed to work. As a consequence, requirements documents are supplemented (and often supplanted) with a concept of operations


70 trust in cyberspace

    











(Conops) that describes, usually in the form of scenarios (so-called "use cases"), the operation of the NIS. A Conops for the example HMO system might, for example, trace the computer operations that support a patient from visiting a doctor at a neighborhood clinic, through diagnosis of a condition requiring hospitalization, admission and treatment at the hospital, discharge, and follow-up visits to the original clinic. Other scenarios in the Conops might include home monitoring of chronic conditions, emergency room visits, and so forth. The existence of two documents covering the same ground raises the possibility of inconsistencies. When they occur, it is usually the Conops that governs, because the Conops is the document typically read (and understood) by the sponsors of the project.

Review and approval of system requirements documents may involve substantial organizational interaction and compromise when once-independent systems are networked and required to support overall organizational (as opposed to specific departmental) objectives. The compromises can be driven more by organizational dynamics than by technical factors, a situation that may lead to a failure to meet basic objectives later on. That risk is heightened in the case of the trustworthiness requirements, owing (as is discussed below) to the difficulty of expressing such requirements and compounded by the difficulty of predicting the consequences of requiring certain features. In the case of the HMO system, for example, advocates for consumer telemedicine might insist on home computer access to the network in ways that are incompatible with maintaining even minimal medical records secrecy in the face of typical hackers. Anticipating and dealing with such a problem require predicting what sorts of attacks could be mounted, what defenses might be available in COTS products, and how attacks will propagate through an NIS whose detailed design might not be known for several years. Making the worst-case assumption (i.e., all COTS products are completely vulnerable and all defenses must be mounted through the locally developed software of the NIS) will likely lead to unacceptable development costs. Similar situations arise for other dimensions of trustworthiness, such as data integrity or availability.

Notation and Style

Requirements documents are written first in ordinary English, which is notorious for imprecision and ambiguity. Most industrial developers do not use even semiformal specification notations, such as the SCR/A7 tabular technique (Heninger, 1980). The principal reason for using natural language (in addition to the cynical observation that without ambiguity there can be no consensus) is that, despite significant R&D investment


software for networked information systems 71

    











in the 1970s (Ross, 1977), no notation for system-level requirements has shown sufficiently commanding advantages to achieve dominant acceptance.

Finally, many—if not most—software developers are forced to lead "unexamined lives." The demand for their services is so great that they must move from one project to the next without an opportunity for reflection or consideration of alternatives to the approaches they used before. The paradoxical result of this situation is that the process of developing software, which has had revolutionary impact on many aspects of society and technology, is itself quite slow to change.

One common strategy for coping with the problems inherent in natural language is to divide the requirements into two classes: criteria for success (often called "objectives" or "goals") and criteria for failure (sometimes called "absolute requirements"). The criteria for success can be a matter of degree: situations where "more is better" without clear cutoff points. The criteria for failure are absolute—conditions, such as causing a fatality, that render success in other areas irrelevant. In the HMO example, a criterion for success might be the time needed to transfer a medical record from the hospital to an outpatient facility—quicker is better, but unless some very unlikely delays are experienced, the system is acceptable. A criterion for failure might be inaccessibility of information about a patient's drug allergies. If the patient dies from an allergic reaction that could have been prevented by the timely delivery of drug allergy data, then nothing else the system has done right (such as the smoothness of admission, proper assignment of diagnostic codes, or the correct interfacing with the insurance carrier) really matters.

It is often posited that requirements should state what a particular criterion is but not how that criterion should be achieved. In real-world systems development, this dictum can lead to unnecessarily convoluted and indirect formulations of requirements. The issue is illustrated by turning to building codes, which are a kind of requirements document. Building codes distinguish between performance specifications and design specifications. A performance specification states, "Interior walls should resist heat of x degrees for y minutes." A design specification states, "Interior walls should use 5/8-inch Type X sheetrock." Performance specifications leave more room for innovation, but determining whether they have been satisfied is more difficult. Design specifications tend to freeze the development of technology by closing the market to innovations, but it is a simple matter to determine whether any given design specification has been fulfilled. More realistic guidance for what belongs in a requirements document is the following: If it defines either failure or success, it belongs in the requirements document, no matter how specific or detailed it is.


72 trust in cyberspace

    











A distinction is sometimes made between functional requirements and nonfunctional requirements. When this distinction is made, functional requirements are concerned with services that the system should provide and are usually stated in terms of the system's interfaces; nonfunctional requirements define constraints on the development process, the structure of the system, or resources used during execution (Sommerville, 1996). For example, a description of expected system outputs in response to various inputs would be considered a functional requirement. Stipulations that structured design be employed during system development, that average system response time be bounded by some value, or that the system be safe or secure exemplify nonfunctional requirements.

Nonfunctional requirements concerning execution theoretically can be translated into functional requirements. Doing that translation requires knowledge of system structure and internals. The resulting inferred functional requirements may concern internal system interfaces that not only are unmentioned in the original functional requirements but also may not yet be known. Moreover, performing the translation invariably will involve transforming informal notions, such as "secure," "reliable," or "safe," into precise requirements that can be imposed on the internals and interfaces of individual modules. Formalizing informal properties at all and decomposing systemwide global properties into properties that must be satisfied by individual components are technically very challenging tasks—tasks often beyond the state of the art (Abadi and Lamport, 1993; McLean, 1994).

Where to Focus Effort in Requirements
Analysis and Documentation

The process of requirements analysis is complicated by the fact that any NIS is part of some larger system with which it interacts. An understanding of the application domain itself and mastery of a variety of engineering disciplines other than software engineering may be necessary to perform requirements analysis for an NIS. Identification of system vulnerabilities is one process for which a broad understanding of the larger system context (including users, operators, and the physical environment) is particularly important. Techniques have been developed to deal with some of these issues. Modeling techniques, such as structured analysis (Constantine and Yourdon, 1979), have been developed for constructing system descriptions that can be analyzed and reviewed by customers. Rapid prototyping tools (Tanik et al., 1989) offer a means to answer specific questions about the requirements for a new system, and prototyping is today a popular way to determine user interface require


software for networked information systems 73

    











ments. Systematic techniques have been developed for determining application requirements by either interviewing application experts or observing the actions of potential users of the system (Potts et al., 1994).

Interviews conducted in the 1970s with experienced project managers revealed their skepticism about making significant investments in system-level requirements documents (Honeywell Corporation, 1975). Those veterans of large-scale aerospace and defense projects believed that any significant efforts regarding requirements should be directed to the level of subsystems or components. They argued that system-level requirements documents were seldom consulted after detailed component-level requirements were written. Change—sometimes significant change—in system-level requirements was quite common and rendered obsolete a system-level requirements document.

Changes in requirements originate from a variety of sources:

• The outside environment may change—the example HMO could merge, restructure, or be affected by new statutory or regulatory forces.

• The advent of new technology could generate a desire for the enhanced capability that the technology provides. This factor would be amplified for the HMO's NIS by the current rapid development pace of Internet-related technology (so-called Internet time) and the false perception that components and features can be added to an NIS with relative ease.

Requirements errors are the most expensive to fix, because they typically are not found until significant resources have been invested in system design, implementation, and, in some cases, testing and deployment. The high cost of repairing such errors would then justify expending additional resources on systems requirements analysis and documentation. But that argument is incomplete, for it presumes that the additional expenditures could prevent such errors. Published (Glass, 19813) and unpublished (Honeywell Corporation, 1975) studies of requirements errors indicate that errors of omission are the most common. Experienced program managers, who have internalized the experience of unpleasant surprises resulting from combinations of inputs and internal states (or other phenomena that were thought to be impossible), understand that no amount of effort is likely to produce a complete requirements document.

Resources expended in requirements analysis and documentation are, nevertheless, usually well spent. The activity helps a system's developers to better understand the problem they are attacking. Design and coding

3This reference contains the classic "Reason for Error" entry in a trouble report: "Insufficient brain power applied during design."

74 trust in cyberspace

    











decisions are thus delayed until a clearer picture of needs and constraints has emerged. It is not the documentation but the insight that is the important work product. Conceivably, other techniques could be developed for acquiring this insight. However, systems requirements documents serve also for communication within a project team as well as with customers and suppliers; any alternative technique would have to address this need as well.

Doing a bad job at requirements analysis actually can have harmful long-term repercussions for a development effort. Requirements analysis invariably goes astray when analysts are insufficiently familiar with the anticipated uses of the system being contemplated or with the intended implementation technology. It also can go astray when analysts become grandiose and formulate requirements far in excess of what is actually needed. Finally, inevitable changes in context and technology mean that requirements analysis and documentation should be an ongoing activity. To the extent possible, requirements should be determined at the outset of development and updated as changes occur during development. In practice, requirements analysis and documentation mostly occur early in the process.

Top-level Design

The trustworthiness of a system depends critically on its design. Once the system's requirements and (optionally) the Conops are approved, the next step is development of a top-level design. This document is often called an "architecture" to emphasize just how much detail is being omitted. During development of the top-level design, basic types of technology are selected, the system is divided into components and subsystems, and requirements for each component are defined. This process has been called "programming in the large," to distinguish it from writing code, or "programming in the small" (DeRemer and Kron, 1976).

Components are building blocks for integration, and subsystems are clusters of components that are integrated first as a group and then the assemblage integrated into the whole. For software that is being developed (as opposed to purchased), the size of a component or subsystem is determined by the number of lines of code, the programming language used, and the complexity of the algorithms involved. A rough rule of thumb is that a component (or "module") is a body of software that can be fully grasped4 by one or two programmers. Using the same principle, a

4That is, some member of the team can answer any question about the subsystem; it is not necessary (or even desirable) that every member of the team be able to answer every question.

software for networked information systems 75

    











subsystem is a body of code that can be fully grasped by a team of three to five programmers, which happens also to be the maximum size group that can be supervised effectively by a team leader.

There exist no generally accepted notations for top-level design. Most designs are described using diagrams. Such diagrams rarely have precisely defined semantics, so they are not always helpful for determining whether a top-level design includes all the necessary functions or satisfies all of its requirements.

A dependency analysis (Parnas, 1974) should be performed on the top-level design, where a dependency is defined to exist between components A and B if the correct operation of A depends on the correct operation of B. The results of a dependency analysis are captured in a dependency diagram.5 Experienced designers attempt to move functions among components to eliminate cycles in the dependency diagram. In a cycle, the correctness of one component depends directly or indirectly on the correctness of another, and the correctness of the second depends directly or indirectly on the correctness of the first, thereby forming a circular relationship. Where a cycle exists, all components in the cycle must be integrated and tested as a unit. In the extreme case—so-called "big bang" integration—all components are integrated at one time; that process seldom has a positive outcome. At present there is no scientific foundation for determining, analyzing, or changing dependency relationships among components in large-scale systems.

Many would argue that interface determination and design are the essence of system design (Lampson, 1983). Therefore, an important output of the top-level design activity is precise specifications for the system's interfaces. These specifications define the formats and protocols for interactions between components and subsystems. A rigorous interface description is particularly important when the interface being defined is between subsystems implemented by different teams.6 The definition of interfaces and the determination of which interfaces are sufficiently important to warrant control by project management are, like the rest of top-level design, more an art than a science.

5As with the top-level design itself, there exist no generally accepted notations for such diagrams, nor do there exist widely used tools to support the development of dependency diagrams.

6There is an element of program management lore called Conways's Law whose essence is that the human organization of a software project and the technical organization of the software being produced will be congruent. The law was originally stated as, "If you have four teams working on a compiler, you get a four-pass compiler." A more general formulation is that "a system's structure resembles the organization that produces it" (Raymond and Steele, 1991).


76 trust in cyberspace

    











Despite the innovative design concepts that have appeared in the literature in areas such as object-oriented design (Meyer, 1988) and architectural description languages (Garland and Shaw, 1996), still no comprehensive approach to the design and analysis of NISs exists. Important challenges remain in design visualization, design verification, design techniques (that accommodate long-term evolution), COTS, and legacy components, as well as tool support for the creation and analysis of designs. Among the most critical issues are design verification and design evolution, since assuring that a design will continue to implement the necessary trustworthiness properties—even as the system evolves—is central to building an NIS. Moreover, because top-level design occurs relatively early in the life cycle, detection of defects during the top-level design stage has great leverage.

Perhaps the greatest design challenges concern techniques to compose subsystems in ways that contribute directly to trustworthiness. NISs are typically large and, therefore, they must be developed and deployed incrementally. Significant features are added even after an NIS is first deployed. Thus, there is a need for methods to identify feature interactions, performance bottlenecks, omitted functionality, and critical components in an NIS that is being developed by composition or by accretion.

There exists a widening gap between the needs of software practitioners and the ability to evaluate software technologies for developing moderate- to large-scale systems. The expense of building such systems renders infeasible the traditional form of controlled scientific experiment, where the same system is built repeatedly under controlled conditions but using differing approaches. Benefits and costs must be documented, risks enumerated and assessed, and necessary enhancements or modifications identified and carried out. One might, instead, attempt to generalize from the experiences gained in different projects. But to do so and reach a sound conclusion requires understanding what aspects of a system interact with the technology under investigation. Some advantages would probably accrue if only software developers documented their practices and experiences. This activity, however, is one that few programmers find appealing and few managers have the resources to support.

Critical Components

A critical component is one whose failure would result in an undetected and irrecoverable failure to satisfy a trustworthiness requirement. Experienced designers attempt to produce top-level designs for which the number of components that depend on critical components is not constrained but the critical components themselves depend on as few other


software for networked information systems 77

    











components as possible. This strategy achieves two things: it enables developers to use freestanding tests and analyses to build trust in the critical components, and it permits an orderly integration process in which trusted components become available early. Unless the critical components come from vendors with impeccable credentials, development teams generally prefer, wherever feasible, to implement the critical components themselves. That way, all aspects of the design, implementation, and verification of critical components can be strictly controlled. There are two risks in pursuing this approach. One is that the criticality of a component has been overlooked—a danger that is increased by the lack of a scientific basis to assess the criticality of components. A second is that it may not be feasible to implement a critical component in-house or, for a vendor-provided critical component, it may not be possible to obtain sufficient information to be convinced of that component's trustworthiness.7

The Integration Plan

Once the basic structure of the system has been established, the integration plan is produced. Ideally, the plan involves two activities:

1. Integration of components into subsystems that reside on single network nodes; and

2. Connection of network nodes into subsystems that perform definable functions and whose behavior can be observed and evaluated, followed by the connection of the subsystems into the final NIS.

The essence of the integration process is progress toward a completely operational system on a step-by-step basis. Observed defects can be localized to the last increment that was integrated—if one build passes its tests and the next build fails its tests, then the most likely sources of difficulty are those components that turned the first build into the second. Working in this manner, the integration team should not have to revisit previously integrated components or subsystems during the integration process. And this process avoids a cycle of "fix and test and fix again" that could continue until time, money, or management patience runs out. Note that for the integration process to be successful, the top-level design must exhibit proper dependency relationships between components. An inte

7In the case of a browser, which might well be a critical component in an NIS, this situation is ameliorated by Netscape's recent decision to release the Netscape Navigator source code. A development team now can examine the code and possibly eliminate unwanted functionality.

78 trust in cyberspace

    











gration plan thus can serve another purpose: to force the detailed analysis of a top-level design. Top-level designs lacking straightforward integration plans are likely to be ambiguous, incomplete, or just plain wrong.

Integration skills today are developed only through experience. There is essentially no theoretical basis for deciding what should constitute a build, nor has the problem received serious scientific examination. System integration continues to be practiced as a craft that is passed along through apprenticeship. The drift of university computer science research from emphasizing large experimental systems projects (such as Multics, c.mmp, and Berkeley UNIX) toward undertaking smaller engineering efforts is of particular concern. Looking back at the master's and Ph.D. thesis topics at the Massachusetts Institute of Technology (as an example) during the Multics era, it is striking how many concern software that had to be integrated into the larger system in a planned and disciplined manner. The shrinking of this skills base in orderly integration is further exacerbated by the reward system of the personal computer market. Financial benefits flow principally to authors of the freestanding application or component (the so-called "killer app") that attracts large numbers of consumers or is selected for use in information systems assembled from COTS components. This latter case involves a different set of skills from those required to design, implement, and integrate a large system from scratch.

Project Structure, Standards, and Process

Other branches of engineering rely heavily on controlling the development process to ensure the quality of engineering artifacts. The Software Engineering Institute's Capability Maturity Model (CMM) is a step in that direction for software design and development (see Box 3.1). As with requirements definition and analysis, there is considerable anecdotal evidence and some experimental evidence that having a systematic process in place contributes to the quality of software systems that an organization develops. There is, however, little evidence that any one process can be distinguished from another, nor is there evidence that different characteristics of development processes are correlated with product quality.

Rigorous, repeatable processes are sometimes thought to result when software development standards are imposed on organizations. Such standards typically prescribe overall process structure, documents to be produced, the order of events, techniques to be used, and so on. A recent study found 250 different standards that apply to the engineering of software, yet the authors of the study found that the standards were largely ineffective and concluded that software technology is too immature to standardize (Pfleeger et al., 1994).


software for networked information systems 79

    











BOX 3.1

The SEI Capability Maturity Model for Software

The Software Engineering Institute's (SEI) Capability Maturity Model (CMM) for software was first introduced in the late 1980s. The current version, version 1.1, was introduced in 1993.1 According to the SEI: "The Capability Maturity Model for Software (SW-CMM or CMM) is a model for judging the maturity of the software processes of an organization and for identifying the key practices that are required to increase the maturity of the processes. The SW-CMM is developed by the software community with stewardship by the SEI" (Paulk et al., 1993).

The CMM defines a maturity framework that has five levels: (1) initial, (2) repeatable, (3) defined, (4) managed, and (5) optimized. The five levels are carefully defined and based on key process areas (KPAs). The KPAs are, as the name suggests, the most important aspects of software processes. At CMM level 2, for example, requirements management is a KPA.

It is important to understand that the CMM is intended only to measure maturity. It is not a software development process standard or a mechanism for assessing specific software development techniques. It also is not a means of achieving high levels of either productivity or software quality (although some users report that both tend to improve after higher CMM levels have been achieved). Rather, the CMM aims to assess the ability of an organization to develop software in a repeatable and predictable way. Thus, an organization possessing a high CMM level will not necessarily develop software more quickly or of better quality than an organization having a lower level. The higher-ranked organization will, however, develop software in a more predictable way and will be able to do so repeatedly.

After a careful analysis, an assessed organization is rated at one of the five levels of the CMM framework. Attainment of some specified minimum CMM level is sometimes required to bid on certain government contracts. (The practice seems to be becoming more common within the Department of Defense.) Whether having such a minimum CMM level ensures higher-quality work is not clear, but it has succeeded in making corporate management aware of the importance of software development processes.

A second benefit of the CMM has been reported by organizations seeking to improve their ratings. The staff of such organizations become more conscious of the software technology they are using and how it can be improved. Esprit de corps tends to be generated when the entire staff is involved in a single process-improvement goal.

Although there is no specific intention that higher CMM rankings will be associated with higher quality or productivity, there is some evidence that more mature processes do yield those advantages. Watts and his colleagues document a variety of benefits and important lessons they observed at Hughes Aircraft after moving from

CMM level 2 to level 3 (Watts et al., 1991). Dion reports increased productivity and

continues on next page

80 trust in cyberspace

    











Box 3.1 continued
large cost savings at Raytheon after it moved from level 2 to level 3 (Dion, 1993). And Motorola, which observed the development performance of 34 different projects with roughly equal numbers of projects rated at each CMM level (Diaz and Sligo, 1997), has reported reduced cycle time, reduced defect rates, and improved productivity as CMM level increased.

However, a recent paper by McGarry, Burke, and Decker (1997) is less favorable in discussing the correlation between CMM level and software development metrics based on data from more than 90 projects within one organization (a part of Computer Sciences Corporation). The results of the study were mixed, and in most cases improvements were not correlated with CMM level.

Impacts of process improvement have also been surveyed. Brodman and Johnson (1996) report survey data in the form of return on investment to industry. Their results document a wide variety of benefits associated with achieving higher CMM levels. Lawlis, Flowe, and Thordahl (1995) investigated the effect of CMM level on software development cost and schedule. They found a positive correlation between CMM level and cost and schedule performance. Another survey reporting positive results of using the CMM has been published by Herbsleb and Goldenson (1996).

The actual CMM assessment process has also been studied. Kitson and Masters (1993) identify which KPAs are major factors affecting CMM ratings, thereby suggesting areas of weakness in industrial software practice.

Although many successes of the CMM have been reported, the CMM itself has also been criticized. Bollinger and McGowan (1991) raised a number of important questions about the practical benefits of an initial version of the CMM in the context of government contracting. Their concerns were mainly with the relative simplicity of the assessment process and the fact that CMM levels would be used for rating government contractors. The criticisms of Bollinger and McGowan were addressed by the developers of the CMM in Watts and Curtis (1991). More recently, Fayad and Laitinen (1997) criticized aspects of the CMM ranging from the cost of assessment to the fact that a single assessment scheme is used for organizations of all sizes. Although these criticisms have merit, they do not appear to be fundamental flaws in the CMM concept.

1The CMM of the Software Engineering Institute is available online at <http://www.sei.cmu.edu/technology/cmm.html>.

software for networked information systems 81

    











Barriers to Acceptance of New Software Technologies

The high costs associated with adopting new software technologies make managers less likely to do so. The concern is that, despite claimed benefits, problems might arise in using the new technology and these problems might lead to missed deadlines or budget overruns. Sticking with technology that has been used before—the conservative course—reduces the risks.

Managers' fears are well founded in many cases, as many new software technologies do not work when tried on industrial-scale problems. Things that work well in the laboratory are not guaranteed to work well in practice. All too often, laboratory assessments of software technology are based on experiences with a few small examples. The need to investigate the scaling of a new technology is common to all branches of engineering but, as already discussed, the expense of performing large-scale software experiments makes such experiments infrequent. To assess a new software technology, the technology should be observed in full-scale development efforts. Any research program that aspires to relevance should include plans for compelling demonstrations that the resultant technology is applicable to industrial-scale problems and that its benefits justify the costs of learning and applying it.

Many new software technologies are also tool-intensive. They try to improve software development practices by replacing or supplementing human effort. Testing an interactive application that employs a graphic user interface, for example, requires the manipulation of complex software structures, the management of extensive detail, and the application of sophisticated algorithms. It all could be undertaken by hand, but having computers perform as much of the work as possible is preferable. Yet, software tools are notoriously expensive to develop because, although the essence of a new idea might be relatively simple to implement, providing all the basic services that are needed for practical use is neither simple nor inexpensive. In addition, learning to use new software tools takes time. The result is one more barrier to the success of any new software technology.

Findings

1. Although achieving connectivity and providing basic services are relatively easy, providing specialized services—especially trustworthy ones—is much more difficult and is complicated by the decentralized and asynchronous nature of NISs.

2. Project management, a long-standing challenge in software development, becomes even more problematic in the context of NISs because of


82 trust in cyberspace

    











their large and complex nature and the continual software changes that can erode trustworthiness.

3. Whereas a large software system cannot be developed defect free, it is possible to improve the trustworthiness of such a system by anticipating and targeting vulnerabilities. But to determine, analyze, and, most importantly, prioritize these vulnerabilities requires a good understanding of how the software interacts with the other elements of the larger system.

4. It seems clear from anecdotal evidence that using any methodical and tested technique for the capture and documentation of requirements—no matter what its shortcomings—is better than launching directly into design and implementation.

5. No notation for system-level requirements has shown sufficiently commanding advantages to become dominant.

6. System-level trustworthiness requirements typically are first characterized informally. The transformation of the informal notions into precise requirements that can be imposed on system components is difficult and often beyond the current state of the art.

7. NISs generally are developed and deployed incrementally. Thus, techniques are needed to compose subsystems in ways that contribute directly to trustworthiness.

8. There exists a widening gap between the needs of software practitioners and the problems that are being attacked by the academic research community. In most academic computer science research today, researchers are not confronting problems related to large-scale integration and students do not develop the skills and intuition necessary to develop software that not only works but also works in the context of software written by others.

9. Although systematic processes may contribute to the quality of software systems, specific processes or standards that accomplish this goal have not been demonstrated.

10. Since the investment of resources needed for a large software development project is substantial, managers are reluctant to embrace new software technologies because they entail greater risks.

Building and Acquiring Components

Component-level Requirements

It is useful to distinguish between two kinds of component-level requirements: allocated or traceable requirements, which devolve directly from system requirements, and derived requirements, which are consequences of the system architecture. In the HMO system, for example,


software for networked information systems 83

    











there might be an overall trustworthiness requirement that medical records must be available 24 hours a day, 7 days a week. One way to meet that need would be to replicate records on two different servers; the data management software then has the derived requirement of ensuring the consistency of the data on the two servers. The requirement is "derived" because it results not so much from an interpretation or clarification of the original trustworthiness requirement but rather from the architectural strategy—replication—being used to satisfy the trustworthiness requirement.

A common practice is to insist that all requirements at the component level be testable. That is, each requirement must be accompanied by some experiment for assessing whether that requirement is satisfied. These tests must be chosen with care because, in actual practice, cost and schedule pressures drive a development team toward making sure their component passes the tests as a first priority. If a test is not chosen carefully and described unambiguously, then a component that does not satisfy the spirit or even the letter of the actual requirements statement might be deemed acceptable.

The relationship between the requirements, which capture intent, and a test, which determines acceptance, is especially problematic for nonfunctional requirements in support of trustworthiness concerns. Continuing with the HMO medical record example, the test may check that the two copies of the medical record are synchronized within so many seconds of a change having been made, that the failure of the primary server is detected by the switchover logic within so many seconds, that switchover is accomplished in so many seconds, and so on. The problem is that the list of tests is not equivalent to the requirement being tested (i.e., availability 24 hours a day, 7 days a week). For example, the tests do not take into account simultaneous or cascading failures (e.g., primary fails while secondary is running backup, secondary fails immediately after switchover, synchronization request comes in at just the wrong time as switchover is being initiated, and so on). There are thus circumstances in which the component or subsystem will pass its tests but fail to satisfy the intent of the requirement.

Detailed, component-level requirements for user interfaces are difficult to write. So-called storyboards, which show display configurations for various inputs, outputs, and states of the system, can be hard to follow. However, the popularity of graphical user interfaces has led to the development of tools that enable designers to rapidly prototype user interfaces. Generally speaking, prototyping is sensible in requirements analysis and can even serve as an executable requirements document. But the cost of building prototypes can be high, thereby preempting other higher-payoff forms of requirements analysis. For example, devoting too


84 trust in cyberspace

    











much effort to prototyping a user interface can lead to software in which an elaborate user interface surrounds a poorly thought-out core.

Component Design and Implementation

To project managers, component design and implementation are the least visible of the phases. A large number of activities are proceeding in parallel, the staff are focused on their individual tasks (perhaps ignoring the global view), and the tasks themselves are highly technical. All conspire to make measuring progress or even anecdotal observations of status extremely difficult. While there is an extensive literature on the problem of demonstrating that a component satisfies its specification, there is considerably less literature devoted to determining whether a component-level specification properly reflects or contributes toward satisfying system requirements.

For code written in traditional languages (such as C) running on a single node, and interacting in limited and controlled ways with users and other software, the craft of programming has evolved into a generally accepted process. As practiced within the aerospace, defense, and other large-scale computing system development communities (but not necessarily in commercial practice) over the last two decades, that process consists of roughly the following steps:

• Review the component requirements document for sanity.

• Prepare a component design in some notation, often called "pseudocode." (Pseudocode is usually a mixture of programming language statements and some less detailed notation, not excluding natural language.)

• Conduct an organized inspection of the component design ("structured walkthrough") with an emphasis on the logic flow.

• Write component test scripts or test drivers to exercise the component after it has been written.

• Write the component in some appropriate higher-level language ("source code").8

• Conduct a structured walkthrough of the source code.

• Compile the component into executable form.

8This and the preceding step are often reversed, and the test drivers are not written until after the component is. The order given in the text is preferable because the detailed design and coding of a test driver force implementers to rigorously analyze and understand component-level requirements.

software for networked information systems 85

    











• Exercise the component ("unit test" or "level 1 testing") using the test scripts or drivers.

• Release the component to the integration process.

This process, and ones like it, have been synthesized from the wreckage of expensive failures, and a significant percentage, if not a majority, of experienced practitioners would caution that any of these steps are omitted at one's peril. One variation is to repeat the cycle frequently, making very small changes at each iteration. This approach was used successfully in the Multics project (Clingen and van Vleck, 1978) and has long been part of the program management lore in high-consequence real-time systems.

Today's turnover rate among software personnel somewhat reduces the effectiveness of the component-development process just described. Software development is still typically learned through apprenticeship. Yet personnel shortages, the potential financial rewards and short life cycles of start-up companies, and the deterioration of corporate loyalty as a result of downsizing and restructuring make it less likely that a junior practitioner will witness a complete project life cycle, much less several projects conducted in the same organization. Ultimately, this will impede the development of an adequate skill base in critical areas, like synthesis and analysis of design, integration, or structuring of development organizations.

The above component development process is predicated on starting with a modular design. Achieving modularity is intellectually challenging and costly; it requires management and design discipline. In addition, modular systems often are larger and slower. So there is a tension between system modularity and cost (along a variety of cost dimensions); it can be hard to know when system modularity is needed and when it is not worth the cost. Moreover, certain NIS building blocks—mobile code and Web browsers with helper applications, for example—compromise the advantages of modular design by permitting unrestricted interactions between different software components.

Programming Languages

Modern programming languages, such as C++, Java, and Ada, include compile-time checks to detect a wide range of possible errors. The checks are based on declaring or inferring a type for each object (i.e., variables and procedures) and analyzing the program to establish that objects are used in ways consistent with their types. This kind of automated support is especially helpful for detecting the kinds of errors (such as passing arguments that overflow a corresponding parameter) so successfully used


86 trust in cyberspace

    











by attackers of operating system and network software. Ever more expressive type systems are a continuing theme in programming language research, with considerable attention being directed recently at the representation of security properties using types (Digital Equipment Corporation, 1997). Success would mean that compile-time checks could play an even bigger role in supporting trustworthiness properties.

Modern programming languages also contain features to support modularity and component integration. Ada, for example, provides type checking across separate compilations; Ada also integrates component linking with compilation, so that statements whose validity depends on the order in which compilation occurs can be checked. Other modern languages provide equivalent features. At the other end of the spectrum, scripting languages (Ousterhout, 1998) (such as Visual Basic and TCL) are today attracting ever-larger user communities. These languages are typically typeless and designed to facilitate gluing together software components. The preponderance of COTS and legacy components in a typical networked information system assures the relevance of scripting languages to the enterprise.

Also of interest to NIS developers are very-high-level languages and domain-specific languages, which provide far-higher-level programming abstractions than traditional programming languages do. The presence of the higher-level abstractions enables rapid development of smaller, albeit often less efficient, programs. Moreover, programming with abstractions that have rich semantics and powerful operations reduces the opportunity for programming errors and permits more sophisticated compile-time checking.

There is much anecdotal and little hard, experimental evidence concerning whether the choice of programming language can enhance trustworthiness. One report (CSTB, 1997) looked for hard evidence but found essentially none. Further study is needed and, if undertaken, could be used to inform research directions in the programming language community.

Systematic Reuse

Systematic reuse refers to the design and implementation of components specifically intended for instantiation in differing systems. It is one of the most sought-after goals in software research, because it offers the potential for substantial software productivity improvements.9 More

9It is worth noting that the infamous year 2000 problem would be far easier to address if a small number of date packages had been reused in date-sensitive applications. There would still be the problem of database conversion, though, once the date format is changed.

software for networked information systems 87

    











over, components intended for reuse can be more intensely scrutinized, since the higher cost of analysis can be amortized over multiple uses. The current economic emphasis on short-term results, however, serves to inhibit the acceptance of any method of systematic reuse that requires (as appears inevitable) up-front investment.

Certain commercial vendors, such as SAP, whose R/3 enterprise-applications software (Hernandez, 1997) has captured one-third of the worldwide client-server market for business systems, claim to have solved the systematic reuse problem in a cost-effective manner for large classes of applications. R/3 is an integrated software package that includes interwoven reusable components for all the major functions of a commercial enterprise, from order entry and accounting through manufacturing and human resources. In addition, R/3 is built to use a COTS operating system along with COTS database management systems, browsers, and user-interface software. Other commercially driven attempts at providing components or infrastructure for systematic reuse include the C++ standard template library (STL) (Musser and Saini, 1996), common object request broker architecture (CORBA),10 common object model (COM) (Microsoft Corporation and Digital Equipment Corporation, 1995), distributed common object model (DCOM) (Brown and Kindel, 1998), and JavaBeans (Hamilton, 1997).

There is always a tension between the pressure to innovate and the stability associated with components intended for reuse. That tension is particularly acute for COTS components, for which the addition of new features and time to market are such strong forces. New features are usually accompanied by new bugs; careful analysis of components enhances stability but delays product release. Moreover, when bugs in COTS components do get fixed, the fixes are often bundled in a release that also introduces new features. The COTS component user must then choose between living with a bug and migrating to a release that may be less stable due to new bugs.

Commercial Off-the-Shelf Software

The Changing Role of COTS Software

Success for a COTS software component often leads to deployment in settings never intended. A component might start as an interesting piece of software at the periphery of trustworthiness concerns and ultimately

10COBRA 3.0 was introduced by the Object Management Group (OMG) in December, 1994. Additional information is available online at <http://www.omg.org>.

88 trust in cyberspace

    











become a critical component in some NIS. In 1994, it would have been absurd to suggest that a bug in a Web browser could kill someone. Yet in the HMO system we are using as an example, a Web-based telemedicine application could allow precisely that outcome. That software can be used for tasks not envisioned by its developers is a double-edged sword, especially if COTS development practices cause developers to compromise trustworthiness for other requirements.

COTS software development practices in the personal computer (PC) era arose in a technical and economic environment that tended to ignore trustworthiness. PC operating systems and applications ran on isolated desktops; the consequences of failure were limited to destruction of perhaps valuable, but certainly not life-critical, data. Failures had no way of propagating to other machines. Therefore, an organizational and programming culture arose that was very accepting of errors and malfunctions, epitomized by the notorious shrink-wrap license whose primary feature is a total disclaimer of responsibility by the developer.

This climate was amplified by economic conditions of the early PC era. Software was purchased separately rather than being bundled with a leased computer, as in the mainframe era. Consequently, there was less financial leverage for dissatisfied customers to affect vendor, and therefore developer, attitudes. A customer's financial leverage was limited to consuming vendor resources in calls to telephone help-lines, which could be ignored by inept or uncaring vendors,11 and refusing to purchase other software or the next revision of the malfunctioning product from that vendor. The latter option is reduced by the diminishing diversity of the marketplace, the need to exchange data with other users, and the investment the customer may have in data that can be processed only by the product in question.

As the PC market exploded, visionary entrepreneurs realized that market share was the dominant factor in corporate survival and personal financial success. Market share is heavily influenced by market entry time. Specifically, the first product to reach a market has the greatest opportunity both to gain market share and to establish the de facto standard upon which the software industry currently operates. Another influence on market share is the richness of features and user interface, which impresses users and reviewers in the technical press. Something must be sacrificed, and it has been trustworthiness aspects such as robustness and security.

One way to reduce time to market is to reduce the time spent in

11This situation is changing. A vendor, albeit of hardware, has recently settled a class action suit requiring an increase in warranty and support coverage (Manes, 1998). Similar actions against software vendors are likely to follow from this precedent.

software for networked information systems 89

    











testing. By making early releases (beta test versions) available to interested users and by freely distributing incremental updates to production software, vendors enlist the help of the user community in finding errors. From a societal perspective, the PC software industry's attitude toward errors was relatively unimportant, since the worst consequence of PC software errors was the time lost by individuals trying to reconstruct destroyed work or otherwise get their PCs to do their bidding. But today, COTS software is moving toward being a business of providing components—and possibly critical components—for NISs that can be high consequence, either because they were explicitly designed that way or because people assign to them a level of trust that their designers never intended.

General Problems with COTS Components

The use of COTS components presents special problems for the responsible developer of an NIS. COTS software typically is full of features that vary in quality and are a source of complexity. The complexity, in turn, means that specifications for COTS components are likely to be incomplete, and users of those components will discover features by experimentation. Being conservative in exploiting these discoveries is prudent—semantics not documented in an accompanying written specification may or may not have been intended and consequently may or may not persist across releases. Moreover, wise developers learn to avoid the more complex features of COTS components because these are the most likely to exhibit surprising behavior and their behavior is least likely to remain stable across releases. When these features cannot be avoided, encapsulating components with wrappers, effectively narrowing their interfaces, can protect against undesirable behaviors.

The COTS developer's reliance on customer feedback12 as a significant, or even primary, quality assurance mechanism can lead to uneven quality levels in different subsystems or functionality in a single COTS product. Press coverage is not guaranteed to be accurate and may not convey the implications of the problem being reported.13 For example, security vulnerabilities in components such as Web browsers, which are

12Handling calls to customer-support telephone help-lines is sometimes claimed to be a significant portion of COTS software costs. The committee was unable to explore the veracity of this claim. However, the use of customer feedback in place of other quality control mechanisms does allow a software producer to externalize costs associated with product testing.

13See, for example, the February 1997 coverage of the Chaos Computer Club demonstration of a supposed security flaw in Microsoft's Internet Explorer.


90 trust in cyberspace

    











used directly by the public, receive widespread coverage, as do ultimately inconsequential (and unsurprising) exploits, such as the use of large numbers of machines on the Internet to "break" cryptographic algorithms by brute-force searches. Feedback from customers and the press, by its very nature, occurs only after a product has been distributed. And experience with distribution of bug fixes clearly indicates that many sites do not, for a variety of reasons, install such upgrades, thereby leaving themselves vulnerable to attack through the now highly publicized methods.14 Reliance on market forces to select what gets examined and what gets fixed is haphazard at best and is surely not equivalent to performing a methodical search for vulnerabilities prior to distribution.

Finally, using COTS software in an NIS has the advantages and disadvantages that accompany any form of "outsourcing." COTS components can offer rich functionality and may be better engineered and tested than would be cost-effective for components developed from scratch for a relatively smaller user community. But an NIS that uses COTS components becomes dependent on a third party for decisions about a component's evolution and the engineering processes used in its construction (notably regarding assurance). In addition, the NIS developer must track new releases of those COTS components and may be forced to make periodic changes to the NIS in response to those new releases. It all comes down to a trade-off between cost and risk: the price of COTS components can be attractive, especially if the functionality they provide is a good match for what is needed, but the risk of ceding control may or may not be sensible for any given piece of an NIS.

Interfacing Legacy Software

Legacy software refers to existing components or subsystems that must be retained and integrated more or less unchanged into a system. Legacy software is used when developing an NIS because reusing an existing system is cheaper and less risky than completely reimplementing it, especially given the migration costs (training, rebuilding online records) associated with deploying a replacement system. In the HMO example, it would be very likely that the Clinical Laboratory or Pathology departments had been operating for decades with freestanding computerized systems. Incorporating such a freestanding system into an NIS poses special problems:


14Even when administrators diligently apply security-bug fixes, the fixes can then be lost when a crashed system is restored from backup media. Since such restorations are often done in a crisis atmosphere, the need to perform the additional update step is easily overlooked in the rush to restore service.

software for networked information systems 91

    











• NIS designers must recognize that they might be dealing with an operational subsystem that performs critical functions and cannot be rendered inactive for days or even hours.

• The legacy subsystem might not have been designed with networking in mind,