Experience shows that there is a correlation between the degree of software system precedent, routinization, and stability, on the one hand, and the ability to deliver results with predictable cost, schedule, and operational test and evaluation (OT&E) success, on the other. Many of the Department of Defenses’s (DoD’s) information technology (IT) systems are precedented, as are significant portions of mission systems. These include the office automation and back-office systems for business operations that are increasingly conventionalized in both commercial and national security contexts. There are precedents for such systems in numerous institutions and environments. Such conventions enable the DoD to build on wide internal experience, other government experience, and commercial experience, reducing the uncertainty associated with predicting the outcomes of particular design decisions. This happens when similar decision points have been experienced in other settings, experience was gained, and it has been possible to transfer that experience to new projects that are sufficiently similar. For precedented development efforts, managers can project plans further into the future of a development process with higher accuracy. They can focus more closely on optimizing cost, schedule, and other factors while managing the various tradeoffs involved. For these routine systems, the DoD benefits when it can adjust its practices to conform to government and industry conventions, because it is then able to build more directly on precedent and also exploit a broader array of more mature market offerings.
The largest producibility challenges for the DoD come from its need for unprecedented, innovative systems that can be rapidly adapted. The mission of the DoD requires it to constantly move forward in advancing the capability of its systems. The committee uses the term “unprecedented” to refer to systems concepts, designs, or capabilities that are not similar enough to the existing base of experience to benefit from fully following an established pattern. As a result, development efforts may involve greater risk (see next section). This report calls these innovative and agile projects software-intensive innovative development and reengineering/evolution (SIDRE) efforts and focuses much of its attention on them. It must be recognized, however, that most unprecedented systems designs, including very-large-scale interlinked systems, generally incorporate significant portions that are themselves precedented and possibly also associated with established commercial or open-source ecosystems.
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 45
2
Accept Uncertainty:
Attack Risks and Exploit Opportunities
INNOVATION, PRECEDENT, AND DYNAMISM
Experience shows that there is a correlation between the degree of software system precedent,
routinization, and stability, on the one hand, and the ability to deliver results with predictable cost,
schedule, and operational test and evaluation (OT&E) success, on the other. Many of the Department of
Defenses’s (DoD’s) information technology (IT) systems are precedented, as are significant portions of
mission systems. These include the office automation and back-office systems for business operations
that are increasingly conventionalized in both commercial and national security contexts. There are
precedents for such systems in numerous institutions and environments. Such conventions enable the
DoD to build on wide internal experience, other government experience, and commercial experience,
reducing the uncertainty associated with predicting the outcomes of particular design decisions. This
happens when similar decision points have been experienced in other settings, experience was gained,
and it has been possible to transfer that experience to new projects that are sufficiently similar. For
precedented development efforts, managers can project plans further into the future of a development
process with higher accuracy. They can focus more closely on optimizing cost, schedule, and other fac -
tors while managing the various tradeoffs involved. For these routine systems, the DoD benefits when
it can adjust its practices to conform to government and industry conventions, because it is then able to
build more directly on precedent and also exploit a broader array of more mature market offerings.
The largest producibility challenges for the DoD come from its need for unprecedented, innovative
systems that can be rapidly adapted. The mission of the DoD requires it to constantly move forward in
advancing the capability of its systems. The committee uses the term “unprecedented” to refer to systems
concepts, designs, or capabilities that are not similar enough to the existing base of experience to benefit
from fully following an established pattern. As a result, development efforts may involve greater risk
(see next section). This report calls these innovative and agile projects software-intensie innoatie deel-
opment and reengineering/eolution (SIDRE) efforts and focuses much of its attention on them. It must be
recognized, however, that most unprecedented systems designs, including very-large-scale interlinked
systems, generally incorporate significant portions that are themselves precedented and possibly also
associated with established commercial or open-source ecosystems.
4
OCR for page 45
4 CRITICAL CODE: SOFTWARE PRODUCIBILITY FOR DEFENSE
Precedented Systems
Put simply, engineering practices and technology choices for precedented systems (such as stable
back-office systems capabilities) are guided by convention (such as commercial best practices for such
systems), while engineering practices and technology choices for unprecedented systems and compo -
nents are guided by processes for mitigating engineering risk. In fact, there is a constant pace of innova -
tion even for seemingly established functional capabilities, such as back-office systems, with some areas
of innovation and other areas that are more guided by convention.
Following precedent may require engaging in a process of adapting previously unique business
practices to reflect more “standard” or “conventional” operations practices that are more readily sup -
ported within mainstream systems and ecosystems, particularly when there is correspondence with
modern back-office systems in the commercial world. This adaptation of functional goals to achieve con-
sistency with normative practice is an explicit part of the commercial requirements engineering process.
The DoD and other government agencies may struggle more because they may find it more difficult to
compromise, and for many good reasons. But the extent to which the DoD can find commonalities (and
avoid unnecessary differentiation) with other government agencies creates opportunities for major cost
reduction, risk reduction, and process simplification.
Unprecedented Systems
As noted in the previous chapter, the need to develop unprecedented systems is a consequence of
the highly complex and rapidly evolving operational environment within which the DoD must oper-
ate and execute its mission. Complexity is increasing, as is the difficulty of the threats and challenges.
Highly capable information technology is now ubiquitous worldwide, and adversaries have ready access
to cutting-edge technology. Mission and deployment priorities are constantly shifting. The DoD must
collaborate extensively with other agencies, nongovernmental organizations (NGOs), coalition part -
ners, and others in constantly changing configurations over which the DoD has no control. Operational
decisions are derived from a broad diversity of inputs. Command-and-control models must adapt to
rapidly evolving threats. Success in this environment depends on systems designed for flexibility, agility,
and robustness, but it also requires flexibility, agility, and robustness in the process by which systems
are developed and continue to evolve. There is much less opportunity to rely on precedent and much
greater requirement to undertake a process of ongoing innovation. This process of innovation entails
acceptance of certain categories of risks. (See Box 2.1 for details.)
Commercial best practices have also evolved for developing unprecedented systems. Air traffic
control, telecommunications switches, middleware (such as from IBM and Oracle), operating systems
(such as from Apple and Microsoft), and large-scale web applications (such as from Google, Facebook,
and Amazon) have been developed under commercial best practices with varying degrees of success.
However, these large-scale, unprecedented systems emerged over a period of years from market oppor-
tunity without a specification-driven need, while others did not. Besides business savvy, the main criti -
cal success/failure factors in these situations have involved the ability to assess potentially disruptive
technologies and competitor strengths, and the corporate agility to adapt to change. 1
This chapter addresses the processes and practices by which these risks can be understood and
addressed in the engineering of systems. A principal conclusion is that a well-managed incremental
(iterative) process, supported by appropriate evaluation and measurement approaches, can more reli -
ably lead to successful outcomes even when there are significant engineering risks. On the other hand,
attempts to produce innovative or unprecedented systems using familiar linear (“waterfall”) processes
1Michael Cusumano and David B. Yoffie, 1998, Competing on Internet Time: Lessons from Netscape and Its Battle with Microsoft, New
York: The Free Press. See also Clayton M. Christensen, 1997, The Innoator’s Dilemma: The Reolutionary Book That Will Change the
Way You Do Business, New York: Harper Business. See also Robert L. Glass and P. Edward Presson, 2001, ComputingFailure.com:
War Stories from the Electronic Reolution, Upper Saddle River, NJ: Prentice Hall PTR.
OCR for page 45
4
ACCEPT UNCERTAINTY: ATTACK RISKS AND EXPLOIT OPPORTUNITIES
can very often lead to unhappy surprises—late-breaking negative feedback regarding early design com -
mitments that, when learned at a late stage in the process, can be very costly to revise. That is, what
appears to be a “safe,” conservative decision to follow the most basic process is in fact a dangerous
decision that can drastically increase programmatic risk—and the possibility of total project failure. 2
The key features of a well-managed incremental process for innovative systems are (1) measurements
that are informative and relevant, and (2) process feedback loops that are relatively short, with potential
major reductions in programmatic risk.
For many of the innovative systems at the heart of the DoD’s software producibility challenge, the
details of the future requirements are not—and in many cases cannot be—fully understood. Thus the
need to innovate is made more challenging by the simultaneous need to be agile as requirements neces -
sarily evolve over time.
MANAGING RISk AT SCALE
There are attempts to manage innovative software development following process patterns more
appropriate to precedented systems and to established predictable engineering disciplines. One con -
sequence is that linear development processes are inappropriately used despite the presence of high
engineering risk (and requirements risk also), with the consequence that those engineering risks are
unnecessarily transformed into increasing project risks. A second consequence is that there is unjusti -
fied emphasis on achieving excessive precision at the outset regarding functionality desired by the user,
choices of infrastructure platforms, and possibly also economic tradeoffs in various complex dimensions
of quality. This drive for excessive precision in these areas can yield a surfeit of specifications and other
early design artifacts, which may in fact give only false comfort—and lead to downstream scrap and
rework.
This is because these process patterns do not account for the engineering risks and uncertainties
inherent in developing innovative software, where there are no laws of physics and materials to constrain
solutions to particular structural patterns. In precedented software, the structural patterns derive from
established software ecosystems and from the body of precedent. In innovative SIDRE systems, these
patterns are lacking, which is both advantageous, in that opportunity is afforded for innovation and
creativity, and also disadvantageous, in that greater levels of uncertainty must be addressed.
Modern governance approaches for larger systems must account for the management of uncer-
tainty. At scale, they must exploit collaboration among distributed teams and in rich supply chains for
which there is a continuous negotiation of scope, quality, and resources to balance the opportunities in
delivering more value with the uncertainties inherent in software development cost and scope targets.
That is:
It is important to treat scope, plans and resources as ariables (not frozen baselines) and explicitly manage
the ariances in these ariables until they conerge on acceptable leels to commit a project/product to full scale
production.
Fortunately, recent DoD and NRC studies3 have resulted in some very initial steps, as evidenced in
2 Some program managers sarcastically refer to an inappropriately used linear (waterfall) process model as the “requirements,
delay, surprise” process model. Fred Brooks’s recent book, The Design of Design (Boston: Addison-Wesley, 2010), succinctly con-
cludes, “The Waterfall Model is wrong and harmful; we must outgrow it.” This point was also made in Fred P. Brooks, 1987, “No
Silver Bullet—Essence and Accidents of Software Engineering,” Information Processing 20(4):10-19.
3 Assessment Panel of the Defense Acquisition Performance Assessment Project, 2006, Defense Acquisition Performance Assess-
ment; see also NRC, Richard W. Pew and Anne S. Mavor, eds., 2007, Human-System Integration in the System Deelopment Process:
A New Look, Washington, DC: National Academies Press. Available online at http://www.nap.edu/catalog.php?record_id=11893.
Last accessed August 20, 2010; and National Research Council (NRC), 2008, Pre-Milestone A and Early-Phase Systems Engineering:
A Retrospectie Reiew and Benefits for Future Air Force Acquisition, Washington, DC: National Academies Press. Available online at
http://www.nap.edu/catalog.php?record_id=12065. Last accessed August 20, 2010.
OCR for page 45
4 CRITICAL CODE: SOFTWARE PRODUCIBILITY FOR DEFENSE
Box 2.1
Programmatic, Engineering, and Systems Risk
Programmatic Risk
Programmatic or project risks pertain to the successful completion of engineering projects with respect
to expectations and priorities for cost, schedule, capability, quality, and other attributes. A principal influ-
ence on programmatic risk is the process by which engineering risks are identified and addressed. This
applies particularly to engineering risks related to architecture and ecosystems choices, quality attributes,
and overall resourcing. With innovative projects, programmatic risk can be reduced through use of iteration,
incremental engineering, and modeling and simulation (as used in many engineering disciplines). Program-
matic risks that derive from overly aggressive functional or quality requirements, where engineering risks
are not readily mitigated, are often best addressed through moderation on the “value side,” for example,
through scoping of functional requirements. Indeed, for ambitious and innovative programs—those char-
acterized as “high risk, high reward”—for identifying and sorting engineering risks, it is often most effective
to focus as early as possible on architecture. Once overall scope of functionality is defined, architecture
risks may often dominate the detailed development of functional requirements.
A well-known example of negative consequences of unmitigated programmatic risks is the FBI Vir-
tual Case File (VCF) project.1 The project is documented in the IEEE Spectrum: “The VCF was supposed to
automate the FBI’s paper-based work environment, allow agents and intelligence analysts to share vital
investigative information, and replace the obsolete Automated Case Support (ACS) system. Instead, the FBI
claims, the VCF’s contractor, Science Applications International Corp. (SAIC), in San Diego, delivered 700,000
lines of code so bug-ridden and functionally off target that this past April [2005], the bureau had to scrap
the US $170 million project, including $105 million worth of unusable code. However, various government
and independent reports show that the FBI—lacking IT management and technical expertise—shares the
blame for the project’s failure.”2
Eight factors that contributed to the VCF's failure were noted in a 2005 Department of Justice audit.
These included “poorly defined and slowly evolving design requirements; overly ambitious schedules; and
the lack of a plan to guide hardware purchases, network deployments, and software development for the
bureau. . . .” Finally, “Detailed interviews with people directly involved with the VCF paint a picture of an
enterprise IT project that fell into the most basic traps of software development, from poor planning to bad
communication.” (Today, 5 years later, the program has been scrapped yet again.)
Supply chain risk is an area of engineering risk that is growing in significance and that often develops
into programmatic risk. This is evident in the DoD’s increasingly complex and dynamic supply-chain struc-
ture, with particular emphasis on concerns related to assurance, security, and evolution of components and
systems infrastructure. This risk can be mitigated through techniques outlined in Chapters 3 and 4 related
to architecture design, improved assurance and direct evaluation techniques, multi-sourcing, provenance
assessment, and tracking and auditing of sourcing information. Supply chain risk is particularly challenging
for infrastructure software and hardware, because of the astonishingly rapid evolution of computing tech-
nologies, with commercial replacement cycles typically every 3 to 5 years. In the absence of careful planning,
this means that early ecosystem commitments can potentially create programmatic risks in downstream
1Ben Bain, 2009, “FBI Pushes Back Completion Date for Sentinel File System ” November 10, 2009, Federal Computer
Week. Available online at http://fcw.com/Articles/2009/11/10/FBI-Sentinel-IG-report.aspx. Last accessed August 20, 2010.
2 Harry Goldstein, 2005, “Who Killed the Virtual Case File?” IEEE Spectrum 42(9):24-35. Available online at http://
spectrum.ieee.org/computing/software/who-killed-the-virtual-case-file. Last accessed August 20, 2010. See also James
C. McGroddy and Herbert S. Lin , eds., 2004, A Review of the FBI’s Trilogy Information Technology Modernization Program,
Washington, DC: National Academies Press. Available online at http://www.nap.edu/catalog.php?record_id=10991. Last
accessed August 20, 2010. See also the subsequent NRC letter report, 2004, Letter Report to the FBI, James C. McGroddy
and Herbert S. Lin, eds., Washington, DC: National Academies Press. Available online at http://www.nap.edu/catalog.
php?record_id=11027. Last accessed August 20, 2010.
OCR for page 45
4
ACCEPT UNCERTAINTY: ATTACK RISKS AND EXPLOIT OPPORTUNITIES
system-refresh cycles, when shifts in vendor strategy may create unanticipated incompatibilities. Indeed,
vendors may often institute incompatible changes in interface specifications in their ecosystems in order to
force user organizations to stay current with evolving technology. Thus, the advantage of “riding the curves”
with off-the-shelf infrastructure must be weighed against the loss of control by program managers. But this
is not a simple tradeoff, since savvy architects have ways to structure systems to increase the possibility of
“having it both ways” in many cases.
Related to supply chain is another kind of programmatic risk, derived from conflicting business incen-
tives. This kind of risk is often present in projects focused on enhancing interoperation among systems.
With interoperation, for example, data from sensors for one system can be used to enhance situation
awareness models in another system. Indeed, one of the strong arguments for net-centric approaches is the
benefit of broad sharing of sensor data to enhance situation awareness and better inform tactical decision
making.3 Despite the natural drivers for interlinking, there are risks and difficulties. A critical system risk, for
example, relates to security—poor architectural decision making at the outset could mean that successful
attacks could result in amplified consequences, due to the larger scale of the overall system.
There are also programmatic risks relating to potential conflicts in the business and mission interests of
the organizations responsible for the entities being interlinked. Vendors, for example, who are competing
at the level of ecosystems may see interlinking as an opportunity for competitors to benefit from network
effects associated with acceptance of the ecosystem by users—that is, interlinking with competitor systems
may be perceived as a threat to investment in ongoing ecosystems enhancement. There are also circum-
stances under which DoD contractors may also see enhanced interoperation (and open architectures, more
generally) as a threat to lock-in and an enhancement to opportunities of competitors.
Engineering Risk
Engineering risks pertain to uncertainties and consequences of particular choices to be made within
an engineering process. High engineering risk means that outcomes of immediate project commitments
are difficult to predict and consequently raise programmatic risk. Engineering risks can relate to many
different kinds of decisions—most significantly architecture, quality attributes, functional characteristics,
infrastructure choices, and the like. Except in the most routinized cases, much of the practice and tech-
nology of software engineering is focused not only on system capability, team productivity, and resource
requirements for development, but also on the reduction of the engineering risks that unavoidably arise
in unprecedented developments.
As DoD and commercial systems evolve with more and more functionality delivered in software, sys-
tems engineering and software engineering techniques are intersecting, leading to a critical area for new
research and the advancement of practice. The challenge is three-fold. First, traditional decision support
techniques need to be enhanced to address the diverse kinds of software engineering risks. Second, there
is need for modeling, simulation, prototyping, and other “early validation” techniques for many different
kinds of software engineering decisions, for example, those related to architecture, requirements, eco-
system choice, tooling and language choice, and many others. Third, system engineering models must be
developed through which appropriate “credit” can be given in earned value models for activities that miti-
gate identified engineering risks. This entails addressing a range of challenges related to measurement and
process design. Development of techniques to meet these challenges would benefit commercial industry
as well as the DoD.
An example of the identification and resolution of engineering risk is described in a workshop report
3 Metcalfe’s Law is an observation on network effects, stating that the “value” of a telecommunications network grows with
the square of the number of nodes in the network—when the number of nodes in a complete graph doubles, the number
of edges roughly quadruples. Of course, there are other ways scale influences “value” that may make actual value greater
or less than quadratic. But regardless of the model, it is clear that the effects are super-linear. This observation explains the
forces that drive the coalescing of separate networked systems into aggregates, including the internetworking initiatives of
the 1970s that coalesced diverse computer networks into the Internet.
OCR for page 45
0 CRITICAL CODE: SOFTWARE PRODUCIBILITY FOR DEFENSE
Box 2.1 Continued
produced by this study committee.4 This case study concerns the internal ecosystem at Amazon.com.
Amazon evolved from a relatively straightforward e-commerce model into a highly complex aggrega-
tion of sellers and buyers, with a business model that depended in part on realizing the synergies of
this aggregation and the growth in scale. “Amazon builds almost all of its own software because the
commercial and open source infrastructure available now does not suit Amazon.com’s needs.” When
it became clear that initial architectural approaches needed to be enhanced, a critical decision point
was reached, with developers at Amazon facing a choice between a “goal of building the ‘perfect’ sys-
tem (the ‘right’ system) whatever the cost” and a very different and “more modest goal of building a
smaller, less ambitious system that works well and can evolve.” Developers recognized that trying to
build the best possible system was a long, difficult, and potentially error-prone process. It also forced
anticipation of a comprehensive set of potential downstream business models. The developers instead
adopted an approach designed to support evolution and rapid organic growth. It necessarily embodied
fewer assumptions regarding the business model, but it was designed to be adaptive and robust. This
led to greater emphasis on infrastructure performance, scalability, and reliability, with a focus on imple-
mentation ideas such as redundancy, feedback, modularity, and loose coupling, under rubrics such as
“purging,” “spatial compartmentalization,” and “apoptosis.” This was the model that led to Amazon’s
rapidly growing venture into cloud computing and associated services.
Systems Risk
Systems risks pertain to the potential hazards—operational risks, mission risks, deployment chal-
lenges, and so on—associated with the deployment of a system. What are the kinds of failures, and
what kinds of hazards do they create? For example, cascading failures have been experienced in tele-
communications and power utilities. These are large-scale system failures resulting from unwanted
positive feedback of local failures triggering failures elsewhere, leading to more global failures with
the corresponding hazards. That is, the hazard of a single system failing can often be associated with
a much larger aggregate of systems, often spread across a wide geography. The consequences of a
single local failure thus extend well beyond the immediate locality of the failure. The hazard is at a
much greater scale.
A case study in systems risk is the Toyota Prius, a “highly computerized car,”5 that relies on software
programs to manage the various applications and features of the vehicle. This complex system is really
an amalgam of simpler subsystems interoperating with each other across a network fabric. Drivers of
the 2010 Prius had reported brake malfunctions, which would later be attributed to a glitch in the soft-
ware controlling the car’s brakes. It is unclear from reports from Toyota and in the press whether the
“software glitch” was an algorithmic fault faithfully encoded into the software or a fault in the software
encoding or the software infrastructure. Regardless, the repair of the fault was accomplished through
software updates: Toyota later issued a software patch for the brake problem.6 In February 2010, Ford
also resolved a braking issue through a software upgrade.7
4 NRC, 2006, Summary of a Workshop on Software Intensive Systems and Uncertainty at Scale, Washington, DC:
National Academies Press. Available online at http://www.nap.edu/catalog.php?record_id=11936. Last accessed Au-
gust 20, 2010.
5 Stephen Manning and Tom Krisher, 2010, “More Trouble for Toyota as Regulators Launch Investigation of Prius
Brake Problems,” Associated Press, February 4, 2010. Available online at http://autos.ca.msn.com/news/canadian-press-
automotive-news/article.aspx?cp-documentid=23387474. Last accessed August 20, 2010.
6 David Millward, 2010, “Toyota Offers UK Prius Owners Brake Software Upgrade,” Telegraph.co.uk, February 8,
2010. Available online at http://www.telegraph.co.uk/motoring/news/7189917/Toyota-offers-UK-Prius-owners-brake-
software-upgrade.html. Last accessed August 20, 2010.
7 David Bailey, 2010, “Ford offers fix for Fusion hybrid brake glitch,” Reuters.com, February 4, 2010. Available
online at http://www.reuters.com/article/idUSTRE61369I20100205. Last accessed August 20, 2010.
OCR for page 45
ACCEPT UNCERTAINTY: ATTACK RISKS AND EXPLOIT OPPORTUNITIES
the DoD’s revision of DoD Instruction 5000.02 and the recent Congressional Weapon System Acquisition
Reform Act that establish such convergence as DoD acquisition policy. 4 However, the policy does not
provide detail about how such convergence can be achieved, particularly in the software arena.
Finding 2-1: Modern practice for innovative software systems at all levels of scale is geared toward
incremental identification and mitigation of engineering uncertainties, including requirements
uncertainties. For defense software, the challenge is doing so at a larger scale, and in ways that are
closely linked with an overall systems engineering process.
Innovation and agility are related. Innovation is the ability to create new systems concepts to address
emerging challenges and opportunities. Innovation can be in concept, functionality, architecture and
design, performance, and so on. Innovative functionalities have migrated into software realizations
because of the special characteristics of software. By improving our capability to manage uncertainty, we
are able to accelerate the delivery of more capable systems and reduce costs. The environments of defense
mission needs and of computing technology are both rapidly changing and often in unpredictable ways.
This creates uncertainty, particularly when systems must be designed to anticipate these changes in
mission and technology over periods of many years. Not all changes can be anticipated, which implies
that not only must architecture and design be forward-looking, but also that ongoing process must be
agile in facilitating ongoing innovation in response to changing needs and opportunities. What does it
mean to “manage uncertainty,” and what are good characterizations and, where possible, measurements
of the various dimensions of uncertainty?
It is often stated as a matter of principle that we must measure something if we are to manage it.
However, in the history of software engineering, the principal “measurables” have been time, effort,
lines of code produced, and defects found and fixed. These are only approximate surrogates for the
attributes of progress that matter in complex development projects, such as identification and resolu -
tion of engineering risks, assurance with respect to quality attributes, manifestation of critical functional
features, ability to support future evolution, and so on. The former set of measurables (e.g., time, effort,
etc.) are perhaps more useful for linear or waterfall developments, but they are of diminishing value
for innovative and agile projects. In these projects, not only must engineering risks be identified and
resolved, but also observable attributes must be created to provide evidence—and reduce the possibility
of “going into denial” regarding challenging engineering risks. This issue is elaborated in the section
below on earned value concepts.
From the perspective of quantitative measurement, the uncertainty of software producibility can be
understood through the following observations:
1. Best practices for software development resource estimates employ empirical, parametric models.
These models typically have 20 to 30 input parameters and produce a probability distribution of
outcomes.
2. The variance of the distribution of outcomes is a good measure of the uncertainty.
3. Managing this uncertainty means reducing the variance in the distribution of outcomes as esti -
mates to complete are recalculated on a periodic basis.
Process agility is needed to respond rapidly and effectively to changing circumstances. But often
those changing circumstances are in the form of late-breaking (in a development process) understand -
ing of key design commitments and emerging engineering risks. Thus effectiveness at the management
4 The 2010 National Defense Authorization Act (NDAA) language (Section 804) is also evidence of progress. National Defense
Authorization Act (NDAA) of Fiscal Year 2010, Pub. L. no. 111-84, 111 Congress, (2009). Available online at http://www.wifcon.
com/dodauth10/dod10_804.htm. Last accessed August 20, 2010.
OCR for page 45
CRITICAL CODE: SOFTWARE PRODUCIBILITY FOR DEFENSE
of uncertainty is one key to process agility in systems definition and conceptualization, as well as
realization.
Technology cannot overcome poor governance or management. However, in the past two decades,
considerable progress has been made in assessing organizational capability to enact process models
to manage complex software developments. This understanding has been packaged in a variety of
ways, such as the capability maturity model, ISO 9000, and the spiral development model. Even well-
managed software development can be hamstrung by poor selection of technology, but it can also be
enhanced significantly by the judicious use of technology. Software-intensive systems are complex. Tools
are needed to help manage complexity, track changes, maintain configuration control, and enforce the
integrity of the architecture—these help teams avoid mistakes often driven by that complexity. Indeed,
modern software teams at all levels of scale rely much more intensively on tooling for process support
than at any point in the past.
Software is distinctive in that it permits rapid iterations, aggressive prototyping, simulation, and
modeling, along with other techniques that can afford early validation with respect to many critical
acceptance criteria. Improved software infrastructure and practices also enable agility, as do improved
means to measure and assess software quality and other attributes. The governance and management
process for unprecedented systems can better exploit these unique software capabilities. This means, in
particular, the aggressive use of iterative risk-managed processes and the definition of suitable earned
value measures related to validation of requirements and architecture, team collaboration, and continu -
ous integration. It also means that platform automation support for measurement, resource estimation,
variance reduction, and change propagation must mature. Another recent study from the National
Research Council has assessed the potential, primarily from a management perspective, for the DoD
to more widely employ incremental and iterative processes to support risk-managed development of
SIDRE systems.5 The recommendations of this study are generally in harmony with the recommenda-
tions of this report, which focuses more on technological enablers and on attendant research and tech -
nology-development challenges.
Earned Value Management and Unprecedented Systems
Earned value management (EVM) is “a means of determining the financial health of a project by
measuring whether the work completed to date is in line with the budget and schedule planning.” One
of the goals of using EVM is to get early warning of potential problems. EVM tracks planning, progress,
cost, earned value (the planned cost of actual progress), and variance in cost and schedule.
Although the technique is seemingly straightforward, the application of EVM for innovative and
unprecedented software-intensive systems poses challenges. In particular, assessing and measuring
actual progress is difficult. Conventional EVM systems make several assumptions, namely: (1) The rela -
tionship between resources and progress is linear, (2) The effort needed to meet certain goals is predict -
able at the outset, (3) Progress is easily and accurately measurable, and (4) The expected outcome—as
articulated in requirements—is well understood. None of these assumptions applies in the case of
software-intensive unprecedented system development efforts where the level of uncertainty changes
the governance process from planning and tracking a straightforward production sequence of related
tasks to an emerging discovery process that requires continuous steering.
Extending EVM to SIDRE software requires some significant changes in how EVM assessment
and measurement strategies are applied. In particular, EVM in this context needs to be adapted from
tracking conformance to planned expenditures to steering toward planned value creation. For this to
happen, significant improvements are needed in our ability to value software assets. For example, a
major, unfinished software asset is no more than an option to guide further investments that, with some
5NRC, 2010, Achieing Effectie Acquisition of Information Technology in the Department of Defense , Washington, D.C: National
Academies Press. http://www.nap.edu/catalog.php?record_id=12823. Last accessed August 20, 2010.
OCR for page 45
ACCEPT UNCERTAINTY: ATTACK RISKS AND EXPLOIT OPPORTUNITIES
remaining risk, will lead to a finished product that creates realized value for an organization. In other
words, software systems present very weak observables.6
But, such a product contains additional value in the form of flexibility to be better adapted, through
additional investments, to its evolving operating conditions. Placing value on adaptation flexibility is
essential for reasoning about investments in modular design architectures, such as those produced by
the application of product-line approaches. Unfortunately, except at the level of overall ecosystems and
vendor components, the value of most software design assets may be apparent only within a project and
may change according to architectural choices. Thus, valuation of these assets is difficult and risky. We
have neither the models we need to perform such valuation, nor adequate approaches to develop and
validate estimates needed as inputs to such models (e.g., of uncertainties about future conditions).
There is perhaps the potential to calibrate models over time based on past experience, though cali -
brations are always vulnerable to invalidation as operating conditions change. Nevertheless, some kind
of approach to valuation (not only accounting for costs but also for value created, even if in the form
of options) is important to managing iterative or other development processes to optimize for value
created, rather than merely for conformance to predicted cost flow streams.
Time-Certain Development and Feature Prioritization
The fact that (particularly SIDRE) software development effort and duration cannot be estimated
precisely means that it is unwise to try to lock a software project into simultaneously fixed budget,
schedule, and feature content (as has been found in many fixed-price, fixed-requirements software devel-
opment contracts). The concept of time-certain development recommended in the Defense Acquisition
Performance Assessment (DAPA) report and elsewhere7 avoids this problem by fixing duration as the
independent variable and feature content a dependent variable. This is basically the same concept as
the agile practice of timeboxing, but it needs more success conditions for large, mission-critical projects
for which, as time is running out, it is difficult (and time-consuming) to determine which features to
drop, and how to drop them without adverse side effects.
The critical success conditions for large-project time-certain development are to prioritize the fea -
tures in advance and to modularize the architecture to make it relatively easy to add or drop border-
line-priority features. In evolutionary development, this does not mean that the features will never be
available, but that they will be deferred to a later increment. Prioritizing features in multi-stakeholder
situations is never easy, but it becomes easier if the decision is just to determine what features are most
needed in the next increment.
A most significant side effect of feature prioritization is that it produces a consensus ranking of the
relative value of the system’s features. This provides the beginning of a way to reason about project risk,
as the key quantity in risk management is an item’s risk exposure, defined as the product of the probabil-
ity of loss times the size of the loss, which is known at least relatively from the feature prioritization.
6 This is in the sense of the traditional OODA (observe, orient, decide, act) loop, which underlies iterative processes and in -
cremental development. That is, as iterations and increments of effort yield results, future iterations and increments necessarily
build on those results. The challenges of software measurement and evaluation (as addressed throughout this report) relate to
the “observe” part of the loop, the process-related challenges relate to the “orient” and “decide” parts of the loop, and many of
the architecture/design and programming challenges relate to the “act” part of the loop.
7 This includes, most significantly, the NRC report on Achieing Effectie Acquisition of Information Technology in the Department of
Defense. NRC, 2010, Achieing Effectie Acquisition of Information Technology in the Department of Defense , Washington, D.C: National
Academies Press. Available online at http://www.nap.edu/catalog.php?record_id=12823. Last accessed August 10, 2010. Last
accessed August 20, 2010.
OCR for page 45
4 CRITICAL CODE: SOFTWARE PRODUCIBILITY FOR DEFENSE
Evidence-Based Software Engineering and Risk Probability
Another major trend in software engineering and project management is to shift from schedule-
based milestones (“The contract says that the Preliminary Design Review (PDR) will be held on April
1, so that’s when we’ll hold the PDR, whether we have a design or not.”) to event-based milestones
(“We’ll finish all the design features by June 1, so we’ll hold the PDR then.”). However, such reviews
often fail because there is no way to tell from all the Unified Modeling Language (UML) diagrams and
PowerPoint charts whether the design will scale-up, handle crisis conditions, meet critical timelines, or
be buildable within the available budget and schedule.
This has led to the current trend toward evidence-based milestones, and evidence-based software
and systems engineering in general. This approach places responsibility on developers not only to create
artifacts for review such as operational concepts, requirements, designs, plans, budgets, and schedules,
but also to produce evidence that if a software system were built to the design, it would satisfy the
requirements, support the operational concept, and be buildable within the budgets and schedules in
the plan. This evidence would then be reviewed by independent experts, and shortfalls in evidence
would be treated as uncertainties or probabilities of loss. As with the relative sizes of loss determined
from requirements prioritization above, these probabilities are generally known only relatively, but they
can be combined with the relative sizes of loss to produce at least relative risk exposure quantities for
use in risk management.
Actually, evidence-based software and systems engineering has been practiced many times and
has been a consistently performed and high-payoff corporate practice at leading companies such as
AT&T since the 1980s.8 However, such evidence is usually asked for in contract data item descriptions
(DIDs) in optional appendices, where it is one of the first things to go when resources become strained.
Making appropriate evidence a first-class deliverable not only would ensure its development, but also
would make it an element of earned value management, in that it thus would have to be planned for
and its progress tracked with respect to the plans. The evidence should be parsimonious and focus on
enabling of action—rather than on the massing of “read-never” program documentation. These points
are summarized in the following findings and recommendations.
Finding 2-2: The prescription in DoD Instruction 5000.02 for the use of evolutionary develop-
ment needs to be supplemented by the development of related guidance on the use of such prac -
tices as time-certain development, requirements prioritization, evidence-based milestones, and risk
management.
Finding 2-3: Extensions to earned value management models to include evidence of feasibility and
to accommodate practices such as time-certain development are necessary conditions to enable suc -
cessful application of incremental development practices for innovative systems.
As noted throughout this report, the DoD would benefit from investing effort in developing
improved quantitative measures related to diverse software attributes such as quality, productivity,
architecture compliance, architecture modularity, process performance, and many others. But DoD
practices must also recognize that existing metrics do not fully reveal critical attributes of systems and
process status and that expert judgment also has a critical role, particularly with respect to architecture,
design, and many quality attributes associated with SIDRE systems. Evidence-based software and sys -
tems engineering approaches are being increasingly applied to address achievement of critical SIDRE
attributes and need to be better institutionalized into DoD acquisition practice.
8 Joseph F. Maranzano, Sandra A. Rozsypal, Gus H. Zimmerman, Guy W. Warnken, Patricia E. Wirth, and David M. Weiss, 2005,
“Architecture Reviews: Practice and Experience,” IEEE Software 22(2):34-43.
OCR for page 45
ACCEPT UNCERTAINTY: ATTACK RISKS AND EXPLOIT OPPORTUNITIES
Finding 2-4: Research related to process, measurement, architecture, and assurance can contribute to
the improvement of measurement practice in support of both routine management of engineering
risks and value assessment as part of earned value management.
For example, keys to developing cost-effective evidence involve determination of feature and attri -
bute priorities, assessment of candidate evidence-generation capabilities (modeling, simulation, proto -
typing, bench marking, exercises, early working versions, citations of relevant previous experience), and
measurement of progress toward thorough evidence generation.9 Some initial steps in this direction are
provided in a report by Boehm and Lane.10
Recommendation 2-1: The DoD should take aggressive actions to identify and remove barriers to
the broader adoption of incremental development methods, including iterative approaches, staged
acquisition, evidence-based systems and software engineering, and related methods that involve
explicit acknowledgment and mitigation of engineering risk.
There are different kinds of barriers that can be addressed through combinations of established
best practice and emergent improved practice derived from technology and other improvements. These
potentially surmountable barriers include (1) improved measurement and associated technology, (2)
architecture validation using models, simulation, prototyping, etc., (3) program manager training and
evaluation of perceived career risks (see findings below), (4) accretion of an accessible experience base
and other shared resources that can facilitate sound decision making, and (5) acceptable shifts of early-
stage emphasis for innovative systems from detailed functional requirements to concurrent engineering
of requirements, architecture, process definition, and evidence of their compatibility and feasibility.
Similar barriers exist in commercial industry, of course. These are accentuated in DoD because of its
particular challenges of arm’s-length contractual relationships, high assurance requirements, potential
presence of adversaries in the systems development activity, and other barriers.
MANAGING REqUIREMENTS AND ARCHITECTURE
Software development complexities tend to increase non-linearly as systems scale up in complex -
ity, features, and quality goals. The challenge for the DoD is that its requirements must be addressed
at unusual scale, complexity, interconnection, security, and with life-critical mission requirements. This
challenge is exacerbated by the fact that the DoD is not sufficiently exploiting known techniques for the
management of complex and evolving requirements. These techniques have been a focus of research
for many years, but the known techniques are not widely employed on DoD applications—techniques
including spiral development, joint application development, agile development, etc. The resulting
difficulties are well known.11,12
There is widespread agreement that the requirements-delay-surprise (linear) approach to software
development is not effective for innovative systems. The committee proposes more extensive use of
an incremental, risk-assessment-driven approach. It is important to appreciate that, for incremental
approaches to succeed, there needs to be forward-looking up-front investment in the overall system and
process design. This enables problems to be decomposed in such a way that engineering risks can be
identified, initial architecture models developed, and overall programmatic risk is minimized. If this is
9 This concept of evidence generation is different but analogous to the discussion of evidence-based assurance in Chapter 4.
10 Barry Boehm and Jo Ann Lane, 2010, Eidence-Based Software Processes, Proceedings, 00 International Software Process Confer-
ence Springer, Berlin.
11 NRC, 2010, Achieing Effectie Acquisition of Information Technology in the Department of Defense , Washington, DC: National
Academies Press, Washington, DC. Available online at http://www.nap.edu/catalog.php?record_id=12823. Last accessed August
20, 2010.
12 Barry Boehm and Richard Turner, 2003, Balancing Agility and Discipline: A Guide for the Perplexed, Boston: Addison-Wesley.
OCR for page 45
ACCEPT UNCERTAINTY: ATTACK RISKS AND EXPLOIT OPPORTUNITIES
just as cost and resources, schedule and work breakdown, quality (with respect to various attributes), and
other overall constraints are variables subject to ongoing negotiation. An example is the prioritization of
requirements and use of scope as a dependent variable in time-certain development. Even when scope
is treated as a seriously scrutinized and controlled variable in DoD projects, a collaborative, open, and
honest management style between customer and contractor, supported by constant effort to improve
measurement and observation capability, has proven to be another necessary ingredient for success.
These are the foundations of modern agile governance that demand executable capability demon -
strations over time. Modern agile governance of software delivery means managing uncertainty through
steering. In a healthy software project, each successive phase of development produces an increased
level of understanding in the evolving plans, specifications, and completed solution, because each phase
furthers a sequence of executable capabilities and the team’s knowledge of competing objectives. At any
point in the life cycle, the precision of the subordinate artifacts should be in balance with the evolving
precision in understanding, at compatible levels of detail and reasonably traceable to each other.
ESTIMATIONS, CONTRACTING, AND ITERATIVE DEVELOPMENT
The DoD operates within a complicated federal procurement and acquisition process. From the
outset the government typically awards to the contractor with a viable technical solution and the lowest
cost. For software-intensive systems there is a conventional wisdom that aggressive bids have driven
many programs to diminished probabilities of success.16 Although the reasons for cost overruns and
delays are complex, the choice of evaluation criteria in this process is undoubtedly a factor. The govern -
ment and contractors need to establish rigorous processes to ensure that we have a basis for size estimates
that have sound derivation from comparable systems, as well as thoughtful scaling factors to account
for degree of engineering risk, overall complexity and scale, and maturity of the various contributing
technologies and ecosystems. For innovative SIDRE systems, the variances in a sound estimate can rise
quite dramatically. For all systems, there is also the complication of the rapid evolution of the underlying
software technologies, which tends to reduce commensurability with historical comparables. The use of
evidence-based proposals and independent expert review is also helpful at the source selection stage.
Estimates
An additional difficulty is the lack of a rational standard by which the cost estimates are judged.
While there are well-used metrics for hardware, a uniform set of standards for software development
is lacking, although there are candidate models such as SEER-SEM, True S, or COCOMO. Also, analyz -
ing comparable probabilities of success should be a key element for awards. This analysis must avoid
conflating engineering risk with programmatic risk and instead account for process plans (and earned
value credit models) that acknowledge the reality of the engineering risk and indicate how it can be
mitigated (as outlined above). Product-line and framework efforts provide significant challenges to
development estimation, as do commercial, open-source, and vendor infrastructure and services. These
outsourced products and services, loosely considered as COTS (“commercial off the shelf”), although
often they may not be commercial or off the shelf, require further adaptation to the estimation models to
account for the costs of assimilating the product/service, including integration, configuration, ongoing
upgrade (and consequent adaptation to the subject system), licensing, sourcing risks, and other factors.
For example, many conventional commercial components and services have refresh cycles, which may
range from months to many years. The period of these cycles (and the extent of likely incompatibility)
can often be anticipated on the basis of industry standard practices, but it nonetheless needs to be
16 See reports from the Government Accountability Office including, GAO, 2004, Defense Acquisitions: Stronger Management
Practices Are Needed to Improe DoD’s Software-Intensie Weapon Acquisitions, GAO-04-393, Washington, DC: U.S. Printing Office.
OCR for page 45
CRITICAL CODE: SOFTWARE PRODUCIBILITY FOR DEFENSE
understood, along with the prospects for “support engagements” from vendors to address urgent and
critical issues should they arise.
Recommendation 2-2: The DoD should take steps to accumulate high-quality data regarding project
management experience and technology choices that can be used to inform cost estimation models,
particularly as they apply to innovative software development projects.
The current upgrade underway of the reporting quantities and guidance for the DI-MGMT-81739
and -81740 Software Resource Data Reports is a good example.
Contracting
There are a variety of contracting structures that are available to government program managers.
Choices are generally made on the basis of goals regarding incentives for the performer. For example,
the cost-plus-award-fee (CPAF) paradigm tends to front-load the incentives for performance where the
product is primarily a set of artifacts that define the design, but that do not necessarily provide functional
capability. It will be important to ensure that future contracts provide a balanced incentive for early
development of functioning products, as well as early evaluations of performance and robustness.
An iteratie process for software development requires somewhat of an iterative or, more precisely,
an incremental contract with the customer, very much following the concept of a spiral model of soft-
ware engineering.17 For a company to respond to a request for proposals (RFP) with some accuracy, it
generally must have experience with multiple similar projects on the basis of which it can estimate with
confidence the resources and risks associated with building and testing a particular system. Some com -
panies frequently offer a fixed-price bid as well, perhaps for as many as half of their projects, although
the preferred contract is not a fixed price but rather an agreement on the general estimated figure for
the cost and delivery schedule in chunks, with more specificity for the critical initial deliveries, some
agreed upon process for continued negotiation around time and schedule for changes, and pricing of
later parts of the system as more of it is built and delivered.
A Scenario for SIDRE Incremental Development
One possible way to combine improvement in the precision of estimation with mitigation of early-
stage engineering risk (architecture, scope, hazard analysis) would be for a software customer to start a
project with an initial scoping and prototyping engagement, lasting a few weeks, depending on the size
or complexity of the system. This can serve to determine the scope of requirements, assess architecture
alternatives, identify constituent ecosystems, and address other potential sources of up-front engineer-
ing risk. This affords both the customer and the bidder opportunity to develop more precise (but still
crude) estimates of the cost and time potentially required for the project.
This scoping and prototyping phase can be used to identify what are the essential features of the
system that the customer must have, what are the lower-priority features, and what are the features or
functions that must be built first for the work to proceed. The company (and the customer) can then
generate estimates for this first phase of the development work. This can be viewed as developing an
immature design, but through a mature design process that will eventually lead to a well-validated
mature design. As this initial phase of the work nears its completion milestone, systems engineers,
architects, and requirements engineers can then work on a more detailed plan for the next milestone,
including more specific plans for value measurement that would be used to enhance a baseline overall
earned value model. The company, if it has experience with similar projects, can use historical data to
adjust its estimates and add buffer time in the schedule, which will also add costs for manpower.
17 Barry Boehm, 1986, “A Spiral Model of Software Development and Enhancement,” Communication of the ACM 11(4):14-24.
OCR for page 45
ACCEPT UNCERTAINTY: ATTACK RISKS AND EXPLOIT OPPORTUNITIES
In short, the scenario is for the customer to pay for an upfront scoping and prototyping exercise,
agree to a general budget and timeframe, and then pay in increments as the work proceeds and changes.
It is critical for the development team to be working extremely closely with the customer such as through
having weekly or biweekly project updates and sharing information regarding architecture, features,
and quality attributes of the evolving system in frequent increments. This also affords opportunity for
risk mitigation regarding validation for critical requirements, enabling operational acceptance and pro -
viding evaluators an opportunity to mitigate the engineering risks they face regarding various kinds
of evaluation criteria. Regarding budget, it enables the customer to achieve a budget target with essen -
tial features of the system completed and to establish options regarding additional features or quality
improvements earlier in the lifecycle, thus facilitating negotiations regarding lower-priority features or
bug-fixing time later on in the project.
The committee notes that real-world project experience has shown time and again that it is the early
phases that make or break a project. It is therefore of paramount importance to have a strong start-up
team for the early planning and architecture activities. If these early phases are done right with good
teams, projects can more often be completed successfully with (stable) nominal teams of capable devel -
opers evolving the applications into the final product. If the early planning and architecture phases are
not performed adequately, however, then programmatic risks escalate dramatically—even tremendous
expertise may not succeed in overcoming the consequences of early bad decisions.
The committee also notes that for the largest and most complex systems, and also for many of the
more innovative systems, the DoD has a strong and direct interest in architecture definition in the early
project phases. DoD interests in architecture bear on longer-range issues such as interoperability, flex -
ibility, and shifts in quality attributes as infrastructure and associated ecosystems evolve. This implies
that the DoD must have capability to assess architectural decisions at the early stages as part of the
overall process. There is a challenge in finding the right balance—on the one hand, contractors must
fully “buy in” to architecture designs with respect to owning responsibility for outcomes, but, on the
other hand, the DoD and contractors must be able to collaborate in refactoring or adapting architectures
when required.
Finding 2-5: Architectural expertise is becoming dramatically more important for the DoD, its advi -
sors, and its contractors. There will be significant and immediate benefits from advances in the state
of technical support for architecture.
Recommendation 2-3: Update procurement, contracting, and governance methods to include an early
and explicit architecture phase that reduces the predominant uncertainties in software-intensive
systems.
Technical support for architecture includes architecture development, modeling, simulation, evalu -
ation of quality attributes (such as performance and security), evaluation of structural attributes (such
as code compliance, modularity, etc.), and techniques for adaptation. This also includes capture of
architectural experience to support building on experience.
Recommendation 2-4: Define architectural leadership roles for major SIDRE projects and provide
program managers with channels for architectural expertise.
With respect to risk management, if a project is structured in short cycles or milestones, such as
every 4 or 8 weeks,18 then estimates and teams can be adjusted to try to make up time on the schedule.
For example, if part of the system is proving to be more difficult than planned to build or to test, then
18 Agile cycles are typically 30 days, with a deliberate commitment for schedule-driven milestones to provide the dominant
constraining structure in the management of process.
OCR for page 45
0 CRITICAL CODE: SOFTWARE PRODUCIBILITY FOR DEFENSE
it may be possible to restructure the work plan to enable switching of people from different tasks with -
out running afoul of Brooks’s Law.19 Mitigating risk from a contractual perspective requires reducing
development cycles, system testing intervals, and feedback opportunities with the customer. Although
this would vary based on scale, it would typically change release cycle times from units of months to
units of weeks. This is predicated on identifying the most useful observables to support effective deci -
sion making in the feedback loop implemented in project iterations. Project managers should also be
identifying early on what parts of the system have high engineering risk—such as complex components
that are different from systems they have built successfully in the past.
The use of evolutionary acquisition as emphasized in DoD Instruction (DoDI) 5000.2 implies the
need for continuing architectural adjustments to accommodate changing priorities, independently
evolving external interfaces, new releases of COTS products, and termination of support of older COTS
releases. This will be discussed next.
REALIzING DOD SOFTWARE BENEFITS VIA DOD INSTRUCTION 5000.02
AND EVOLUTIONARY ACqUISITION
As discussed above, recent DoD policy in DoDI 5000.02 has established the concept that “evolu -
tionary acquisition” is the recommended way to acquire DoD systems, but the policy does not provide
detail about how successful evolutionary acquisition can be achieved, particularly in the software arena,
and in a way that is compatible with the concepts of incremental iterative development. The issue is
that evolutionary acquisition requires “a militarily useful and supportable operational capability” (DoD
Instruction 5000.2, p. 13, 2.c.) at each iteration, whereas incremental iterative development does not (and
should not) require operational capability at every iteration. This is because the iterations in incremental
iterative development may be focused on discharging particular engineering risks rather than on mani -
festing operational capability. Further, DoD projects currently preparing to apply evolutionary acquisi -
tion find that much of the available acquisition infrastructure (contract forms, exhibits, and data item
descriptions for reviews and audits, work breakdown structures, requirements, design, test, milestone
pass/fail criteria, progress payments, award fees, etc.) is still oriented around a model of single-step
development to prespecified full-system requirements, with portions pre-allocated to software.
The usual result is a hardware-driven functional-hierarchy system architecture that is incompatible
with preferred layered, service-oriented software architectures, and accompanying hardware-oriented
work breakdown structures that encourage software suboptimization20 and translate into management
structures that hinder rapid software adaptation to change.21 Further, projects are often unaware that
there are several forms of evolutionary acquisition and choose a form that is poorly matched to their
project situation. Some initial work has been done to determine the various forms of evolutionary acqui -
sition and to provide top-level criteria for choosing among them, as shown in Box 2.2.
This top-level guidance is a good first step, but it needs considerably more detailed guidance and
associated methods and tools to ensure its successful application on DoD projects. 22 What is most sorely
needed at this point is an elaboration of the necessary guidance to ensure early software participation in
19 Brooks’s Law states that adding people to troubled software projects only puts them further behind schedule. See Fred Brooks,
1975, The Mythical Man Month: Essays on Software Engineering, Reading: Addison-Wesley.
20 See the current revision of MIL-STD-881.
21 Barry Boehm, A. Winsor Brown, Victor Basili, and Richard Turner, “Spiral Acquisition of Software-Intensive System of Sys -
tems,” CrossTalk, May 2004: 4-9.
22 Specific practices for incremental iterative development are discussed in several studies, including DSB, 2009, Report of the
Defense Science Board Task Force on Department of Defense Policies and Procedures for the Acquisition of Information Technology , Wash-
ington, DC: Office of the Under Secretary of Defense for Acquisition, Technology and Logistics. Available online at http://www.
acq.osd.mil/dsb/reports/ADA498375.pdf. Last accessed August 20, 2010. See also NRC, 2010, Achieing Effectie Acquisition of
Information Technology in the Department of Defense, National Academies Press, Washington, DC. Available at at http://www.nap.
edu/catalog.php?record_id=12823. Last accessed August 20, 2010. The practices are also elaborated in Congressional language
in the National Defense Authorization Act 2010, Section 804. National Defense Authorization Act (NDAA) of Fiscal Year 2010,
OCR for page 45
ACCEPT UNCERTAINTY: ATTACK RISKS AND EXPLOIT OPPORTUNITIES
systems engineering and criteria for evaluating whether adequate evidence of software feasibility has
been produced at major DoD acquisition milestones. A particular need is for guidance on stabilizing the
current increment of evolutionary development while concurrently evolving the software and system
architecture and plans to enable stabilized development of the next increment. 23
Recommendation 2-5: Develop the technical and management infrastructure necessary to simultane -
ously support stabilized, high-assurance development of the current evolutionary increment while
concurrently evolving the plans and specifications for stabilized development of the next high-assur-
ance increment.
INTRINSIC DOD SOFTWARE ExPERTISE—BEING A SMART CUSTOMER
The Current State of DoD Software Expertise
It is widely acknowledged, including within the DoD, that the department does not have sufficient
organic personnel with the software expertise to meet its needs for today’s more software-intensive pro -
grams.24 Although the DoD develops some software internally, the committe’s focus here is on access to
expertise that is needed for the DoD to be effective as a savvy and outstanding customer for software.
This includes the expertise to effectively purchase the larger and less precedented systems as well as the
precedented systems for which sensitivity to issues such as the choice of ecosystem is key. The necessary
expertise includes understanding of process, architecture, requirements, and assurance. It also includes
understanding of the trajectories and adoption trends for both the major commercial ecosystems and
any involved DoD-intrinsic software ecosystems. The DoD faces challenges in attracting and retaining
software and systems engineering personnel and also in keeping up to date the skills of the personnel
they do have.25 Commercial industry also faces challenges because demand for software expertise is
high and the competition for top project managers and top architects can be particularly fierce because
these two skills are both critical to success and their ranks are few.
Challenges Particular to the DoD
The defense environment poses further challenges, notably the difficulty in competing with industry
to hire the most capable software architects and other experts. This is not simply a matter of salaries. For
instance, it is noted by the committee that many software engineers and architects become frustrated
and discouraged working within the constraints of the DoD acquisition process and with the tendency
toward calcification of their “hands-on” skills that made them valuable to the DoD acquisition process
in the first place.26 Especially in recent years, the DoD has not shown the desire or ability to develop
Pub. L. no. 111-84, 111 Congress, (2009). Available online at http://www.wifcon.com/dodauth10/dod10_804.htm Last accessed
August 20, 2010.
23 See related discussion in Chapter 4.
24 “The quantity and quality of software engineering expertise is insufficient to meet the demands of government and the
defense industry.” Excerpted from presentation by Kristin Baldwin, 2008, “DoD Software Engineering and System Assurance,”
January 15, 2008, p. 4. Available online at http://www.acq.osd.mil/se/briefs/2008-01-15-SSA-Boeing-Interchange.pdf. Last ac -
cessed August 18, 2010.
25 Matthew Weigelt, 2009, “Officials Wants Their Own Software Engineering Experts, But They Don’t Want to Disregard
Industry’s Experts,” Federal Computer Week. Available online at http://fcw.com/Articles/2009/07/09/DOD-IT-systems-engineers-
outsourcing.aspx. Last accessed August 20, 2010.
26 The committee did note that federally funded research and development centers (FFRDCs) and labs can provide the op -
portunity to technical staff to take breaks from direct support and move to programs under acquisition. These breaks enable
staff to pursue research and re-connect with their “hands-on” skills that made them valuable to the DoD acquisition process in
the first place. It keeps their skills current and allows them to cycle back to another acquisition activity with fresh thoughts and
approaches to developing DoD capabilities.
OCR for page 45
CRITICAL CODE: SOFTWARE PRODUCIBILITY FOR DEFENSE
Box 2.2
Software Risks
The phrase “software risks” often appears in discussions regarding software development projects
and software-intensive systems engineering projects. It suggests danger and something that should be
prevented or avoided by project managers. But in fact there are different kinds of risks, and not all of them
involve danger. Indeed, some have an appropriate and sometimes valuable place in any innovative engineer-
ing process and its management. Most importantly, by acknowledging and managing the various categories
of risks early in the process, particularly engineering risks but also system risks, overall risks related to both
the engineering process and the product it develops are reduced. Differences in software risks characterize
the difference between the development of precedented (routine) capabilities and unprecedented (innova-
tive) system capabilities. Differences in different kinds of software risks characterize the difference between
“critical systems” and other systems.
Risk, generally speaking, is a product of the probability of occurrence of a consequence with the degree
of severity or cost of the consequence. Risk can be reduced by reducing the probability or by lessening the
extent or severity of consequence or both. There are different types of risk; for software categorization is in
terms of programmatic, system, and engineering risk. (Box 2.1 describes each form of risk in detail.) There
are often tradeoffs and interactions among these risks.
An example is response time. To illustrate the differences and interactions, consider an example relat-
ing to a decision regarding response time of a system—for example, how frequently the tracks of enemy
and friendly units are updated on a display. A longer response time may enable designers to employ prec-
edented infrastructure and other architectural elements, yielding a more predictable engineering process.
That is, from the perspective of the planning phases, a mostly linear plan to engineer the product is more
likely to yield a successful outcome. In other words, programmatic risks (or project risks) are low.
The long response time may, however, create operational difficulties due to insufficient timeliness. This
is a kind of system risk—the possibility of a system failing to accomplish its mission. That is, while there
may be low risk in producing a system with a long response time, it may be less likely to be operationally
valuable. More generally, system risks can pertain to a wide range of hazards and suitability factors in opera-
tions, such as performance, security, usability, valid functionality, and integration and interoperation. System
risks can also include “long-tail” risks—events with high consequence and low (perceived) likelihood. In
this latter case it can be difficult to assess how much effort should be applied to mitigate the risk.
Suppose, on the basis of up-front user studies, it is decided to require a guarantee of a specific short
response time. This would certainly reduce the system risk related to suitability of the response time. But the
short response time may preclude use of the commodity infrastructure and, in the absence of validated al-
ternatives, create uncertainties in the engineering phases of the project regarding architectural choices. The
resulting uncertainties and consequences created within the engineering process are engineering risks.
Which is the correct architectural choice to make? If the answer is not known until the system is put
into an operational environment for test and evaluation, then the uncertainty persists for a longer period,
more engineering investment is made prior to resolution of the uncertainty, and more rework is required
should the choice need to be revised. Additionally, when one possibility is eliminated, uncertainty may
remain regarding the choice among the remaining candidate options, and further effort may be required
to resolve this choice. This adds to engineering risk, and it may add to project risk as well if there is insuf-
ficient allowance in cost and schedule for rework in the project budget. In many cases, the costs of unwind-
ing previous bad decisions become prohibitive, and as a consequence the mismatched architecture (or
other aspects of the system design) becomes a legacy infliction that is constantly worked around, adding
to downstream costs and risks.
Evaluation of architectural alternatives through full development and operational tests is rarely re-
OCR for page 45
ACCEPT UNCERTAINTY: ATTACK RISKS AND EXPLOIT OPPORTUNITIES
quired, however. Techniques such as architectural modeling and simulation, for example, would enable
the architectural alternatives to be evaluated earlier in the process and at lower cost, lowering engineering
risk. (See Chapter 5 for a discussion of the associated research challenges.)
The probabilistic models for risk assessment have limitations. A software manager may find it tempting,
when considering the mathematical characterization of risk as the product of consequence and probability,
to develop mathematical models for probabilities. This is sometime useful but also can be dangerous, since
probabilistic models often fail us in software. For example, a security vulnerability could be perceived as
high consequence, but very low likelihood, and so may be left unaddressed. But once the vulnerability
becomes known to adversaries (e.g., as a zero-day vulnerability), then the probability can rise dramatically,
and with it the extent of risk. Unfortunately, probabilistic models fail also because of aspects other than
security. The possibility of intermittent problems such as deadlocks, for example, can change quite dramati-
cally with changes in processor, communication, or storage infrastructure. Additionally, traditional models
of redundancy as a means to reduce risk are most effective when event probabilities are not coupled. But
this proves to be a dangerous expectation in the engineering of software.1
On the other hand, in systems engineering efforts where software is embedded as part of a cyber-
physical system, there are abundant probabilistic models for faults in attached physical components, and
these models may have dependencies on other probabilistic models relating to aspects of the operating
environment. In these cases, the familiar engineering mathematics for reliability must be employed, and the
results of these analyses will inform the design of software to support tolerance or containment of errors
resulting from faults in the attached components.
Credit for engineering risk reduction. The Apollo moon missions of the 1960s had systems risk (hazard)
related to delivering and returning astronauts safely. This risk could be mitigated through various safety
mechanisms. In general, there may be little correlation between system risks and the other kinds of risks,
especially when the systems risk derives primarily from the context of operational use—system risks may
be much more dependent on characteristics of the operating environment than on precedent regarding
engineering decisions. But in the case of the Apollo missions, there was also considerable engineering risk,
particularly early in the process when basic decisions were being made and experimentation and prototyp-
ing was being done to achieve early validation (i.e., prior to operational use) of the decisions made.
The experience of the prior Mercury and Gemini missions created precedent for many design consid-
erations and so served to discharge certain engineering risks. In addition to relying on hard-won experi-
ence with prior systems, the principal approaches to mitigation of engineering risks involve incremental
development, prototyping, and modeling and simulation. These methods reduce the cost of consequence
through early feedback and response afforded.
For innovative projects, efforts to resolve engineering risks can be a significant component of overall
project progress, and therefore in an earned value measurement regime there need to be ways to “give
credit” for identification and discharge of critical engineering risks. This can be a challenge: How, for
example, can the value to Apollo of the experience of Mercury and Gemini be weighed? Or, at a much
smaller scale, how can the value of the agile practice of ongoing refactoring be assessed at a time when
the costs are incurred? The refactoring practice enables teams to retain ongoing control over architectural
decisions and to enhance the potential for architecture-level adaptation on the basis of future needs. But
the benefits associated with the refactoring costs may appear only in later cycles, perhaps several months
later, and until then the return on the refactoring investment may be difficult to assess despite the long-
term value to the project.
1 See, e.g., Susan Brilliant, John Knight, and Nancy Leveson, 1989, “The Consistent Comparison Problem in N-Version
Programming,” IEEE Transactions on Software Engineering 15(11):1481-1485.
OCR for page 45
4 CRITICAL CODE: SOFTWARE PRODUCIBILITY FOR DEFENSE
and retain top technical experts within its own ranks, both in the military and civilian, except in very
particular circumstances.27 Additionally, as discussed below, it has historically proven challenging for
those software experts who have remained within the DoD to maintain strong technical currency on an
ongoing basis. Indeed, the committee believes that the extent of software expertise within the DoD is
shrinking both relative to that of the commercial sector, and perhaps also in absolute terms.
The false perceptions that software and IT generally are reaching a plateau may lead to erroneous
conclusions that the DoD can fully delegate such leadership into its supply chain. This is inconsistent
with the reality of the rapid ongoing growth of software technology (as elaborated in Chapter 1) and
the essential and growing importance of successful early architecture-focused decision making in the
development of interlinking defense systems (as elaborated in Chapter 3).
An additional challenge to the DoD is that the split between technical and management roles will
result in leaders who, on moving into management, face the prospect of losing technical excellence
and currency over time. This means that their qualifications to lead in architectural decision making
may diminish unless they can couple project management with ongoing architectural leadership and
technical engagement. The DoD does not have strong technical career paths that build on and advance
software expertise with the exception of the service labs. Upward career progression trends leading
closer to senior management-focused roles and further away from technical involvement tend to stress
general management rather than technical management experience. This is not necessarily the case in
technology-intensive roles in industry. Many of the most senior leaders in the technology industry have
technical backgrounds and continue to exercise technical roles and be engaged in technology strategy.
Nonetheless, certain DoD software needs remain sufficiently complex and unique and are not covered
by the commercial world, and therefore call for internal DoD software expertise. In the DoD, however,
as software personnel take on more management responsibility, they have less opportunity and incentive
to stay technically current. At the same time, there is an increasing need for an acquisition workforce
that has a strong understanding of the challenges in systems engineering and software-intensive systems
development. It is particularly critical to have program managers who understand modern software
development and systems.
Commercial industry also continues to have a strong need for the same types of basic software
expertise that the DoD needs and in many areas is competing with the DoD for the same pool of talent.
Notwithstanding the economic downturn, salaries for personnel in these areas remain highly competi -
tive in order to attract key talent. Although there have been improvements in recent years to accom -
modate highly paid technical experts, the DoD and other government pay scales remain generally not
as competitive with commercial industry, making it more difficult for the DoD to attract and retain
the expertise it needs. Additionally, the DoD could strengthen its ability to tap into the talent base in
DoD-aligned research organizations and universities—for example, by sponsoring security clearances
for technology leaders.
An additional challenge that DoD faces in obtaining and attracting key talent is the requirement for
cleared U.S. citizens. Security considerations that often preclude the hiring of non-citizens markedly
shrink the pool of available software talent. The pool of currently cleared U.S. citizens with the right
skills is not sufficient to meet the demand, and this pool could be shrinking because of the reduction in
support by the various agencies (principally the DoD, NASA, and the Department of Energy) of U.S.
universities in areas related to software producibility. (The Networking and Information Technology
Research and Development (NITRD) coordination categories are Software Design and Productivity
(SDP) and High Confidence Software and Systems (HCSS); see Box 1.5.) University programs create
the most highly qualified technical personnel, from the standpoint of pure technical expertise, which
can complement DoD expertise in program management. It is the nature of university economics that
27 See, for example, pp. 8-9 in DSB, November 2000, Report of the Defense Science Board Task Force on Defense Software , Office of
the Under Secretary of Defense for Acquisition, Technology, and Logistics. Available online at http://www.dtic.mil/cgi-bin/
GetTRDoc?AD=ADA385923&Location=U2&doc=GetTRDoc.pdf. Last accessed August 18, 2010
OCR for page 45
ACCEPT UNCERTAINTY: ATTACK RISKS AND EXPLOIT OPPORTUNITIES
production of PhDs, in particular, closely tracks external research support for university projects. The
recent reductions therefore mean not only that there is less U.S. research in software producibility,
but also that the pipeline of software-savvy talent is diminishing. Because the DoD will not be able to
directly hire the necessary talent in the short term to meet its growing needs, it needs to improve access
to “DoD-aligned” talent through federally funded research and development centers (FFRDCs), Service
labs, and university and industry research contractors. Flexibility regarding government personnel poli -
cies could allow more movement for leading technical experts in and out of government service, which
could facilitate DoD maintenance of technical excellence and currency in rapidly changing fields.
For example, because of the rapid growth in the significance of architecture-related capability on the
DoD side of major systems engineering projects, the committee has considered processes by which the
DoD can gain access to the very best architectural talent to address cross-cutting architectural require -
ments and validation challenges. These processes include the assembly of architectural study groups and
review panels of top experts, including experts drawn not just from the intrinsic DoD talent pool, but
also from industry and research. Options may include focusing on trying to get engineers in mid-career
in addition to young software engineers and improving the career environment so that, irrespective of
age, a DoD software engineer can develop and maintain her skills by actually producing software. By
bringing some software engineering work in-house, the DoD may be able to stimulate interest in DoD
careers and opportunities.
The question then becomes, How does the DoD effectively become a savvy customer for these
important IT and software-related services? Traditional methods have involved some combination of
developing know-how internally and acquiring it from contractors. In each case, the necessary com -
petence must be available to execute the programs, with particular emphasis on technology-intensive
decision making. In much of this decision making, the DoD must define the “operating environment”
for major software and systems development efforts performed by its contractors. This operating envi -
ronment includes certain DoD-specific standards for interoperation, assurance, security, and so on. The
expertise required in the DoD is not identical to the corresponding commercial software engineering
expertise. For example, DoD large-scale software development is almost always undertaken at arm’s
length by contractors. This can complicate the implementation of practices that deviate significantly from
the “requirements-first” RFP model. For innovative systems, as the committee has noted throughout
the report, there must be ongoing interaction on topics related to architecture, incremental develop -
ment, and preventive practices in support of assurance. Without appropriate expertise and experience,
these interactions—and associated management of incentives—can be difficult to manage successfully.
In addition, a growing number of areas of technology-intensive decision making where the DoD has
particular interests and incentives vary from those of its contractors.
Access to Talent
Although access needs to be improved, the DoD does, however, have access to a considerable base
of talent through three DoD-aligned sources: (1) DoD FFRDCs, (2) Service labs, and (3) research con -
tractors in universities and industry. Despite the reduction in funding related to software (see Box 1.5),
the DoD has nonetheless taken modest but valuable actions to cultivate talent and introduce leading
young scientists to defense systems and the defense mission. A prominent example is the Computer
Science Study Group (CSSG) sponsored by DARPA28 that affords opportunities for younger researchers
to engage more directly with defense technical challenges.
28 For more information, see http://www.darpa.mil/dso/solicitations/ra07-43.html. Last accessed August 18, 2010.
OCR for page 45
CRITICAL CODE: SOFTWARE PRODUCIBILITY FOR DEFENSE
Opportunities to Strengthen the DoD’s Software Expertise
There are two significant building blocks for strengthening the DOD’s intrinsic software expertise
that leverage the DoD’s particular expertise and responsibilities in two areas—operational test and
evaluation (OT&E) and information assurance (IA). As noted elsewhere in this report, there are oppor-
tunities to expand the role of OT&E organizations to support preventive approaches to assurance and
early validation, generally for innovative large-scale systems engineering projects, particularly regard -
ing architecture, process, and key quality attributes—even when detailed decisions regarding specific
aspects of functional capability are deferred. A successful IA regime will require similar engagement,
as well as sophisticated interaction between defensive and offensive programs and activities, engage -
ment with those who operate and defend the DoD’s communications networks, and intelligence about
threats and vulnerabilities.
In the cases of both OT&E and IA, leaders in practice and technology are understood to reside in
the DoD (and in similar institutions in other countries). By creating a visible culture of elite technol -
ogy-intensive leadership in these areas, the DoD has the potential to attract top talent, in a manner
analogous to the ability of the National Security Agency (NSA) to attract top mathematicians. Although
it is important to “grow the ranks” in these areas, the DoD cannot sustain leadership unless it recruits
and engages top technical talent.
Summary
Because the DoD does not currently have the requisite expertise and talent it needs for effective
software producibility and because the rapid pace of software development demands ongoing interac -
tion with the field, the DoD must engage experts outside the DoD and its primes. This engagement, to
be effective, should be accompanied by internal processes to apply and incorporate contributions and
feedback to software projects throughout the systems engineering lifecycle. In other words, the DoD
should adapt processes to facilitate input from outside experts throughout the systems engineering
lifecycle for software-intensive systems, with particular emphasis on innovative/unprecedented and
large-scale systems and on systems engineering efforts involving iterative processes.
It is essential to sponsor high-quality software-related research projects. Investing in cutting-edge
software defense projects creates value not only in advancing innovation, but also in developing a
pipeline of technical experts with experience tackling DoD software producibility issues. University
research funding supports research opportunities for undergraduates, graduates students, and post -
doctoral researchers. DoD engagement with the next generation of software experts at formative stages
in their careers can encourage exploring a career within the DoD, thus increasing the available pool of
cleared software professionals.
Also crucial is support for defense-relevant top-tier educational programs in U.S. universities to
strengthen the pipeline of top technical experts. Targeted postdoctoral grants may be another avenue
through which the DoD can encourage emerging software professionals to choose careers in the DoD.
Finding 2-6: The DoD has a growing need for software expertise, and it is not able to meet this need
through intrinsic resources. Nor is it able to fully outsource this requirement to DoD primes. The DoD
needs to be a smart software customer. This need is particularly significant for large-scale innovative
software-intensive projects for which there are cross-cutting software architectural requirements and
validation challenges.
The case for the DoD to have software expertise on its side of the table is compelling. Increasing
complexity, scale, and interoperability in a context of rapid innovation and sophisticated incremental
and iterative processes require the DoD to become a knowledgeable customer of software tools and
OCR for page 45
ACCEPT UNCERTAINTY: ATTACK RISKS AND EXPLOIT OPPORTUNITIES
systems. Direct access to this necessary expertise, in light of industry’s competing interest in hiring
similar professionals, is limited. For these reasons, a combination of (1) outreach to FFRDCs and similar
DoD focused organizations, academia, and industry and (2) internal DoD education and development
of software expertise is needed to bridge the gap.