Read "Achieving Effective Acquisition of Information Technology in the Department of Defense" at NAP.edu

Page 79 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

4
Acceptance and Testing

INTRODUCTION

The test and evaluation (T&E) methodologies associated with weapons systems are mature and largely stable. In contrast, T&E methodologies for information technology (IT) systems are still maturing. Areas where challenges exist for IT systems include the assessment of Department of Defense (DOD) enterprise-level scalability, the proper role of modeling, cross-system integration validation, and interoperability validation. These and other areas lack widely agreed on test methods and standards, and they lack operationally realistic T&E methodologies.¹ Not surprisingly, commercial developers of large-scale applications experience similar challenges.

The tenets of DOD Instruction (DODI) 5000 have evolved over many decades and have served as the basis for well-established criteria and processes for decisions on weapons systems programs—for example, decisions on whether to enter into low-rate initial production or full-rate production—as well as providing commonly understood methods that comply with requisite policy and statutory guidelines. The equivalent

¹

See David Castellano, “Sharing Lessons Learned Based on Systemic Program Findings,” presented at 2007 ITEA Annual International Symposium, November 12-15, 2007; National Research Council, Testing of Defense Systems in an Evolutionary Acquisition Environment, The National Academies Press, Washington, D.C., 2006; and Software Program Managers Network (SPMN), “Lessons Learned—Current Problems,” SPMN Software Development Bulletin #3, December 31, 1998.

Page 80 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

decision points in DOD IT systems are quite different in the iterative, incremental development (IID) processes discussed in Chapter 3. As a result, an equivalent understanding of what is required and when it is required has not been reached for IT systems acquisition. The results are frustration for developers and other participants in the acquisition process and uncertainty and delay in the process itself. Much can be gleaned from the experience of commercial IT systems developers and suppliers—such insights are just beginning to be incorporated into DOD practice. This chapter briefly reviews key elements and shortcomings of current practice and outlines opportunities for improvement, with a focus on making the perspective of the end user more salient.

SHORTCOMINGS OF PRESENT DEFENSE TEST AND EVALUATION

The current DOD process for the acquisition of IT systems has its roots in the procurement of hardware-oriented systems that will be manufactured in quantity. As such, the DOD’s typical practice is to determine whether or not a design is adequate for its purpose before committing to advancing the production decision. For programs that are dominated by manufacturing cost, this approach reduces the possibility that a costly reworking of a system might become necessary should a defect be identified only after fielding of units in the operational environment has begun.

DODI 5000 directs programs to conduct testing and evaluation against a predetermined set of goals or requirements. As shown in Table 4.1, the acquisition process, including the T&E process, is governed by a large set of rules, test agents, and conditions, each trying to satisfy a different customer. Traditional test and acceptance encompass three basic phases: developmental test and evaluation (DT&E; see Box 4.1), the obtaining of the necessary certification and accreditation (C&A), and operational test and evaluation (OT&E; see Box 4.2).

In essence, the current approach encourages delayed testing for the assessment of the acceptability of an IT system and of whether it satisfies user expectations. A final operational test is convened in order to validate the suitability and effectiveness of the system envisioned as the ultimate deliverable, according to a specified and approved requirements document. This approach would work if the stakeholders (program manager [PM], user, and tester) were to share a common understanding of the system’s requirements. However, because of the length of time that it takes an IT system to reach a mature and stable test, what the user originally sought is often not what is currently needed. Thus, unless a responsive process had been put in place to update the requirement and associated

Page 81 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

TABLE 4.1 Test and Evaluation Activity Matrix

Activity	Test Agent	Conditions	Customer	Reference
Developmental Test and Evaluation (DT&E)	Program management office (PMO) and/or contractor and/or government developmental test organization	Determined by PMO; generally benign, laboratory; developer personnel	PMO	Title 10, DOD Instruction (DODI) 5000 series
Operational Test and Evaluation (OT&E)	Independent operational test agent	Operationally realistic, typical users	PMO, end user, and Milestone Decision Authority	Title 10, DOD Instruction (DODI) 5000 series^a
Joint Interoperability Test Certification	Joint Interoperability Test Center	Applicable capability environments	Command, Control and Communications Systems Directorate (J6); the Joint Staff	DOD Directive 4630.5 DODI 4630.08 Chairman of the Joint Chiefs of Staff Instruction 6212.01D
Defense Information Assurance Certification and Accreditation Process (DIACAP)	Operational test agent, Defense Information Systems Agency (DISA), DISA Field Security Operations Division, National Security Agency	Operational, laboratory	Designated accrediting authority	DODI 8510.01^b
^aAlso the Director of Operational Test & Evaluation (DOT&E) policy on testing software-intensive systems. ^bNote also the DOT&E policy on information assurance testing during OT&E.

Page 82 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

BOX 4.1

Developmental Testing

The process for testing U.S. military systems, including information technology systems, begins at the component level and progresses to the subsystem and system levels. Initial testing is done with components and subsystems in the laboratory, after which the testing graduates to larger subsystems and full systems.

Early developmental testing, which is conducted by the developer, is designed to create knowledge about the proposed system and about the best solutions to technical difficulties that must be confronted. Later, the Department of Defense (DOD) may participate in developmental testing that is designed to challenge the system under a wider variety of test conditions or variables. Next, more operationally realistic testing is conducted and overseen by the DOD.

BOX 4.2

Operational Assessments

Typically, operational testing, which incorporates user feedback, is more operationally realistic and stressing than is earlier developmental testing. Also, the Department of Defense may conduct operational assessments designed to measure the readiness of the system to proceed to operational testing. Overall, program success is ensured by ironing out performance issues in early operational assessments or limited user tests first.

These limited user tests are still developmental in nature, but they are designed to evaluate the military effectiveness and suitability of systems in a somewhat more operationally representative setting. For example, while still in development, information technology (IT) systems might be tested in configurations that provide interfaces with other related or ancillary IT systems on which development has already been completed. In addition, IT systems might be tested in environments that simulate the expected battlefield environments, which might involve shock, vibration, rough handling, rain or maritime conditions, and/or temperature extremes.

As with other testing, operational assessments are designed to provide data on the performance of a system relative to earlier prototypes, some of which might already have been deployed. These assessments are also designed to compare the rate of failures, system crashes, or alarms with those of earlier prototypes.

Page 83 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

test plan, the tester could well be compelled to perform T&E against a requirement acknowledged by the time of the test to be obsolete but that once was needed, fully vetted, and approved.

Also, the DOD has not adopted a norm of requiring continuous user involvement in developmental testing. Obtaining the perspective of typical users early on and sustaining that user input throughout a development project are essential if the DOD is to exploit iterative, incremental development methods fully.

This situation is exacerbated, in part, because the DOD acquisition community has not been properly and consistently trained to understand IT systems and the development and integration processes of associated emerging systems. Mismatches between the approaches of developmental systems and the expectations of test systems will inevitably lead to failed or unhelpful operational testing. This mismatching manifests itself in at least two ways: in terms of technological expectations and of user expectations. If developers and testers fail to agree on which systems capabilities are to be supplied by the program versus those that are to be provided by existing systems, both developers and testers collectively fail because of mismatched test scope. If users fail to be engaged early in development or if testing fails to acknowledge the continuous revision of user-interface requirements, the development and testing processes collectively fail because of mismatched user engagements and expectations.

A lack of user engagement limits understanding by developers and testers as to what is required: for developers this means what capabilities to build, and for testers it means what capabilities to evaluate. In either case, neither party has an adequate opportunity to implement any corrective action under the current acquisition process.

Operational testers, independent evaluators, and the line organizations that represent end users are the key participants in operational tests. These tests include operationally realistic conditions that are designed to permit the collection of measurable data for evaluating whether a system is operationally suitable and operationally effective as measured against key performance parameters (KPPs). KPPs are descriptions of the missions that the system must be able to perform. Historically, meeting (or not meeting) the KPPs has been a “go/no-go” criterion for system fielding. By definition, a KPP is a performance parameter that, if not met, is considered by the DOD to be disqualifying. There have been many cases over the past decade or so in which the Service test community has documented “capabilities and limitations” in operational tests of systems developed in response to urgent operational needs statements from combatant commanders. This type of case provides more flexibility to decision makers, allowing them to decide whether a system is “good enough” to meet immediate wartime needs.

Page 84 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

In most cases the units participating in the tests necessarily are surrogates for the full range of actual end users who will ultimately receive the fielded system. Their role is to be the representatives of the users and to bring to bear the users’ perspective from an operational standpoint. Naturally, the independent operational test agency and independent evaluators will assess the ability of the system to meet the KPPs in the requirements document along with the strengths and weaknesses of the system. The operational test community will also assess whether the KPPs may or may not be the best contemporary measure of system acceptability at the time that the test is completed. Typically, operational testers, evaluators, and participating units have recent operational experience in the same or associated mission areas for supporting their assessments. Developers need similar operational insights, as the systems being developed must perform under realistic battlefield conditions and not just in the laboratory. The bottom line is that there is no substitute for a user perspective throughout the acquisition of IT systems.

Cost and schedule, rather than performance, frequently become the main drivers of a program, and developmental testing is too often given short shrift, or the amount of time allowed for operational testing is reduced. In general, program offices tend to sacrifice rigor in favor of a more condensed, “success”-oriented testing approach. As a result, user issues that should have been discovered and addressed during DT&E may escape notice until OT&E. Not only are problems more difficult and more expensive to fix at that point, but they also create negative user perceptions of the system and of the acquisition process.

Resource constraints also hamper the DOD test organizations directly. Cuts to budgets and personnel have significantly reduced the number of soldiers, sailors, airmen, and Marines available to serve as users during the test process, especially in the military T&E departments, even as systems have become more complex.² This reduced pool of DOD testers impedes the early and close collaboration with systems acquirers and developers that is necessary to support an IID process adequately.

In summary, IT testing in the DOD remains a highly rigid, serial process without the inherent flexibility and collaboration required to support an agile-oriented iterative, incremental development process, particularly as it might be applied to IT systems. If a new IT systems acquisition process is defined and adopted, life-cycle acceptance testing that reflects the IID approach will also be needed in order to achieve success. The rest of this chapter describes how to address the challenges of testing and evalu-

²	Defense Science Board, Report of the Defense Science Board Task Force on Developmental Test & Evaluation, Department of Defense, Washington, D.C., May 2008, available at http://www.acq.osd.mil/dsb/reports/2008-05-DTE.pdf; accessed November 4, 2009.

Page 85 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

ation in the DOD acquisition environment in a way that incorporates an IID approach.

“BIG-R” REQUIREMENTS AND “SMALL-R” REQUIREMENTS

For IT systems, a decision point assessment rests on the system’s ability to satisfy the stated and approved “big-R” requirements. The term “big-R” requirements in this report refers to a widely recognized purpose, mission, and expected outcome. One example would be a missile system, which would be assessed on the basis of its ability to hit a target at a given range under specified conditions. Another example would be a management information system that would support specific business functional areas and be accompanied by security access levels and a specified data standard and architecture. Such high-level descriptions are expected to be fairly stable over the course of a program, although they may evolve on the basis of feedback from users and other stakeholders.

In contrast, IT systems, particularly software development and commercial off-the-shelf integration (SDCI) IT systems, cannot be expected to have stable requirements initially. The “small-r” requirements referred to in this report are the more detailed requirements, such as those associated with specific user interfaces and utilities, that are expected to evolve within the broader specified architecture as articulated in the initial big-R requirements document. In a sense, small-r requirements could also be thought of as lower-level specifications.

Stated another way, it is challenging if not impossible to accurately capture users’ detailed, small-r requirements up front, as their reactions to a prototype or newly fielded system are often negative, even though it may fully meet the specifications set forth in a requirements document. Part of the problem is that there are so many minute details that together contribute to the usability of a system that it is nearly impossible to detail these in advance of giving users an opportunity to try out an actual running system. Usability might be a big-R requirement, but the specific details that would make that happen are small-r requirements that are essentially impossible to specify before users have been able to experiment with the system. Big-R requirements, such as the expected user interface and user paradigms and integration with other concurrently evolving systems and security practices at a high level, will all result in an unpredictable set of specifications and small-r requirements in practice. The need to manage big-R requirements coupled with changing and/or ill-specified small-r requirements is another reason that an iterative, incremental development process is well suited to these types of systems.

Page 86 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

INCORPORATING THE VOICE OF THE USER

A significant problem across the DOD in the development and acquisition of IT systems is the lack of ongoing user input, both as a means to determine needed capabilities and as a measure of the success of program development. In the current IT environment, user needs are constantly changing; such constant change is an ever-present factor that breaks any development or testing model which assumes a consistent, comprehensive set of user requirements (both big-R and small-r). Typical DOD IT systems have become so complex that describing either big-R or small-r requirements up front has become severely problematic. In comparison, IID approaches to IT systems rely on user feedback early and at intermediary points to guide development.

Without significant user involvement in developmental testing, the earliest point at which users will be involved may not be until operational testing, far too late in the development process for an IT system. It is crucial to learn from user experiences early in the development process, when a system can be improved more easily and at less expense. If requirements (big-R and especially small-r) are not well understood or are likely to change during the course of the project—conditions that are commonly found in DOD IT developments—an iterative approach with regular and frequent, if not continuous, user feedback will produce results faster and more efficiently than the traditional DOD approach can.

The adoption of IID approaches coupled with a focus on the end-user experience does not mean, however, that other stakeholders and nonfunctional requirements (such as information assurance, reliability, and so on) are unimportant. Historically, other stakeholder voices have dominated the process to the exclusion of the end user. The committee urges a rebalancing and a focus on end-user mission success that incorporates higher-level architectural and nonfunctional requirements at appropriate phases of system development and deployment.

TOWARD CONTINUOUS OPERATIONAL ASSESSMENT

Agile approaches and iterative, incremental approaches to software development have been receiving increased attention in industry in recent years and are having a significant impact on how software is developed and how systems are tested. While architectural and systems-level engineering considerations will continue to be significant drivers of system success, the shift toward IID and agile processes also appropriately requires the incorporation of user perspectives throughout the development life cycle. In the case of assembling commercial off-the-shelf (COTS) solutions, the criteria for solutions that please users are often related less to the technical architecture that accounts for how the components are

Page 87 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

put together than to the workflow of the resulting system—How easy is it to perform specific tasks with the system? How easy is it to maintain, manage, or perform upgrades on the system? Answers to these questions cannot typically be answered up front but rather are best answered through end-user experiences. In the case of any new systems development, user input is critical to the requirements and deployment-readiness of the resulting system, but generally users are unable to specify the details of these needs in the requirements phase. Thus, requirements in IT systems are best guided through user input based on direct experience with current prototypes. The acquisition approach described in this report incorporates the restructuring of the DOD testing and acceptance process to be the enabler of this user input, and the incorporating of operationally realistic testing with user feedback into a routine of continuous operational assessment.

A continuous operational assessment process would replace the traditional DOD development, testing, and acquisition process. A continuous process would need to encompass two critical features: an acceptance team and a metrics collection and reporting capability:

An acceptance team (AT) would be charged with providing feedback on the acceptability of the solution toward meeting the users’ goals. The AT should be drawn from the expected initial users of the IT system and ideally should be the cadre that first tests the system in the field and then trains subsequent users in use of the system. The AT would work with the development team and acquisition (program management) team through the entire process. The AT would initially engage with these teams as high-level requirements are gathered and initial big-R requirements documents are published, and then it would continue in this relationship through to product deployment. An important function of the AT would be to keep the requirements, functional deliverables, and test plans synchronized.
A robust metrics collection and reporting capability (MCRC) would collect, aggregate, and analyze end-user service consumption and experiences. The MCRC—leveraging existing operational tools, including enterprise monitoring and management capabilities, DOD information assurance capabilities, and commercial capabilities in combination with measured end-user performance—would provide visibility into what functions end users are actually using in accomplishing their day-to-day missions and how they are using them. The MCRC would provide all stakeholders with a clear indication of operational effectiveness based on actual operational use.

Page 88 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

Acceptance Teams

The development process continues through a set of iterations as described earlier in the report. In each iteration, the AT would fill the role of “the customer” on matters pertaining to the achievement of iteration objectives. The AT would evaluate each iteration prototype and identify issues and plausible changes to satisfy user acceptance of stated requirements before the development process went forward. Thus, the development team would secure early feedback from the AT on unclear, misunderstood, or incorrect requirements. For each iteration, the AT, in its role as proxy for the ultimate customer, would validate the current requirements list while the development team would provide work estimates and cost projections and develop a plan for the next iteration.

A responsibility of the AT in assisting the development process would include building (or working with the development team to build) acceptance tests for each iteration. This approach is a cornerstone of Test-Driven Development: By Example,³ in which tests for a deliverable are written first, and then a system to satisfy those tests is developed. Using tests in this manner is particularly helpful with larger, more complex organizations as it helps remove ambiguity or gaps in communication.

Moving to an IID methodology does not imply any reduction in oversight or the shepherding of acquisition funds. To ensure satisfactory progress of programs, periodic and regular progress checkpoints need to occur with the acquisition executive. In keeping with agile-inspired methodologies, these checkpoints should be based on calendar or funding milestones rather than being requirements-driven progress points. In agile processes, iterations are based on time-boxing work schedules, whereby some content may slip from one iteration to the next, but iterations are closed according to the schedule, thus allowing for prompt identification of erroneous estimates of the time required to complete work items and ensuring continuous user input regarding priorities. In a similar manner, checkpoints with acquisition executives could be based either on time duration or on a funding milestone (as the two are frequently closely correlated in an IT project). These checkpoints should be less frequent than one per iteration, perhaps occurring every 6 to 18 months.

In these checkpoints, the AT would be the key voice in articulating the “value delivered.” It might be that at a checkpoint, the requirements delivered are not as anticipated at the start of the program, but the value to the end users is still quite significant. In such cases, the acquisition executive would examine and understand the reasons for the deviation in

³	Kent Beck, Test-Driven Development: By Example, Addison-Wesley, Old Tappen, N.J., 2003.

Page 89 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

the plan and take into account user reactions to progress in the program as far as it had gone. Conversely, a program might be tracking exactly to the initial schedule but could be producing an asset with which the AT is not satisfied. In such cases the acquisition executive should intervene and determine if corrective action is possible or if the fault lies in the concept of the program itself. In either case, the acquisition executive is getting regular feedback on project progress with a clear opinion from those who will benefit from the system.

OT&E is, at its core, about determining the effectiveness and suitability of a system for deployment. During development and before a system reached OT&E, the AT would carry this responsibility. The AT would recommend to the acquisition executive when an iteration of an IT system was ready for deployment. Such deployments can take several forms. Initial deployments may be limited in scope, with the intention of testing system effectiveness or of allowing some features to be exploited while others continue to be developed (e.g., joint programs that are initially deployed with a single Service). Other deployments may be wider in scope to allow value to be captured from these systems while more advanced features are developed in future iterations. An important difference in this process is that deployment is not a one-time event at the end of the program. The AT would recommend deployment and the scope of deployment on the basis of the operational benefit, functional value to the user, and risk of deployment, but would not need to (and should not) wait until “completion” of the entire system before making such recommendations.

Evaluation Through Operational Use Metrics

One of the major lessons learned from interactions with commercial suppliers of IT systems is that significant benefits come from these suppliers’ understanding of and reliance on actual end-user behavior as measured by actual end-user actions. In fact, the instrumentation of products and services for the collection of actual end-user metrics is so engrained in many commercial IT companies as to be unremarkable to commercial suppliers. Indeed, because this practice is second nature to those in the commercial sector, it was necessary for the committee to expressly solicit comments on this point from briefers. Commercial IT companies drive their entire investment portfolios on the basis of anticipated and actual end-user consumption patterns and/or end users’ engagement with their offered products and services. Those products and services with large and committed user bases drive a preponderance of the businesses’ valuation and receive commensurate corporate leadership attention and investment. Small changes in interaction patterns are induced and then

Page 90 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

measured and analyzed minutely for an understanding of how best to improve the user experience and increase user satisfaction as measured by user engagement. As a result, virtually every aspect of the business is focused on satisfying end-user needs as quickly as possible, with the associated increase in network-based productivity that has been witnessed on the World Wide Web.

Many tools can be used to gather this information, including run-time configuration management, run-time collection and reporting, near-real-time aggregation, and business analytics. Many of the supporting tools are also used by service operations to identify early signs of technology outages or slowdowns and to begin assessing and diagnosing a problem before it becomes a catastrophic failure.

This report recommends incorporating the voice of the end user at all stages of the system life cycle. From a test perspective, such incorporation focuses resources in a number of important ways:

Those services that are integral to every higher-level capability receive special attention during testing and are added incrementally much more deliberately, which may impact fielding cycle time;
Those services that are exercised most strenuously and frequently by end users directly experience extensive, highly iterative beta testing with actual successful end-user sessions forming the basis for a determination of operational effectiveness; and
Field failures are automatically reported and incorporated into development and testing procedures as necessary, resulting in “living” test documents.

Establishing an MCRC along with leveraging current DOD tools and available commercial tools and practices would overtly move the operational evaluation assessment from a speculative proposition based on surrogate run times, users, test data, and marginally current requirements specifications, to a managed and measured investment assessment based on current, actual end-user missions and needs.

INCORPORATING COMMON SERVICES DEFINITIONS

For years, IT systems developers have employed functionalities that are externally supplied and operationally validated as a basis for their success, without expecting to revalidate those functionalities as part of their formal test regimen. Examples include the use of the Internet Protocol (IP) and the Domain Name System (DNS). These capabilities are provided externally to the capability being tested, have already been validated in separate acquisitions, and thus need not be included in the

Page 91 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

scope of the IT systems test regimen. As more services have been commoditized and/or supplied through commercial means, IT systems acquisition practice has not changed to address this reality. Unfortunately, no mechanisms exist to identify and track these supplied services or to apply a consistent approach for their use throughout the current DOD acquisition process. This is another negative repercussion of the weapons systems-based acquisition approach, where far fewer opportunities for shared services exist.

For example, the DOD has broadly adopted a set of networking capabilities that are integral to every IT system and that do not require revalidation (e.g., the IP and DNS capabilities mentioned above). As more of the technology stack becomes commoditized or provided as a service, the set of associated capabilities not requiring revalidation should likewise grow. In addition, testing approaches should be broadened to account for this commoditization. Developers of common service capabilities should account for the full range of possible application, from strategic, fixed-base IT systems to tactical, end-user-focused IT systems. Similarly, the testing of common services should reflect the full intended scope of application of the services. Dependent developers should be permitted and encouraged to view these common services as operationally validated externally, as long as they adhere to the terms of the supplied service. These developers’ test teams should accept and treat these as externally supplied and acceptable operational services, regardless of the validating organization.

Commercial off-the-shelf hardware, software, and services (CHSS) IT systems acquisition can similarly better leverage commercial experience in place of formal DOD testing and oversight review. Service-level agreements established for commercial services—which often have been validated by thousands or millions of users or hundreds of commercial entities—constitute, in effect, a test environment whose results should be accepted prima facie. Past validation of platform components—either validation from widespread use in the commercial marketplace or prior validation in an SDCI IT system—can also substitute for new testing.

As an example, consider a public key infrastructure (PKI) that is sourced and validated as a well-defined service on which secure identity management will depend. As a result, the PKI program manager must architect the supplied service to support the range of users anticipated. The PKI test regimen should address the specified terms of service that result for the PKI. So long as a dependent PM uses the standard service interface and has no requirements that exceed the already-validated PKI service scope, the PKI should be treated as an existing and validated external service that is outside the scope of the dependent PM’s formal testing regimen. Other examples include “on-demand” computing and storage, network-centric enterprise services, and visualization services.

Page 92 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

This approach to the evaluation of common services avoids the cost and time required to recertify proven products individually. Risks of undetected failures in these products are mitigated by development tests of integrated modules and operational testing when the composite system undergoes rigorous evaluations to determine effectiveness and suitability.

VIRTUAL INFORMATION TECHNOLOGY TEST ENVIRONMENTS

The use of integrated virtual information technology test environments may be one way to facilitate testing that would allow early prototypes of systems to be subjected to much more realistic test conditions, thereby helping to identify potential problems in development as soon as possible. Such test environments would rely on a distributed test network that could be accessed by both government and industry, when appropriate, for use in performing early acceptance testing. A broad range of simulation systems and operational command-and-control systems that can represent realistic operational elements would provide the necessary data to drive such systems during testing. Linking the proponents of these simulations and systems through a distributed network would allow them to maintain the systems within their existing facilities while also providing opportunities for use during larger test events. Additional applications necessary to control, monitor, and log data during such tests would augment the sets of simulations and systems.

It is important that virtual IT test environments have the ability not only to test the basic functionality of systems but also to emulate as much of the expected operational environment as possible. One of the recommendations of the National Research Council report Testing of Defense Systems in an Evolutionary Acquisition Environment was to “revise DOD testing procedures to explicitly require that developmental tests have an operational perspective (i.e., are representative of real-world usage conditions) in order to increase the likelihood of early identification of operational failure modes.”⁴ A simulation-based test environment has the potential to provide such functionality, as has been shown in multiple DOD training and experimentation environments.

The technology necessary to achieve virtual test environments is already well established. In fact, multiple (somewhat duplicative and overlapping) programs that have similar capabilities for doing exactly this kind of testing already exist within the DOD. These programs may

⁴	National Research Council, Testing of Defense Systems in an Evolutionary Acquisition Environment, The National Academies Press, Washington, D.C., 2006.

Page 93 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

provide an important starting point from which an expanded capability for early and continuous acceptance testing could be implemented. A sampling of such programs from across the military Services and defense agencies includes the following:

The Systems of Systems Integration Laboratory (SoSIL) is a large-scale communications network for modeling and simulation, hardware and software integration, and virtual operational testing; SoSIL also offers a “soldier-in-the-loop” capability.⁵
The Army’s Cross Command Collaboration Effort is an effort to establish and evolve a consistent and core set of modeling and simulation tools, data, and business processes that meet the common environment requirements of the U.S. Army’s Training and Doctrine Command, Army Test and Evaluation Command, and Research, Development and Engineering Command. This common environment will facilitate those organizations’ interoperability with the materiel development community to help conduct the distributed development of doctrine, organizations, training, materiel, leadership and education, personnel, and facilities.⁶
The Air Force Integrated Collaborative Environment (AF ICE) is intended to provide a persistent, composable, flexible infrastructure along with a series of tools, standards, processes, and policies for using the environment to conduct the continuous analysis required to support a capabilities-based planning process.⁷
The Joint Mission Environment Test Capability (JMETC) was established in October 2006 to “link distributed facilities on a persistent network, thus enabling customers to develop and test warfighting capabilities in a realistic joint context.”⁸ JMETC has already established a persistent test network, through the Secret Defense Research and Engineering Network, which provides connectivity to both Service and industry assets. It relies on the Test and Training Enabling Architecture (TENA) as its infrastructure for data exchange; TENA provides a standard object model and interfaces to the Distributed Interactive Simulation Protocol and the

⁵	Boeing, FCS Systems of Systems Integration Laboratory Backgrounder, May 2007, available at www.boeing.com/defense-space/ic/fcs/bia/080523_sosil_bkgndr.pdf; accessed November 12, 2009.
⁶	Brian Hobson and Donald Kroening, “Cross Command Collaboration Effort (3CE),” Spring Simulation Multiconference: Proceedings of the 2007 Spring Simulation Multiconference, Vol. 3, 2007.
⁷	B. Eileen Bjorkman and Timothy Menke, “Air Force-Integrated Collaborative Environment (AF-ICE) Development Philosophy,” ITEA Journal of Test and Evaluation 27(1), March/ April 2006.
⁸	Richard Lockhard and Chip Ferguson,“Joint Mission Environment Test Capability (JMETC),” ITEA Journal 29:160-166, 2008.

Page 94 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

High Level Architecture, which are widely used standards for modeling and simulation. JMETC has already established linkages to the Future Combat System program and AF ICE.⁹

The Distributed Engineering Plant (DEP) was established in 1998 by the Naval Sea Systems Command (NAVSEA) to identify and resolve combat battle management command, control, communications, and computers (C4) systems interoperability problems prior to deploying new and upgraded systems to sea. Enabled by today’s newest networking technology, DEP links the Navy’s shore-based combat systems/C4/hardware test sites, which are located in geographically disparate facilities across the nation, into a virtual shore-based battle group that exactly replicates a battle group fighting at sea. By inserting “ground truth” system simulation and stimulation data and then observing how the combat systems exchange and display tactical data, engineers can identify precisely and solve interoperability problems ashore well before those systems enter the operating forces. This approach emphasizes shore-based testing and warfare systems integration and interoperability testing and acceptance certification of operational IT systems in a test environment similar to their ultimate shipboard operational environment; it also emphasizes interoperability assessments, which are a prerequisite for the operational certification of the ships in strike force configurations prior to deployment.

Obviously, numerous organizations across the DOD with roles and missions oriented toward testing and evaluation may also have capabilities that could be leveraged for such an effort. Among them are the Joint Interoperability Test Command, the Army Test and Evaluation Command, the Air Force Operational Test and Evaluation Center, and the Navy Operational Testing and Evaluation Force. Numerous software testing and distributed testing capabilities also exist in organizations such as the Defense Advanced Research Projects Agency and the various research laboratories within the Services.

Establishing the technical underpinnings of virtual IT test environments is only part of the solution to improving early testing in acquisition. In addition, appropriate policy and process changes would need to be implemented to mandate activities that would utilize such an environment. Some of the issues that must be addressed include data sharing across a testing enterprise, the establishment of standards for data exchange, and the formalizing of the role of early testing in acquisi-

⁹

Test and Training Enabling Architecture Software Development Activity, JMETC VPN Fact Sheet, Central Test & Evaluation Investment Program, U.S. Joint Forces Command, Department of Defense, Washington, D.C., November 23, 2009, available at https://www.tena-sda.org/download/attachments/6750/JMETC-VPN-2009-11-23.pdf?version=1.pdf.

Page 95 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×

tion—likely requiring revisions to DODI 5000.2. Other issues include the governance and management of such a capability and the roles and responsibilities in terms of executing testing in such an environment. Some of these issues are raised in the DOD’s Acquisition Modeling and Simulation Master Plan, issued in April 2006, which focuses on improving the role of modeling and simulation for testing.¹⁰

Another challenge in making such environments usable is to ensure that the complexity required to perform the integration of systems and configuration for tests is minimized; otherwise, the costs of using such test environments would far outweigh the benefits. Paramount in managing the complexity involved is the establishment of a formal systems engineering process involving test design, integration, documentation, configuration management, execution, data collection, and analysis. Also important is the establishment of standards for simulation and systems interoperability that allow for common interfaces and the reuse of systems that also provide enough flexibility to adapt to new requirements. And finally, a management process must be married to the systems engineering process so that users are invited to participate and incentives are created for industry and government to share data and work toward common goals.

¹⁰	Department of Defense, Acquisition Modeling and Simulation Master Plan, Software Engineering Forum, Office of the Under Secretary of Defense (Acquisition, Technology and Logistics) Defense Systems, Washington, D.C., April 17, 2006.

Page 96 Cite

Suggested Citation:"4 Acceptance and Testing." National Research Council. 2010. Achieving Effective Acquisition of Information Technology in the Department of Defense. Washington, DC: The National Academies Press. doi: 10.17226/12823.

×