4
Acceptance and Testing

INTRODUCTION

The test and evaluation (T&E) methodologies associated with weapons systems are mature and largely stable. In contrast, T&E methodologies for information technology (IT) systems are still maturing. Areas where challenges exist for IT systems include the assessment of Department of Defense (DOD) enterprise-level scalability, the proper role of modeling, cross-system integration validation, and interoperability validation. These and other areas lack widely agreed on test methods and standards, and they lack operationally realistic T&E methodologies.1 Not surprisingly, commercial developers of large-scale applications experience similar challenges.

The tenets of DOD Instruction (DODI) 5000 have evolved over many decades and have served as the basis for well-established criteria and processes for decisions on weapons systems programs—for example, decisions on whether to enter into low-rate initial production or full-rate production—as well as providing commonly understood methods that comply with requisite policy and statutory guidelines. The equivalent

1

See David Castellano, “Sharing Lessons Learned Based on Systemic Program Findings,” presented at 2007 ITEA Annual International Symposium, November 12-15, 2007; National Research Council, Testing of Defense Systems in an Evolutionary Acquisition Environment, The National Academies Press, Washington, D.C., 2006; and Software Program Managers Network (SPMN), “Lessons Learned—Current Problems,” SPMN Software Development Bulletin #3, December 31, 1998.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 79
4 Acceptance and Testing INTRODUCTION The test and evaluation (T&E) methodologies associated with weap- ons systems are mature and largely stable. In contrast, T&E methodologies for information technology (IT) systems are still maturing. Areas where challenges exist for IT systems include the assessment of Department of Defense (DOD) enterprise-level scalability, the proper role of modeling, cross-system integration validation, and interoperability validation. These and other areas lack widely agreed on test methods and standards, and they lack operationally realistic T&E methodologies. Not surprisingly, commercial developers of large-scale applications experience similar challenges. The tenets of DOD Instruction (DODI) 5000 have evolved over many decades and have served as the basis for well-established criteria and processes for decisions on weapons systems programs—for example, decisions on whether to enter into low-rate initial production or full-rate production—as well as providing commonly understood methods that comply with requisite policy and statutory guidelines. The equivalent 1 See David Castellano, “Sharing Lessons Learned Based on Systemic Program Findings,” presented at 2007 ITEA Annual International Symposium, November 12-15, 2007; National Research Council, Testing of Defense Systems in an Eolutionary Acquisition Enironment, The National Academies Press, Washington, D.C., 2006; and Software Program Managers Net - work (SPMN), “Lessons Learned—Current Problems,” SPMN Software Deelopment Bulletin #, December 31, 1998. 

OCR for page 79
0 ACHIEVING EFFECTIVE ACQUISITION OF IT IN THE DOD decision points in DOD IT systems are quite different in the iterative, incremental development (IID) processes discussed in Chapter 3. As a result, an equivalent understanding of what is required and when it is required has not been reached for IT systems acquisition. The results are frustration for developers and other participants in the acquisition process and uncertainty and delay in the process itself. Much can be gleaned from the experience of commercial IT systems developers and suppliers—such insights are just beginning to be incorporated into DOD practice. This chapter briefly reviews key elements and shortcomings of current practice and outlines opportunities for improvement, with a focus on making the perspective of the end user more salient. SHORTCOMINGS OF PRESENT DEFENSE TEST AND EVALUATION The current DOD process for the acquisition of IT systems has its roots in the procurement of hardware-oriented systems that will be manu- factured in quantity. As such, the DOD’s typical practice is to determine whether or not a design is adequate for its purpose before committing to advancing the production decision. For programs that are dominated by manufacturing cost, this approach reduces the possibility that a costly reworking of a system might become necessary should a defect be iden- tified only after fielding of units in the operational environment has begun. DODI 5000 directs programs to conduct testing and evaluation against a predetermined set of goals or requirements. As shown in Table 4.1, the acquisition process, including the T&E process, is governed by a large set of rules, test agents, and conditions, each trying to satisfy a different customer. Traditional test and acceptance encompass three basic phases: developmental test and evaluation (DT&E; see Box 4.1), the obtaining of the necessary certification and accreditation (C&A), and operational test and evaluation (OT&E; see Box 4.2). In essence, the current approach encourages delayed testing for the assessment of the acceptability of an IT system and of whether it satisfies user expectations. A final operational test is convened in order to validate the suitability and effectiveness of the system envisioned as the ultimate deliverable, according to a specified and approved requirements docu- ment. This approach would work if the stakeholders (program manager [PM], user, and tester) were to share a common understanding of the sys- tem’s requirements. However, because of the length of time that it takes an IT system to reach a mature and stable test, what the user originally sought is often not what is currently needed. Thus, unless a responsive process had been put in place to update the requirement and associated

OCR for page 79
TABLE 4.1 Test and Evaluation Activity Matrix Activity Test Agent Conditions Customer Reference Developmental Test Program management Determined by PMO; PMO Title 10, DOD Instruction and Evaluation office (PMO) and/or generally benign, (DODI) 5000 series (DT&E) contractor and/or laboratory; developer government personnel developmental test organization Operational Test and Independent Operationally realistic, PMO, end user, and Title 10, DOD Instruction Evaluation (OT&E) operational test agent typical users Milestone (DODI) 5000 seriesa Decision Authority Joint Interoperability Joint Interoperability Applicable capability Command, Control DOD Directive 4630.5 Test Certification Test Center environments and Communications DODI 4630.08 Systems Directorate Chairman of the Joint (J6); the Joint Staff Chiefs of Staff Instruction 6212.01D Defense Information Operational test agent, Operational, Designated accrediting DODI 8510.01b Assurance Defense Information laboratory authority Certification and Systems Agency Accreditation Process (DISA), DISA Field (DIACAP) Security Operations Division, National Security Agency aAlso the Director of Operational Test & Evaluation (DOT&E) policy on testing software-intensive systems. bNote also the DOT&E policy on information assurance testing during OT&E. 

OCR for page 79
 ACHIEVING EFFECTIVE ACQUISITION OF IT IN THE DOD BOX 4.1 Developmental Testing The process for testing U.S. military systems, including information technol- ogy systems, begins at the component level and progresses to the subsystem and system levels. Initial testing is done with components and subsystems in the laboratory, after which the testing graduates to larger subsystems and full systems. Early developmental testing, which is conducted by the developer, is de- signed to create knowledge about the proposed system and about the best so- lutions to technical difficulties that must be confronted. Later, the Department of Defense (DOD) may participate in developmental testing that is designed to challenge the system under a wider variety of test conditions or variables. Next, more operationally realistic testing is conducted and overseen by the DOD. BOX 4.2 Operational Assessments Typically, operational testing, which incorporates user feedback, is more operationally realistic and stressing than is earlier developmental testing. Also, the Department of Defense may conduct operational assessments designed to measure the readiness of the system to proceed to operational testing. Overall, program success is ensured by ironing out performance issues in early opera- tional assessments or limited user tests first. These limited user tests are still developmental in nature, but they are designed to evaluate the military effectiveness and suitability of systems in a somewhat more operationally representative setting. For example, while still in development, information technology (IT) systems might be tested in con- figurations that provide interfaces with other related or ancillary IT systems on which development has already been completed. In addition, IT systems might be tested in environments that simulate the expected battlefield environments, which might involve shock, vibration, rough handling, rain or maritime condi- tions, and/or temperature extremes. As with other testing, operational assessments are designed to provide data on the performance of a system relative to earlier prototypes, some of which might already have been deployed. These assessments are also designed to compare the rate of failures, system crashes, or alarms with those of earlier prototypes.

OCR for page 79
 ACCEPTANCE AND TESTING test plan, the tester could well be compelled to perform T&E against a requirement acknowledged by the time of the test to be obsolete but that once was needed, fully vetted, and approved. Also, the DOD has not adopted a norm of requiring continuous user involvement in developmental testing. Obtaining the perspective of typi - cal users early on and sustaining that user input throughout a develop- ment project are essential if the DOD is to exploit iterative, incremental development methods fully. This situation is exacerbated, in part, because the DOD acquisition community has not been properly and consistently trained to understand IT systems and the development and integration processes of associated emerging systems. Mismatches between the approaches of developmental systems and the expectations of test systems will inevitably lead to failed or unhelpful operational testing. This mismatching manifests itself in at least two ways: in terms of technological expectations and of user expecta- tions. If developers and testers fail to agree on which systems capabilities are to be supplied by the program versus those that are to be provided by existing systems, both developers and testers collectively fail because of mismatched test scope. If users fail to be engaged early in development or if testing fails to acknowledge the continuous revision of user-inter- face requirements, the development and testing processes collectively fail because of mismatched user engagements and expectations. A lack of user engagement limits understanding by developers and testers as to what is required: for developers this means what capabilities to build, and for testers it means what capabilities to evaluate. In either case, neither party has an adequate opportunity to implement any correc- tive action under the current acquisition process. Operational testers, independent evaluators, and the line organiza- tions that represent end users are the key participants in operational tests. These tests include operationally realistic conditions that are designed to permit the collection of measurable data for evaluating whether a system is operationally suitable and operationally effective as measured against key performance parameters (KPPs). KPPs are descriptions of the missions that the system must be able to perform. Historically, meeting (or not meeting) the KPPs has been a “go/no-go” criterion for system fielding. By definition, a KPP is a performance parameter that, if not met, is considered by the DOD to be disqualifying. There have been many cases over the past decade or so in which the Service test community has documented “capabilities and limitations” in operational tests of systems developed in response to urgent operational needs statements from com- batant commanders. This type of case provides more flexibility to decision makers, allowing them to decide whether a system is “good enough” to meet immediate wartime needs.

OCR for page 79
 ACHIEVING EFFECTIVE ACQUISITION OF IT IN THE DOD In most cases the units participating in the tests necessarily are sur- rogates for the full range of actual end users who will ultimately receive the fielded system. Their role is to be the representatives of the users and to bring to bear the users’ perspective from an operational standpoint. Naturally, the independent operational test agency and independent eval- uators will assess the ability of the system to meet the KPPs in the require- ments document along with the strengths and weaknesses of the system. The operational test community will also assess whether the KPPs may or may not be the best contemporary measure of system acceptability at the time that the test is completed. Typically, operational testers, evaluators, and participating units have recent operational experience in the same or associated mission areas for supporting their assessments. Developers need similar operational insights, as the systems being developed must perform under realistic battlefield conditions and not just in the labora - tory. The bottom line is that there is no substitute for a user perspective throughout the acquisition of IT systems. Cost and schedule, rather than performance, frequently become the main drivers of a program, and developmental testing is too often given short shrift, or the amount of time allowed for operational testing is reduced. In general, program offices tend to sacrifice rigor in favor of a more condensed, “success”-oriented testing approach. As a result, user issues that should have been discovered and addressed during DT&E may escape notice until OT&E. Not only are problems more difficult and more expensive to fix at that point, but they also create negative user perceptions of the system and of the acquisition process. Resource constraints also hamper the DOD test organizations directly. Cuts to budgets and personnel have significantly reduced the number of soldiers, sailors, airmen, and Marines available to serve as users dur- ing the test process, especially in the military T&E departments, even as systems have become more complex.2 This reduced pool of DOD testers impedes the early and close collaboration with systems acquirers and developers that is necessary to support an IID process adequately. In summary, IT testing in the DOD remains a highly rigid, serial pro- cess without the inherent flexibility and collaboration required to support an agile-oriented iterative, incremental development process, particularly as it might be applied to IT systems. If a new IT systems acquisition pro- cess is defined and adopted, life-cycle acceptance testing that reflects the IID approach will also be needed in order to achieve success. The rest of this chapter describes how to address the challenges of testing and evalu- 2Defense Science Board, Report of the Defense Science Board Task Force on Deelopmental Test & Ealuation, Department of Defense, Washington, D.C., May 2008, available at http://www. acq.osd.mil/dsb/reports/2008-05-DTE.pdf; accessed November 4, 2009.

OCR for page 79
 ACCEPTANCE AND TESTING ation in the DOD acquisition environment in a way that incorporates an IID approach. “BIG-R” REQUIREMENTS AND “SMALL-R” REQUIREMENTS For IT systems, a decision point assessment rests on the system’s ability to satisfy the stated and approved “big-R” requirements. The term “big-R” requirements in this report refers to a widely recognized purpose, mission, and expected outcome. One example would be a missile system, which would be assessed on the basis of its ability to hit a target at a given range under specified conditions. Another example would be a manage- ment information system that would support specific business functional areas and be accompanied by security access levels and a specified data standard and architecture. Such high-level descriptions are expected to be fairly stable over the course of a program, although they may evolve on the basis of feedback from users and other stakeholders. In contrast, IT systems, particularly software development and com - mercial off-the-shelf integration (SDCI) IT systems, cannot be expected to have stable requirements initially. The “small-r” requirements referred to in this report are the more detailed requirements, such as those associ- ated with specific user interfaces and utilities, that are expected to evolve within the broader specified architecture as articulated in the initial big-R requirements document. In a sense, small-r requirements could also be thought of as lower-level specifications. Stated another way, it is challenging if not impossible to accurately capture users’ detailed, small-r requirements up front, as their reactions to a prototype or newly fielded system are often negative, even though it may fully meet the specifications set forth in a requirements document. Part of the problem is that there are so many minute details that together contribute to the usability of a system that it is nearly impossible to detail these in advance of giving users an opportunity to try out an actual run - ning system. Usability might be a big-R requirement, but the specific details that would make that happen are small-r requirements that are essentially impossible to specify before users have been able to experi - ment with the system. Big-R requirements, such as the expected user interface and user paradigms and integration with other concurrently evolving systems and security practices at a high level, will all result in an unpredictable set of specifications and small-r requirements in practice. The need to manage big-R requirements coupled with changing and/or ill-specified small-r requirements is another reason that an iterative, incre- mental development process is well suited to these types of systems.

OCR for page 79
 ACHIEVING EFFECTIVE ACQUISITION OF IT IN THE DOD INCORPORATING THE VOICE OF THE USER A significant problem across the DOD in the development and acqui- sition of IT systems is the lack of ongoing user input, both as a means to determine needed capabilities and as a measure of the success of program development. In the current IT environment, user needs are constantly changing; such constant change is an ever-present factor that breaks any development or testing model which assumes a consistent, comprehen - sive set of user requirements (both big-R and small-r). Typical DOD IT systems have become so complex that describing either big-R or small-r requirements up front has become severely problematic. In comparison, IID approaches to IT systems rely on user feedback early and at interme - diary points to guide development. Without significant user involvement in developmental testing, the earliest point at which users will be involved may not be until opera - tional testing, far too late in the development process for an IT system. It is crucial to learn from user experiences early in the development pro - cess, when a system can be improved more easily and at less expense. If requirements (big-R and especially small-r) are not well understood or are likely to change during the course of the project—conditions that are commonly found in DOD IT developments—an iterative approach with regular and frequent, if not continuous, user feedback will produce results faster and more efficiently than the traditional DOD approach can. The adoption of IID approaches coupled with a focus on the end-user experience does not mean, however, that other stakeholders and nonfunc- tional requirements (such as information assurance, reliability, and so on) are unimportant. Historically, other stakeholder voices have dominated the process to the exclusion of the end user. The committee urges a rebal- ancing and a focus on end-user mission success that incorporates higher- level architectural and nonfunctional requirements at appropriate phases of system development and deployment. TOWARD CONTINUOUS OPERATIONAL ASSESSMENT Agile approaches and iterative, incremental approaches to software development have been receiving increased attention in industry in recent years and are having a significant impact on how software is devel- oped and how systems are tested. While architectural and systems-level engineering considerations will continue to be significant drivers of sys - tem success, the shift toward IID and agile processes also appropriately requires the incorporation of user perspectives throughout the develop - ment life cycle. In the case of assembling commercial off-the-shelf (COTS) solutions, the criteria for solutions that please users are often related less to the technical architecture that accounts for how the components are

OCR for page 79
 ACCEPTANCE AND TESTING put together than to the workflow of the resulting system—How easy is it to perform specific tasks with the system? How easy is it to maintain, manage, or perform upgrades on the system? Answers to these ques - tions cannot typically be answered up front but rather are best answered through end-user experiences. In the case of any new systems develop - ment, user input is critical to the requirements and deployment-readi - ness of the resulting system, but generally users are unable to specify the details of these needs in the requirements phase. Thus, requirements in IT systems are best guided through user input based on direct experience with current prototypes. The acquisition approach described in this report incorporates the restructuring of the DOD testing and acceptance process to be the enabler of this user input, and the incorporating of operationally realistic testing with user feedback into a routine of continuous opera - tional assessment. A continuous operational assessment process would replace the tradi- tional DOD development, testing, and acquisition process. A continuous process would need to encompass two critical features: an acceptance team and a metrics collection and reporting capability: • An acceptance team (AT) would be charged with providing feedback on the acceptability of the solution toward meeting the users’ goals. The AT should be drawn from the expected initial users of the IT system and ideally should be the cadre that first tests the system in the field and then trains subsequent users in use of the system. The AT would work with the development team and acquisition (program management) team through the entire process. The AT would initially engage with these teams as high- level requirements are gathered and initial big-R requirements documents are published, and then it would continue in this relationship through to product deployment. An important function of the AT would be to keep the requirements, functional deliverables, and test plans synchronized. • A robust metrics collection and reporting capability (MCRC) would collect, aggregate, and analyze end-user service consumption and expe - riences. The MCRC—leveraging existing operational tools, including enterprise monitoring and management capabilities, DOD information assurance capabilities, and commercial capabilities in combination with measured end-user performance—would provide visibility into what functions end users are actually using in accomplishing their day-to-day missions and how they are using them. The MCRC would provide all stakeholders with a clear indication of operational effectiveness based on actual operational use.

OCR for page 79
 ACHIEVING EFFECTIVE ACQUISITION OF IT IN THE DOD Acceptance Teams The development process continues through a set of iterations as described earlier in the report. In each iteration, the AT would fill the role of “the customer” on matters pertaining to the achievement of iteration objectives. The AT would evaluate each iteration prototype and identify issues and plausible changes to satisfy user acceptance of stated require - ments before the development process went forward. Thus, the develop- ment team would secure early feedback from the AT on unclear, misun - derstood, or incorrect requirements. For each iteration, the AT, in its role as proxy for the ultimate customer, would validate the current require - ments list while the development team would provide work estimates and cost projections and develop a plan for the next iteration. A responsibility of the AT in assisting the development process would include building (or working with the development team to build) accep - tance tests for each iteration. This approach is a cornerstone of Test-Drien Deelopment: By Example,3 in which tests for a deliverable are written first, and then a system to satisfy those tests is developed. Using tests in this manner is particularly helpful with larger, more complex organizations as it helps remove ambiguity or gaps in communication. Moving to an IID methodology does not imply any reduction in oversight or the shepherding of acquisition funds. To ensure satisfactory progress of programs, periodic and regular progress checkpoints need to occur with the acquisition executive. In keeping with agile-inspired methodologies, these checkpoints should be based on calendar or funding milestones rather than being requirements-driven progress points. In agile processes, iterations are based on time-boxing work schedules, whereby some content may slip from one iteration to the next, but iterations are closed according to the schedule, thus allowing for prompt identification of erroneous estimates of the time required to complete work items and ensuring continuous user input regarding priorities. In a similar manner, checkpoints with acquisition executives could be based either on time duration or on a funding milestone (as the two are frequently closely cor- related in an IT project). These checkpoints should be less frequent than one per iteration, perhaps occurring every 6 to 18 months. In these checkpoints, the AT would be the key voice in articulating the “value delivered.” It might be that at a checkpoint, the requirements delivered are not as anticipated at the start of the program, but the value to the end users is still quite significant. In such cases, the acquisition executive would examine and understand the reasons for the deviation in 3 Kent Beck, Test-Drien Deelopment: By Example, Addison-Wesley, Old Tappen, N.J., 2003.

OCR for page 79
 ACCEPTANCE AND TESTING the plan and take into account user reactions to progress in the program as far as it had gone. Conversely, a program might be tracking exactly to the initial schedule but could be producing an asset with which the AT is not satisfied. In such cases the acquisition executive should intervene and determine if corrective action is possible or if the fault lies in the concept of the program itself. In either case, the acquisition executive is getting regular feedback on project progress with a clear opinion from those who will benefit from the system. OT&E is, at its core, about determining the effectiveness and suit - ability of a system for deployment. During development and before a system reached OT&E, the AT would carry this responsibility. The AT would recommend to the acquisition executive when an iteration of an IT system was ready for deployment. Such deployments can take several forms. Initial deployments may be limited in scope, with the intention of testing system effectiveness or of allowing some features to be exploited while others continue to be developed (e.g., joint programs that are ini - tially deployed with a single Service). Other deployments may be wider in scope to allow value to be captured from these systems while more advanced features are developed in future iterations. An important dif - ference in this process is that deployment is not a one-time event at the end of the program. The AT would recommend deployment and the scope of deployment on the basis of the operational benefit, functional value to the user, and risk of deployment, but would not need to (and should not) wait until “completion” of the entire system before making such recommendations. Evaluation Through Operational Use Metrics One of the major lessons learned from interactions with commercial suppliers of IT systems is that significant benefits come from these sup- pliers’ understanding of and reliance on actual end-user behavior as mea - sured by actual end-user actions. In fact, the instrumentation of products and services for the collection of actual end-user metrics is so engrained in many commercial IT companies as to be unremarkable to commercial suppliers. Indeed, because this practice is second nature to those in the commercial sector, it was necessary for the committee to expressly solicit comments on this point from briefers. Commercial IT companies drive their entire investment portfolios on the basis of anticipated and actual end-user consumption patterns and/or end users’ engagement with their offered products and services. Those products and services with large and committed user bases drive a preponderance of the businesses’ val- uation and receive commensurate corporate leadership attention and investment. Small changes in interaction patterns are induced and then

OCR for page 79
0 ACHIEVING EFFECTIVE ACQUISITION OF IT IN THE DOD measured and analyzed minutely for an understanding of how best to improve the user experience and increase user satisfaction as measured by user engagement. As a result, virtually every aspect of the business is focused on satisfying end-user needs as quickly as possible, with the asso- ciated increase in network-based productivity that has been witnessed on the World Wide Web. Many tools can be used to gather this information, including run-time configuration management, run-time collection and reporting, near-real- time aggregation, and business analytics. Many of the supporting tools are also used by service operations to identify early signs of technology outages or slowdowns and to begin assessing and diagnosing a problem before it becomes a catastrophic failure. This report recommends incorporating the voice of the end user at all stages of the system life cycle. From a test perspective, such incorporation focuses resources in a number of important ways: • Those services that are integral to every higher-level capability receive special attention during testing and are added incrementally much more deliberately, which may impact fielding cycle time; • Those services that are exercised most strenuously and frequently by end users directly experience extensive, highly iterative beta testing with actual successful end-user sessions forming the basis for a determi - nation of operational effectiveness; and • Field failures are automatically reported and incorporated into development and testing procedures as necessary, resulting in “living” test documents. Establishing an MCRC along with leveraging current DOD tools and available commercial tools and practices would overtly move the opera - tional evaluation assessment from a speculative proposition based on surrogate run times, users, test data, and marginally current requirements specifications, to a managed and measured investment assessment based on current, actual end-user missions and needs. INCORPORATING COMMON SERVICES DEFINITIONS For years, IT systems developers have employed functionalities that are externally supplied and operationally validated as a basis for their success, without expecting to revalidate those functionalities as part of their formal test regimen. Examples include the use of the Internet Pro - tocol (IP) and the Domain Name System (DNS). These capabilities are provided externally to the capability being tested, have already been validated in separate acquisitions, and thus need not be included in the

OCR for page 79
 ACCEPTANCE AND TESTING scope of the IT systems test regimen. As more services have been com - moditized and/or supplied through commercial means, IT systems acqui- sition practice has not changed to address this reality. Unfortunately, no mechanisms exist to identify and track these supplied services or to apply a consistent approach for their use throughout the current DOD acquisition process. This is another negative repercussion of the weapons systems-based acquisition approach, where far fewer opportunities for shared services exist. For example, the DOD has broadly adopted a set of networking capa - bilities that are integral to every IT system and that do not require revali - dation (e.g., the IP and DNS capabilities mentioned above). As more of the technology stack becomes commoditized or provided as a service, the set of associated capabilities not requiring revalidation should likewise grow. In addition, testing approaches should be broadened to account for this commoditization. Developers of common service capabilities should account for the full range of possible application, from strategic, fixed-base IT systems to tactical, end-user-focused IT systems. Similarly, the testing of common services should reflect the full intended scope of application of the services. Dependent developers should be permitted and encouraged to view these common services as operationally validated externally, as long as they adhere to the terms of the supplied service. These developers’ test teams should accept and treat these as externally supplied and accept- able operational services, regardless of the validating organization. Commercial off-the-shelf hardware, software, and services (CHSS) IT systems acquisition can similarly better leverage commercial experi - ence in place of formal DOD testing and oversight review. Service-level agreements established for commercial services—which often have been validated by thousands or millions of users or hundreds of commercial entities—constitute, in effect, a test environment whose results should be accepted prima facie. Past validation of platform components—either validation from widespread use in the commercial marketplace or prior validation in an SDCI IT system—can also substitute for new testing. As an example, consider a public key infrastructure (PKI) that is sourced and validated as a well-defined service on which secure identity management will depend. As a result, the PKI program manager must architect the supplied service to support the range of users anticipated. The PKI test regimen should address the specified terms of service that result for the PKI. So long as a dependent PM uses the standard service interface and has no requirements that exceed the already-validated PKI service scope, the PKI should be treated as an existing and validated external service that is outside the scope of the dependent PM’s formal testing regimen. Other examples include “on-demand” computing and storage, network-centric enterprise services, and visualization services.

OCR for page 79
 ACHIEVING EFFECTIVE ACQUISITION OF IT IN THE DOD This approach to the evaluation of common services avoids the cost and time required to recertify proven products individually. Risks of undetected failures in these products are mitigated by development tests of integrated modules and operational testing when the composite system undergoes rigorous evaluations to determine effectiveness and suitability. VIRTUAL INFORMATION TECHNOLOGy TEST ENVIRONMENTS The use of integrated virtual information technology test environ- ments may be one way to facilitate testing that would allow early proto - types of systems to be subjected to much more realistic test conditions, thereby helping to identify potential problems in development as soon as possible. Such test environments would rely on a distributed test network that could be accessed by both government and industry, when appro- priate, for use in performing early acceptance testing. A broad range of simulation systems and operational command-and-control systems that can represent realistic operational elements would provide the neces- sary data to drive such systems during testing. Linking the proponents of these simulations and systems through a distributed network would allow them to maintain the systems within their existing facilities while also providing opportunities for use during larger test events. Additional applications necessary to control, monitor, and log data during such tests would augment the sets of simulations and systems. It is important that virtual IT test environments have the ability not only to test the basic functionality of systems but also to emulate as much of the expected operational environment as possible. One of the recom - mendations of the National Research Council report Testing of Defense Systems in an Eolutionary Acquisition Enironment was to “revise DOD testing procedures to explicitly require that developmental tests have an operational perspective (i.e., are representative of real-world usage conditions) in order to increase the likelihood of early identification of operational failure modes.”4 A simulation-based test environment has the potential to provide such functionality, as has been shown in multiple DOD training and experimentation environments. The technology necessary to achieve virtual test environments is already well established. In fact, multiple (somewhat duplicative and overlapping) programs that have similar capabilities for doing exactly this kind of testing already exist within the DOD. These programs may 4 National Research Council, Testing of Defense Systems in an Eolutionary Acquisition Eni- ronment, The National Academies Press, Washington, D.C., 2006.

OCR for page 79
 ACCEPTANCE AND TESTING provide an important starting point from which an expanded capability for early and continuous acceptance testing could be implemented. A sampling of such programs from across the military Services and defense agencies includes the following: • The Systems of Systems Integration Laboratory (SoSIL) is a large-scale communications network for modeling and simulation, hardware and software integration, and virtual operational testing; SoSIL also offers a “soldier-in-the-loop” capability.5 • The Army’s Cross Command Collaboration Effort is an effort to estab- lish and evolve a consistent and core set of modeling and simulation tools, data, and business processes that meet the common environment require- ments of the U.S. Army’s Training and Doctrine Command, Army Test and Evaluation Command, and Research, Development and Engineering Command. This common environment will facilitate those organizations’ interoperability with the materiel development community to help con - duct the distributed development of doctrine, organizations, training, materiel, leadership and education, personnel, and facilities. 6 • The Air Force Integrated Collaboratie Enironment (AF ICE) is intended to provide a persistent, composable, flexible infrastructure along with a series of tools, standards, processes, and policies for using the environment to conduct the continuous analysis required to support a capabilities-based planning process.7 • The Joint Mission Enironment Test Capability (JMETC) was estab- lished in October 2006 to “link distributed facilities on a persistent net - work, thus enabling customers to develop and test warfighting capabilities in a realistic joint context.”8 JMETC has already established a persistent test network, through the Secret Defense Research and Engineering Net - work, which provides connectivity to both Service and industry assets. It relies on the Test and Training Enabling Architecture (TENA) as its infrastructure for data exchange; TENA provides a standard object model and interfaces to the Distributed Interactive Simulation Protocol and the 5 Boeing, FCS Systems of Systems Integration Laboratory Backgrounder, May 2007, available at www.boeing.com/defense-space/ic/fcs/bia/080523_sosil_bkgndr.pdf; accessed November 12, 2009. 6 Brian Hobson and Donald Kroening, “Cross Command Collaboration Effort (3CE),” Spring Simulation Multiconference: Proceedings of the 00 Spring Simulation Multiconference, Vol. 3, 2007. 7 B. Eileen Bjorkman and Timothy Menke, “Air Force-Integrated Collaborative Environ - ment (AF-ICE) Development Philosophy,” ITEA Journal of Test and Ealuation 27(1), March/ April 2006. 8 Richard Lockhard and Chip Ferguson,“Joint Mission Environment Test Capability (JMETC),” ITEA Journal 29:160-166, 2008.

OCR for page 79
 ACHIEVING EFFECTIVE ACQUISITION OF IT IN THE DOD High Level Architecture, which are widely used standards for modeling and simulation. JMETC has already established linkages to the Future Combat System program and AF ICE.9 • The Distributed Engineering Plant (DEP) was established in 1998 by the Naval Sea Systems Command (NAVSEA) to identify and resolve combat battle management command, control, communications, and com- puters (C4) systems interoperability problems prior to deploying new and upgraded systems to sea. Enabled by today’s newest networking technol- ogy, DEP links the Navy’s shore-based combat systems/C4/hardware test sites, which are located in geographically disparate facilities across the nation, into a virtual shore-based battle group that exactly replicates a battle group fighting at sea. By inserting “ground truth” system simu- lation and stimulation data and then observing how the combat systems exchange and display tactical data, engineers can identify precisely and solve interoperability problems ashore well before those systems enter the operating forces. This approach emphasizes shore-based testing and war- fare systems integration and interoperability testing and acceptance cer- tification of operational IT systems in a test environment similar to their ultimate shipboard operational environment; it also emphasizes interop - erability assessments, which are a prerequisite for the operational certifi - cation of the ships in strike force configurations prior to deployment. Obviously, numerous organizations across the DOD with roles and missions oriented toward testing and evaluation may also have capabili- ties that could be leveraged for such an effort. Among them are the Joint Interoperability Test Command, the Army Test and Evaluation Com- mand, the Air Force Operational Test and Evaluation Center, and the Navy Operational Testing and Evaluation Force. Numerous software test- ing and distributed testing capabilities also exist in organizations such as the Defense Advanced Research Projects Agency and the various research laboratories within the Services. Establishing the technical underpinnings of virtual IT test environ- ments is only part of the solution to improving early testing in acquisi - tion. In addition, appropriate policy and process changes would need to be implemented to mandate activities that would utilize such an envi - ronment. Some of the issues that must be addressed include data shar- ing across a testing enterprise, the establishment of standards for data exchange, and the formalizing of the role of early testing in acquisi- 9 Test and Training Enabling Architecture Software Development Activity, JMETC VPN Fact Sheet, Central Test & Evaluation Investment Program, U.S. Joint Forces Command, Department of Defense, Washington, D.C., November 23, 2009, available at https://www. tena-sda.org/download/attachments/6750/JMETC-VPN-2009-11-23.pdf?version=1.pd f.

OCR for page 79
 ACCEPTANCE AND TESTING tion—likely requiring revisions to DODI 5000.2. Other issues include the governance and management of such a capability and the roles and responsibilities in terms of executing testing in such an environment. Some of these issues are raised in the DOD’s Acquisition Modeling and Simulation Master Plan, issued in April 2006, which focuses on improving the role of modeling and simulation for testing.10 Another challenge in making such environments usable is to ensure that the complexity required to perform the integration of systems and configuration for tests is minimized; otherwise, the costs of using such test environments would far outweigh the benefits. Paramount in man - aging the complexity involved is the establishment of a formal systems engineering process involving test design, integration, documentation, configuration management, execution, data collection, and analysis. Also important is the establishment of standards for simulation and systems interoperability that allow for common interfaces and the reuse of systems that also provide enough flexibility to adapt to new requirements. And finally, a management process must be married to the systems engineer- ing process so that users are invited to participate and incentives are created for industry and government to share data and work toward common goals. 10 Department of Defense, Acquisition Modeling and Simulation Master Plan, Software En- gineering Forum, Office of the Under Secretary of Defense (Acquisition, Technology and Logistics) Defense Systems, Washington, D.C., April 17, 2006.

OCR for page 79