Click for next page ( 104


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 103
Phase I Report Operational Test Design and Evaluation of the Interim Armored Vehicle

OCR for page 103

OCR for page 103
Executive Summary This report provides an assessment of the U.S. Army's planned ini- tial operational test (JOT) of the Stryker family of vehicles. Stryker is intended to provide mobility and "situation awareness" for the Interim Brigade Combat Team (IBCT). For this reason, the Army Test and Evaluation Command (ATEC) has been asked to take on the unusual re- sponsibility of testing both the vehicle and the IBCT concept. Building on the recommendations of an earlier National Research Council study and report (National Research Council, 1998a), the Panel on Operational Test Design and Evaluation of the Interim Armored Ve- hicle considers the Stryker IOT an excellent opportunity to examine how the defense community might effectively use test resources and analyze test data. The panel's judgments are based on information gathered during a series of open forums and meetings involving ATEC personnel and experts in the test and evaluation of systems. Perhaps equally important, in our view the assessment process itself has had a salutary influence on the IOT design for the IBCT/Stryker system. We focus in this report on two aspects of the operational test design and evaluation of the Stryker: (1) the measures of performance and effec- tiveness used to compare the IBCT equipped with the Stryker against the baseline force, the Light Infantry Brigade (LIB), and (2) whether the cur- rent operational test design is consistent with state-of-the-art methods. Our next report will discuss combining information obtained from the 105

OCR for page 103
106 IMPROVED OPERATIONAL TESTING AND EVALUATION IOT with other tests, engineering judgment, experience, and the like. The panel's final report will encompass both earlier reports and any additional developments. OVERALL TEST PLANNING Two specific purposes of the IOT are to determine whether the IBCT/ Stryker performs more effectively than the baseline force, and whether the Stryker family of vehicles meets its capability and performance require- ments. Our primary recommendation is to supplement these purposes: when evaluating a large, complex, and critical weapon system such as the Stryker, operational tests should be designed, carried out, and evalu- ated with a view toward improving the capabilities and performance of the system. MEASURES OF EFFECTIVENESS We begin by considering the definition and analysis of measures of effectiveness (MOEs). In particular, we address problems associated with rolling up disparate MOEs into a single overall number, the use of untested or ad hoc force ratio measures, and the requirements for calibration and scaling of subjective evaluations made by subject-matter experts (SMEs). We also identify a need to develop scenario-specific MOEs for noncombat missions, and we suggest some possible candidates for these. Studying the question of whether a single measure for the "value" of situation awareness can be devised, we reached the tentative conclusion that there is no single appropriate MOE for this multidimensional capability. Modeling and simulation tools can be used to this end by augmenting test data during the evaluation. These tools should be also used, however, to develop a better understanding of the capabilities and limitations of the system in general and the value of situation awareness in particular. With respect to determining critical measures of reliability and main- tainability (RAM), we observe that the IOT will provide a relatively small amount of vehicle operating data (compared with that obtained in training exercises and developmental testing) and thus may not be sufficient to ad- dress all of the reliability and maintainability concerns of ATEC. This lack of useful RAM information will be exacerbated by the fact that the IOT is to be performed without using add-on armor. For this reason, we stress that RAM data collection should be an ongo-

OCR for page 103
PHASE I REPORT: EXECUTIVE SUMMARY 107 ing enterprise, with failure times, failure modes, and maintenance informa- tion tracked for the entire life of the vehicle (and its parts), including data from developmental testing and training, and recorded in appropriate data- bases. Failure modes should be considered separately, rather than assign- ing a single failure rate for a vehicle using simple exponential models. EXPERIMENTAL DESIGN With respect to the experimental design itself, we are very concerned that observed differences will be confounded by important sources of un- controlled variation. In particular, as pointed out in the panel's letter re- port (Appendix A), the current test design calls for the IBCT/Stryker trials to be run at a different time from the baseline trials. This design may confound time of year with the primary measure of interest: the difference in effectiveness between the baseline force and the IBCT/Stryker force. We therefore recommend that these events be scheduled as closely together in time as possible, and interspersed if feasible. Also, additional potential sources of confounding, including player learning and nighttime versus daytime operations, should be addressed with alternative designs. One alternative to address confounding due to player learning is to use four separate groups of players, one for each of the two opposing forces (OPFORs), one for the IBCT/Stryker, and one for the baseline system. Intergroup variability appears likely to be a lesser problem than player learn- ing. Also, alternating teams from test replication to test replication be- tween the two systems under test would be a reasonable way to address differences in learning, training, fatigue, and competence. We also point out the difficulty in identifying a test design that is simultaneously "optimized" with respect to determining how various fac- tors affect system performance for dozens of measures, and also confirming performance either against a baseline system or against a set of require- ments. For example, the current test design, constructed to compare IBCT/ Stryker with the baseline, is balanced for a limited number of factors. How- ever, it does not provide as much information about the system's advan- tages as other approaches could. In particular, the current design allocates test samples to missions and environments in approximatley the same pro- portion as would be expected in field use. This precludes focusing test samples on environments in which Stryker is designed to have advantages over the baseline system, and it allocates numerous test samples to environ- ments for which Stryker is anticipated to provide no benefits over the

OCR for page 103
108 IMPROVED OPERATIONAL TESTING AND EVALUATION baseline system. This reduces the opportunity to learn the size of the ben- efit that Stryker provides in various environments, as well as the reasons underlying its advantages. In support of such an approach, we present a number of specific technical suggestions for test design, including making use of test design in learning and confirming stages as well as small-scale pilot tests. Staged testing, presented as an alternative to the current design, would be particularly useful in coming to grips with the difficult problem of understanding the contribution of situation awareness to system perfor- mance. For example, it would be informative to run pilot tests with the Stryker situation awareness capabilities intentionally degraded or turned off, to determine the value they provide in particular missions or scenarios. We make technical suggestions in several areas, including statistical power calculations, identifying the appropriate test unit of analysis, com- bining SME ratings, aggregation, and graphical methods. SYSTEM EVALUATION AND IMPROVEMENT More generally, we examined the implications of this particular IOT for future tests of similar systems, particularly those that operationally in- teract so strongly with a novel force concept. Since the size of the opera- tional test (i.e., number of test replications) for this complex system (or systems of systems) will be inadequate to support hypothesis tests leading to a decision on whether Stryker should be passed to full-rate production, ATEC should augment this decision with other techniques. At the very least, estimates and associated measures of precision (e.g., confidence inter- vals) should be reported for various MOEs. In addition, the reporting and use of numerical and graphical assessments, based on data from other tests and trials, should be explored. In general, complex systems should not be forwarded to operational testing, absent strategic considerations, until the system design is relatively mature. Forwarding an immature system to op- erational test is an expensive way to discover errors that could have been detected in developmental testing, and it reduces the ability of an opera- tional test to carry out its proper function. As pointed out in the panel's letter report (Appendix A), it is extremely important, when testing complex systems, to prepare a straw man test evalu- ation report (TER), as if the IOT had been completed. It should include examples of how the representative data will be analyzed, specific presenta- tion formats (including graphs) with expected results, insights to develop from the data, draft recommendations, and so on. The content of this straw man report should be based on the experience and intuition of the

OCR for page 103
PHASE I REPORT: EXECUTIVE SUMMARY 109 analysts and what they think the results of the IOT might look like. To do this and to ensure the validity and persuasiveness of evaluations drawn from the testing, ATEC needs a cadre of statistically trained personnel with "own- ership" of the design and the subsequent test and evaluation. Thus, the Department of Defense in general and ATEC in particular should give a high priority to developing a contractual relationship with leading practi- tioners in the fields of reliability estimation, experimental design, and data analysis to help them with future IOTs. In summary, the panel has a substantial concern about confounding in the current test design for the IBCT/Stryker IOT that needs to be ad- dressed. If the confounding issues were reduced or eliminated, the remain- der of the test design, aside from the power calculations, has been compe- tently developed from a statistical point of view. Furthermore, this report provides a number of evaluations and resulting conclusions and recom- mendations for improvement of the design, the selection and validation of MOEs, the evaluation process, and the conduct of future tests of highly complex systems. We attach greater priority to several of these recommen- dations and therefore highlight them here, organized by chapters to assist those interested in locating the supporting arguments. RECOMMENDATIONS Chapter 3 Different MOEs should not be rolled up into a single overall num- ber that tries to capture effectiveness or suitability. To help in the calibration of SMEs, each should be asked to review his or her own assessment of the Stryker IOT missions, for each scenario, immediately before he or she assesses the baseline missions (or vice versa). ATEC should review the opportunities and possibilities for SMEs to contribute to the collection of objective data, such as times to complete certain subtasks, distances at critical times, etc. . ATEC should consider using two separate SME rating scales: one r cc r 1 '' 1 r cc '' tor "allures ant ~ anotner tor successes. . FER (and the LER when appropriate), but not the RLR, should be used as the primary mission-level MOE for analyses of engagement results. ATEC should use fratricide frequency and civilian casualty fre- quency to measure the amount of fratricide and collateral damage in a . . mission.

OCR for page 103
110 IMPROVED OPERATIONAL TESTING AND EVALUATION Scenario-specific MOPs shoulcl be aclclecl for SOSE missions. Situation awareness shoulcl be introduced as an explicit test . . cone ration. RAM data collection shoulcl be an ongoing enterprise. Failure and maintenance information shoulcl be trackocl on a vehicle or part/system basis for the entire life of the vehicle or part/system. Appropriate databases shoulcl be set up. This was probably not clone with those Stryker vehicles already in existence, but it could be implemented for future maintenance actions on all Stryker vehicles. With respect to the difficulty of reaching a decision regarding reli- ability, given limited miles and absence of aclcl-on-armor, weight packs shoulcl be used to provide information about the impact of additional weight on reliability. Failure modes shoulcl be considered separately rather than trying to develop failure rates for the entire vehicle using simple exponential mocl- els. The data reporting requirements vary depending on the failure rate r tunctlon. Chapter 4 Given either a learning or a confirmatory objective, ignoring various tactical considerations, a requisite for operational testing is that it shoulcl not commence until the system design is mature. ATEC shoulcl consicler, for future test clesigns, relaxing various rules of test design that it adheres to, by (a) not allocating sample size to sce- narios according to the OMS/MP, but instead using principles from opti- mal experimental design theory to allocate sample size to scenarios, (b) testing under somewhat more extreme conditions than typically will be faced in the fielcl, (c) using information from developmental testing to improve test clesign, and (cl) separating the operational test into at least two stages, learning and confirmatory. ATEC shoulcl consider applying to future operational testing in general a two-phase test design that involves, first, learning phase studies that examine the test object under different conclitions, thereby helping testers design further tests to elucidate areas of greatest uncertainty and importance, ancl, seconcl, a phase involving confirmatory tests to address hypotheses concerning performance vis-a-vis a baseline system or in com- parison with requirements. ATEC shoulcl consider taking advantage of this approach for the IBCT/Stryker JOT. That is, examining in the first phase IBCT/Stryker under different conclitions, to assess when this system

OCR for page 103
PHASE I REPORT: EXECUTIVE SUMMARY 111 works best, and why, and conducting a second phase to compare IBCT/ Stryker to a baseline, using this confirmation experiment to support the decision to proceed to full-rate production. An important feature of the learning phase is to test with factors at high stress levels in order to develop a complete understanding of the system's capabilities and limitations. When specific performance or capability problems come up in the early part of operational testing, small-scale pilot tests, focused on the analy- sis of these problems, should be seriously considered. For example, ATEC should consider test conditions that involve using Stryker with situation awareness degraded or turned off to determine the value that it provides in . . . . particular missions. ATEC should eliminate from the IBCT/Stryker IOT one signifi- cant potential source of confounding, seasonal variation, in accordance with the recommendation provided earlier in the October 2002 letter report from the panel to ATEC (see Appendix A). In addition, ATEC should also seriously consider ways to reduce or eliminate possible confounding from player learning, and day/night imbalance. Chapter 5 The IOT provides little vehicle operating data and thus may not be sufficient to address all of the reliability and maintainability concerns of ATEC. This highlights the need for improved data collection regarding vehicle usage. In particular, data should be maintained for each vehicle over that vehicle's entire life, including training, testing, and ultimately field use; data should also be gathered separately for different failure modes. The panel reaffirms the recommendation of the 1998 NRC panel that more use should be made of estimates and associated measures of pre- cision (or confidence intervals) in addition to significance tests, because the former enable the judging of the practical significance of observed effects. , ~ ~ Chapter 6 Operational tests should not be strongly geared toward estimation of system suitability, since they cannot be expected to run long enough to estimate fatigue life, estimate repair and replacement times, identify failure modes, etc. Therefore, developmental testing should give greater priority to measurement of system (operational) suitability and should be struc- tured to provide its test events with greater operational realism. . In general, complex systems should not be forwarded to operational

OCR for page 103
112 IMPROVED OPERATIONAL TESTING AND EVALUATION testing, in the absence of strategic considerations, until the system design is relatively mature. Forwarding an immature system to operational test is an expensive way to discover errors that could have been detected in develop- mental testing, and it reduces the ability of an operational test to carry out its proper function. System maturation should be expedited through previ- ous testing that incorporates various aspects of operational realism in addi- tion to the usual developmental testing. Because it is not yet clear that the test design and the subsequent test analysis have been linked, ATEC should prepare a straw man test evalu- ation report in advance of test design, as recommended in the panel's Octo- ber 2002 letter to ATEC (see Appendix A). The goals of the initial operational test need to be more clearly specified. Two important types of goals for operational test are learning about system performance and confirming system performance in com- parison to requirements and in comparison to the performance of baseline systems. These two different types of goals argue for different stages of operational test. Furthermore, to improve test designs that address these different types of goals, information from previous stages of system devel- opment need to be utilized. Finally, we wish to make clear that the panel was constituted to address the statistical questions raised by the selection of measures of performance and measures of effectiveness, and the selection of an experimental design, given the need to evaluate Stryker and the IBCT in scenarios identified in the OMS/MP. A number of other important issues (about which the panel provides some commentary) lie outside the panel's charge and expertise. These include an assessment of (a) the selection of the baseline system to compare with Stryker, (b) the problems raised by the simultaneous evalua- tion of the Stryker vehicle and the IBCT system that incorporates it, (c) whether the operational test can definitively answer specific tactical ques- tions, such as the degree to which the increased vulnerability of Stryker is offset by the availability of greater situational awareness, and (~) whether or not scenarios to be acted out by OPFOR represent a legitimate test suite. Let us elaborate each of these ancillary but important issues. The first is whether the current choice of a baseline system (or multiple baselines) is best from a military point of view, including whether a baseline system could have been tested taking advantage of the IBCT infrastructure, to help understand the value of Stryker without the IBCT system. It does not seem to be necessary to require that only a system that could be trans- ported as quickly as Stryker could serve as a baseline for comparison.

OCR for page 103
PHASE I REPORT: EXECUTIVE SUMMARY 113 The second issue (related to the first) is the extent to which the current test provides information not only about comparison of the IBCT/Stryker system with a baseline system, but also about comparison of the Stryker suite of vehicles with those used in the baseline. For example, how much more or less maneuverable is Stryker in rural versus urban terrain and what impact does that have on its utility in those environments? These questions require considerable military expertise to address. The third issue is whether the current operational test design can pro- vide adequate information on how to tactically employ the IBCT/Stryker system. For example, how should the greater situational awareness be taken advantage of, and how should the greater situational awareness be balanced against greater vulnerability for various types of environments and against various threats? Clearly, this issue is not fundamentally a technical statisti- cal one, but is rather an essential feature of scenario design that the panel was not constituted to evaluate. The final issue (related to the third) is whether the various missions, types of terrain, and intensity of conflict are the correct choices for opera- tional testing to support the decision on whether to pass Stryker to full-rate production. One can imagine other missions, types of terrain, intensities, and other factors that are not varied in the current test design that might have an impact on the performance of Stryker, the baseline system, or both. These factors include temperature, precipitation, the density of buildings, the height of buildings, types of roads, etc. Moreover, there are the serious problems raised by the unavailability of add-on armor for the early stages of the operational test. The panel has been obligated to take the OMS/MP as given, but it is not clear whether additional factors that might have an important impact on performance should have been included as test fac- tors. All of these issues are raised here in order to emphasize their impor- tance and worthiness for consideration by other groups better constituted to address them. Thus, the panel wishes to make very clear that this assessment of the operational test as currently designed reflects only its statistical merits. It is certainly possible that the IBCT/Stryker operational test may be deficient in other respects, some of them listed above, that may subordinate the statistical aspects of the test. Even if the statistical issues addressed in this report were to be mitigated, we cannot determine whether the resulting operational test design would be fully informative as to whether Stryker should be promoted to full-rate production.

OCR for page 103
202 IMPROVED OPERATIONAL TESTING AND EVALUATION Tukey, J.W. 1977 Exploratory Data Analysis. New York: Addison-Wesley, U.S. Department of Defense 2000 Operational Requirements Document (ORD) for a Family of Interim Armored Vehicles (IAV), ACAT I, Prepared for the Milestone I Decision, April 6. U.S. Army Training and Doctrine Command, Fort Monroe, Virginia. 2001 Test And Evaluation Master Plan (TEMP): Stryker Family of Interim Armored Vehicles (IAV). Revision 1, Nov 12. U.S. Army Test and Evaluation Command, Alexandria, Virginia. 2002a Interim Armored Vehicle IOTE: Test Design Review with NAS Panel. Unpublished presentation, Nancy Dunn and Bruce Grigg, April 15. 2002b Interim Armored Vehicle IOTE: Test Design Review with NAS Panel; Power and Sample Size Considerations. Unpublished presentation, Nancy Dunn and Bruce Grigg, May 6. 2002c System Evaluation Plan (SEP) for the Stryker Family of Interim Armored Vehicles (IAV), May 22. U.S. Army Test and Evaluation Command, Alexandria, Virginia. Veit, Clairice T. 1996 Judgments in military research: The elusive validity issue. Phalanx.

OCR for page 103
Appendix A Letter Report of the Pane! to the Army Test and Evaluation Command

OCR for page 103

OCR for page 103
THE NATIONAL ACADEMIES Advisers to the Nation on Science, Engineering, and Medirine Division of Behavioral and Social Sciences and Education Committee on National Statistics Panel on Operational Test Design and Evaluation of the Interim Armored Vehicle (IAV) Frank John Apicella Technical Director Army Evaluation Center U.S. Army Test and Evaluation Command 4501 Ford Avenue Alexandria, VA 22302-1458 Dear Mr. Apicella: 500 Fifth Street, NW Washington, DC 20001 Phone: 202 334 3408 Fax: 202 334 3584 Email:jmcgee~ nas.edu October 10, 2002 As you know, at the request of the Army Test and Evaluation Com- mand (ATEC) the Committee on National Statistics has convened a panel to examine ATEC's plans for the operational test design and evaluation of the Interim Armored Vehicle, now referred to as the Stryker. The panel is currently engaged in its tasks of focusing on three aspects of the operational test design and evaluation of the Stryker: (1) the measures of performance and effectiveness used to compare the Interim Brigade Combat Team (IBCT), equipped with the Stryker, against a baseline force; (2) whether the current operational test design is consistent with state-of-the-art meth- ods in statistical experimental design; and (3) the applicability of models for combining information from testing and field use of related systems and from developmental test results for the Stryker with operational test results for the Stryker. ATEC has asked the panel to comment on ATEC's current plans and to suggest alternatives. The work performance plan includes the preparation of three reports: The first interim report (due in November 2002) will address two 205

OCR for page 103
206 IMPROVED OPERATIONAL TESTING AND EVALUATION topics: (1) the measures of performance and effectiveness used to compare the Stryker-equipped IBCT against the baseline force, and (2) whether the current operational test design is consistent with state-of-the-art methods . . . . . In statlstlca . experlmenta . ( ~eslgn. The second interim report (due in March 2003) will address the topic of the applicability of models for combining information from test- ing and field use of related systems and from developmental test results for the Stryker with operational test results for the Stryker. The final report (due in July 2003) will integrate the two interim reports and add any additional findings of the panel. The reports have been sequenced and timed for delivery to support ATEC's time-critical schedule for developing plans for designing and imple- menting operational tests and for performing analyses and evaluations of the test results. Two specific purposes of the initial operational test of the Stryker are to determine whether the Interim Brigade Combat Team (IBCT) equipped with the Stryker performs more effectively than a baseline force (Light In- fantry Brigade), and whether the Stryker meets its performance require- ments. The results of the initial operational test contribute to the Army's decisions of whether and how to employ the Stryker and the IBCT. The panel's first interim report will address in detail factors relating to the effec- tiveness and performance of the Stryker-equipped IBCT and of the Stryker; effective experimental designs and procedures for testing these forces and their systems under relevant operational conditions, missions, and scenarios; subjective and objective measures of performance and effectiveness for cri- teria of suitability, force effectiveness, and survivability; and analytical pro- cedures and methods appropriate to assessing whether and why the Stryker- equipped IBCT compares well (or not well) against the baseline force, and whether and why the Stryker meets (or does not meet) its performance requirements. In the process of deliberations toward producing the first interim re- port that will address this broad sweep of issues relevant to operational test design and to measures of performance and effectiveness, the panel has discerned two issues with long lead times to which, in the opinion of the panel, ATEC should begin attending immediately, so that resources can be identified, mustered, and applied in time to address them: early develop- ment of a "straw man" (hypothetical draft) Test and Evaluation Report (which will support the development of measures and the test design as

OCR for page 103
PHASEIREPORTAPPENDIXA: LETTER REPORT 207 well as the subsequent analytical efforts) and the scheduling of test partici- pation by the Stryker-equipped force and the baseline force so as to remove an obvious test confounder of different seasonal conditions. The general purpose of the initial operational test (JOT) is to provide information to decision makers about the utility of and the remaining chal- lenges to the IBCT and the Stryker system. This information is to be generated through the analysis of IOT results. In order to highlight areas for which data are lacking, the panel strongly recommends that immediate effort be focused on specifying how the test data will be analyzed to address relevant decision issues and questions. Specifically, a straw man Test Evalua- tion Report (TER) should be prepared as if the IOT had been completed. It should include examples of how the representative data will be analyzed, specific presentation formats (including graphs) with expected results, in- sights one might develop from the data, draft recommendations, etc. The content of this straw man report should be based on the experience and intuition of the analysts and what they think the results of the IOT might look like. Overall, this could serve to provide a set of hypotheses that would be tested with the actual results. Preparation of this straw man TER will help ATEC assess those issues that cannot be informed by the opera- tional tests as currently planned, will expose areas for which needed data is lacking, and will allow appropriate revision of the current operational test plan. The current test design calls for the execution of the IBCT/Stryker vs. the opposing force (OPFOR) trials and the baseline vs. the OPFOR trials to be scheduled for different seasons. This design totally confounds time of year with the primary measure of interest: the difference in effectiveness between the baseline force and the IBCT/Stryker force. The panel believes that the factors that are present in seasonal variations weather, foliage density, light level, temperature, etc. may have a greater effect on the differences between the measures of the two forces than the abilities of the two forces themselves. We therefore recommend that serious consideration be given to scheduling these events as closely in time as possible. One way to address the potential confounding of seasonal affects, as well as possible effects of learning by blue forces and by the OPFOR, would be to inter- sperse activities of the baseline force and the IBCT/Stryker force over time. The panel remains eager to assist ATEC in improving its plans and processes for operational test and evaluation of the IBCT/Stryker. We are grateful for the support and information you and your staff have consis- tently provided during our efforts to date. It is the panel's hope that deliv-

OCR for page 103
208 IMPROVED OPERATIONAL TESTING AND EVALUATION Bring to you the recommendations in this brief letter in a timely fashion will encourage ATEC to begin drafting a straw man Test Evaluation Report in time to influence operational test activities and to implement the change in test plan to allow the compared forces to undergo testing in the same season. Sincerely yours, Stephen Pollock, Chair Panel on Operational Test Design and Evaluation of the Interim Armored Vehicle

OCR for page 103
Appendix B Force Exchange Ratio, Historical Win Probability, and Winning with Decisive Force FORCE EXCHANGE RATIO AND HISTORICAL WIN PROBABILITY For a number of years the Center for Army Analysis (CAA) analyzed historical combat data to determine the relationship between victory and casualties in land combat. The historical data, contained in the CAA Data Base of Battles (1991 version, CDB91) is from a wide range of battle types durations ranging from hours to weeks, dates ranging from the 1600s to the late 20th century, and forces involving a variety of nationali- ties. Based on the analysis of these data (and some motivation from Lanchester's square law formulation), it has been demonstrated (see Center for Army Analysis, 1987, and its references) that: the probability of an attacker victory1 is related to a variable called the "defenders advantage" or ADS, where ADVis a function of force strengths and final survivors; and ADV~ in (FER) Since N= threat forces and M= friendly coalition forces in our defini- tion of the force exchange ratio (FER), Figure B-1 depicts the historical relationship between the FER and the probability of winning, regardless of Probability of a defender victory is the complement. 209

OCR for page 103
210 1 0.8 0.6 0.4 - 0.2 - IMPROVED OPERATIONAL TESTING AND EVALUATION , - . ~ I 0~ 1 1 1 1 1 1 1 1 1 0 1 4 5 6 Force Exchange Ratio 7 8 9 10 FIGURE B-1 Historical relationship between force exchange ratio and Prkwin). SOURCE: Adapted from Thompson (1992) and Helmbold (1992~. whether the coalition is in defense or attack mode. Additionally, the rela- tion between FER and friendly fractional casualties is depicted in Figure B-2 (see CAA, 1992 and VRI, 19921. FER is not only a useful measure of effectiveness (MOE) to indicate the degree to which force imbalance is reduced, but it is also a useful his- torical measure of a force's warfighting capability for mission success. FER AND "DECISIVE FORCE" Following the demise of the Soviet Union and Operation Desert Storm, the U.S. National Military Strategy (NMS) codified a new military success objective: "Apply decisive force to win swiftly and minimize casual- ties." The NMS also implied that decisive force will be used to minimize risks associated with regional conflicts. The FER is a MOE that is useful in defining and quantifying the level of warfi~htin~ canabilitv needed to meet this objective. Figure B-3 has been derived from a scatterplot of results from a large number of simulated regional conflicts involving joint U.S. forces and coa- lition partners against a Southwest Asia regional aggressor. The coalition's ~ ~ 1 ~

OCR for page 103
PHASE I REPORTAPPENDIXB: FORCE EXCHANGE RATIO .12 REIN .10 o it, .08 Ct tn .06 Ct ~ .04 . _ 211 \ \ \ - 2 3 4 5 6 Force Exchange Ratio FIGURE B-2 Force exchange ratio/casualty relationship. SOURCE: Adapted from Thompson (1992) and Helmbold (1992~. Objectives are to conduct a defense to prevent the aggressor from capturing critical ports and airfields in Saudi Arabia and to then conduct a counterof- fensive to regain lost territory and restore national boundaries. The FER-coalition casualty relationship shown in the figure is based on simulation results, in which the FER is the ratio of the percentage of enemy losses to the percentage of coalition losses. Except for the region in which the coalition military objectives were not achieved (FER < 1.3) be- cause insufficient forces arrived in the theater, the relationship between FER and coalition casualties is similar to that shown in Figure B-2, which is based on historic data. The relationship between FER and the probability of win in Figure B-3 is based on the analysis of historic data. As shown in Figure B-3, a FER= 5.0 is defined to be a "decisive" warfighting capability. This level comes close to achieving the criterion of minimizing casualties, since improvements above that level only slightly reduce casualties further. This level of FER also minimizes risk in that a force with a FERof 2.5 will win approximately 90 out of 100 conflicts (lose 10 percent of the time) but will lose less than 2 percent of the time with a FER= 5.0.

OCR for page 103
212 IMPROVED OPERATIONAL TESTING AND EVALUATION Aggressor achieves military objectives Coalition objectives achieved, casualties high ~ : Coalition objectives achieved, casualties . reduced Pr (win) = .505 1.0 2.0 3.0 4.0 .846 .937 .968 .981 Force Exchange Ratio FIGURE B-3 Force exchange ratio and decisive warfighting capability. SOURCE: Adapted from Thompson (1992) and Helmbold (1992~. Decisive ODS 5.0 50.0 100.0