Skip to main content

Currently Skimming:

5 Improving Operational Test Planning and Design
Pages 61-86

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 63...
... Although defense systems have unique aspects, there are also substantial similarities in the operational testing of a defense system and the testing of a new industrial product. Given the very high stakes involved in the testing of defense systems that are candidates for full-rate production, it is extremely important that the officials in charge of designing and carrying out operational tests make use of the full range of techniques available so that operational tests are as informative as possible for the least cost possible.
From page 64...
... Our assessment is that the current level of test planning and experimental design for operational testing in the Department of Defense (DoD) is neither representative of best industrial practice, nor takes full advantage of the relevant experimental design literature.
From page 65...
... that objective (2) receive greater emphasis in operational testing.
From page 66...
... so, if a missile is to have a hit rate of 0.80, that hit rate could be measured as a weighted average of hit rates in individual scenarios, weighted by their frequency and military importance. Even though the objective of operational testing is often to determine whether this average is consistent with the system requirement, testers are also understandably very interested in examining performance in individual scenarios because system deficiencies might only be sensitive to some test scenarios.
From page 67...
... The panel's understanding is that such uncontrolled effects are not always taken into consideration in operational test planning and test design. In addition to understanding which factors influence system performance, it is also important to understand how test factors affect system performance, i.e., it is important to understand the variability of system performance resulting from changes in the levels of various test factors.
From page 68...
... Other information that could be collected includes: information as to any potential flaws in the system under development; the expected difference in performance between a system under development and any control or baseline system; information on the plausibility of various assumptions; and information that might be helpful in determining the usefulness of modeling and simulation to supplement results from operational testing. The information from these screening tests can be used to estimate costs on the basis of the number of test runs, the standard error of prediction for estimates of system performance for individual scenarios and for average system performance across scenarios, and the statistical power if significance testing is used.
From page 69...
... Additional benefits of this idea are briefly discussed in Chapter 3, where the notion of a more continuous assessment of operational system performance is advanced.5 Certainly, there are aspects of operational testing that do not scale down to small tests easily, for example, the number of users and systems needed for a test of a theatre radar system. As was mentioned in Chapter 3, if these screening or guiding tests are not feasible, there will often be some real benefit from taking existing developmental testing and modifying it to have operational aspects where practicable.
From page 70...
... It is important to recognize that test design and the subsequent evaluation are closely linked: the Test and Evaluation Master Plan and subsequent test planning documents should include not only the test plan, but also an outline of the anticipated analysis and how the test design accommodates the evaluation. If averages and percentages remain the primary analysis associated with an operational test, this approach is much less important.
From page 71...
... The identification of alternative measures for various performance characteristics and full understanding of their advantages and disadvantages is an important step in test design. STATISTICAL EXPERIMENTAL DESIGN This section considers both some broad issues related to the methods used for experimental design of operational tests of defense systems and some specific issues raised in association with four "special problems" that the panel was introduced to while examining the application of experimental design to operational testing.
From page 73...
... Therefore, it is vital that operational tests be designed so that they are as informative and efficient as possible, and that they produce results that permit the best decisions to be made with respect to proceeding to full-rate production given the level of test funding. Experimental Design for Operational Testing Experimental design is concerned with how to make tests more effective.
From page 74...
... When it is appropriate, expert help will then be sought so that these techniques can be used to greater advantage in operational test design. We repeat Recommendation 4.4 from Chapter 4: Recommendation 4.4: The service test agencies should examine the applicability of state-of-the-art experimental design techniques and principles and, as appropriate, make greater use of them in the design of operational tests.
From page 75...
... . If t is smaller than m, the only way to proceed is to assume that the test scenarios have features in common and use statistical modeling.ll Modeling has 9Dubints challenge is so named because the problem was suggested to the panel for study by Henry Dubin, technical director of the Army Operational Test and Evaluation Command.
From page 76...
... The experimental design, assuming the models are correct, can be shown to reduce the variance of estimated performance by a substantial amount. However, this advantage comes at a potential cost, since there is a possible specification bias if the models are wrong.
From page 77...
... Although it is not always easy to determine whether the additional assumptions are supported, especially given the small sample sizes of operational tests and the limited information about a system's operational performance prior to operational testing, having a test data archive (as recommended in Chapter 3) would often provide much of the needed information to support these types of models.
From page 78...
... The theory of optimal experimental design for multiple measures is well established in some specific situations, but it is difficult to apply in many applied settings for the same reasons that it is difficult to apply optimal design for single measures. In addition, the breadth of application for multiple measures in terms of the problems that are solved is much more limited.
From page 79...
... Problem 3: Sequential Experimentation and Other Sequence and Time Issues The accommodation of time and sequence considerations is another set of technical issues raised by the operational testing of defense systems. Time and sequence issues arise in several different ways.
From page 80...
... Sequential significance tests have the property that at a given confidence level and power, the expected test size will be smaller than the comparable fixed sample size test. In the application of operational testing, wide use of sequential testing could result in substantial savings of test dollars and a decrease in test time.
From page 81...
... The taxonomy provided in Appendix C attempts to take this into account. DETERMINATION OF SAMPLE SIZE USING DECISION THEORY This section concerns the determination of the sample size of operational tests.
From page 82...
... As a way of determining the sample size of operational tests, analytical procedures of this type have some advantages. The use of significance testing
From page 83...
... If the practice results in a test that would cost less than the funds earmarked for operational testing by the program manager, the test will go forward. However, as is more likely for ACAT I systems, if the test budget of the program manager cannot support this sample size, negotiations between the program manager, the service test agency, and DOT&E take place, with the program manager' s test budget limit possibly modestly augmented, but very likely resulting in an operational test size that is substantially reduced from that based on this approach.
From page 84...
... Other approaches to determining the size of operational tests, such as allocating a fixed percentage of program dollars to operational testing, do not use test dollars effectively. Unfortunately, the decision-theoretic approach is difficult to apply since the costs associated with the wrong decisions are difficult to quantify and must be based, at least in part, on subjective judgments, which ought to be made explicit.
From page 85...
... Finally, if the objective of operational testing is viewed as not only a statistical certification that requirements have been met, but also to gain the most information about operational system performance and identification of deficiencies in system design, the assessment of the benefit of additional testing will involve broader considerations, such as:
From page 86...
... To better understand how to implement the comparison of marginal cost with the value of the information gained through additional testing, a feedback system is needed, in which the service test agencies maintain records of the costs of testing and the costs of any retrofitting, along with the precise test events that were conducted and the reason why retrofitting was required. Discovered defects or system limitations might then be traced back to inadequate testing.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.