the relationship between the reliability of the system in developmental and operational test events.
The second discussant, Ernest Seglie, expressed his belief that experts in operational test agencies would be able to guess the ratio of the mean waiting time to failure for operational test to that for developmental test (λ in Steffey’s notation). Presumably, program managers would not allow a system for which they were responsible to enter operational test unless they were relatively sure it could meet the reliability requirement. In data just presented to the Defense Science Board (2000), one finds that 80 percent of operationally tested Army systems failed to achieve even half of their requirement for mean time between failures. This finding reveals that operational test is demonstrably different from developmental test, and that perhaps as a result, priors for λ should not be too narrow since it appears that the information on system performance is not that easy to interpret.
Seglie stated that there are two possible approaches for improving the process of system development. First, one can combine information from the two types of testing. Seglie admitted that he had trouble with this approach. The environments of developmental and operational testing are very different with very different failure modes. In addition, combining information focuses too much attention on the estimate one obtains instead of on the overall information about the system that one would like to give to the user from the separate test situations. It is extremely important to know about the system’s failure modes and how to fix them. Therefore, one might instead focus on the size of λ and use this information to help diagnose the potential for unique failure modes to occur in operational test and not in developmental test. Doing so might demonstrate the benefits of broadening the exposure of the system during developmental test to include operational failure modes. Until a better understanding is developed of why developmental and operational tests are so different, it appears dangerous to combine data. A better understanding of why λ differs from 1 should allow one to incorporate (operational test-specific) stresses into developmental testing to direct design improvements in advance of operational test.
Samaniego commented on Seglie’s concerns as follows. First, it should be recognized that Bayesian modeling has been shown to be remarkably robust to the variability in prior specifications. The Achilles heel of the Bayesian approach tends to be overstatement of the confidence one has in one’s prior mean (that is, in one’s best guess, a priori, of the value of the parameters of interest). With prior modeling that is sufficiently diffuse,