The original Army protocols for backface deformation (BFD) were based on binary (0-1) data. The BFD measurement at each location was compared against its specified threshold, and the outcome was scored as a “1” (failure) if it exceeded its threshold. This original plan was based on 20 shots; if no BFD measurements exceeded their limit, the demonstration was successful. In this sense, it was similar to Army’s legacy protocol for resistance to penetration (RTP). The Director, Operational Test and Evaluation (DOT&E) protocol expanded the number of shots to 240 and used the continuous measurements together with an assumption that the data are normally distributed. Specifically, the plan compared the 90 percent “upper tolerance limits” computed at 90 percent confidence level (90/90 rule) with their thresholds for the corresponding location on the helmet. As noted in Chapter 5, available BFD test data show that the probability of BFD exceeding its limits is quite small—on the order of 0.005. As this chapter observes, DOT&E’s BFD protocol has about a 90 percent chance of accepting the helmet design even if there is an order of magnitude increase in the exceedance probability (from 0.005 to 0.05). This weakens the incentive for manufacturers to produce helmets that are at least as good as current helmets with respect to BFD. In addition, the DOT&E protocols are based on an assumption of normality (a priori untestable) and the complex notion of an upper tolerance limit. Therefore, Recommendation 7-1 proposes that DOT&E’s protocol be changed. This change has the advantage that the new BFD protocol would exactly parallel the RTP protocol and would be easy for designers and manufacturers to understand and interpret. It is important that, after testing, the continuous BFD measurements be analyzed to assess the actual BFD levels and monitor them for changes over time.
This chapter evaluates the DOT&E’s first article testing (FAT) protocol for BFD. For the sake of comparison, the committee also considers the Army’s legacy test plan. As was the case for RTP (Chapter 6), the Army has modified the DOT&E protocol for application to the lightweight Advanced Combat Helmet, so the effect of that modification is also evaluated.
Recall from Chapter 4 that BFD is the maximum depth of the indentation in the clay headform resulting from a 9-mm-bullet impact on a mounted helmet. It is measured for each shot that does not penetrate the helmet. These BFD measurements are compared against corresponding thresholds (or limits) that depend on shot location: 25.4 mm for front and back and 16.0-mm for left, right, and crown. As discussed in Chapter 5, there appears to be no scientific basis for the choice of these thresholds. Without a scientific basis, the committee is limited to an assessment of whether the BFD distribution for a new helmet is at least as good as that of current helmets, in terms of the probability of exceeding the specified limits.
The DOT&E protocol is based on the suite of 240 shots discussed in Chapter 5. Data from the 240 shots are divided into two groups corresponding to shot location as follows:
1. 96 measurements from all the shots at front and back locations, combined across helmet sizes and environments; and
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 48
7 Test Protocols for Backface Deformation: Statistical Considerations and Assessment 7.0 SUMMARY 7.1 INTRODUCTION The original Army protocols for backface deformation This chapter evaluates the DOT&E’s first article testing (BFD) were based on binary (0-1) data. The BFD measure- (FAT) protocol for BFD. For the sake of comparison, the ment at each location was compared against its specified committee also considers the Army’s legacy test plan. As threshold, and the outcome was scored as a “1” (failure) if was the case for RTP (Chapter 6), the Army has modified it exceeded its threshold. This original plan was based on the DOT&E protocol for application to the lightweight 20 shots; if no BFD measurements exceeded their limit, the Advanced Combat Helmet, so the effect of that modification demonstration was successful. In this sense, it was similar to is also evaluated. Army’s legacy protocol for resistance to penetration (RTP). Recall from Chapter 4 that BFD is the maximum depth of The Director, Operational Test and Evaluation (DOT&E) the indentation in the clay headform resulting from a 9-mm- protocol expanded the number of shots to 240 and used the bullet impact on a mounted helmet. It is measured for each continuous measurements together with an assumption that shot that does not penetrate the helmet. These BFD measure- the data are normally distributed. Specifically, the plan com- ments are compared against corresponding thresholds (or pared the 90 percent “upper tolerance limits” computed at limits) that depend on shot location: 25.4 mm for front and 90 percent confidence level (90/90 rule) with their thresholds back and 16.0-mm for left, right, and crown. As discussed for the corresponding location on the helmet. As noted in in Chapter 5, there appears to be no scientific basis for the Chapter 5, available BFD test data show that the probability choice of these thresholds. Without a scientific basis, the of BFD exceeding its limits is quite small—on the order committee is limited to an assessment of whether the BFD of 0.005. As this chapter observes, DOT&E’s BFD proto- distribution for a new helmet is at least as good as that of col has about a 90 percent chance of accepting the helmet current helmets, in terms of the probability of exceeding the design even if there is an order of magnitude increase in the specified limits. exceedance probability (from 0.005 to 0.05). This weakens the incentive for manufacturers to produce helmets that are 7.2 BACKFACE DEFORMATION FIRST ARTICLE at least as good as current helmets with respect to BFD. In ACCEPTANCE TESTING PROTOCOLS AND THEIR addition, the DOT&E protocols are based on an assumption PROPERTIES of normality (a priori untestable) and the complex notion of an upper tolerance limit. Therefore, Recommendation 7-1 DOT&E Protocol proposes that DOT&E’s protocol be changed. This change has the advantage that the new BFD protocol would exactly The DOT&E protocol is based on the suite of 240 shots parallel the RTP protocol and would be easy for designers discussed in Chapter 5. Data from the 240 shots are divided and manufacturers to understand and interpret. It is important into two groups corresponding to shot location as follows: that, after testing, the continuous BFD measurements be analyzed to assess the actual BFD levels and monitor them 1. 96 measurements from all the shots at front and back for changes over time. locations, combined across helmet sizes and environ- ments; and 48
OCR for page 48
TEST PROTOCOLS FOR BACKFACE DEFORMATION 49 2. 144 measurements from all the shots at left, right, and The left-hand side of this inequality is the number of crown locations, combined across helmet sizes and (sample) standard deviations, S, between B* and the aver- environments. age BFD, Y . The conventional term for this quantity is the estimated “margin” relative to a one-sided specification To accept the lot, the 90/90 UTLs calculated from the data limit. If the estimated margin is greater than a specified k, for both groups must be less than their respective thresholds. the acceptance criterion is met. A 90/90 upper tolerance limit (UTL) is the upper 90 In the statistical and quality control literature, the test percent confidence bound on the 90th percentile of the plans are developed by controlling the probability of exceed- underlying distribution. The statistical inference is that, with ing a one-sided specification limit directly from a margin 90 percent confidence, 90 percent of the underlying BFD calculation, rather than backing into this criterion from a distribution is less than the UTL calculated from the data. UTL. If the calculated margin exceeds a threshold, k, the The DOT&E protocol calculates the UTLs assuming the demonstration is successful. BFD measurements have a normal distribution (but different normal distributions for the two location groups). Finding 7-1. Statistical tolerance limits, which are the basis For a normal distribution with mean μ and standard devia- of the DOT&E analyses, are complex, and one has to keep tion σ, the upper 90th percentile is μ + 1.28σ. Because the track of multiple probabilities and inequalities. An equiva- parameters are unknown, one has to estimate them and also lent, and more conventional and transparent, analysis is to incorporate the variability in the estimates. It turns out that base the acceptance test on the margin (the standardized UTL, based on the data, has the form difference between the threshold and the sample mean, as in Equation 7-1). UTL = Y + k S The margin plan parameters (k, n) are analogous to the (c, Here, Y is the sample mean, S is the sample standard devia- n) parameters for binomial data. For a given plan, operating tion, and k is a constant that depends on the sample size n characteristic (OC) curves can be calculated that plot the (number of shots), the confidence level, and the distribution probability of acceptance versus the underlying probability percentile of interest. The last two are both set at 90 percent of exceeding the limit, B*. By specifying two points on the by DOT&E, hence the 90/90 rule. The k-factors are derived OC curve, values of n and k can be derived that define a plan from a non-central t distribution. They have been tabulated that satisfies those two requirements. and can also be obtained using commercial software. For the 90/90 criterion, it is clear that the k-factor has to be Operating Characteristics Curves of DOT&E Protocol larger than 1.28 to account for the uncertainty in estimating the parameters μ and σ from the data using Y and S. Figure 7-1 shows the OC curves for the two groups of shot The 90/90 UTL is applied as follows in DOT&E’s BFD locations: (1) red dashed line corresponds to back and front, protocol. UTL is a 90 percent upper confidence bound for the and (2) black solid line corresponds to right, left, and crown. 90th percentile, so one can say with 90 percent confidence At the right side of Figure 7-1, the green line shows that, that at least 90 percent of the distribution is smaller than the if the underlying probability of a BFD “failure” is 0.10 for UTL (or at most 10 percent of the distribution exceeds the either location group, there is only a 10 percent chance of UTL). Therefore, the FAT is successful if the UTL is less passing the test. This is the 90/90 criterion that was speci- than the specified BFD limit B* for each data group. The fied up front, and the plans have the intended property at this rationale is that if UTL < B*, with 90 percent confidence, B* value. The manufacturer’s risk, and incentive, is read from exceeds more than 90 percent of the distribution, and there is the left end of the curves. For example, for the extreme left less than 10 percent of the distribution exceeding B*. (red) line where P(BFD > B*) = 0.005, comparable to the The same theory underlying the determination of normal proportion of available BFD data that exceed their thresh- distribution tolerance limits can be used to calculate a 90 olds, the probability of acceptance is close to one; that is, the percent upper confidence limit on the probability of exceed- manufacturer’s risk is close to zero. The blue lines show that, ing a specified threshold. This exceedance probability is to have at least a 90 percent chance of passing the acceptance analogous to the penetration probability for RTP testing. test, the manufacturer must have a BFD exceedance probabil- The acceptance criterion would then be that this confidence ity of about 0.05 for the back and front locations and about limit on the exceedance probability be less than 0.10. This 0.055 for the other group. Putting it another way, even if the criterion is equivalent to the UTL criterion, but more in line exceedance probability is as high as 5 percent or 5.5 percent, with the 90/90 criterion underlying the DOT&E protocols. manufacturers still have a 90 percent chance of passing the The acceptance criterion, that Y + k S < B*, can be FAT requirement for BFD. rewritten as The DOT&E protocol specifies that the plans for both groups of locations must pass their acceptance tests in (B* – Y )/S> k. Equation 7.1 order for the overall BFD protocol to be successful. Thus,
OCR for page 48
7-1 50 REVIEW OF DEPARTMENT OF DEFENSE TEST PROTOCOLS FOR COMBAT HELMETS Comparison of DOT&E’s Current Protocols to the Legacy 0.005 0.1 1.0 n k Protocol 0.9 0.9 144 1.44 96 1.47 0.8 n sample size The legacy protocol was a (c = 0, n = 20) plan based on Probability of Acceptance 0.7 k critical distance converting BFD failures to binary data. The OC curves of 0.6 such plans were discussed in Chapter 5; in this case, P(BFD 0.5 > B*) is the probability of a BFD failure. Figure 7-3 overlays 0.4 the OC curve for that plan on the OC curves in Figure 7-2. 0.3 To have at least a 90 percent chance of passing the legacy 0.2 plan, the underlying BFD failure probability had to be 0.005 0.1 0.1 or less. The DOT&E protocol relaxes that incentive by about 0.0 00 01 02 03 0 4 0 5 0 6 0 7 08 09 10 11 12 13 14 15 an order of magnitude (even considering that the tolerance 0. 0. 0 . 0 . 0 . 0 . 0 . 0. 0. 0. 0. 0. 0. 0. 0. 0 . Pr(BFD>B*) limit acceptance test has to be passed by both data sub- groups). Thus, as was the case for RTP, the DOT&E protocol FIGURE 7-1 Operating characteristic curves for Director, Opera- is “easier” to pass than the legacy protocol for values of true tional Test and Evaluation, backface deformation (BFD) protocol BFD failure probabilities less than 0.075 (where the legacy for the two groups of shot locations: red dashed line corresponds and the green curves cross). to back and front and black solid line corresponds to right, left, and For the BFD data provided to the committee (see crown. Green and red lines show the acceptance probabilities for the two groups when P(BFD > B*), the exceedance probabilities, Chapter 5), there were 8 BFD failures in a total of 816 tests. are 0.10 and 0.005 respectively. Blue line shows the exceedance All of those failures were in one test series, which could probabilities when the acceptance probability is fixed at 0.9. indicate a systematic problem with that helmet or that test series. The combined data for the other three helmet tests yield an upper 90 percent confidence limit on the BFD failure probability of 0.004. This should be the region of interest for BFD protocol. 1.0 0.0 0 5 0.045 0 .1 Finding 7-3. Figure 7-3 shows that the DOT&E protocol 0.9 has a 90 percent chance of accepting helmets even when the 0.8 P(acc:96) P(acc:144) BFD failure probabilities are an order of magnitude larger P(acc:both) than what has been achieved by current helmets. This reduces 0.6 the incentive for manufacturers of future helmets to sustain Pr(acc) BFD failure probabilities at current levels. 0.4 0.2 0.1 0.0 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 Pr(BFD > B*) FIGURE 7-2 The two operating characteristic (OC) curves in Figure 7-1 overlaid with the overall OC curve of the backface deformation 0 .0 0 5 0 .0 4 5 0.1 1.0 (BFD) protocol (assuming both BFD exceedance probabilities are 0.9 the same). 0.8 P(acc:96) P(acc:144) P(acc:both) 0.6 P(acc:0/20) Pr(acc) if the underlying BFD failure probability was 0.10 for both 0.4 subgroups of locations, the probability of passing both tests would be only 0.1 × 0.1 = 0.01, or 1 percent, as shown by 0.2 the green curve in Figure 7-2. On the other hand, even when 0.1 the underlying BFD failure probability is as high as 0.045, 0.0 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 manufacturers have a 90 percent chance of passing both tests. Pr(BFD > B*) Finding 7-2. The use of two BFD tests, rather than a single FIGURE 7-3 Comparison of the three operating characteristic test, has made the evaluation of the government’s risk and the curves in Figure 7-2 with that of the legacy (0, 20) plan. manufacturer’s risk and incentive more complicated.
OCR for page 48
TEST PROTOCOLS FOR BACKFACE DEFORMATION 51 Modified DOT&E Protocol for the Enhanced Combat Helmet 0.02 0.1 1.0 The Enhanced Combat Helmet (ECH) protocol is based 0.9 0.9 on 48 helmets spanning four helmet sizes and four environ- 0.8 P(acc:48) ments, with three helmets tested for each combination of 0.7 P(acc: 48, 5 locs) helmet size and environment. There are 2 shots per helmet, 0.6 totaling 96 shots. One shot is at one of the front/back loca- Pr(acc) 0.5 tions; the other is at one of the left/crown/right locations. 0.4 The same type of 90/90 UTLs are computed based on the 0.3 0.2 assumption of normality; the k-factor for n = 48 and the 90/90 0.1 0.1 criterion is 1.57. The black curve in Figure 7-4 is the OC 0.0 curve for the plan based on 48 shots. The red dashed curve 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 Pr(BFD > B*) is the OC curve for both tests passing. This curve shows that for a manufacturer to have a 90 percent chance of acceptance FIGURE 7-5 Operating characteristic curves for a single 48-shot for both location groups, the helmets should have an underly- plan and for five 48-shot plans. NOTE: BFD refers to backface ing probability of exceeding the limit, B*, at just less than deformation. 0.03. As was the case with the previous protocol, this is a substantially higher BFD failure probability than what cur- rent helmets have achieved. four or five subgroups is in line with the patterns of hetero- geneity that were discussed in Chapter 5. Finding 7-4. The DOT&E protocol for the ECH has a 90 Under this protocol, the tolerance limit analysis is done percent chance of accepting helmets that have an order of on appropriate subsets of either 48 or 96 shots, depending magnitude larger BFD failure probability than those achieved on the location and whether the left and right distributions of by current helmets. BFD are consistent. Figure 7-5 shows the OC curves for the situation in which the protocol is applied to a single group Army’s Modified DOT&E Protocol for the Lightweight of 48 shots, and the combined curve is for the situation of all Advanced Combat Helmet five groups passing their individual margin tests. Figure 7-5 shows that for a manufacturer to have a 90 This protocol changed the grouping of the shots in the percent chance of passing all five acceptance tests by loca- subsection above as follows: (1) front only, (2) rear only, (3) tion, the underlying BFD failure probability would have to be crown only, and (4) right and left sides combined. Before about 0.02. As was the case with RTP, the Army’s modifica- combining right and left sides, a pre-test is done to test if the tion of the DOT&E protocol is considerably more stringent distributions (mean and variance) for the two sides are differ- than the DOT&E protocol (Figure 7-2). ent; the data are combined only if there is not an indication of significant difference. This separation of the protocol into 7.3 DISCUSSION Backface Deformation Protocol Based on Binary Data 0.0 0 5 0.0 3 0 .1 1.0 Although the BFD tests are part of DOT&E’s FAT proto- 0.9 0.9 0.8 P(acc:48) P(acc:48, both) cols, the committee’s impression is that they do not receive 0.7 the same level of public scrutiny as the RTP protocols. For 0.6 example, they were not mentioned in the communications Pr(acc) 0.5 between Rep. Slaughter and the Department of Defense. 0.4 There are many possible reasons, some of which are stated 0.3 in the following finding. 0.2 0.1 0.1 Finding 7-5. The rationale behind BFD protocols for FAT is 0.0 difficult to understand for the following reasons: 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 Pr(BFD > B*) • The lack of a scientific connection between BFD and brain injury dilutes the usefulness of BFD FIGURE 7-4 Operating characteristic curves for the two location measurements; groups for the Enhanced Combat Helmet. NOTE: BFD, backface • The choice of BFD thresholds is not based on data deformation. or scientific studies, so the notion of exceeding the
OCR for page 48
52 REVIEW OF DEPARTMENT OF DEFENSE TEST PROTOCOLS FOR COMBAT HELMETS threshold has no practical or scientific meaning, and this case, it is preferable to use protocols that do not require their use is limited to comparing a new design of strong parametric assumptions. An additional consideration helmets with existing ones; and is the need for simplicity and transparency. The use of two • BFD measures the deformation on clay, which is very different protocols for RTP and BFD data makes it dif- only an indirect measure of the actual deformation ficult for DoD test designers to develop plans with the same on helmets. goals and for users to understand their properties. DOT&E’s legacy protocol was a simple and transparent There are also several statistical issues related to the plan that was based on binary data. Specifically, each BFD DOT&E protocols. The data in Chapter 5 indicate an appre- measurement is compared to its location-specific threshold, ciable difference between the BFD distributions for front and the data are converted to 0-1 outcomes depending on and rear shots. To address this, DOT&E has recommended whether the observation is below or above the threshold. A preliminary analyses to decide whether the BFD data can BFD measurement above the threshold leads to a “failure.” be pooled across groups before conducting the test. These The probability of interest is then the exceedance probability. added analyses will add substantial complexity to both the decision process and the properties of the test protocol. They Recommendation 7-1. The Director, Operational Test and also make it the protocols less transparent. These points are Evaluation, should revert to the more transparent and robust summarized in the following finding. analysis of backface deformation data based on pass/fail scoring of each measurement. Finding 7-6. • The current DOT&E protocols for BFD data are With such conversion, one can use the same types of pro- based on upper tolerance limits, which are more dif- tocols as those for RTP. For the BFD data the committee has ficult to understand than the protocols for RTP based seen, the probability of exceedance is around 0.005, about on binary data. the same levels as the penetration probabilities estimated • These protocols are based on the assumption that the from the data. So, if the same considerations in Chapter 5 are BFD data follow a normal distribution. The computed used to develop the BFD plan, the two protocols are likely values of the upper tolerance limits are sensitive to to be the same. this assumption. A natural concern in converting continuous measurements • The graphical diagnostics that were shown to the to binary data is the loss of statistical efficiency. However, committee indicate that the normality assumption is recall that the goal of the test protocols is to determine if the not unreasonable for the limited data sets that have BFD measurements exceed their corresponding thresholds. been analyzed. However, one should be cautious in The FAT BFD data provided to the committee indicate assuming that future BFD test measurements will that these thresholds are well in the upper tails of the BFD always be normally distributed. measurements (see Figures 5-2 and 5-4). The data show that • The methodology for computing UTLs requires that P(BFD > B*) is less than 0.005. The probability of rejecting the BFD data across environments, helmet sizes, and helmets (manufacturer’s risk) produced at this level of qual- across locations (within the two groups) are homo- ity is essentially zero for the test, based on binary data (the geneous; that is, they have a normal distribution with same as that for protocols based on normal theory). In other the same mean and variance. DOT&E has proposed: words, the probability of acceptance is essentially 1 for both (1) conducting preliminary hypotheses tests to deter- protocols. If P(BFD > B*) were to increase to 0.05 (an order mine if this assumption of homogeneity holds, and of magnitude increase), the probability of rejection under a (2) pooling the data only for cases where the pre-test binary (17, 240) plan is about 0.10 (see Figure 6-5). This is suggests the homogeneity assumption is valid. Such very close to the combined normal-theory plan that is cur- an approach will add substantial complexity to the rently in use (see Figure 7-2). decision process and, more importantly, to the prop- The current DOT&E protocol is based on two different erties of the test protocol. plans for the two different location subsets, because they have different thresholds and also differences in distributions The replacement of the legacy protocol, based on binary within location subsets. data, with variable BFD data was presumably driven by effi- ciency considerations. If the normal distribution assumption Recommendation 7-2. The binary data for the different is correct, the resulting protocol is much more efficient from location subgroups should be combined into a single back- a statistical perspective. When the test sample is small, as face deformation protocol. was the case with the legacy protocol of 20 shots, statistical efficiency is indeed an important consideration. Converting to a binary protocol and combining the data However, if the test sample size is large (as is the case with across the locations would mean that the exceedance prob- 240 shots), the concern about efficiency is less critical. In abilities may vary across locations. However, the numerical
OCR for page 48
TEST PROTOCOLS FOR BACKFACE DEFORMATION 53 study described in Chapter 5 indicates that the OC curves are Recommendation 7-3. The Office of the Director, Opera- robust to the level of deviations in exceedance probabilities tional Test and Evaluation, and the Services should analyze that are present with current BFD data. the continuous backface deformation measurements, com- pute the margins, and track them over time to assess any changes over time. Post-Test Analyses As noted, the loss in efficiency is not a major concern in Recommendation 7-4. Available backface deformation converting the continuous BFD measurements to 0-1 out- (BFD) data should be used to develop data-based limits comes. It is, however, important for DOT&E and the Services against which to compare future BFD data, as a replacement to do post-test analyses of the continuous BFD data, compute for the current legacy ad hoc limits. the margins, and monitor them to see if there is any trend or increase or decrease in BFD values over time. Such monitor- ing is an important part of any test process.