CONTEXT AND TASKING
In 2007, the Secretary of Defense asked the Director of Operational Test and Evaluation (DOT&E) to take over the responsibility to prescribe policy and procedures for the conduct of live-fire test and evaluation of body armor and helmets. A 2009 report by the Department of Defense’s (DoD) Inspector General recommended that the DOT&E “develop for Department-wide implementation a standard test operations procedure for body armor inserts” that includes “statistical specification of probability of performance and associated confidence in that performance” (DoD IG, 2009). As a result of this recommendation, DOT&E developed and published statistically based test protocols for body armor and for combat helmets, in April and December, 2010, respectively.
In June 2012, Rep. Louise Slaughter (D-NY) sent a (Slaughter, 2012)1 to Secretary of Defense Leon Panetta expressing concerns that the new protocol2 for ballistic testing for the Advanced Combat Helmet (ACH) posed “an unacceptably high risk” for such protective equipment. Dr. Michael Gilmore, DOT&E, responded to Rep. Slaughter’s letter (Gilmore, 2012)3 on July 13, 2012. As part of this response, he noted that DOT&E would request the assistance of the National Academies’ National Research Council (NRC) to determine the adequacy of the ballistic helmet testing methodology.
The NRC set up the Committee on Review of Test Protocols Used by the DoD to Test Combat Helmets to consider the technical issues relating to test protocols for military combat helmets and prepare a report. The statement of task for the committee is as follows:
• Evaluate the adequacy of the Advanced Combat Helmet test protocol for both first article testing and lot acceptance testing, including its use of the metrics of probability of no penetration and the upper tolerance limit (used to evaluate backface deformation).
• Evaluate the appropriate use of statistical techniques (e.g., rounding numbers, choosing sample sizes, or test designs) in gathering the data.
• Evaluate the adequacy of the current helmet testing procedure to determine the level of protection provided by current helmet performance specifications.
• Evaluate procedures for the conduct of additional analysis of penetration and backface deformation data to determine whether differences in performance exist.
• Evaluate the scope of characterization testing relative to the benefit of the information obtained.
This report is the result of the committee’s deliberations.
The ACH was introduced by the Army in 2002 and continues to be produced. The advance production order was for 1.08 million helmets, and these are in sustainment. When a manufacturer proposes to produce ACHs for the Army, it submits a sample for first article testing (FAT). If the helmet design passes the FAT, the manufacturer will start production. The produced helmets must be subjected to a lot acceptance test (LAT) for a quality check before the lot is accepted.
The FAT process involves a suite of ballistic shots, with the primary one being 9-mm shots at a specified velocity and at specified helmet locations. Two measures are used to assess the performance of helmets during the test process:
1The full text of Rep. Slaughter’s letter to Secretary Panetta is in Appendix A.
2The December 7, 2010, protocol for first article testing is superseded by the September 20, 2011, protocol for first article testing. This protocol, including the May 4, 2012, protocol for lot acceptance testing, is found in Appendix B.
3The full text of Director Gilmore’s response to Rep. Slaughter is in Appendix A.
FIGURE S-1 Operating characteristic curves for the Army’s and the Director of Operational Test and Evaluation’s first article testing protocols for penetration. The blue lines show the probabilities of acceptance for the two plans when the true probability of penetration is 0.1.
resistance to penetration (RTP) and backface deformation (BFD).4
The original Army FAT protocol consisted of 20 9-mm shots (four helmets and shots at five specified locations on a helmet). The helmets were all the same size, and one helmet each was exposed to one of four environmental conditions. A manufacturer’s helmet design was deemed to pass FAT for penetration if there were zero penetrations out of the 20 shots. This is an example of a c-out-of-n test plan in the statistical quality control literature; in this case, c = 0 and n = 20.
The properties of a test plan can be obtained from its operating characteristic (OC) curve, which is a plot of the probability of passing the test (y-axis) as a function of the penetration probability of a single shot (x-axis). The solid black curve in Figure S-1 gives the OC curve for the Army’s 0-out-of-20 plan. The blue line shows that, if the true probability of penetration is 0.10, the probability of passing the test is about5 0.10. This property has been referred to as the 90/90 standard in Director Gilmore’s letter and elsewhere by DOT&E: If the probability of non-penetration is 0.9 or less, then the helmet design has at least a 90 percent chance of failing the FAT.
In developing its protocol, DOT&E decided to increase the number of helmets tested from 4 to 48. Five shots were taken at five different locations on a helmet (as was the case with the Army’s protocol), leading to a total of n = 240 shots. DOT&E applied the same 90/90 standard to get the number of acceptable penetrations as c = 17. In other words, the helmet design passes FAT if there are17 or fewer penetrations in 240 shots and fails otherwise. The dashed red curve in Figure S-1 shows the OC curve for this plan developed by DOT&E. It can be seen that, if the true probability of penetration is 0.10, the probability of acceptance equals 0.10 (satisfying the 90/90 standard).
It is this change in the protocol, from zero penetrations (out of 20 shots) to allowing as many as 17 penetrations (out of 240 shots), that resulted in Rep. Slaughter’s concern with the safety of Army combat helmets. In his response, Director Gilmore noted that DOT&E’s plan had (essentially6) the same 90/90 property as the Army’s legacy plan. Further, it had better statistical properties because a larger number of
4RTP is a binary outcome indicating whether or not there is a complete penetration of the helmet shell. BFD is measured by the maximum depth of the deformation that is imprinted by the helmet on the clay surface of the headform. (Formal definitions are given in Chapter 5.)
5The actual probability of acceptance for the 0-out-of-20 plan is slightly higher than 0.10. The 0-out-of-22 plan is closer to the 90/90 standard. This was noted in Dr. Gilmore’s response to Rep. Slaughter.
6See footnote 5.
FIGURE S-2 Further comparisons of the operating characteristic curves for the Army’s and the Director of Operational Test and Evaluation’s first article testing protocols for penetration. The blue lines show the probabilities of acceptance for the two plans when the true probability of penetration is 0.1; the purple and green lines show the corresponding acceptance probabilities when the true penetration probabilities are, respectively, 0.005 and 0.05.
helmets and multiple helmet sizes were tested under different environmental conditions, and, therefore, the new protocol was an improvement.
Comparison of FAT Protocols for Penetration
The committee first considers FAT protocols for RTP because these were the focus of the correspondence between Rep. Slaughter and Director Gilmore. FAT protocols involving BFD are discussed in Chapter 7. LAT protocols for both RTP and BFD are considered in Chapter 8.
The committee emphasizes an obvious point: The Army’s legacy protocol allowed zero penetrations in 20 shots, but that did not imply that a helmet design that passes FAT has zero probability of penetration.
Further, there are good statistical reasons to justify DOT&E’s increase in the number of helmets tested to 48 helmets from the Army’s 5. One gets more precise estimates of the penetration probability from 240 shots than 20 shots. In addition, DOT&E’s plan allows better statistical comparison of possible differences between helmet sizes and environmental conditions. So, as pointed out in Dr. Gilmore’s letter, there are indeed advantages associated with increasing the number of helmets tested.
However, a key issue is whether the 90/90 standard, which was used to develop the protocol, is appropriate. In addition, that standard specifies only one point on the OC curve in developing the test plan, but, in fact, the whole curve and the plan’s incentives and risks need to be considered. Figure S-2 provides a re-examination of the OC curves for the Army’s and DOT&E’s protocols. As in Figure S-1, the black curve is for the Army’s 0-out-of-20 plan, and the red curve is for DOT&E’s 17-out-of-240 plan. Each curve shows how the probability of accepting a helmet design (y-axis) varies as the underlying probability of penetration (x-axis) varies. As noted in Figure S-1, the two curves cross at a point close to penetration probability of 0.10 (blue line). To the left of this curve, DOT&E’s plan (in red) has higher probabilities of acceptance (passing FAT); to the right it has lower probabilities. In other words, the DOT&E’s plan is less stringent (easier to pass) than the original 0-out-of-20 plan if the actual penetration probability is less than 0.10 and more difficult to pass if the penetration probability is higher than 0.10. However, as we will see below, there are more pertinent penetration probabilities at which the plans should be compared.
Data made available to the committee show that manufacturers are currently producing ACHs with penetration probabilities around 0.005 or less (overall, there were 7 penetrations in 12,147 shots; see Chapter 5). This corresponds to the purple line in Figure S-2. At this penetration probability of 0.005, the probability of passing the FAT is close to 1.0 for DOT&E’s protocol (red curve), while it is about 0.9 for the Army’s legacy protocol (black curve). So the manufacturer’s risks (probabilities of not passing the FAT) at a penetration probability of 0.005 are zero and 0.1 respectively. These are relatively small values, as they should be.
Consider the green line in Figure S-2 that corresponds to a penetration probability of 0.05, an order of magnitude higher than the current penetration level of 0.005. For this value, the DOT&E’s plan (red curve), has an acceptance probability of about 0.95, while the Army’s legacy plan (black curve) has a probability of about 0.38. In other words, if manufacturers produce helmets with a penetration probability of 0.05 (as noted, an order of magnitude higher than the current level), they have a 95 percent chance of passing the FAT under the current DOT&E protocol; that is, the government’s risk is 0.95. In comparison, the government’s risk under the Army’s legacy plan is 0.38.
So the question comes down to the following: What is the appropriate level of penetration probability at which the government’s risk should be controlled? By selecting the 90/90 standard, DOT&E has set this penetration probability at 0.10, a value that is roughly two orders of magnitude greater than where the manufacturers are currently operating.
Now, for business reasons, the manufacturers would want to design a helmet that has a high chance of passing the test while meeting the other helmet criteria such as weight. If there is a high probability of passing the test, even if the penetration probability is an order of magnitude higher than the current levels, manufacturers may not have an incentive to sustain the current levels of penetration-resistance, and, hence, helmet safety could possibly be degraded.
As noted in Chapters 3, 6, and 10, there is currently no scientific basis for linking performance metrics to brain injuries. The report notes, in Chapter 3 and elsewhere, that there is a need to initiate research that connects performance metrics to brain injuries.
Recommendation 3-4. The Department of Defense should vigorously pursue efforts to provide a biomedical basis for assessing the risk of helmet backface injuries.
While these links are being developed, it is important that the performance of new helmet systems is at least as good as previous helmet systems, as measured by current performance metrics.
Recommendation 6-2. If there is a scientific basis to link brain injury with performance metrics (such as penetration frequency and backface deformation), the Director of Operational Test and Evaluation (DOT&E), should use this information to set the appropriate standard for performance metrics in the test protocols. In the absence of such a scientific basis, DOT&E should develop a plan that provides assurance that it leads to the production of helmets that are at least as penetration resistant as currently fielded helmets.
Director Gilmore’s response to Rep. Slaughter notes that the “Services and the U.S. Special Operations Command have endorsed the 90/90 standard for no perforation.”7 Despite this assurance, the committee is concerned that DOT&E’s protocol may have unintended consequences. As noted earlier, under the new DOT&E protocol, there is a high probability of passing the test even if the penetration probability is an order of magnitude higher than the current levels. Therefore manufacturers may not have an incentive to sustain the current levels of penetration resistance.
Of course, future designs of helmets may involve other considerations such as lower weight and added mobility. It is possible that manufacturers and the government have to compromise on the penetration probability levels in order to produce lighter helmets. However, the added benefits of such design changes would have to be studied and demonstrated before one accepts higher levels of penetration. In the case of the ACH, there have been no such design changes.
The Army’s Modified Protocol
In 2012, with DOT&E’s approval, the Army modified the 17-out-of-240 plan to a two-stage protocol. The two stages involve conducting a 0-out-of-22 plan in the first stage, and, if the helmet design passes this test, then a second 17-out-of-218 plan is used, for a total of 240 shots and a combined acceptable number of penetrations of 17. The first stage, the 0-out-of-22 plan, is slightly more stringent than the Army’s 0-out-of-20 legacy plan, so this modified plan provides an incentive for manufacturers to achieve a penetration probability of 0.005 or less.
Finding 6-4. The Army’s modified plan satisfies the criterion that it provides an incentive for manufacturers to produce helmets that are at least as penetration-resistant as current helmets.
The second stage of this plan allows 17 penetrations out of 218 shots, or equivalently, a penetration probability level of 17/218 = 0.08. However, a helmet design with 0.08 penetration probability has a very small chance of being
7Director Gilmore’s letter, reprinted in Appendix A, also noted, “The National Research Council (NRC), in its recent independent technical review of the Department’s testing of body armor, indicated that this approach to testing is scientifically defensible.” It should be emphasized, however, that the Committee on Testing of Body Armor Materials for Use by the U.S. Army—Phase III did not explicitly endorse the 90/90 standard. Further, the standards for helmets should be determined independently of those for body armor.
accepted in the first stage, so the two-stage plan will reject such a helmet design.
CONSIDERATIONS IN DEVELOPING NEW PROTOCOLS
Although the Army’s modified protocol can be a short-term solution, the committee encourages DOT&E to consider the various findings and recommendations in the report and develop a better alternative to its current protocols. These findings and recommendations are described in Chapters 5 through 9 of the report. Some of the important considerations identified in the report include the following:
• What is the appropriate level at which government’s risk should be controlled? The 90/90 standard implies that it should be controlled at a penetration probability of 0.10. However, manufacturers are currently producing ACHs with a penetration probability of around 0.005 or less, which is substantially lower than 0.10.
Recommendation 6-3. The government’s risk should be controlled at much lower penetration levels than the 0.10 value specified by the 90/90 standard.
• When DoD adopts new helmets with changes to the design (such as lighter weight and added mobility), it will be necessary to reevaluate the protocols. For example, it may not be possible for manufacturers to produce lighter helmets at current levels of penetration.
Recommendation 9-1. When combat helmets with new designs are introduced, the Department of Defense should conduct appropriate characterization studies and cost-benefit analyses to evaluate the design changes before making decisions. It is not advisable to automatically apply the same standard (such as the 90/90 rule or others) when these tests could potentially be across different protective equipment (body armor, helmets, etc.), different numbers of tests (e.g., 96 tests for the Enhanced Combat Helmet, 240 tests for the Advanced Combat Helmet), or over time.
• The current BFD protocols use upper tolerance limits based on the assumption that the data are normally distributed. One has to be cautious in using protocols that are sensitive to such parametric assumptions. Further, the use of pretests to check on assumptions of homogeneity, as has been proposed by DOT&E, would lead to complexity in the analysis and, more importantly, the properties of the BFD protocols. When the test sample size is large (as is the case with DOT&E’s proposed plan of 240 shots), it is preferable to use protocols that do not rely on parametric assumptions, are more transparent, and are easier to interpret.
Recommendation 7-1. The Director of Operational Test and Evaluation should revert to the more transparent and robust analysis of backface deformation data based on pass/fail scoring of each measurement.
However, it is important to conduct post-test analysis of the continuous BFD measurements and monitor them over time.
Recommendation 7-3. The Office of the Director, Operational Test and Evaluation, and the Services should analyze the continuous backface deformation measurements, compute the margins, and track them over time to assess any changes over time.
• The different-sized helmets are intrinsically different products with different shells, molds, and manufacturing settings, and consideration should be given to testing them separately. Further, separating by helmet sizes will simplify some of the complexities associated with current test processes.
Recommendation 5-5. Current Office of the Director, Operational Test and Evaluation, protocols should be revised and implemented separately by helmet size.
• Data made available to the committee indicate that there may be considerable differences in the distributions of the BFD data across helmet sizes and shot locations. DOT&E is considering the use of preliminary hypothesis tests on BFD data and pooling the data across the different settings if the hypotheses are not rejected. The committee has reservations about the use of such procedures. The changes to binary data for BFD test plans and the implementation of protocols by helmet size will mediate the effect of heterogeneity among shot locations.
It was not part of the committee’s charge to offer specific alternative test protocols. However, several alternative plans and their properties are discussed in this report to assist in DOT&E’s efforts to develop an appropriate plan.
DOT&E has indicated that as data are obtained its protocol will be updated and modified. The committee’s findings are in that spirit: Available data indicate that penetrations are rare events (penetration probability of 0.005 or less). Therefore, an alternative protocol has to be developed such that ACH manufacturers have an incentive to maintain that level of penetration-resistance. The 17-out-of-240 FAT protocol does not provide such incentive.
The report compares the performance of DOT&E’s 17-out-of-240 with the Army’s legacy plan of 0-out-of-20 at various places. The main reason for such comparisons, as discussed earlier, is that any new plan should lead to the production of helmets that are at least as penetration-resistant
as currently fielded helmets. However, the committee reiterates that there are important advantages to the increased test size in DOT&E’s plan compared to the Army’s legacy plan. Any modification to DOT&E’s plan should retain the benefits obtained from the increased test size, although the report does not make any specific recommendation on test size.
ORGANIZATION OF THE REPORT
This report includes 10 chapters and several appendixes. Chapter 1 provides an introduction and overview. Chapter 2 describes the history and evolution of the combat helmet as well as recent advances in design, materials, and manufacturing processes.
Chapter 3 describes historical wounding patterns and recent and emerging threats as well as the biomechanical basis for penetration and blunt trauma. The latter topic is taken up in more detail in Chapter 10, which presents the gaps in medical knowledge of brain injury tolerances relative to current standards of helmet protection. The key findings and recommendations from these two chapters include the following:
• Wounding from an explosive source (including fragmentation from bombs, mines, and artillery) has dominated injuries in all major modern conflicts since World War II. Blast and blunt trauma are increasingly becoming a major source of injuries.
Recommendation 3-1. The Department of Defense should ensure that appropriate threats, in particular fragmentation threats, from current and emerging threat profiles are used in testing.
Recommendation 3-3. The Department of Defense should reassess helmet requirements for current and potential future fragmentation threats, especially for fragments energized by blast and for ballistic threats. The reassessment should examine redundancy among design threats, such as the 2-grain versus the 4-grain and the 16-grain versus the 17-grain. Elimination of tests found to be redundant may allow resources to be directed at a wider diversity of realistic ballistic threats, including larger mass artillery fragments, bullets other than the 9-mm, and improvised explosive device fragments. This effort should also examine the effects of shape, mass, and other parameters of current fragmentation threats and differentiate these from important characteristics of design ballistic threats.
• Unlike body armor, there is not any indirect biomechanical connection between the backface deformation assessment in the current test methodology and brain injuries from behind-helmet deformation.
• Brain injury tolerances determined in the past, and continuing to be developed for vehicle and sports collisions, are based on stresses and stress rates that are significantly different from those for ballistic and blast stresses.
Most of the findings are recommendations in Chapters 3 and 10 are in response to the third point in the committee’s statement of task: Evaluate the adequacy of current testing to determine the level of protection provided by the ACH.
Chapters 4-9 deal primarily with statistical issues. Chapter 4 describes the testing and measurement processes for combat helmets, including the test threats and the different sources of variation. The Phase II report on body armor testing noted the need to conduct a formal gauge repeatability and reproducibility (R&R) study to determine the sources of variation in the test process (NRC, 2012). It appears that such a study has not been done. In view of the costs involved in testing and the benefits to be gained from an R&R study, the committee reiterates the importance of carrying out such a study.
Recommendation 4-1. The Department of Defense should conduct a formal gauge repeatability and reproducibility study to determine the magnitudes of the sources of test variation, particularly the relative contributions of the various sources from the testing methodology versus the variation inherent in the helmets. The Army and the Office of the Director, Operational Test and Evaluation, should use the results of the gauge repeatability and reproducibility study to make informed decisions about whether and how to improve the testing process.
Chapter 5 provides a formal definition of the performance measures—resistance to penetration (RTP) and backface deformation (BFD)—and discusses their limitations. The results from analyses of FAT and LAT data made available to the committee are also described here. These data showed considerable heterogeneity among helmet sizes and shot locations.
Chapters 6 and 7 are concerned with the evaluation and comparison of FAT protocols for RTP and BFD, respectively. Most of the key findings and recommendations from these chapters are summarized above.
Chapter 8 deals with LAT, with major findings and recommendations that mirror those for FAT. In addition, Chapter 8 describes how the current LAT protocols can be modified to conform to American National Standards Institute standard.
Chapter 9 responds to the committee’s charge to evaluate the scope of current characterization testing and recommend additional studies. A number of additional characterization studies for new helmet designs as part of a broader program on characterization are suggested.
The committee commends the Director of Operational Test and Evaluation and his office for their efforts to bring scientific rigor to the testing of combat helmets. These efforts are of critical importance to the safety and morale of the men and women of the U.S. armed services. The committee also applauds Rep. Slaughter for her active oversight in this area.
The overarching messages in this report are:
• There is an urgent need for the Department of Defense to establish a research program to develop helmet test metrics that have a clear scientific link to the modes of human injury from ballistic impact, blast, and blunt trauma.
• It is critical that test profiles for combat helmets be modified to include appropriate threats from current and emerging threats.
• The development of test protocols must be based on appropriately derived OC curves, where such curves will likely be unique to each helmet type and design, which is intentionally chosen to match current technology capability and the needs of the soldier on the battlefield. Further, it is important that the design of test plans focus on that region of the OC curve at which the helmet is expected to perform.
Throughout the course of the committee’s research and deliberations, it became quite clear that DOT&E’s and the Army’s goal is to ensure that combat helmets (and all personal protective equipment) are manufactured and tested to the highest possible standards. It is the committee’s hope that this report helps DOT&E and DoD in their continued pursuit of this goal.
DoD IG (Department of Defense Inspector General). 2009. D-2009-047. DoD Testing Requirements for Body Armor. Washington, D.C.: Department of Defense.
Gilmore, J.M. 2012. Letter from J. Michael Gilmore, Director, Operational Test and Evaluation, to Representative Louise M. Slaughter, July 13.
NRC (National Research Council). 2012. Phase III Report on Review of the Testing of Body Armor Materials for Use by the U.S. Army. Washington, D.C.: The National Academies Press.
Slaughter, L.M. 2012. Letter from Representative Louise M. Slaughter to Secretary of Defense Leon Panetta, June 26, 2012.