National Academies Press: OpenBook
« Previous: CHAPTER 4 PROCEDURES
Page 128
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 128
Page 129
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 129
Page 130
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 130
Page 131
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 131
Page 132
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 132
Page 133
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 133
Page 134
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 134
Page 135
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 135
Page 136
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 136
Page 137
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 137
Page 138
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 138
Page 139
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 139
Page 140
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 140
Page 141
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 141
Page 142
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 142
Page 143
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 143
Page 144
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 144
Page 145
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 145
Page 146
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 146
Page 147
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 147
Page 148
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 148
Page 149
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 149
Page 150
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 150
Page 151
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 151
Page 152
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 152
Page 153
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 153
Page 154
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 154
Page 155
Suggested Citation:"CHAPTER 5 COMPARING TIME HISTORIES." National Academies of Sciences, Engineering, and Medicine. 2011. Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications. Washington, DC: The National Academies Press. doi: 10.17226/17647.
×
Page 155

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

126 CHAPTER 5 COMPARING TIME HISTORIES INTRODUCTION Comparing the correspondence between curves from physical experiments and mathematical models is a very important and common technique used by scientists and engineers to determine if the mathematical models adequately represent physical phenomena. Two common reasons for which shapes are compared are the verification or validation of computational results and the assessment of the repeatability of experimental tests. In the former case, an experimental and a numerical curve are compared in order to assess how well the numerical model predicts a physical phenomenon; while in the latter case, two or more experimental curves are compared in order to assess if they represent the same or similar physical events. A traditional technique has been to visually compare curves by matching peaks, oscillations, common shapes, etc. Although this kind of comparison gives a subjective impression of how similar two curves are, it is based on a purely subjective judgment which could vary from one analyst to another. With subjective methods, validation is in the eye of the beholder. Validation and verification decisions need to be based as much as possible on quantitative criteria that are unambiguous and mathematically precise. In order to minimize the subjectivity, it is necessary to define objective comparison criteria based on computable measures. Comparison metrics, which are mathematical measures that quantify the level of agreement between simulation outcomes and experimental outcomes, can accomplish this goal. Several comparison metrics have been developed in different engineering domains. Metrics can be grouped into two main categories: (i) deterministic metrics and (ii) stochastic metrics. Deterministic metrics do not specifically address the probabilistic variation of either experiments or calculation (i.e., for deterministic metrics the calculation results are the same every time given the same input), while stochastic metrics involve computing the likely variation in both the simulation and the experiment response due to parameter variations. Deterministic metrics found in literature can be further classified into two main types: (a) domain-specific metrics and (b) shape comparison metrics. The domain-specific metrics are quantities specific to a particular application. For example, the axial crush of a railroad car in a standard crash test might be a metric that is useful in designing rolling stock but has no relevance to other applications. Similarly, the occupant impact velocity in Report 350 is an important evaluation criterion in roadside safety but has no relevance in other domains of structural mechanics. On the other hand, shape comparison metrics involve a comparison of two curves: a curve from a numerical simulation and a physical experiment in the case of validation. The curves may be time histories, force-deflection plots, stress-strain plots, etc. Shape comparison

127 metrics assess the degree of similarity between any two curves in general and, therefore, do not depend on the particular application domain. In roadside safety, comparisons between several tests or between test and simulation results have mainly used domain-specific metrics (e.g. occupant severity indexes, changes in velocity, 10-msec average accelerations, maximum barrier deflection etc.).(132) The advantage of this method was that the user could use the same domain-specific metrics that are already used to evaluate experiments to compare test and simulations results. Although the comparison of domain-specific metrics can give an idea of how close two tests or a test and a simulation are, shape-comparison metrics would be another valuable tool since they can be used to directly evaluate the basic measured response of the structures like acceleration and velocity time histories. In roadside safety, many domain-specific metrics with numerical values (e.g., occupant risk, ride-down acceleration, change in velocity, etc) are derived from the acceleration time histories so if the acceleration time history information is valid, any metric derived from the time history data should also be valid. A computer program is described in this chapter which automatically evaluates the most common shape-comparison metrics found in literature.(3,6,7,27, 35,133-139) The program, called Roadside Safety Simulation Validation Program (RSVVP) was developed to evaluate metrics used in the verification and/or validation of numerical models in the roadside safety field or for comparing repeated crash tests. In order to correctly evaluate the shape-comparison metrics, a series of preprocessing tasks are necessary to ensure a correct comparison of the two curves. These preprocessing tasks are implemented in the code before the actual metrics are calculated. The following sections describe the preprocessing steps implemented in the RSVVP code, the numerical implementation of the metrics and the post-processing operations implemented to present the results. In the second section, the results obtained comparing some simple analytical curves are presented and discussed. In the last section, the choice of metrics used in roadside safety and the acceptance criteria used to judge comparisons is presented. The RSVVP code was written in Matlab; the user can input the data and select the various options using a series of intuitive graphical interfaces.(131) The user’s manual for RSVVP, included in Appendix A, provides step-by-step instructions for using the program to compare time histories. The programmer’s manual, included in Appendix B, documents the algorithms and structure of the program. RSVVP PREPROCESSING Since the two curves being compared will come from different sources (e.g., a crash test and a finite element simulation), it is necessary to preprocess the curves in the same way to ensure that any differences are not the result of filtering, sampling rate, sensor bias or sensor

128 drift. Some pre-processing operations like re-sampling and trimming of the two curves are essential since the curves must have the same length and be comparable point-to-point. Other preprocessing steps like filtering and sensor bias adjustments, though not strictly necessary, can play an important role in the final comparison result. For example, two identical curves that are simply shifted in time with respect to each other because the data was recorded with a different start time could produce a poor result just because of the initial offset value between the curves. A synchronizing operation that ensures the two curves start at the same point in time is another important pre-processing step. While all these pre-processing steps could be performed prior to the comparison, they have been included in the RSVVP code to make an easy-to-use program that can perform all the steps needed to generate an accurate comparison starting from the raw electronic data. The RSVVP program performs the following pre-processing operations: • Filtering, • Re-sampling (i.e., ensuring that the interval between data samples is the same in both curves), • Synchronizing (i.e., ensuring that the two curves start at the same point) and • Trimming (i.e., the curves are trimmed so they have the same length). The user has the option of skipping any or all of these steps. For example, the input data for one curve may already be filtered so it would be unnecessary to re-filter the data. Similarly, if the user knows the two curves are sampled at the same frequency there is no need to re-sample the curves. A brief description of all the pre-processing tasks performed by the RSVVP code before evaluating the comparison metrics is presented in the following sections. Filtering Filtering the curves is the first preprocessing step. In the case of accelerations collected in crash tests, the data collected are characterized by some level of high-frequency noise (a.k.a. ringing) which does not reflect the true overall dynamics of the crash. The curves must be filtered before calculating the comparison metrics although the filtering may be done inside RSVVP or with some other program (e.g., TRAP). RSVVP digitally filters the time-histories according to SAE J211, the same reference standard for filtering used in NCHRP Report 350.(140) The user can chose between the most common Channel Frequency Classes (CFC) and even define custom filter specifications if necessary. By default, the filter option is disabled to give the user the choice to filter data using an external program since most common crash test evaluation programs like TRAP perform filtering. The RSVVP filter function is a digital four-pole Butterworth low-pass filter. The algorithm uses a double-pass filtering option (i.e., forward and backward): data are filtered twice, once forward and once backward using the following difference equation in the time domain:

129 )2()1()2()1()()( 21210 −⋅+−⋅+−⋅+−⋅+⋅= tYbtYbtXatXatXatY (1) where )(tX = Input data sequence )(tY = Filtered output sequence The filter coefficients vary with the particular CFC value chosen and are calculated using the following formulas: 2 2 0 21 aa aa ωω ω ++ = (2) 01 2aa = (3) 02 aa = (4) ( ) 2 2 1 21 12 aa ab ωω ω ++ −− = (5) 2 2 2 21 21 aa aab ωω ωω ++ −+− = (6) where, 0775.22 ⋅⋅= CFCd πω (7) )2cos( )2sin( T T a a a ⋅ ⋅ = ω ωω (8) In order to avoid the typical scatter at both the beginning and the end of the filtered time histories due to the application of the difference equation (i.e., Eq. 1), a head and tail are added to the original data sets consisting of a simple repetition of the first and last data values. Once the modified data sets are filtered, the head and tail are deleted from the final filtered curve. The length of the head and tail is equal to the closest integer approximation of the curve frequency divided by 10. Both Report 350 and MASH suggest using SAE J211 Part 1 in determining the appropriate filter specifications for electronic data from crash tests. In general, acceleration data used to document the rigid body motion of the vehicle and also intended for integration to obtain velocities should be filtered according to CFC Class 180. Since Report 350 and MASH

130 evaluation criteria like the occupant risk as based on velocities integrated from accelerations, the CFC 180 filter class should generally be used in performing V&V comparisons. Re-Sampling Since most shape-comparison metrics are based on point-to-point comparisons (i.e., the data at each sampling point is compared to the corresponding point in the other curve), the two curves must have the same sampling rate to ensure the points match up in time. After the time histories have been filtered, RSVVP checks the two sets of data to determine if they have been sampled at the same rate (note: while the curves can represent any type of data, this report generally refers to the curves as time histories since in roadside safety computational mechanics the curves being compared will generally be acceleration or velocity time histories). The re-sampling operation is managed by a subroutine which checks if the sets of data corresponding to the two curves have the same sampling period within a fixed tolerance of 5E-6 sec. If the curves do not have the same sampling rate, RSVVP proceeds to resample the curve which has the lower sampling rate (i.e., the bigger difference in time between two contiguous data points) at the higher rate of the other curve. The resampling is performed by means of a simple linear interpolation between points. Synchronizing Usually the time history curves to be compared do not start at the same time and, hence, the two curves are shifted by a fixed value along the abscissa (i.e., the time axis). As the comparison metrics are generally point-to-point comparisons, the time shift between the two curves must be identified and corrected to ensure that corresponding start points are matched during the metric evaluation. For example, data from a crash test may begin to be recorded 20 msec before the intial impact whereas data from a computer simulation of the same impact may begin at first contact between the vehicle and barrier. The two curves must be adjusted in time to make sure that the actual impact point occurs at the same time in both curves. Figure 43 shows an example where a test and simulation curve is shifted by an amount “s.” The user is given the option in RSVVP to choose between two different methods of synchronizing: (1) minimum area between the curves or (2) the least square error method. A Matlab function called shift was created which shifts either one of the two the curves by a value s with a positive value of s meaning a forward shift for the test curve and a negative value meaning a forward shift for the true curve. In both cases, positive or negative shift values, the tail of the curve which is shifted is cut by a length equal to the shift value s as well as the head of the other curve which is not shifted. This way, a positive shift value means a forward translation for the test curve while a negative value of the shift is equivalent to a backward translation of the test curve.

131 s Tests Simulation Tests Simulation s (a) Positive offset s (b) Negative offset s Figure 43. Shift between a test and simulation time history. Once the shift function had been defined, two other Matlab functions were created which evaluate, respectively, the square root of the sum of the squared residuals (i.e., the difference between the value of the test and true curves at that instant in time) and the absolute area of residuals as a function of the shift value s. RSVVP identifies the shift value which minimizes either the absolute area of residuals (method 1) or of the sum of squared residuals (method 2); the shift value corresponding to the mínimum error is the most probable matching point between the curves. Once the synchronization process is complete, the user can inspect the synchronized curves and, if the result is not satisfactory, the user can repeat the synchronization procedure using a different initial shift value for the minimization algorithm or using the other minimization method. The two methods are very closely related and generally give essentially the same result. Trimming After the two curves have been re-sampled, filtered and synchronized, RSVVP checks that they have the same length and, in the case of different lengths, the longer curve is trimmed to the same size of the shorter curve. At the conclusion of these preprocessing steps, the shape- comparison metrics can be calculated. METRICS A brief description of the shape-comparison metrics used in RSVVP is presented in this section. All fifteen metrics described are deterministic shape-comparison metrics. Details about

132 the mathematical formulation of each metric can be found in the cited literature. Conceptually, the metrics evaluated can be classified into three main categories: (i) magnitude-phase-composite (MPC) metrics, (ii) single-value metrics and (iii) analysis of variance (ANOVA) metrics. RSVVP has two main parts: a generic shape-comparison tool and a roadside safety crash test specific comparison tool. The generic shape-comparison tool includes all 15 metrics described herein whereas the roadside safety analysis tool only uses the Sprauge-Geers MPC and the ANOVA metrics. The rationale for selecting these metrics for roadside safety evaluations is presented in the last section of this chapter along with a discussion of appropriate acceptance criteria. Analysts and researchers can use the first part of RSVVP to perform comparisons of any two shapes using any of the shape comparison metrics. The ability to perform general shape- metric comparisons was retained in RSVVP in order to provide a tool that can be used to validate parts, subassemblies and assemblies while developing roadside hardware or vehicle PIRTS. The second part of RSVVP is specially intended for comparing time histories that represent full-scale vehicle crash tests of roadside hardware. MPC Metrics MPC metrics treat the curve magnitude and phase separately using two different metrics (i.e., M and P, respectively). The M and P metrics are then combined into a single value comprehensive metric, C. The following MPC metrics are included in RSVVP and the formulations are shown in Table 14: (a) Geers (the original formulation and two variants), (b) Russell and (c) Knowles and Gear. (31, 32, 33, 133,134) Table 14 shows the mathematical definition of each metric. In this and the following sections, the terms mi and ci refer to the measured and computed quantities respectively with the “i” subscribe indicating a specific instant in time. This symbology assumes that the measured data points (i.e., mi) are the “true” data and the computed data points (i.e., ci) are the data points being tested in the comparison. In all MPC metrics, the phase component (P) should be insensitive to magnitude differences but sensitive to differences in phasing or timing between the two time histories. Similarly, the magnitude component (M) should be sensitive to differences in magnitude but relatively insensitive to differences in phase. These characteristics of MPC metrics allow the analyst to identify the aspects of the curves that do not agree. For each component of the MPC metrics, zero indicates that the two curves are identical. Each of the MPC metrics differs slightly in its mathematical formulation. The different variations of the MPC metrics are primarily distinguished in the way the phase metric is computed, how it is scaled with respect to the magnitude metrics and how it deals with synchronizing the phase. In particular, the Sprague and Geers metric uses the same phase component as the Russell metric. (33,134) Also, the magnitude component of the Russell metric is peculiar as it is based on a base-10 logarithm and it is the only MPC metrics among those considered in this paper to be symmetric (i.e., the order of the two curves is irrelevant). The Knowles and Gear metric is the most recent variation of

133 MPC-type metrics. (32, 135) Unlike the previously discussed MPC metrics, it is based on a point-to-point comparison. In fact, this metric requires that the two compared curves are first synchronized in time based on the so called Time of Arrival (TOA), which represents the time at which a curve reaches a certain percentage of the peak value. In this work the percentage of the peak value used to evaluate the TOA was five percent, which is the typical value found in literature. Once the curves have been synchronized using the TOA, it is possible to evaluate the magnitude metric. Also, in order to avoid creating a gap between time histories characterized by a large magnitude and those characterized by a smaller one, the magnitude component M has to be normalized using the normalization factor QS. Table 14. Definition of MPC metrics. Magnitude Phase Comprehensive Integral comparison metrics Geers Geers CSA Sprague & Geers Russell where Point-to-point comparison metrics Knowles & Gear where (with )

134 Single-Value Metrics Single-value metrics give a single numerical value that represents the agreement between the two curves. Eight single-value metrics are included in RSVVP: 1. The correlation coefficient metric, 2. The NARD correlation coefficient metric (NARD), 3. The Zilliacus error metric, The RSS error metric, 4. Theil's inequality metric, 5. Whang's inequality metric, the T statistic and 6. The regression coefficient metric. (27, 136 -139) The first two metrics are based on integral comparisons while the others are point-to- point comparisons. The definition of each metric is given in Table 15 and in the discussion that follows. Table 15. Definition of single-value metrics. Integral comparison metrics Correlation Coefficient Correlation Coefficient (NARD) Weighted Integrated Factor Point-to-point comparison metrics Zilliacus error RMS error Theil's inequality Whang's inequality Regression coefficient ANOVA Metrics ANOVA metrics are based on the assumption that if two curves represent the same event, then any differences between the curves must be attributable only to random experimental error. The analysis of variance (i.e., ANOVA) is a standard statistical test that assesses whether the

135 variance between two curves can be attributed to random error.(35, 36) When two time histories represent the same physical event, both should be identical such that the mean residual error and the standard deviation of the residual errors, are both zero. Of course, this is never the case in practical situations (e.g., experimental errors cause small variations between tested responses even in identical tests). The conventional T statistic provides an effective method for testing the assumption that the observed residual errors are close enough to zero to represent only random errors. Both Oberkampf and Ray independently proposed similar methods. In Ray’s versión of the ANOVA, the residual error and its standard deviation are normalized with respect to the peak value of the true curve. Using this method to compare six repeated frontal full-scale crash tests Ray proposed the following acceptance criteria:(36) • The average residual error normalized by the peak response (i.e., re ) should be less than five percent. • The standard deviation of the normalized residuals (i.e., rσ ) should be less than 20 percent. • The t-test on the distribution of the normalized residuals should not reject the null hypothesis that the mean value of the residuals is null for a paired two-tail t-test at the five-percent level, ∞,005.0t (i.e., 90 th percentile). r r enT σ = (9) Where n is the number of samples. APPLICATION TO SIMPLE ANALYTICAL CURVES DEFINITION OF TEST FUNCTIONS RSVVP was used to compare pairs of ideal analytical curves differing only in magnitude or phase as described in a recent work by Schwer.(32) These examples will illustrate the use of the RSVVP program and also provide some insight into the features of the different metrics calculated by RSVVP. The baseline analytical curve used as a reference in both the magnitude and phase comparisons is referred to in Figure 44 as the “true” curve, while the curves differing respectively in phase or magnitude are referred to as the “test” curves. The true curve was defined by the following decayed sinusoidal curve: )(2sin)( )( τπτ −= −− tetm t (10) where the parameter τ was used to create a phase shift.

136 (a) Magnitude test (b) Phase test Figure 44. Analytical wave forms created for a (a) the magnitude test or (b) the phase test. Following Schwer’s work, two different tests were performed: (a) a curve with the same phase but an amplitude 20 percent greater than the true curve (i.e., the magnitude-error test) and (b) a curve with the same magnitude but out of phase by +/- 20 percent with respect to the true curve (i.e., the phase- error test). (32) The analytical forms used for the magnitude-error test were:     −⋅= −= −− −− )14.0(2sin2.1)( )14.0(2sin)( )14.0( )14.0( tetc tetm t t π π (11) while the analytical forms used for the phase-error test were:     −= −= −− −− )04.0(2sin)( )14.0(2sin)( )04.0( )14.0( tetc tetm t t π π (12) and,     −= −= −− −− )24.0(2sin)( )14.0(2sin)( )24.0( )14.0( tetc tetm t t π π (13) In both cases, the sampling period was 0.02 sec, the start time was zero and the ending time was 2 sec. Figure 44 shows the graphs of the true and test curves used for the magnitude error and the phase error tests.

137 MPC METRIC RESULTS The curves used for the magnitude test are shown in Figure 44 and the values for the 15 shape- comparison metrics evaluated using RSVVP are listed in Table 16. The M components of the MPC metrics are supposed to be insensitive to phase changes and sensitive to magnitude changes and this appears to be true according to the metric values shown in Table 16. An identical match would result in M, P and C scores of zero for all the MPC metrics. The Geers, Geers CSA, Sprauge-Geers and Knowles-Gear M components are all 20 percent, as they should be, and the P components are near zero as they should be. The M component of the MPC metrics can, therefore, be considered to be an estímate of the percent difference in the magnitude. An M score of 20 can be interpreted as a magnitude difference of roughly 20 percent. The Russell M metric was found to be 13.6 with a P value of 0.0 so the Russell metric does not scale exactly with the percent of magnitude difference Similarly, the P component of the MPC metrics should be insenstitive to magnitude and sensitive to phase shift. As shown in the right two columns of Table 16, this also appears to be the case for the phase test of this simple analytical shape. The Geers, Geers CSA, Sprauge-Geers and Russell phase components, P, all result in scores of around 20 percent (i.e., 18.2 to 19.5) so the phase component, P, of the MPC metrics can be interpreted as the percent of phase difference. The P value for the Knowles-Gear metric was 62.5 which indicates that magnitude and phase scores represent different levels of error for the Knowles-Gear metrics (i.e., a 20 percent magnitude shift results in an M of 20 and a 20 percent phase shift results in a P value of 62.5). The phase test was performed for both a leading and a lagging test curve. The results of the P component of the MPC metrics are exactly the same regardless of whether the test curve is leading or lagging the true curve. The P component, therefore, does not provide any information about the direction of the phase shift, only the amount of the phase shift. Table 16 also shows that there is very little difference between the values of each of the MPC metrics, particularly the Geers, Geers CSA and Sprague-Geers. Lastly, the C component of the MPC metrics is simply the vector combination of the M and P components. The C component is obtained by taking the square root of the sum of the squares of M and P. The Geers, Geers CSA, Sprauge-Geers and Russell metrics all produce similar results so there is no reason to use more than one of them. This is not surprising since the mathematical formulations for all of these metrics, as shown in Table 14, are very similar. Metrics that scale magnitude and phase similarly are easier to interpret so the Knowles-Geer metric is not preferred. One of the advantages of the Knowles-Gear metric is that it is formulated to account for unsynchronized signals. If a synchronization process provided in the pre-processing step of RSVVP is used prior to making the comparison calculations, there is no need to use the time-of- arrival technique in the Knowles-Gear metric. Likewise, metrics where the score directly represents the magnitude or phase shift are easier to interpret so the Russell metric is not

138 preferred. The Sprauge-Geer MPC metric has gained some popularity in other areas of computational mechanics so it is the MPC metric recommended for roadside safety computational mechanics. The only value of the C component of the MPC metric is to provide a single value for comparing the curves. Since there is useful diagnostic information in the M and P components, RSVVP calcuates and reports both. The analyst can use the information in the M and P components to look for errors in their models but the C component has no direct diagnostic value. In roadside safety comparisons it is recommended that the Sprague-Geers M and P metrics be used to compare crash test time histories. Table 16. Comparison metrics for the analytical curves for (1) the magnitude test and (2) the phase test. RSVVP Metric Results Magnitude +20% Phase -20% Phase +20% MPC Metrics Geers Magnitude 20.0 0.1 -0.5 Geers Phase 0.0 18.2 18.2 Geers Comprehensive 20.0 18.2 18.2 Geers CSA Magnitude 20.0 0.1 -0.5 Geers CSA Phase 0.0 18.2 18.2 Geers CSA Comprehensive 20.0 18.2 18.2 Sprague-Geers Magnitude 20.0 0.1 -0.5 Sprague-Geers Phase 0.0 19.5 19.5 Sprague-Geers Comprehensive 20.0 19.5 19.5 Russell Magnitude 13.6 0.1 -0.4 Russell Phase 0.0 19.5 19.5 Russell Comprehensive 12.0 17.3 17.3 Knowles-Gear Magnitude 20.0 0.0 0.0 Knowles-Geer Phase 0.0 62.5 62.5 Knowles-Geer Comprehensive 18.3 25.5 25.5 Single Value Metrics Wang’s Inequality 9.1 30.7 30.6 Theil’s Inequality 9.1 30.2 30.2 Zilliacus Error Metric 20 61.8 60.4 RMS Error Metric 20 60.5 60.3 WiFAC 16.7 48.8 48.8 Regression Coefficient 97.9 78.9 79.1 Correlation Coefficient 100 81.0 80.9 NARD Correlation Coefficient 100 81.7 81.8 ANOVA Metrics Average Residual Error 0.02 0.0 0.0 Standard Deviation of Residuals 0.09 0.26 0.26

139 T Score 2.08 -0.17 0.35 SINGLE-VALUE METRICS RESULTS The single value metrics are listed in the middle portion of Table 16. The correlation coefficient, NARD correlation coefficient and regression coefficient all result in a score of unity when the two curves are identical. For the magnitude test, the regression coefficient is 97.9 and both forms of the correlation coefficient are 100. Correlation suggests that two curves can be linearly transformed into each other. It does not mean that they are identical curves. Two straight lines with different slopes, for example, have a 100 percent correlation. The magnitude test results show that the two correlation coefficients and the regression coefficients are not sensitive to changes in magnitude since all three result in either perfect or nearly perfect scores (i.e., 100). The results are similar though not as good in the phase tests. The three correlation- type single value metrics result in values between 78.9 and 81.8 indicating fairly high correlation. If the scores in the phase test for these three metrics is subtracted from 100, a value near 20 is obtained indicating that these metrics are fairly direct measures of phase shift. The correlation-type metrics appear to be insensitive to magnitude shifts and directly sensitive to the amount of phase shift. It should also be pointed out that the NARD versión of the correlation coefficient is identical to one minus the P component of the Geers and Geers CSA metrics and also closely related to the Sprauge-Geers P component. Since the phase information detected by the correlation, NARD correlation and regression coefficients is captured equally well in the P component of the Sprague-Geers metrics, there is no reason to routinely calcúlate these metrics in roadside safety verification and validation activities. The RMS is the root-mean squared error, another standard mathamatical technique for comparing signals. The RMS for the magnitude test as shown in Table 16 is 20, the amount of the magnitude shift. The RMS score for the phase shift, however, is about 60, much greater than the 20 percent phase shift. While the RMS yields the percent shift in the magnitude test, the fact that it yields a large value in the phase test limits its diagnostic utility since for a general shape comparison it would not be clear if the difference is due to an error in magnitude or phase. The Zilliacus error metric shares a similar formulation to the RMS and results in similar values. Neither the RMS or Zilliacus Error Factor are suggested for use in roadside safety verification and validation activities because the information they provide is adequately covered in the MPC metrics. As shown in Table 15, Wang’s and Theil’s inequalities are very similar formulations (i.e., one using a square root of a square and the other the absolute value). In the magnitude test both yield values of 9.1 and in the phase tests values of just over 30. The two different formulations, therefore, generally will produce very similar results so there is no need to use both. Both inequalities are essentially measures of the point-to-point error between the signals as shown in

140 their formulations in Table 15. As will be shown in the next section, the average residual error component of the ANOVA metric is essentially the same as both Wang’s and Thiel’s error metrics. Since these metrics are redundant with each other and the average residual error, they are not preferred for roadside safety verification and validation comparisons. The weighted integrated factor (WiFac) value for the magnitude test was 16.7 and 48.8 for the phase test. The diagnostic value of the WiFac is not apparent to the authors so this metric is also not recommended. ANOVA METRICS RESULTS With the exception of Theil’s and Wang’s inequality factors and the Zilliacus error factor, all the metrics discussed so far are assessments of the similarity of the magnitudes or phase of the two curves being compared. The metrics proposed by Theil, Wang and Zillliacus, on the other hand are point-to-point estimates of the residual error between the two curves. Each of these methods subtracts the test from the true signal at each point in time to find the instantaneous difference between the two curves. These differences are then summed and in some fashion normalized. Both Ray and Oberkampf independently developed a more direct assessment of the residual error. Ray and Oberkampf’s methods are essentially identical except Ray normalizes by the peak value of the true curve whereas Oberkampf normalized by the mean of the peaks of the test and true curves. While the other types of metrics compare the phase or magnitude of the two curves, these point-to-point error methods examine the residual error. Ray’s method has an additional advantage since it uses both the average residual error and the standard deviation of the residual error. In essence, the ANOVA method proposed by Ray examines the shape of the residual error curve resulting from a point-to-point comparison of the curves. Random experimental error by definition is normally distributed about a mean of zero and there are standard statistical tests to test the assumption that the error fits a normal distribution. The analytical shape test presented herein is not really a particularly good test of the ANOVA metric since there is no random experimental error – the differences between the curves result from the fact that the curves are in fact different though very similar analytical curves (i.e., there is a very intentional systematic error between the test and true curves in this case). Nonetheless, the results of the magnitude and phase test are shown at the bottom of Table 16. The average residual error for both the magnitude and phase tests was near zero indicating that the average value of the error between the curves was zero. A review of the curves in Figure 44 shows that the curves have a symmetric oscillation above and below zero so the average distance between points on the two curves should be close to zero. The standard deviation is 0.9 in the magnitude test and 0.26 in the two phase tests.

141 Based on an assessment of repeated crash tests, Ray has proposed that the average residual error should be less than five percent and the standard deviation of the residual error should be less than 20 percent. By those criteria the values in Table 16 would indicate that the two curves could represent the same event. The third component of the ANOVA procedure is the T test which is a standard statistical test of the hypothesis that the observed error is normally distributed. For large numbers of samples, as is the case in this test and most full-scale crash tests, and 90 percent confidence, the critical value for the T test is 2.67. The magnitude test is close but under this critical value whereas the phase tests are well inside the acceptance range. The ANOVA test is recommended for use in roadside safety computational mechanics because it provides a direct assessment of the residual errors between the test and true curves and, thereby, provides additional useful diagnostic information about the degree of similarity or difference between the curves. APPLICATION TO REPEATED CRASH TESTS REPEATED CRASH TESTS While exploring the characteristics of deterministic shape comparison metrics using an analytical curve, as was done in the last section, is an informative verification exercise, the performance of the metrics for real-world crash test data is of more practical importance since real time histories are not nearly as well behaved as simple analytical functions. The objective of this section is to examine the performance of the Sprauge-Geers MPC metrics and the ANOVA metrics in a series of identical crash tests. If a particular physical phenomena (e.g., a crash test) is documented using some type of sensors (e.g., accelerometers) and the physical experiment is repeated many times, we would expect that the response would be similar between all the experiments although not identical. Any experiment will experience random experimental error and there is a limit to the precision which sensor data can be collected and processed. If a numerical analysis is performed for the same conditions as the physical experiment and it is not possible to distinguish the physical experiment from the numerical analysis then performing the numerical analysis is as good as another physical experiment. The question of validating computer analyses of roadside hardware collisions, then, is essentially one involving the repeatability of crash tests and quantifying the normal variation that is typically observed in such tests. If the time history from Figure 45. Full-scale crash test set up for the repeated ROBUST crash tests.(43)

142 a numerical analysis cannot be distinguished from the time histories of physical experiments, then the numerical analysis is a valid representation of the physical phenomena. (a)Set #1 (b) Set #2 Figure 46. 90th percentile envelope and acceleration time histories for (a) set #1, (b) set #2 and (c) sets #1 and #2 combined. A series of five crash tests (i.e., Set #1) with new model year 2000 Peugeot 106 vehicles and a rigid concrete barrier were performed as a part of the ROBUST project. (43) The basic test conditions are shown in Figure 45. The tests were independently carried out by five different test laboratories in Europe, labeled laboratories one through five in this report, with the purpose of assessing the repeatability of full-scale crash tests. As the main intent was to see if experimental curves representing the same physical test resulted in similar time-history

143 responses, a rigid barrier was intentionally chosen in order to limit the scatter of the results which is presumably greater in the case of deformable barriers. A second series of five tests (i.e., Set #2) was performed using the same barrier but with vehicles of different brands and models. All the vehicles used in the second series, however, corresponded to the standard 900-kg small test vehicle specified the European crash test standards, EN 1317.(11) The second set of tests was performed to investigate influences arising from different vehicle models on the repeatability of crash tests. In all cases, the three components of acceleration, including the lateral acceleration used in this paper, were measured at the center of gravity of the vehicles. Only lateral accelerations and velocities are discussed in this paper for the purpose of conciseness; the lateral response is thought to be the more critical in this type of re-directional barrier test. In order to compare the different time histories, it was necessary to prepare them in exactly the same way. All the time history curves were pre-processed in exactly the same way using the RSVVP program described earlier in this chapter. One common method for assessing the validity of a simulation result or the repeatability of multiple impact experiments in the biomechanics field is to develop response envelopes. If multiple experiments are available, the time histories for all the experiments can be plotted together. If the responses are average and standard deviations are calculated at each instant in time, the ± 90th percentile envelope indicating the likely response corridor can be plotted. After the ten curves were preprocessed, the 90th percentile envelope for each of the two sets of tests (i.e., set #1 with the same new vehicle and set #2 with similar vehicles) was computed considering the response from Lab #1 to be the “true” curve (note: the results from Lab #1 were chosen arbitrarily as the “true” curve). The 90th percentile envelope for each set was evaluated by adding and subtracting to the respective “true” curve the average of the standard deviations of the residuals for each specific set of tests multiplied by 1.6449. Figure 46 shows the preprocessed curves and the respective envelopes for sets #1 and #2. Also, all ten tests from both sets were compared together considering the response of test Lab#1/Set #1 as the “true” curve and the results are shown in the bottom portion of Figure 46. As expected, there is considerable scatter between the acceleration time histories shown in Figure 46 although there is also a clear trend. Any test response that falls within the response corridors shown in Figure 46 would be considered to be identical or at least equivalent impact events. As shown by the response envelopes, all ten experiments tend to remain inside the response envelopes although the test from Lab #4 in Set #2 has several peaks that are outside the response corridor. While calculating response corridors is a very useful technique, at least five experiments must be available before a corridor can be constructed and the level of confidence (i.e., the width of the corridor) will be wider the smaller the number of samples is. In roadside safety the normal situation is that there is generally only one experiment. A response corridor cannot be obtained from just one or two experiments so if an analyst desires to compare a single

144 computational result to a single crash test experiment, the response corridor method is not an option. ACCEPTANCE CRITERIA When comparing a computational result to an experiment, the analyst must decide what constitutes a reasonable acceptance criterion. While the metrics themselves are deterministic a subjective judgment still has to be made about how close to zero (i.e., zero is a perfect match for all the metrics considered in this section) is “good enough.” Since all ten of the experiments discussed in this paper represent identical tests, the range of metric values observed should be an indicator of the acceptable range of scores for more or less identical tests. One of the purposes of this work, therefore, is to provide insight on acceptance criteria when using the Sprague-Geers and ANOVA metrics. Results Using Acceleration Time Histories Once the time histories for the ten experiments were preprocessed, each was compared to the “true” curve by evaluating the Sprague-Geers and ANOVA metrics calculated using the RSVVP program, described in Chapter 5. Initially, the two sets of tests, Set #1 with the same new vehicle and Set#2 with similar vehicles, were considered separately using the response from the Lab #1 test in each set as the “true” curve. The choice of Lab #1 to represent the “true” curve was arbitrary and certainly the results would be slightly different if another test set was used as the “true” baseline curve. The question being evaluated in this section is, therefore, “are the results from Lab #1 the same as those reported by Labs #2 through #5?” The resulting metric values for set #1, Set #2 and the combination of both sets are shown in Table 17in the top, middle and bottom portions, respectively. Sprague-Geers MPC Metrics The upper portion of Table 17 shows the values for the Sprague-Geers MPC metrics. The magnitude component of the metric is negative for all four of the comparison experiments indicating that the “true” experiment (i.e., the result from Lab #1) general experienced a higher magnitude. The amount of the magnitude score is roughly equal to the percent difference in magnitudes and in the case of Set #1 varies between 14 and almost 26 percent. The last column in Table 17 is a possible acceptance criterion which is based on calculating the 90th percentile value of the observed metrics (i.e., the mean plus 1.67 times the standard deviation). Even when the same make and model of vehicle is used, the acceleration time histories under identical impact conditions can vary as much as nearly 30 percent in magnitude. The results for Set #2, where different vehicles meeting the EN 1317 small car test vehicle criteria were used, are similar although the experiment from Lab #4 experienced a much higher magnitude score indicating that Labs #2, #3 and #5 tended to have smaller magnitudes than the Lab #1 true test and Lab #4 had a much higher magnitude. This is actually confirmed by the time history graphs

145 in Figure 46 where the results for Lab #4 are clearly higher and even cross outside the response corridor. The large difference between Lab #4 and the other tests is reflected in the much larger standard deviation (i.e., 4.85 versus 21.35). It is not clear whether the differences between Sets #1 and #2 are due to the differences in the vehicles or the one unusual test from Lab #4 in Set #2. When the magnitude component of the Sprague-Geers metric is combined for all ten tests, as shown in the bottom portion of Table 17, the mean score is -17.2 showing that on average the tests have smaller magnitudes than the true test. The standard deviation of the results is nearly 12. If the 90th percentile value were used to establish an acceptance criterion for the magnitude component, a value of 37.1 would be the result. Table 17. Comparison metrics for Set #1, Set #2 and the combination of both sets. Metric Lab #2 Lab #3 Lab #4 Lab #5 Mean Std. Possible DATA SET #1 Sprague-Geers Magnitude -23.0 -21.4 -14.4 -25.8 -21.2 4.85 ± 29.2 Phase 22.6 36.3 31.2 24.9 28.8 6.21 39.1 ANOVA Avg. Residual Error 0.00 -0.01 -0.01 0.00 -0.01 0.01 ± 0.01 Std Dev. Of Residuals 0.19 0.24 0.23 0.20 0.22 0.02 0.25 DATA SET #2 Sprague-Geers Magnitude -2.6 -8.2 35.6 -9.3 3.9 21.35 ± 39.3 Phase 21.3 22.8 25.4 26.7 24.0 2.49 28.2 ANOVA Avg. Residual Error -0.01 0.00 0.00 -0.02 -0.01 0.01 ± 0.02 Std Dev. Of Residuals 0.21 0.22 0.31 0.25 0.25 0.05 0.32 DATA SETS #1 and #2 COMBINED Sprague-Geers Magnitude See above for individual scores -17.2 11.94 ± 37.1 Phase 34.9 8.44 33.2 ANOVA Avg. Residual Error -0.01 0.01 ± 0.02 Std Dev. Of Residuals See above for individual scores 0.24 0.04 0.31 The result for the phase component is similar as shown in Table 17. Due to the formulation, the values for the phase component must always be positive so it is not possible to determine from the metric value whether the test curve is leading or lagging the true curve in phase. For Set #1, the values varied from just above 22 to just over 36 with a mean and standard

146 deviation of 28.8 and 6.21, respectively. The phase component of the metric can be interpreted as being the percent out of phase of the signal. The results for Set #2 were very similar and actually resulted in a smaller standard deviation than Set #1 possibly indicating that the difference between vehicles does not appear to play a major role at least in this test with this type of vehicle. Combining both sets of data and calculating the 90th percentile indicates that a phase score of about 33 would be appropriate. Based on the results of the ten essentially identical full-scale crash tests summarized in Table 17, an absolute upper bound value of 40 could be used as acceptance criteria for both the magnitude and phase components of the Sprague-Geers MPC metrics when evaluating acceleration time histories from full-scale crash tests because 90 percent of identical crash tests should have a response that falls within these limits. ANOVA Metrics While the Sprague-Geers metrics assess the magnitude and phase of two curves, the ANOVA examines the differences or residual errors between two curves. The average and standard deviation of the residuals were evaluated for each time history in both sets of data and the results are shown in Table 17. For all ten experiments, the average residual error was always close to zero. The standard deviations of the residual errors were always under 31 percent and in all cases but one less than 25 percent. Since the time histories for all the crash tests represented essentially identical physical events, the residuals for each curve should be attributable only to random experimental error or noise. Statistically speaking, this means that the residuals should be normally distributed around a mean residual error equal to zero. As shown in the cumulative density function in Figure 47, the shape of the residual accelerations distribution is typical of a normal distribution for both sets of crash tests when taken separately or combined. Since the cumulative distribution is an “S” shaped curve centered on zero, the distribution of the residuals is consistent with random experimental error as would be expected in these series of repeated crash tests. This is a very strong indicator that the ten tests are, in fact, similar impact events. Ray applied the ANOVA criteria to a set of six identical frontal rigid pole impacts and reported the results in a paper published in 1996.(35) Ray proposed an acceptance criteria based on these six tests of a mean residual error less than or equal to five percent and a standard deviation of the residual less than 20 percent. Since the tests used in this earlier study were of a type that are presumed to be highly repeatable (i.e., the same type of vehicle was used, the same crash test facility was used, the barrier was a rigid instrumented pole and the impact was a center-on full-frontal impact) it was not known if these criteria would be reflective of more general roadside hardware crash tests performed under less ideal conditions. The data in Table 17 indicate that the mean residual error criterion of less than five percent appears to be adequate since none of the comparisons for these ten tests resulted in a mean residual greater than two percent. The standard deviation of the residuals, however, was higher in this test series than in the one reported by Ray in 1996. The highest standard deviation of the residuals (i.e., 31

147 percent) was found for Lab #4 in Set #2, the same test that resulted in an unusually high magnitude score. With that exception, the standard deviations were generally between 20 and 25 percent, a little higher than Ray originally proposed. Based on these eight comparisons and Ray’s 1996 work, the average residual error should be less than five percent and the standard deviation of the error should be less than 35 percent. (a) Set#1 [True curve: Lab #1 (Set 1)] (b) Set#2 [True curve: Lab #1(Set 2) ] Figure 47.Cumulative density function of the residual accelerations for (a) Set #1, (b) Set #2 and (c) the combination of Sets #1 and #2. RESULTS USING VELOCITY TIME HISTORIES Sometimes velocity time histories are used to visually compare curves rather than acceleration time histories because they are less noisy and the trends are more easily apparent. The acceleration histories for the ten experiments were integrated to obtain the lateral velocities and the velocity time histories and the 90th percentile response corridors as shown in Figure 48 for Set #1. The velocity corridor is much narrower and smoother than the corresponding acceleration time history response corridor shown in Figure 48. Just as the Sprague-Geers MPC metrics can be used to compare the shapes of acceleration time histories, exactly the same procedure can be used to evaluate the velocity time

148 histories. Table 18 shows the values of the metrics calculated for Set #1. As shown in Table 18, the Sprague-Geers magnitude and phase metrics are much smaller for the velocity time history comparison than was the case for the acceleration time history comparison. The maximum magnitude score was 5.1 and the maximum phase score was 3.1. Using these values to compute the 90th percentile range results in an acceptance value of less than 10 for both magnitude and phase, much less than the 35 recommended for acceleration time histories. Figure 48. 90th percentile envelope and velocity time histories for Set#1. Table 18. Values of the comparison metrics using velocity time histories for Set#1. Metric Lab #2 Lab #3 Lab #4 Lab #5 Mean Std. Upper Sprague-Geers Magnitude 0.5 5.1 4.5 4.0 1.5 4.21 8.5 Phase 2.0 3.5 3.1 2.8 2.9 0.64 3.9 ANOVA Avg. Residual Error 0.0 -0.02 -0.02 0.04 0.00 0.03 0.05 Std Dev. Of Residuals 0.05 0.09 0.08 0.06 0.07 0.02 0.10 While the Sprague-Geers metrics improve as the acceleration time history is integrated to a velocity time history, the ANOVA metrics becomes much worse. While the average residual errors are still around zero, the standard deviation increases by a factor of four or five. The reason for this poor performance with the velocity curves is that the integration process in essence masks the residual acceleration errors. When using an ANOVA technique, the

149 evaluation of metrics should always be performed using time histories directly measured and not derived using either integration or differentiation. For example, if accelerations are measured experimentally, accelerations should be the basis of the ANOVA comparison. Velocities and displacements obtained by integrating the acceleration curve will accumulate error with each subsequent integration. This is shown graphically in Figure 49 where the distribution of the residuals is more spread out and the mean is not as close to zero as was the case for Figure 49. While the Sprauge-Geers MPC metrics can be used with either raw data (i.e., accelerations) or integrated data (i.e., velocities), it is recommended that the comparison be made based on the data the way it was collected (i.e., raw data). Using processed data adds a mathematical layer of complexity that can introduce its own errors. For example, in this case local lateral velocities were compared but if all six channels of data were integrated using a coupled numerical integration to obtain the global velocities, errors from various channels would “seep” into the other data channels. For example, say the lateral accelerations were identical but the yaw rate gyros for the test and simulation were quite different. A coupled integration of such data would result in errors in every channel since the yaw rate data is coupled to every other channel. For this reason, it is suggested that data be compared the way it was collected without any subsequent processing. For example, if local acceleration data is collected, local acceleration data should be the basis of the comparison in order to avoid errors due to post- processing the data into some other form. Figure 49. Cumulative density functions of the residual velocities for Set #1. For the ANOVA metrics, the comparison must take place based on the accelerations if the original data was collected with accelerometers. It is recommended that when using an ANOVA procedure, the results be computed based on the time histories collected in the physical

150 experiments; normally this would be accelerations and rate gyros in crash tests. In evaluating the ANOVA metrics for a series of six identical frontal rigid pole impacts, Ray proposed an acceptance criterion of a mean residual error less than five percent of the peak and a standard deviation of less than 20 percent of the peak test acceleration.(35) As shown in Table 19 and discussed above, this is probably a bit too restrictive and should be changed to a average residual error of less than five percent and a standard deviation of less than 35 percent. The purpose of examining these repeated crash tests was to explore how repeatable similar full-scale crash tests would be and identify sources of possible discrepancies between test organizations. (43) As discussed earlier, the magnitude results for Lab #4 Set #2 represent a departure from most of the other test results. The ROBUST team, in fact, examined all the test procedures and techniques used by the different test agencies and actually identified several differences that could explain some of the discrepancies. For example, the ROBUST team discovered that each test organization used a different technique to mount the accelerometer block to the test vehicle and the mounting technique had a measurable effect on the acceleration time histories. They designed and tested a new light-weight more rigid composite block that significantly improved the consistency of the testing results. This illustrates an important point: while two tests may be performed at the same impact conditions and use the same vehicle and barrier, the way data is collected and processed will also affect the results. The shape- comparison metrics will be sensitive not only to differences in the impact conditions and test results, but also to the way data was collected and processed. ACCEPTANCE CRITERIA A comparison of ten repeated essentially identical crash tests was presented above. The Sprague-Geers MPC metrics and the ANOVA metrics were used to quantitatively make comparisons between eight pairs of crash tests. Two sets of data were available, the first set of five tests used the same make, model and year of vehicle whereas the second set of five tests used different vehicles that met the requirements for the small car defined by EN 1317. The original raw time histories from the 10 tests were filtered, re-sampled and synchronized in order generate accurate comparison results. The statistics derived from the analysis of the residuals confirmed the hypothesis that the errors were normally distributed and could, therefore, be attributed to normal random experimental error. Using the data from these ten tests, recommendations for acceptance criteria for evaluation metrics for comparing repeated crash tests were recommended. Namely: • The Sprague-Geers magnitude and phase metrics should be strictly less than 40. • The average residual error when comparing two acceleration time histories from a crash test should be less than five percent. The standard deviation of the residual errors should be less than 35. INDEPENDENT ACCEPTANCE CRITERIA ASSESSMENT

151 Abu-Odeh and Ferdous performed their own unpublished independent examination of the acceptance criteria described in the previous sections using four similar crash tests performed at Texas Transportation Institute. Four crash tests of strong-post w-beam guardrails were used as repeated crash tests. In all four cases, the only difference between the guardrail systems was the type of material used for the blockout. Test 1 used a typical timber blockout whereas tests 2 and 3 used recycled polymer blockouts and test 4 used a composite blockout. All four tests used C2500 Chevrolet pickup trucks between model years 1989 and 1996 and the nominal impact conditions were essentially the same as shown in Table 19. Since differences between the blockout materials are unlikely to change the results of the test as long as the blockout does not split and break (which they did not in these tests), the tests are essentially repeated instances of the same test of a strong-post w-beam guardrail. Table 19. Comparison metrics for four essentially identical crash tests of a strong-post w- beam guardrail with different blockouts. Test Test Characteristics #1 #2 #3 #4 Vehicle Model Year 1989 1996 1990 1996 Impact Speed (km/hr) 101.5 101.4 100.9 101.4 Impact Angle (deg) 25.5 25.4 25.2 23.8 MPC Metrics M P M P M P X acceleration channel 19 36 10 35 5 39 Y acceleration channel 17 31 23 40 1 35 Z acceleration channel 4 46 13 46 37 56 Roll rate channel 1 35 38 38 34 38 Pitch rate channel 16 37 38 38 8 32 Yaw rate channel 13 13 16 16 9 11 ANOVA Metrics Residual SD Residual SD Residual SD X acceleration channel 0.01 0.22 0.01 0.22 0.02 0.23 Y acceleration channel 0.01 0.13 0.00 0.18 0.00 0.14 Z acceleration channel 0.00 0.26 0.00 0.25 0.05 0.27 Roll rate channel 0.03 0.29 0.02 0.29 0.06 0.29 Pitch rate channel 0.04 0.32 0.03 0.31 0.02 0.29 Yaw rate channel 0.06 0.15 0.07 0.18 0.09 0.08 Note: All tests used Chevrolet C2500 pickup trucks with the same nominal impact conditions. Abu-Odeh and Ferdous used the RSVVP program to calculate the Sprauge-Geer and ANOVA metrics for these four tests using the results of test 1 as the “true” curve and each of the other tests as the test cases. The results are summarized in Table 19. As shown in Table 19, the acceptances recommended in the last section were always satisfied with the exception of the Z accelerations for tests 3 and 4. In these cases, the Z acceleration is a relatively unimportant channel since most of the vehicle kinematics are described by the X and Y accelerations and the Yaw rotation. As discussed earlier, there is a procedure in RSVVP to account for weighting relatively unimportant channels in RSVVP and when this procedure was used by Abu-Odeh and

152 Ferdous the weighted comparison was acceptable. The results also show that, with the exception of the Z channel in comparisons 3 and 4, some of the results are near the acceptance criteria indicating that the values chosen are reasonable for this type of re-directional guardrail crash test. Abu-Odeh and Ferdous’s examination confirms that the acceptance criteria recommended in the last section are indeed reflective of the values of repeated crash tests and so are therefore meaningful acceptance criteria for comparing full-scale crash test results and numerical simulation results. CONCLUSIONS This chapter described the features of the RSVVP program, provided an example of its application to simple analytical curves and provided examples of developing acceptance criteria based on repeated identical full-scale crash tests. RSVVP pre-processes the two input curves. Data can be filtered, adjusted for any bias, re-sampled to the same data acquisition frequency and synchronized to the same equivalent initial time. Pre-processing is an important step because poor metric scores can result just because the curves might have been processed differently. For this reason, it is preferable that raw (i.e., unpreprocessed) data is used in RSVVP rather than crash test data that has already been processed. RSVVP includes fifteen separate metrics that can be used to compare and analyze the differences between the test and true curves. The formulations of these metrics are summarized in Table 14 and Table 15 and full details are available in the literature. A test case using a simple analytical function was performed and the results for the 15 metrics were shown in Table 16. A review of the results and formulations of these metrics show that there are really just three basic features of a shape comparison that are assessed: similarities in magnitude, similarities in phase and the shape of the residual error curve. Since many of the metrics share similar formulations, their results are often identical or very similar and there is no reason to include all the variations. The Sprague-Geers MPC metrics are recommended to assess the similarity of magnitude (i.e., the M metric) and phase (i.e., the P metric) and the ANOVA metric is recommended to assess the characteristics of the residual errors. The RSVVP program will provide a convenient platform for engineers to explore the similarities and differences between both physical test and computational results in validation efforts as well as comparing the repeatability of physical experiments. The program provides all the tools needed to quickly perform the assessments between two curves. Acceptance criteria for comparing full-scale crash tests and numerical simulations were developed using comparisons with analytical curves and a set of 10 essentially identical crash tests. Based on the results of comparing these 10 identical crash tests, the Sprague-Geers magnitude and phase metrics should be strictly less than 40, the average residual error should be less than five percent, and the standard deviation of the residual errors should be less than 35 for acceptable results. Lastly, the acceptance criteria were used by Abu-Odeh and Ferdous to

153 examine four nearly identical crash tests and the results showed that, with the exception of some of the Z accelerations, the acceptance criteria resulted in the correct assessment – that is, that the crash tests were identical.

Next: CHAPTER 6 BENCHMARK CASES »
Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications Get This Book
×
 Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB’s National Cooperative Highway Research Program (NCHRP) Web Only Document 179: Procedures for Verification and Validation of Computer Simulations Used for Roadside Safety Applications explores verification and validation procedures, quantifiable evaluation metrics, and acceptance criteria for roadside safety research that maximize the accuracy and utility of using finite element simulations.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!