**Suggested Citation:**"Appendix C. Comments on Survey Responses." National Academies of Sciences, Engineering, and Medicine. 2021.

*Application of Crash Modification Factors for Access Management, Volume 2: Research Overview*. Washington, DC: The National Academies Press. doi: 10.17226/26162.

**Suggested Citation:**"Appendix C. Comments on Survey Responses." National Academies of Sciences, Engineering, and Medicine. 2021.

*Application of Crash Modification Factors for Access Management, Volume 2: Research Overview*. Washington, DC: The National Academies Press. doi: 10.17226/26162.

**Suggested Citation:**"Appendix C. Comments on Survey Responses." National Academies of Sciences, Engineering, and Medicine. 2021.

*Application of Crash Modification Factors for Access Management, Volume 2: Research Overview*. Washington, DC: The National Academies Press. doi: 10.17226/26162.

**Suggested Citation:**"Appendix C. Comments on Survey Responses." National Academies of Sciences, Engineering, and Medicine. 2021.

*Application of Crash Modification Factors for Access Management, Volume 2: Research Overview*. Washington, DC: The National Academies Press. doi: 10.17226/26162.

**Suggested Citation:**"Appendix C. Comments on Survey Responses." National Academies of Sciences, Engineering, and Medicine. 2021.

*Application of Crash Modification Factors for Access Management, Volume 2: Research Overview*. Washington, DC: The National Academies Press. doi: 10.17226/26162.

**Suggested Citation:**"Appendix C. Comments on Survey Responses." National Academies of Sciences, Engineering, and Medicine. 2021.

*Application of Crash Modification Factors for Access Management, Volume 2: Research Overview*. Washington, DC: The National Academies Press. doi: 10.17226/26162.

**Suggested Citation:**"Appendix C. Comments on Survey Responses." National Academies of Sciences, Engineering, and Medicine. 2021.

*Application of Crash Modification Factors for Access Management, Volume 2: Research Overview*. Washington, DC: The National Academies Press. doi: 10.17226/26162.

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

C-1 A P P E N D I X C Comments on Survey Responses This appendix provides a summary of goodness-of-fit measures to assess the performance of SPFs, including the mean absolute deviation, modified R2, dispersion parameter, coefficient of variation (CV) of the calibration factor, and CURE. It is relatively straightforward to use these goodness-of-fit measures to compare the relative performance of competing SPFs considered for application. More challenging is the use in assessing whether a single SPF is adequate as there are no guidelines on acceptable thresholds except for the CV of the calibration factor and the CURE plot. Thus, some subjective judgment is required to supplement the assessment based on the CV with a consideration of the other goodness-of-fit measures. Mean Absolute Deviation Figure C-1 provides the equation for the mean absolute deviation (MAD), which provides a measure of the average magnitude of variability of prediction. Smaller values are preferred to larger values in comparing two or more competing SPFs. The MAD is the sum of the absolute value of predicted minus observed crashes, divided by the number of sites. The values of predicted and observed crashes are from the calibration data. ðð´ð· â |ð¦ ð¦ |ð Figure C-1. Mean absolute deviation. Where: ï· ð¦ = observed counts, ï· ð¦ = predicted values from the SPF, and ï· n = validation data sample size. Modified R2 Figure C-2 shows the equation for the modified R2 value (Fridstrom 1995). This goodness-of-fit measure subtracts the normal amount of random variation expected if the SPF were 100 percent accurate. Even with a perfect SPF, some variation in observed crash counts would be observed due to the random nature of crashes (Fridstrom 1995). As a result, the amount of systematic variation explained by the SPF is measured. Larger values indicate a better fit to the data in comparing two or more competing SPFs. Values greater than 1.0 indicate the SPF is over-fit (i.e., the SPF is incorrectly explaining some of the expected random variation as systematic variation). ð â ð¦ ð¦ â ðâ ð¦ ð¦ â ð¦ Â Figure C-2. Modified R2 value.

C-2 Where: ï· ð¦ = observed counts, ï· ð¦ = predicted values from the SPF, ï· ð¦= sample average, and ï· ð = ð¦ -ð¦ . Dispersion Parameter The dispersion parameter, f(k), in the negative binomial distribution is reported from the variance equation. Figure C-3 provides the variance equation, rearranged in Figure C-4 to provide the equation for the dispersion parameter. ððð ð ð¸ ð ð ð ð¸ ð Â Figure C-3. Variance of negative binomial distribution. ð ð ððð ð ð¸ ðð¸ ð Figure C-4. Dispersion parameter. Where: ï· f(k) = estimated dispersion parameter, ï· Var{m} = estimated variance of mean crash rate, and ï· E{m} = estimated mean crash rate. The estimated variance increases as dispersion increases, and consequently the standard errors of estimates are inflated. As a result, all else being equal, an SPF with less dispersion (i.e., smaller values of f(k)) is preferred to an SPF with more dispersion. Note that f(k) can be specified as a constant or as a function of site characteristics. The tool facilitates the estimation of the dispersion parameter, either as a constant or from a function, as one goodness-of-fit measure. Coefficient of Variation of the Calibration Factor For a constant calibration factor, the CV of the calibration factor is useful to assess the goodness-of-fit. Figure C-5 provides the equation for the CV of a constant calibration factor, which is the standard deviation of the calibration factor divided by the estimate of the calibration factor. ð¶ð ð ð¶ð¶ Figure C-5. Coefficient of variation of a constant calibration factor. Where: ï· CV = coefficient of variation of the calibration factor, ï· V(C) = variance of the calibration factor, and ï· C = estimate of the calibration factor. Figure C-6 shows the equation for the variance of the calibration factor [V(C)]. The standard deviation of the calibration factor is the square root of the variance.

C-3 ð ð¶ â ð¦ ð â ð¦ â ð¦ Figure C-6. Variance of calibration factor. Where: ï· ð¦ = observed counts, ï· ð¦ = uncalibrated predicted values from the SPF, and ï· k = dispersion parameter (recalibrated). Appendix B of the Userâs Guide to Develop Highway Safety Manual Safety Performance Function Calibration Factors provides guidance on estimating the accuracy of a calibration factor using the CV (Bahar 2014). This guidance is intended for application in assessing the sample size of the calibration dataset; however, it seems reasonable to also apply it to assess the accuracy of a calibration factor regardless of the sample size (Bahar 2014). The Guide suggests that a reasonable upper threshold for the CV is 0.10 to 0.15. This threshold can help to assess whether or not the SPF, and the estimated calibration factor based on the calibration dataset, are acceptable. If the CV exceeds this threshold, the cumulative residual plots, described in the following subsection, can help to determine if the SPF is acceptable. In any case, the CV can help with comparative evaluation of two or more SPFs where smaller values are preferred to larger values. Cumulative Residual Plots Another tool to assess goodness-of-fit is the CURE plot. A CURE plot is a graph of the cumulative residuals (observed minus predicted crashes) against a variable of interest sorted in ascending order (e.g., major road traffic volume). CURE plots provide a visual representation of goodness-of-fit over the range of a given variable, and help to identify potential concerns such as the following: ï· Long trends: long trends in the CURE plot (increasing or decreasing) indicate regions of bias that should be rectified through improvement to the SPF either by the addition of new variables or by a change of functional form. ï· Percent exceeding the confidence limits: cumulative residuals outside the confidence limits indicate a poor fit over that range in the variable of interest. Cumulative residuals frequently outside the confidence limits indicate notable bias in the SPF. The upper threshold for the percent of cumulative residuals exceeding the 95 percent confidence limits is five percent. ï· Vertical changes: Large vertical changes in the CURE plot are potential indicators of outliers, which require further examination. For further discussion of outliers, refer to the section titled, Dealing with Outliers, in The Calibrator User Guide (Lyon et al. 2018). Consult chapter 7 of Hauerâs book, The Art of Regression Modeling in Road Safety, for more advanced discussion (Hauer 2015). Figure C-7 shows an example CURE plot for the variable indicating major road traffic volume at an intersection. In this example, the SPF performs relatively well based on the general pattern and the 95 percent confidence interval. The pattern shows the cumulative residuals oscillating above and below zero. Note that a sustained increasing or decreasing trend would indicate a range of under- or over-prediction, respectively. In this example, the cumulative residuals also remain within the 95-percent confidence limits over most of the range, only exceeding the confidence limits for a short range of lower AADT. The areas outside the confidence limits indicate a poor fit as indicated in the figure. Cumulative residuals frequently outside the confidence limits would indicate notable bias in the SPF. Another notable observation is the sharp increase in the value of cumulative residuals at an AADT of approximately 175,000 vehicles per day. This may indicate the presence of an outlier in the data.

C-4 Â Figure C-7. Example CURE plot. The Calibrator automatically provides a CURE plot similar to Figure C-7 for fitted values (after applying the calibration factor(s)) and allows a user to choose any other available continuous variable for the x-axis. The tool calculates the maximum deviation as well as the percent of observations outside the 95-percent confidence limits. With this information, users can follow the procedure in Hauerâs book to determine whether an SPF is acceptable and in comparing multiple SPFs (Hauer 2015). While the guidance provided by Hauer for making these decisions is useful, it is largely subjective. The most objective consideration is a review of the CURE plot and 95-percent (2Ï) confidence limits. As Hauer notes, âinasmuch as the CURE plot is a sum of many independent random variables, it is approximately normally distributed. For a normal distribution, about 95 percent of the probability mass is within two standard deviations from the mean. Thus, the CURE plot for an âeverywhere unbiasedâ SPF should only rarely go beyond the 2Ï limits.â Hauerâs book (p. 150) also mentions, "the overall fit of the SPF is best judged by the CURE plot for fitted values" (Hauer 2015). The following are general rules for assessing the percent of the CURE plot exceeding the 95 percent (2Ï) confidence limit: 1. An upper threshold of five percent of CURE plot ordinates for fitted values (after applying the calibration factor(s)) exceeding 2Ï limits is indicative of an SPF that calibrates well to the entire range of a jurisdictionâs data. 2. If the CURE plot exceeds the 95-percent confidence limits by more than five percent, then consider the CV of the constant calibration factor. If the CV is within acceptable limits, then the SPF may be acceptable for application, with due recognition for ranges of variables where significant bias is indicated. Another performance measure to compare two or more competing SPFs is the percent of CURE plot ordinates for fitted values (after applying the calibration factor) exceeding the 2Ï limits, where lower values of âpercent exceedingâ are preferred. â400.00 â300.00 â200.00 â100.00 0.00 100.00 200.00 300.00 400.00 0 50000 100000 150000 200000 250000 Cu m ul at iv eÂ Re sid ua ls MajorÂ RoadÂ TrafficÂ Volume cumulativeÂ residuals upperÂ 95%Â confidenceÂ limits lowerÂ 95%Â confidenceÂ limits PoorÂ Fit

C-5 Akaikeâs Information Criterion (AIC) Figure C-8 shows the equation for the AIC. The AIC penalizes for the addition of parameters, and thus helps to select an SPF that fits well but has a minimum number of parameters. AIC is not typically used as a goodness-of-fit measure but can be used to compare the relative fit of alternate SPFs. Smaller values are preferred to larger values in comparing two or more competing SPFs. ð´ð¼ð¶ 2 loglikelihood 2ð¾ Figure C-8. Akaikeâs Information Criterion (AIC). Where: ï· K = number of estimated parameters included in the SPF (i.e., number of variables plus the intercept), and ï· Loglikelihood = statistical output reflecting the overall SPF fit (larger values indicate a better fit). Schwarz Bayesian Information Criterion (BIC) Figure C-9 shows the equation for the BIC. The BIC is complementary to AIC in that it also penalizes for the addition of parameters, and thus selects an SPF that fits well, but has a minimum number of parameters. BIC is not typically used as a goodness of fit measure but can be used to compare the relative fit of alternate SPFs. Smaller values are preferred to larger values in comparing two or more competing SPFs. ðµð¼ð¶ 2 loglikelihood ð¾ â log ðð¢ðððð ðð ððð ððð£ðð¡ðððð Figure C-9. Schwarz Bayesian Information Criterion (BIC). Where: ï· K = number of estimated parameters included in the SPF (i.e., number of variables plus the intercept), and ï· Loglikelihood = statistical output reflecting the overall SPF fit (larger values indicate a better fit). Assessment Tables While the CURE plot method works well for continuous variables, it is not applicable to variables with few categories (e.g., a database with speed limits of 45, 55, and 65 miles per hour (mph)). For such variables, it is useful to develop a table of âcalibration bias factorsâ that include factors for each category of the variable as in the example below. Calibration bias factors are the sum of the observed crashes for the category divided by the sum of the predictions obtained when the calibration factor is applied. If this bias factor is less than 1.0, then the calibrated SPF is over-predicting for the category. If this bias factor is greater than 1.0, then the calibrated SPF is under-predicting. These bias factors can support a comparative assessment of two or more SPFs in conjunction with CURE plots for other measures. In the example in Table C-1., there are three categories of speed limit with corresponding observed crashes and calibration factors. As shown by the calibration factors, the SPF is over-predicting crashes at lower speed limits and under-predicting crashes at higher speed limits.

C-6 Table C-1. Example of categorical variable assessment. Variable 45 mph 55 mph 65 mph Observed crashes 200 320 275 Number of sites 30 35 40 Calibration bias factor 0.85 1.05 1.15 An assessment table can help to identify categories or levels of a given variable for which there is concern about the quality of the calibration process. Calibration bias factors less than 0.8 or greater than 1.2 indicate potential areas of concern, providing these factors are based on at least 100 crashes. Summary of SPF Assessment This section provides a quick reference summary of the key considerations in SPF assessment. Given a single SPF, the CURE plot and the CV of a constant calibration factor help to determine whether the calibrated SPF is acceptable. Given the choice from multiple SPFs, several goodness-of-fit measures can be used to determine the most suitable SPF for the local dataset, and subsequently the CURE plot and CV of the constant calibration factor can be used to determine if the preferred SPF is acceptable. The Calibrator generates these goodness-of-fit measures, but it does not indicate the preferred SPF or acceptability given the need for further research in this area. Assessing the Acceptability of an SPF An SPF with a constant calibration factor may be acceptable if either of the following conditions is met: 1. Five percent or less of CURE plot ordinates for fitted values (after applying the calibration factor) exceed the 2Ï limits, or 2. The CV of a constant calibration factor is less than 0.15. A calibration function should then be estimated that provides a unique calibration factor for each site. The function may be preferable if either of the above conditions is met (i.e., the calibrated SPF is acceptable based on a constant calibration factor) and the percent of CURE plot ordinates for fitted values (after applying the unique calibration factors) exceeding the 2Ïlimits is lower than that for the constant calibration factor. It is likely that the function will then be preferable by other assessment measures (MAD, AIC, BIC). As a caution, if the constant calibration factor is not acceptable, but a calibration function shows less than five percent of CURE plot ordinates for fitted values exceeding the 2Ïlimits, this may be due to a small number of sites or crashes in the calibration dataset. Consider the sample size before adopting the calibration function. If there is a large sample, then the calibration function may be acceptable. If both conditions above are not met, consider increasing the calibration sample. If, with the largest feasible calibration sample, both conditions above are still not met for the constant calibration factor, and the first condition is not met with the calibration function, then consider calibrating another existing SPF or developing a jurisdiction-specific SPF. Comparing Multiple SPFs Table C-2 presents seven measures for comparing the performance of multiple SPFs. For this comparison, there is a need to first estimate a constant calibration factor for each SPF. For each measure, the SPFs are ranked numerically from 1 to n, where 1 represents the best SPF with respect to the given measure and n represents the number of alternative SPFs. To determine the aggregate ranking based on all seven measures, there is a need to sum the numeric rankings over the seven measures. The SPF with the lowest sum of ranks is the preferred SPF for calibration to a jurisdictionâs data. There is an opportunity to

C-7 then refine the preferred SPF based on a constant calibration factor by estimating a calibration function and assessing the SPF performance with this refinement. Note that there is still a need to determine if the preferred SPF is acceptable as outlined above based on the CURE plot and CV for the calibration factor or the CURE plot for the calibration function. Table C-2. Summary of goodness-of-fit measures for ranking SPFs. Goodness-of-Fit Measure Preferred Values Ranking Method Mean Absolute Deviation (MAD) Smaller values Smallest value is ranked number 1 Modified R2 Larger values Largest value is ranked number 1 Constant Dispersion Parameter* Smaller values Smallest value is ranked number 1 Coefficient of variation of the constant calibration factor (CV) Smaller values Smallest value is ranked number 1 Percent of CURE plot ordinates for fitted values (after calibration) exceeding 2Ï limits Smaller values Smallest value is ranked number 1 Akaikeâs Information Criterion (AIC) Smaller values Smallest value is ranked number 1 Bayesian Information Criterion (BIC) Smaller values Smallest value is ranked number 1 *Criterion is only considered where all original candidate SPFs have a constant dispersion parameter. Appendix C References Bahar, G. Userâs Guide to Develop Highway Safety Manual Safety Performance Function Calibration Factors. 2014. NCHRP Report HR 20-7(332). http://safety.fhwa.dot.gov/rsdp/toolbox-content.aspx?toolid=116. Fridstrom, L., Ifver, J., Ingebrigtsen, S., Kulmala, R., and Thomsen, L. K. 1995. Measuring the contribution of randomness, exposure, weather, and daylight to the variation in road accident counts. Accident Analysis and Prevention, Vol. 27, pp. 1-20. Hauer, E. The Art of Regression Modeling in Road Safety. Springer International Publishing Switzerland, 2015. Lyon, C., B. Persaud, and F. Gross. 2018. The CalibratorâAn SPF Calibration and Assessment Tool Updated User Guide. Report FHWA-SA-17-016, Federal Highway Administration, Washington, DC.