Click for next page ( 25

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 24
24 validating the models and determining their statistical accu- Table 13. Measures of goodness-of-fit for racy. The statistical performance of the models was assessed all crashes. by the following methods: Divided Highways Crash Type MPB MAD MSPE SMD R2 1. Mean Prediction Bias (MPB): this measure is an estimate Single 0.90 5.20 157.75 4261.49 0.8425 of the direction and magnitude of the average bias of the Multi 1.42 5.31 188.03 5196.85 0.6816 All 2.47 9.44 532.68 6548.12 0.8028 predictions (47). MPB considers the differences between Undivided Highways predicted and actual values; a positive value indicates that Crash Type MPB MAD MSPE SMD R2 the model overpredicts crashes. Smaller absolute values of Single 0.69 4.49 93.21 1001.04 0.7837 MPB indicate a better predictive model. Multi 1.07 5.66 202.48 1460.82 0.6598 All 1.78 8.51 469.45 1585.85 0.8055 2. Mean Absolute Deviation (MAD): this measure is the aver- age dispersion of the model (47); it can be used to estimate the absolute value of the difference between predicted and observed values. An estimate close to zero suggests that the Trade-Offs from Models' AMFs model predicts the actual values well. Two regression model methods can be used to estimate 3. Mean Square Prediction Error (MSPE): this measure is AMFs (or the effects of changes in geometric design features). an assessment of the error associated with the validation Both were described above, and the method selected for this dataset and is the sum of the squared differences between research is presented here. The chosen method consists of the predicted and actual values (47). MSPE is used in con- estimating AMFs directly from the coefficients of statistical junction with the mean squared error (MSE), which is models using Equation 2. This method provides a simple way similar in concept: the difference between the two measures to estimate the effects of changes in geometric design features. is related to the denominator. For the MSPE, the denom- inator is the sample size used for the validation dataset; for the MSE, it is the sample size used for estimating the Divided Highways model less the number of variables included in the model Single-Vehicle Crashes (i.e., the degrees of freedom). This difference is an indicator of how well the validation dataset fits the data. An MSPE Four variables had single value AMFs: functional classifi- value larger than the MSE value indicates that the model cation, paved shoulder, median barrier, and left turn lane shows signs of overfitting (i.e., that the model incorporates presence. The AMF for the functional classification was 1.50, indicating a 50% increase for arterial roads in comparison to too many parameters) and that some of the relationships other roadways. For paved shoulder, it was 1.18, indicating observed may be spurious. an 18% increase for roads with paved shoulder. For median 4. Sum of Model Deviance (SMD): SMD is a measure of the barrier, it was 2.72, indicating a 172% increase in crashes for model's goodness-of-fit; a value of 0 indicates a perfect roads with a barrier compared to roads without one. For left fit (48). In practical terms, SMD represents a lower bound turn lane presence, it was 0.72, indicating a 28% reduction limit for the observed values. The model with the lowest when a left turn lane was present. AMFs were also computed SMD is considered the model with the best fit for predict- for shoulder width, yielding a 6% reduction per foot increase ing crashes when several models are compared. of average (left and right) shoulder width. 5. R2-like Measure of Fit (RMF): This measure provides an estimate similar to the R2 commonly used in linear regres- sion, but is not appropriate for GLMs, and is calculated Table 14. Measures of goodness-of-fit for from the residual sum of squares and total sum of squares injury crashes. after the model is applied to the data (48). Divided Highways Crash Type MPB MAD MSPE SMD R2 The measures described above were used to estimate the Single 0.18 1.70 13.22 1983.99 0.7982 goodness-of-fit of the models by using them with the validation Multi 0.39 1.56 18.19 1774.24 0.7435 datasets. The values obtained for these measures are shown All 0.57 2.78 47.76 2684.18 0.8109 in Tables 13 and 14. Overall, the models for single-vehicle Undivided Highways Crash Type MPB MAD MSPE SMD R2 crashes fitted the data better than do the models for multi- Single 0.28 1.91 16.85 542.65 0.6974 vehicle and all crashes. Despite these differences, the measures Multi 0.36 1.54 19.47 498.11 0.4094 show that the models perform adequately. All 0.70 2.90 64.80 684.69 0.6558