Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 24

24
validating the models and determining their statistical accu- Table 13. Measures of goodness-of-fit for
racy. The statistical performance of the models was assessed all crashes.
by the following methods:
Divided Highways
Crash Type MPB MAD MSPE SMD R2
1. Mean Prediction Bias (MPB): this measure is an estimate Single 0.90 5.20 157.75 4261.49 0.8425
of the direction and magnitude of the average bias of the Multi 1.42 5.31 188.03 5196.85 0.6816
All 2.47 9.44 532.68 6548.12 0.8028
predictions (47). MPB considers the differences between
Undivided Highways
predicted and actual values; a positive value indicates that Crash Type MPB MAD MSPE SMD R2
the model overpredicts crashes. Smaller absolute values of Single 0.69 4.49 93.21 1001.04 0.7837
MPB indicate a better predictive model. Multi 1.07 5.66 202.48 1460.82 0.6598
All 1.78 8.51 469.45 1585.85 0.8055
2. Mean Absolute Deviation (MAD): this measure is the aver-
age dispersion of the model (47); it can be used to estimate
the absolute value of the difference between predicted and
observed values. An estimate close to zero suggests that the Trade-Offs from Models' AMFs
model predicts the actual values well. Two regression model methods can be used to estimate
3. Mean Square Prediction Error (MSPE): this measure is AMFs (or the effects of changes in geometric design features).
an assessment of the error associated with the validation Both were described above, and the method selected for this
dataset and is the sum of the squared differences between research is presented here. The chosen method consists of
the predicted and actual values (47). MSPE is used in con- estimating AMFs directly from the coefficients of statistical
junction with the mean squared error (MSE), which is models using Equation 2. This method provides a simple way
similar in concept: the difference between the two measures to estimate the effects of changes in geometric design features.
is related to the denominator. For the MSPE, the denom-
inator is the sample size used for the validation dataset;
for the MSE, it is the sample size used for estimating the Divided Highways
model less the number of variables included in the model Single-Vehicle Crashes
(i.e., the degrees of freedom). This difference is an indicator
of how well the validation dataset fits the data. An MSPE Four variables had single value AMFs: functional classifi-
value larger than the MSE value indicates that the model cation, paved shoulder, median barrier, and left turn lane
shows signs of overfitting (i.e., that the model incorporates presence. The AMF for the functional classification was 1.50,
indicating a 50% increase for arterial roads in comparison to
too many parameters) and that some of the relationships
other roadways. For paved shoulder, it was 1.18, indicating
observed may be spurious.
an 18% increase for roads with paved shoulder. For median
4. Sum of Model Deviance (SMD): SMD is a measure of the
barrier, it was 2.72, indicating a 172% increase in crashes for
model's goodness-of-fit; a value of 0 indicates a perfect
roads with a barrier compared to roads without one. For left
fit (48). In practical terms, SMD represents a lower bound
turn lane presence, it was 0.72, indicating a 28% reduction
limit for the observed values. The model with the lowest
when a left turn lane was present. AMFs were also computed
SMD is considered the model with the best fit for predict-
for shoulder width, yielding a 6% reduction per foot increase
ing crashes when several models are compared.
of average (left and right) shoulder width.
5. R2-like Measure of Fit (RMF): This measure provides an
estimate similar to the R2 commonly used in linear regres-
sion, but is not appropriate for GLMs, and is calculated Table 14. Measures of goodness-of-fit for
from the residual sum of squares and total sum of squares injury crashes.
after the model is applied to the data (48).
Divided Highways
Crash Type MPB MAD MSPE SMD R2
The measures described above were used to estimate the Single 0.18 1.70 13.22 1983.99 0.7982
goodness-of-fit of the models by using them with the validation Multi 0.39 1.56 18.19 1774.24 0.7435
datasets. The values obtained for these measures are shown All 0.57 2.78 47.76 2684.18 0.8109
in Tables 13 and 14. Overall, the models for single-vehicle Undivided Highways
Crash Type MPB MAD MSPE SMD R2
crashes fitted the data better than do the models for multi- Single 0.28 1.91 16.85 542.65 0.6974
vehicle and all crashes. Despite these differences, the measures Multi 0.36 1.54 19.47 498.11 0.4094
show that the models perform adequately. All 0.70 2.90 64.80 684.69 0.6558