Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
APPENDIX E SUMMARY OF GOODNESS-OF-FIT MEASURES AND STATISTICAL TERMS Mean Prediction Bias (MPB) The mean prediction bias (MPB) is the sum of predicted accident frequencies minus observed accident frequencies in the validation data set, divided by the number of validation data points. This statistic provides a measure of the magnitude and direction of the average model bias. The smaller the average prediction bias, the better the model is at predicting observed data. The MPB can be positive or negative, and is given by: 1 n i i i Y Y MPB n ⧠= â âââ ââ= â â where n = validation data sample size, Y ⧠= the predicted value, = observation iY A positive MPB suggests that on average the model overpredicts the observed validation data. Conversely, a negative value suggests systematic underprediction. The magnitude of MPB provides the magnitude of the average bias. Mean Absolute Deviation (MAD) The mean absolute deviation (MAD) is the sum of the absolute value of the predicted value minus the observed observations, divided by the number of observations. It differs from mean prediction bias in that positive and negative prediction errors will not cancel each other out. Unlike MPB, MAD can only be positive. 1 n i i i Y Y MAD n ⧠= â = â where n = validation data sample size The MAD gives a measure of the average magnitude of variability of prediction. Smaller values are preferred to larger values. Mean Squared Prediction Error (MSPE) and Mean Squared Error (MSE) The mean squared prediction error (MSPE) is the sum of squared differences between observed and predicted crash frequencies, divided by sample size. MSPE is typically used to assess error associated with a validation or external data set. Smaller values are preferred to larger values. MSPE = 2 1 2 n i i i Y Y n ⧠= â âââ ââ â â NCHRP Web-Only Document 94: Appendixes to NCHRP Report 572: Roundabouts in the United States E-1
where n2 = data sample size To normalize the GOF measures to compensate for the different numbers of years associated with different data sets, GOF measures can be computed on a per year basis. For MPB and MAD per year, MPB and MAD are divided by number of years. However, since MSPE is the mean values of the squared errors, MSPE is divided by the square of number of years to calculate MSPE per year, resulting in a fair comparison of predictions based on different numbers of years. Other Parameters Waldâs 95% Confidence Limits â Parameter estimates are not estimated exactly. Each has a point estimate and a standard error. The 95% confidence limits give a range of values for which it can be said that the true value lies within that range with 95% certainty. In other words, there is a 5% chance that the true value of the parameter lies outside of this range. Chi Squared Statistic â Calculated by squaring the ratio of the parameter estimate to its standard error. Pr > ChiSq â Is the probability that the chi-squared statistic could exceed the calculated value if the true value of the parameter estimate was zero. This statistic is often referred to as the p-value. Thus a value of less that 0.05 would indicate significance at the 5% level. Dispersion Parameter â This calibrated dispersion parameter of the negative binomial distribution and relates the mean to the variance of the model prediction such that the smaller the value is the better the fit is. A perfect model would have a dispersion parameter close to zero. Correlation Coefficients - Correlation coefficients contain information on both the strength and direction of a linear relationship between two numeric random variables. If one variable x is an exact linear function of another variable y, a positive relationship exists when the correlation is 1 and an inverse relationship exists when the correlation is -1. If there is no linear predictability between the two variables, the correlation is 0. If the variables are normal and correlation is 0, the two variables are independent. However, correlation does not imply causality because, in some cases, an underlying causal relationship may exist. NCHRP Web-Only Document 94: Appendixes to NCHRP Report 572: Roundabouts in the United States E-2