Cover Image

Not for Sale

View/Hide Left Panel
Click for next page ( 133

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 132
132 APPENDIX B Application of Safety Performance Functions in Other States or Time Periods As previously stated, applying a safety performance func- Step 3. Recalibrate Dispersion Parameter, k: tion (SPF) in another state, or application in the same state a) For each segment, apply the recalibrated SPF from for different years, requires the model to be recalibrated to re- Step 2 to estimate the expected crash frequency, flect differences across time and space in factors such as col- m, for each segment lision reporting practices, weather, driver demographics, and b) A linear regression model is fit to the data as fol- wildlife movements. lows, where x is the collision frequency at a site: Since the SPFs developed for this report used state-wide Model: y = a + k * z data, they should also be recalibrated where they are being applied to a specific subset of the roadway system. As an ex- where, ample, wildlife crossings will be installed in locations with y = (m x)2 m significant wildlife populations, a history of animalvehicle independent variable z = m2 collisions, and other site characteristics which make crossings a is an intercept term favorable, and which are not common on the entire road sys- k is the slope of the line and is equal to the dis- tem. When an evaluation study of crossing effectiveness is persion parameter undertaken, the suggestion of the research team is that areas with crossings be compared to areas without crossings, but This model can be fit with many statistical or spreadsheet which are as similar as possible to the treated segments. The software packages. Alternatively, one can fit the model using SPFs are then recalibrated using these untreated segments as the sample data and relatively simple equations as follows: a reference group. Each segment, i, is an observation of (zi,yi): i = 1, . . . , n. i) Calculate the sample mean of the variables y and z, y Recalibration Procedure and z ii) Estimate the parameters a and k using the following In the recalibration procedure, a multiplier is estimated to formula reflect these differences by first using the models to predict the number of collisions for a sample of sites for the new state or time period. The sum of the collisions for those sites k= (zi - z )( yi - y ) is divided by the sum of the model predictions to derive the (zi - z )2 multiplier. Worked Example Step 1. Assemble Data: Assemble data and crash prediction models for the road segments of interest. For the time As an example, consider that it is desired to recalibrate the period of interest, obtain the count of animalvehicle California model CA1 for use in Utah for the time period collisions and obtain or estimate the average AADT. 1996 to 2000. Step 2. Estimate Recalibration Multiplier: Apply the SPF to all sites. The sum of the observed collisions for those Step 1 sites is divided by the sum of the SPF predictions to derive the multiplier. The SPF to be applied is:

OCR for page 132
133 total animalvehicle collisions/mile-yr = with known theory about collision causation and processes. exp(-7.8290)(AADT)0.6123 The GOF measures used were: The length, crash, and AADT data for all 3,699 rural two- Pearson's Product Moment Correlation Coefficients lane roadway segments in Utah is assembled. For each site the Between Observed and Predicted Crash Frequencies total number of animalvehicle collisions from 1996 to 2000 is summed and the average AADT for the same time period Pearson's product moment correlation coefficient, usu- calculated. ally denoted by r, is a measure of the linear association between the two variables Y1 and Y2 that have been meas- ured on interval or ratio scales. A different correlation Step 2 coefficient is needed when one or more variables is ordi- The SPF is applied to all sites and the observed collisions nal. Pearson's product moment correlation coefficient is and predictions are summed. given as: sum of the observed collisions = 5,086 r= (Yi1 - Y1 )(Yi 2 - Y2 ) sum of SPF predictions = 933 [ (Yi1 - Y1 )2 (Yi 2 - Y2 )2 ]1/2 The recalibration multiplier is calculated: where Multiplier = 5,086/933 = 5.45 Y = the mean of the Yi observations. The multiplier is very large implying that the animalvehicle A model that predicts observed data perfectly will produce collision frequency is much higher in Utah during 1996 to 2000 a straight line plot between observed (Y1) and predicted than in California during the time period the SPF was calibrated values (Y2) and will result in a correlation coefficient of ex- for (1991 to 2002). actly 1. Conversely, a linear correlation coefficient of 0 sug- The recalibrated SPF is: gests a complete lack of a linear association between ob- total animalvehicle collisions/mile-yr = served and predicted variables. The expectation during 5.45exp(-7.8290)(AADT) 0.6123 model validation is a high correlation coefficient. A low co- efficient suggests that the model is not performing well and that variables influential in the calibration data are not as Step 3 influential in the validation data. Random sampling error, The following calculations are performed for each segment: which is expected, will not reduce the correlation coeffi- cient significantly. y = (m-x)2-m z = m2 Mean Prediction Bias (MPB) The average values are found to be: The MPB is the sum of predicted collision frequencies minus observed collision frequencies in the validation data y = 16.36 set, divided by the number of validation data points. This z = 0.20 statistic provides a measure of the magnitude and direction k= (zi - z )( yi - y ) of the average model bias as compared to validation data. (zi - z )2 The smaller the average prediction bias, the better the model is at predicting observed data. The MPB can be pos- 174244 k= = 1.775 itive or negative, and is given by: 98193 ^i - Yi ) (Y n i =1 MPB = A2: Goodness-of-Fit (GOF) Tests n for Assessing which SPF to Adopt where n = validation data sample size; and Adapted from Washington et al.241 ^ = the fitted value Yi observation. Y Several GOF measures can be used to assess model per- formance. It is important to note at the outset that only after A positive MPB suggests that on average the model overpre- an assessment of many GOF criteria is made can the per- dicts the observed validation data. Conversely, a negative formance of a particular model or set of models be assessed. value suggests systematic underprediction. The magnitude In addition, a model must be internally plausible and agree of MPB provides the magnitude of the average bias.

OCR for page 132
134 Mean Absolute Deviation (MAD) error associated with the calibration or estimation data, and MAD is the sum of the absolute value of predicted valida- so degrees of freedom are lost (p) as a result of producing tion observations minus observed validation observations, Yhat, the predicted response. divided by the number of validation observations. It differs ^i ) (Yi - Y n 2 from MPB in that positive and negative prediction errors i =1 will not cancel each other out. Unlike MPB, MAD can only MSE = n1 - p be positive. ^i ) (Yi - Y n 2 n Y ^i - Yi MPSE = i =1 i =1 MAD = n2 n where where n1 = estimation data sample size; and n = validation data sample size. n2 = validation data sample size. The MAD gives a measure of the average magnitude of vari- A comparison of MSPE and MSE reveals potential overfitting ability of prediction. Smaller values are preferred to larger or underfitting of the models to the estimation data. An values. MSPE that is higher than MSE may indicate that the models may have been overfit to the estimation data, and that some Mean Squared Prediction Error (MSPE) and Mean of the observed relationships may have been spurious instead Squared Error (MSE) of real. This finding could also indicate that important vari- MSPE is the sum of squared differences between observed ables were omitted from the model or the model was mis- and predicted collision frequencies, divided by sample size. specified. Finally, data inconsistencies could cause a relatively MSPE is typically used to assess error associated with a vali- high value of MSPE. Values of MSPE and MSE that are sim- dation or external data set. MSE is the sum of squared ilar in magnitude indicate that validation data fit the model differences between observed and predicted collision fre- similar to the estimation data and that deterministic and sto- quencies, divided by the sample size minus the number of chastic components are stable across the comparison being model parameters. MSE is typically a measure of model made. Typically this is the desired result.