Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
132 As previously stated, applying a safety performance func- tion (SPF) in another state, or application in the same state for different years, requires the model to be recalibrated to re- ï¬ect differences across time and space in factors such as col- lision reporting practices, weather, driver demographics, and wildlife movements. Since the SPFs developed for this report used state-wide data, they should also be recalibrated where they are being applied to a speciï¬c subset of the roadway system. As an ex- ample, wildlife crossings will be installed in locations with signiï¬cant wildlife populations, a history of animalâvehicle collisions, and other site characteristics which make crossings favorable, and which are not common on the entire road sys- tem. When an evaluation study of crossing effectiveness is undertaken, the suggestion of the research team is that areas with crossings be compared to areas without crossings, but which are as similar as possible to the treated segments. The SPFs are then recalibrated using these untreated segments as a reference group. Recalibration Procedure In the recalibration procedure, a multiplier is estimated to reï¬ect these differences by ï¬rst using the models to predict the number of collisions for a sample of sites for the new state or time period. The sum of the collisions for those sites is divided by the sum of the model predictions to derive the multiplier. Step 1. Assemble Data: Assemble data and crash prediction models for the road segments of interest. For the time period of interest, obtain the count of animalâvehicle collisions and obtain or estimate the average AADT. Step 2. Estimate Recalibration Multiplier: Apply the SPF to all sites. The sum of the observed collisions for those sites is divided by the sum of the SPF predictions to derive the multiplier. Step 3. Recalibrate Dispersion Parameter, k: a) For each segment, apply the recalibrated SPF from Step 2 to estimate the expected crash frequency, m, for each segment b) A linear regression model is ï¬t to the data as fol- lows, where x is the collision frequency at a site: Model: y = a + k * z where, y = (m â x)2â m independent variable z = m2 a is an intercept term k is the slope of the line and is equal to the dis- persion parameter This model can be ï¬t with many statistical or spreadsheet software packages. Alternatively, one can ï¬t the model using the sample data and relatively simple equations as follows: Each segment, i, is an observation of (zi,yi): i = 1, . . . , n. i) Calculate the sample mean of the variables y and z, and ii) Estimate the parameters a and k using the following formula Worked Example As an example, consider that it is desired to recalibrate the California model CA1 for use in Utah for the time period 1996 to 2000. Step 1 The SPF to be applied is: k z z y y z z i i i = â â â â â ( )( ) ( )2 z y A P P E N D I X B Application of Safety Performance Functions in Other States or Time Periods
total animalâvehicle collisions/mile-yr = exp(â7.8290)(AADT)0.6123 The length, crash, and AADT data for all 3,699 rural two- lane roadway segments in Utah is assembled. For each site the total number of animalâvehicle collisions from 1996 to 2000 is summed and the average AADT for the same time period calculated. Step 2 The SPF is applied to all sites and the observed collisions and predictions are summed. sum of the observed collisions = 5,086 sum of SPF predictions = 933 The recalibration multiplier is calculated: Multiplier = 5,086/933 = 5.45 The multiplier is very large implying that the animalâvehicle collision frequency is much higher in Utah during 1996 to 2000 than in California during the time period the SPF was calibrated for (1991 to 2002). The recalibrated SPF is: total animalâvehicle collisions/mile-yr = 5.45âexp(â7.8290)(AADT)0.6123 Step 3 The following calculations are performed for each segment: y = (mâx)2-m z = m2 The average values are found to be: A2: Goodness-of-Fit (GOF) Tests for Assessing which SPF to Adopt Adapted from Washington et al.241 Several GOF measures can be used to assess model per- formance. It is important to note at the outset that only after an assessment of many GOF criteria is made can the per- formance of a particular model or set of models be assessed. In addition, a model must be internally plausible and agree k = = 174244 98193 1 775. k z z y y z z i i i = â â â â â ( )( ) ( )2 y z = = 16 36 0 20 . . with known theory about collision causation and processes. The GOF measures used were: ⢠Pearsonâs Product Moment Correlation Coefficients Between Observed and Predicted Crash Frequencies Pearsonâs product moment correlation coefficient, usu- ally denoted by r, is a measure of the linear association between the two variables Y1 and Y2 that have been meas- ured on interval or ratio scales. A different correlation coefficient is needed when one or more variables is ordi- nal. Pearsonâs product moment correlation coefficient is given as: where A model that predicts observed data perfectly will produce a straight line plot between observed (Y1) and predicted values (Y2) and will result in a correlation coefï¬cient of ex- actly 1. Conversely, a linear correlation coefï¬cient of 0 sug- gests a complete lack of a linear association between ob- served and predicted variables. The expectation during model validation is a high correlation coefï¬cient. A low co- efï¬cient suggests that the model is not performing well and that variables inï¬uential in the calibration data are not as inï¬uential in the validation data. Random sampling error, which is expected, will not reduce the correlation coefï¬- cient signiï¬cantly. ⢠Mean Prediction Bias (MPB) The MPB is the sum of predicted collision frequencies minus observed collision frequencies in the validation data set, divided by the number of validation data points. This statistic provides a measure of the magnitude and direction of the average model bias as compared to validation data. The smaller the average prediction bias, the better the model is at predicting observed data. The MPB can be pos- itive or negative, and is given by: where n = validation data sample size; and = the ï¬tted value Yi observation. A positive MPB suggests that on average the model overpre- dicts the observed validation data. Conversely, a negative value suggests systematic underprediction. The magnitude of MPB provides the magnitude of the average bias. YË MPB Y Y n i i i n = â( ) = â Ë 1 Y Yi= the mean of the observations. r Y Y Y Y Y Y Y Y i i i i = â â â â[ ] â â â ( )( ) ( ) ( ) / 1 1 2 2 1 1 2 2 2 2 1 2 133
⢠Mean Absolute Deviation (MAD) MAD is the sum of the absolute value of predicted valida- tion observations minus observed validation observations, divided by the number of validation observations. It differs from MPB in that positive and negative prediction errors will not cancel each other out. Unlike MPB, MAD can only be positive. where n = validation data sample size. The MAD gives a measure of the average magnitude of vari- ability of prediction. Smaller values are preferred to larger values. ⢠Mean Squared Prediction Error (MSPE) and Mean Squared Error (MSE) MSPE is the sum of squared differences between observed and predicted collision frequencies, divided by sample size. MSPE is typically used to assess error associated with a vali- dation or external data set. MSE is the sum of squared differences between observed and predicted collision fre- quencies, divided by the sample size minus the number of model parameters. MSE is typically a measure of model MAD Y Y n i i i n = â = â Ë 1 error associated with the calibration or estimation data, and so degrees of freedom are lost (p) as a result of producing Yhat, the predicted response. where n1 = estimation data sample size; and n2 = validation data sample size. A comparison of MSPE and MSE reveals potential overï¬tting or underï¬tting of the models to the estimation data. An MSPE that is higher than MSE may indicate that the models may have been overï¬t to the estimation data, and that some of the observed relationships may have been spurious instead of real. This ï¬nding could also indicate that important vari- ables were omitted from the model or the model was mis- speciï¬ed. Finally, data inconsistencies could cause a relatively high value of MSPE. Values of MSPE and MSE that are sim- ilar in magnitude indicate that validation data ï¬t the model similar to the estimation data and that deterministic and sto- chastic components are stable across the comparison being made. Typically this is the desired result. MPSE = â( ) = â Y Y n i i i n Ë 2 1 2 MSE = â( ) â = â Y Y n p i i i n Ë 2 1 1 134