National Academies Press: OpenBook

Impact of Shoulder Width and Median Width on Safety (2009)

Chapter: Chapter 3 - Data Analysis

« Previous: Chapter 2 - Literature Review
Page 17
Suggested Citation:"Chapter 3 - Data Analysis." National Academies of Sciences, Engineering, and Medicine. 2009. Impact of Shoulder Width and Median Width on Safety. Washington, DC: The National Academies Press. doi: 10.17226/14252.
×
Page 17
Page 18
Suggested Citation:"Chapter 3 - Data Analysis." National Academies of Sciences, Engineering, and Medicine. 2009. Impact of Shoulder Width and Median Width on Safety. Washington, DC: The National Academies Press. doi: 10.17226/14252.
×
Page 18
Page 19
Suggested Citation:"Chapter 3 - Data Analysis." National Academies of Sciences, Engineering, and Medicine. 2009. Impact of Shoulder Width and Median Width on Safety. Washington, DC: The National Academies Press. doi: 10.17226/14252.
×
Page 19
Page 20
Suggested Citation:"Chapter 3 - Data Analysis." National Academies of Sciences, Engineering, and Medicine. 2009. Impact of Shoulder Width and Median Width on Safety. Washington, DC: The National Academies Press. doi: 10.17226/14252.
×
Page 20
Page 21
Suggested Citation:"Chapter 3 - Data Analysis." National Academies of Sciences, Engineering, and Medicine. 2009. Impact of Shoulder Width and Median Width on Safety. Washington, DC: The National Academies Press. doi: 10.17226/14252.
×
Page 21
Page 22
Suggested Citation:"Chapter 3 - Data Analysis." National Academies of Sciences, Engineering, and Medicine. 2009. Impact of Shoulder Width and Median Width on Safety. Washington, DC: The National Academies Press. doi: 10.17226/14252.
×
Page 22
Page 23
Suggested Citation:"Chapter 3 - Data Analysis." National Academies of Sciences, Engineering, and Medicine. 2009. Impact of Shoulder Width and Median Width on Safety. Washington, DC: The National Academies Press. doi: 10.17226/14252.
×
Page 23
Page 24
Suggested Citation:"Chapter 3 - Data Analysis." National Academies of Sciences, Engineering, and Medicine. 2009. Impact of Shoulder Width and Median Width on Safety. Washington, DC: The National Academies Press. doi: 10.17226/14252.
×
Page 24
Page 25
Suggested Citation:"Chapter 3 - Data Analysis." National Academies of Sciences, Engineering, and Medicine. 2009. Impact of Shoulder Width and Median Width on Safety. Washington, DC: The National Academies Press. doi: 10.17226/14252.
×
Page 25
Page 26
Suggested Citation:"Chapter 3 - Data Analysis." National Academies of Sciences, Engineering, and Medicine. 2009. Impact of Shoulder Width and Median Width on Safety. Washington, DC: The National Academies Press. doi: 10.17226/14252.
×
Page 26

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

17 The first section of this chapter presents the methodological approach and related issues. The second section presents the data used in the development of the prediction models and AMFs. Methodology Over the past decades, interest has increased in estimating the safety implications from changes in various design elements. To be able to determine these changes, models were developed that could predict the crash-rate frequency or the number of crashes as a function of various traffic conditions and values of geometric elements. A significant part of past research was devoted to developing such models; in the past decade, most researchers have used negative binomial models for modeling crashes. These models assume that unobserved crash variation across roadway segments is gamma-distributed, while crashes within sites are Poisson-distributed (41). The Poisson, Poisson- Gamma (negative binomial), and other related models are collectively called “generalized linear models” (GLM). These models have the general form of Equation 1: where E[N] = predicted number of crashes per year for a roadway section, EXPO = exposure to crashes, b0, . . . , bn = regression coefficients, and X1, . . . , Xn = predictor variables. Models developed similar to Equation 1 will be capable of identifying the relationship of the number of crashes to the various elements to be considered. The measure of exposure used in these prediction models could be either the traditional vehicle-miles (i.e., length × Average Daily Traffic (ADT) vol- ume), or the length itself while the ADT becomes a predictor variable. E N EXPO eb b X b X b Xn n[ ] = + + + +0 1 1 2 2 1. . . ( ) Negative binomial models are typically used in developing Crash Reduction Factors (CRFs) or Accident Modification Factors (AMFs). Even though these two terms are in general similar in concept, there are slight differences. A CRF is a value that represents the reduction of crashes due to a safety improvement at a roadway spot or section. Such values rep- resent the percent improvement on the roadway and most often have a positive connotation—that is, the safety inter- vention will have a positive result. On the other hand, an AMF is a constant that represents the safety change due to a change in a value of the segment. These factors are typically the ratio of the expected values of crashes with and without the change. AMFs are also used as multipliers for estimating the expected number of crashes, and values less than 1.0 indicate fewer crashes as a result of the change. The basic concept of the AMF is to capture the change in crash frequency due to the change of a single element. How- ever, this is often not the case, and these factors have been developed using cross-sectional studies where multivariate models were developed and used in the determination of AMFs. The models typically include all contributing factors that could influence safety and then use them to estimate the change in crashes due to a change in one unit of the variable of concern. This approach is typically completed with the assistance of an expert panel that evaluates the use of the prediction models and estimates the potential effect for each variable of concern. These evaluations could be further supported by the existing literature and current knowledge for the specific variable. This approach was used in the two-lane rural roadway models as part of the IHSDM, where the models developed were used as the basis for the creation of the AMFs. AMFs may appear subjective in nature, but they represent a collective “wisdom” based on expert panel knowledge, field observation, and findings in the research literature. The key limitation to this approach for AMF development is that there may not be ade- quate literature dealing with the identification of the safety impacts from the elements of interest. C H A P T E R 3 Data Analysis

Currently, there are two methods that can be used for esti- mating AMFs using regression models. The first method consists of estimating AMFs directly from the coefficients of statistical models. This method has been used by Lord and Bonneson (42) for estimating AMFs for rural frontage roads in Texas. Washington et al. (41) used a similar approach in their study. The AMFs are estimated the following way: where xj = range of values or a specific value investigated (e.g., lane width, shoulder width, etc.) for AMF j; yj = baseline conditions or average conditions for the vari- able xj (when needed or available); and βj = regression coefficient associated for the variable j. This method provides a simple way to estimate the effects of changes in geometric design features. However, although the variables are supposed to be independent, they may be correlated, which could affect the coefficients of the model. The Variance Inflation Factor (VIF) can be used for detecting correlated variables, but this procedure usually flags only extreme cases of correlation among variables (43). The second method consists of estimating the AMF using baseline models and applying them to data that do not meet the nominal conditions (41). These models are developed using data that reflect nominal conditions commonly used by design engineers or could also reflect the average values for some input variables. Such models usually include only traffic flow as the input variable. Examples of nominal conditions for rural four-lane undivided highways may include 12-ft lane and 8-ft shoulder widths, straight sections, and so forth. It is anticipated that by controlling the input variables, the models will more accurately estimate the safety performance of the facility for the given input conditions. However, an important drawback to developing baseline models is associ- ated with the smaller sample size. Because the input data only include data meeting the nominal conditions, the sample size can be significantly reduced. This reduction can (1) affect the model stability, especially if the sample mean value is low (44); (2) increase the model error (variance); and (3) decrease the statistical power of the model. Baseline models are currently used for the HSM (45). The second method was proposed by Washington et al. (41), who have re-calibrated models for estimating the safety per- formance of rural signalized and unsignalized intersections. For this method, the baseline model is first applied to sites not meeting all of the baseline conditions; then, the predicted and observed values per year are compared, and a linear relation- ship between these two values is estimated via a regression model to determine whether AMFs can be produced from AMF ej x yj j j= × −[ ]( )β ( )2 its coefficients. The linear equation is given by the following equation: where µi = the predicted number of crashes for Site i per year estimated by the baseline model; Yi = observed number of crashes for Site i per year; Xm = a vector of the baseline variables (each site not meeting one or more of these variables); and γm = a vector of coefficients to be estimated. The AMFs are estimated using the following relationship when the coefficients are found to be statistically significant (e.g., at the 5% or 10% level): where AMFm = AMF for Coefficient m, and n = the number of observations in the sample. Data Base As noted above, the initial approach was to evaluate the safety implications from specific changes to values of design elements though a review and analysis of cases where such flexibility changes were implemented. The meeting with the NCHRP project panel at the end of Phase I resulted in a sig- nificant change of the scope of the work and the type of data to be acquired. The discussion during the meeting focused on the potential problems and issues identified from the original approach. That approach was centered on the identification of cases where design flexibility was used and was documented by a comparison of the safety performance of each case to control sites where no flexibility was required. A variety of issues were identified that led to the need for another approach to produce the most beneficial research. This research must be useful in the ongoing HSM efforts, and that required this revised approach. The project panel recommended that the research be concentrated on multilane rural roads and that it should be limited to specific design elements: lane width, shoulder width, and median type and width. The possibility of examining the contribution of clear zones was also discussed, but this decision was made contingent on a determination of data availability and potential feasibility. The first task in Phase II of the research was to identify candidate states with crash data suitable for analysis. The plan was to retrieve crash data from the states participating in the AMF Y n Y n m i i n i mi n = − = = ∑ ∑ 1 1 4 γ ( ) Y X Xi i m m− = + +μ γ γ1 1 3. . . ( ) 18

HSIS database in a manner that would achieve a broad geo- graphic distribution to ensure consideration of terrain, climate, and other key factors. The states in the FHWA HSIS database include California, Illinois, Maine, Michigan, Minnesota, North Carolina, Utah, and Washington. Data availability varies among these states with respect to time periods (some have fewer years than others) as well as the type of information available (not all include roadway geometry data in the crash records). Therefore, data from these states were evaluated with respect to the availability of the following data types and classes: (1) multilane rural roads; (2) geometric elements including lane and shoulder width and median type and width; (3) crash severity level; and (4) possibly, crash type. In addition, in order to stratify the data and the potential crash models, the number of lanes and functional classification were needed. A review of the data available through HSIS for each state found that only Ohio and Washington have data available for horizontal and vertical curves. At the end of Phase I in 2005, 2004 data were available from Minnesota and were being processed for the other states. In addition, data were available from Kentucky that had been used satisfactorily by the research team in the past. To achieve the objective of identifying possible geographic differences among the states and, thus, to achieve a national perspective, the databases from California, Kentucky, and Minnesota were selected for the Phase II analysis. This allows for a reasonable geographic distribution that should adequately cover roadways found throughout the nation. A final element discussed at the project panel meeting was the exclusion of intersections to create a database with midblock sections only. An understanding of the safety consequences for both the total number and specific types of crashes is of interest in evaluating design element trade-offs. The change in the crash rate will provide an understanding of the overall safety risks of the applied trade-off. There are also specific crash types that would be expected to occur due to a trade-off on a specific geometric element—for example, if the decision involved the use of medians, the number of head-on crashes would be of particular interest. The analysis of such specific types of crashes would provide an understanding of the effect of certain types of decisions. Therefore, the number of all crashes and the number of specific crash types for each case would be col- lected for evaluating the safety trade-offs from varying values of design elements. An additional evaluation would focus on the severity of the crashes. It is possible that trade-offs for a design element may not show significant impacts on roadway safety expressed in total crashes, but might affect the severity of the crashes. The California and Minnesota data used for this research were provided by NCHRP Project 17-29, which was also working on a similar issue and had already developed and evaluated the databases for these two states. The Kentucky data were also evaluated by the research team to provide compatibility among the three data sets and to see that all variables to be examined provided the same information and values. An effort was made to augment the Kentucky data with the available clear zone width for all segments included in the database. Site visits were conducted at all 437 rural multilane segments in the database. The intent of these visits was to review the available information included in the state’s High- way Information System and to determine its accuracy. Past work with this data indicated occasional inaccuracies regard- ing the geometric elements used. Kentucky is conducting a similar review, but their results were not available at the time of this research work. For each site, the lane, shoulder, and median widths were measured, the shoulder and median types were recorded, and an estimate was made of the available clear zone. The data were then used to update the geometry file, which was in turn used to develop the crash database for analysis. The final data base was developed by aggregating the individual state databases into one. For each state, a 12-year period was used with examination of data covering 2,387 miles. A further evaluation of the data to determine presence of all common available variables and values indicated that the majority of the segments (more than 95%) were four-lane facilities and most (more than 90%) had lane widths of 12 ft. These data indicate that there may be some concerns regarding the distribution of certain variables since a significant mileage was at specific values, which may not allow for the development of complete models. For example, it was envisioned to create separate models for four- and six-lane facilities. However, the available data indicate that there are only 35 segments for six-lane facilities accounting for 205.45 miles (8.6%) of the total mileage. Therefore, the decision was made to develop models only for four-lane, 12-ft lane width segments. This approach resulted in a new data set that had a total extent of 1,433.7 miles with 35,694 crashes of which 9,024 were injury crashes. The ADT ranged from 241 to 77,250 vehicles/day, and the total miles for divided highways was 1,241.4. All segments were classified as non-freeway, even though these facilities could qualify as rural multilane roadways and all have a length greater than 0.10 miles. An average of the left and right shoulder widths was used as the shoulder width since this approach resulted in models with more reasonable and intu- itive coefficients. The average shoulder width is computed as the mean of the left and shoulder width in the same direction for divided highways and as the mean for the right shoulders in undivided segments. Moreover, the shoulder type was checked to ensure that both shoulders used in the calculation are of the same type. All segments included in the final data set had the same type of left and right shoulders. Finally, all injury 19

levels (ABC injuries) and fatalities (K) are included in the injury crashes. These state databases exhibited various values for commonly named variables. For example, the California database codes median barrier types differently than do Minnesota and Kentucky. It was important to decipher these differences and to determine the common categories and groups among all three state databases. It became apparent that the commonality of data coding among these databases should be evaluated in order to avoid misinterpretation of the results. The unit of analysis in the model development process is a highway segment that has homogenous geometry and traffic conditions. The database developed herein used this approach and, thus, allows for the development of models that will have the segment as a unit. Table 10 presents a summary of the variables considered and the number of segments in the final database by each state (as described above). In all cases, the term “injury crash” denotes both injury and fatality crashes. The data in Table 10 indicate that most segments are divided highways without median barriers, with shoulder widths between 6 and 8 ft, and with traffic volumes between 5,000 and 15,000 vehicles/day. All are four-lane rural highways with 12-ft lanes. There are differences among the states for certain variables—for example, most of the roads with higher ADT are in California, and they account for approximately one-third of the segments within the state. California and Minnesota also had large numbers of segments with wide medians (greater than 60 ft), while most median widths for Kentucky were narrower (more than one-half of the segments were less than 20 Divided Undivided Variable Categories CA KY MN CA KY MN All 16,951 8,035 5,106 3,495 1,037 1,068 Crashes Injury 4,045 2,765 681 995 405 133 Yes 571 539 615 164 73 84 Principal arterial No 183 71 46 125 8 31 Yes 95 3 6 NA NA NA Median barrier No 659 607 655 NA NA NA Yes 624 530 595 243 68 47 Paved right shoulder No 130 80 66 46 13 68 0 0 1 10 6 2 49 0–2 49 27 1 20 7 2 2–4 102 32 14 124 1 9 4–6 87 218 99 36 19 13 6–8 412 329 536 75 31 18 Average shoulder width (ft) 8+ 104 3 1 28 21 24 <5 65 61 91 103 4 34 5–10 116 172 268 80 31 38 10–15 181 239 178 53 23 32 15–20 131 89 92 34 12 10 20–25 89 30 26 12 8 ADT (vehicles/day; 000s) >25 172 19 6 7 3 1 <10 55 101 27 NA NA NA 10–20 177 188 12 NA NA NA 20–30 116 159 20 NA NA NA 30–40 59 108 37 NA NA NA 40–50 149 37 142 NA NA NA 50–60 32 14 185 NA NA NA Median width (ft) >60 166 3 238 NA NA NA Table 10. Extent of variables in database.

20 ft). These differences among states can affect the model development because they may influence the presence or absence of a variable as well as the magnitude of its coefficients. In addition to this evaluation, a preliminary analysis was also made to estimate the crash rates for the variables of concern (see Table 11). The data show that, in general, divided high- ways have lower crash rates, the segments with a median barrier have higher crash rates than do segments without, and there is a difference between single- and multi-vehicle crashes depending on whether the roadway is divided. The median width has a positive effect (i.e., lower crash rates) up to 40 feet; the crash rates increase above that width. The same could be observed for shoulder width, where the crash rate decreases up to 6 ft and then varies as the shoulder becomes wider. These trends are simple observations, and statistical tests were not conducted to determine their statistical significance. Data Analysis As noted above, predictive models were developed to evaluate trade-offs among selected design elements. The unit of analysis is a roadway segment with its associated crash history. The database records are based on roadway segments that have consistent geometric features for their corresponding length. Each record included the total number of crashes and total number of injury crashes. A distinction was made with respect to the number of vehicles involved in the crash, with crashes classified as single-vehicle or multi-vehicle for both total and injury crashes. The goal of the analysis was to isolate the effect of a single parameter. For example, all road segments in four-lane undivided arterials would be used in developing a model to determine the potential effect of the var- ious features on total number of crashes or other crash types (i.e., single-vehicle, multi-vehicle or injury crashes). 21 Variable Categories Divided Undivided Yes 48.97 77.15 Principal arterial No 51.63 77.83 Yes 98.95 NA Median barrier No 46.67 NA Single 29.21 36.68 Vehicles Multi 20.15 39.44 Yes 74.21 128.84 Paved right shoulder No 60.40 79.25 0 89.45 155.51 0–2 82.26 87.04 2–4 60.15 75.89 4–6 53.15 64.51 6–8 45.47 65.08 Average shoulder width (ft) 8+ 38.92 52.56 <5 72.78 92.90 5–10 49.88 75.94 10–15 40.32 68.28 15–20 45.55 58.10 20–25 38.86 89.10 ADT (vehicles/day; 000s) >25 63.32 93.53 <10 74.75 NA 10–20 55.65 NA 20–30 47.99 NA 30–40 38.85 NA 40–50 42.56 NA 50–60 43.90 NA Median width (ft) >60 46.98 NA Table 11. Crash rates for selected variables.

Data analysis focused on developing models by design element for assessing safety impacts from trade-offs among values of each design element and predicting the potential safety consequences expressed as number of crashes per unit time. Models for predicting crashes by severity level were also developed. However, models for specific crash types were not developed due to lack of available crash data. For the statisti- cal modeling, GLMs were used because they are considered more appropriate for variables that are not normally distrib- uted. Such models use a maximum likelihood function to determine which variables are significant and how well the model fits the data. Crashes are considered random events that follow a Poisson distribution; therefore, the use of GLMs is appropriate. Such models are derived using a relatively recent statistical approach; the literature suggests they have been gaining popularity among researchers (39–41). The SAS statistical software was used to develop the prediction models and to determine their coefficients (46). The Generalized Modeling procedure (GENMOD) was imple- mented, and the model coefficients were estimated through the maximum-likelihood method. This approach is well suited to the development of models that have predictors that are either continuous or categorical2. The residual deviance statistics were used to assess the model’s goodness-of-fit. Initially, all variables of concern were included in the model, and variables with coefficients that were not statistically significant (at the 5% level) were removed from the model. This process was followed until a model was obtained in which all variables entered were statistically significant. The signs of the coefficients were also evaluated to determine whether they reflected previously observed crash trends. A desirable outcome from such a model is the determina- tion of the relative safety impact of specific geometric ele- ments. This requires the availability of adequate data to establish such comparisons as well as the isolation of the impact of each element. There are potential problems that should be considered when a model is developed. First, spe- cific elements may not be easily isolated and examined alone since the literature has indicated that there are elements that interact. Second, there is the potential for significant vari- ability among the various roadway segments included in the database such that, even if an element can be isolated, there may be other variables (such as traffic volume, number of lanes, and functional class) that could also require attention and, thus, require an additional data classification, further reducing a model’s strength in reaching statistically sound conclusions. The models developed in this research predict the number of crashes for a given condition. This decision was reached during the project panel meeting, during which the appropri- ateness of crash rates and number of crashes was discussed. The decision was based on the need to develop results that could be eventually used in the HSM. The rationale for this decision is that the current trend is to avoid the use of crash rates because of potential problems arising from the implicit assumption of linearity between volume and crashes as well as the possible misuse by unaware users who may assume that a change in traffic volumes could proportionally affect the number of crashes. It was therefore decided to separate the data in divided and undivided segments and to develop separate models for each group. Models developed in this research were validated to deter- mine their goodness-of-fit. The available data were randomly divided into two sets: one was used in the model development, while the second was used for the evaluation of the strength of the model to predict the number of crashes. This is an accepted approach to determine the goodness-of-fit of a model, even though it reduces the data available for developing the model by one-half. Prediction Models Models were developed and evaluated for their applicability and ability to produce predictors with reasonable coefficient signs. Initially, models were developed where the exposure was considered as the product of length and traffic volume. However, these models produced consistently counterintuitive results: the coefficient signs were opposite to a priori expecta- tions based on past research. Therefore, a second round of models was produced that used volume as a predictor with the goal of obtaining more robust models with coefficients more in accordance with past work. These new models had a better fit, and most coefficients were in agreement with past research findings. The general form of these models was as follows: where E[N]i = expected crash frequency per year for Condition i; L = segment length (mile); bi = model coefficients; ADT = average daily traffic (vehicles/day); and Xi = predictors (various variables). The predictor variables varied for each condition—divided and undivided segments and single-vehicle, multi-vehicle, and all crashes—are discussed in the following paragraphs. The term ln 12 is included in each model to provide the results in units of crashes per year (as 12 years of data were used for esti- mating the model). E N L e i b b ADT b X b X b Xn n[ ] = − + + + + +0 1 2 1 2 212ln ln . . . (5) 22 2A categorical predictor variable is a variable whose categories identify class or group membership, which is used to predict responses on one or more dependent variables (from http://www.statsoft.com/textbook/glosc.html).

Divided Roads, All Crashes Single-Vehicle Crashes Multi-Vehicle Crashes All Crashes Undivided Roads, All Crashes Single-Vehicle Crashes Multi-Vehicle Crashes All Crashes Divided Roads, Injury Crashes Single-Vehicle Crashes Multi-Vehicle Crashes E N L e MDJ b ADT MW S[ ] = − + − −0 12 0 981 0 009 0 137ln . ln . . W ( )13 E N L e SDI b ADT FC M[ ] = − + + +0 12 0 571 0 251 0 813ln . ln . . BAR SW LTLN− −0 053 0 728 12 . . ( ) E N L e AD b ADT SW[ ] = − + −0 12 0 960 0 067 11ln . ln . ( ) E N L e MU b ADT RSP S[ ] = − + − −0 12 1 223 0 474 0 111ln . ln . . W ( )10 E N L e SU b ADT RSA[ ] = − + +0 12 0 795 0 379 9ln . ln . ( ) E N L e AD b ADT MBAR[ ] = − + + +0 12 0 835 0 781 0 172ln . ln . . FC RSP SW+ −0 228 0 118 8. . ( ) E N L e MD b ADT MW MB[ ] = − + − +0 12 1 203 0 010 0 523ln . ln . . AR SW LTLN− +0 137 0 452 7 . . ( ) E N L e SD b ADT FC MB [ ] = − + + +0 12 0 597 0 407 0 999ln . ln . . AR RSP SW LTLN+ − −0 166 0 053 0 327 6. . . ( ) All Crashes where E[N]i = expected crash frequency per year for Condition i; L = segment length (mi); b0 = model intercept; ADT = average daily traffic (vehicles/day); RSP = right shoulder paved (no/yes); SW = average right and left shoulder width (ft); MW = median width (ft); FC = functional class principal arterial (no/yes); MBAR = median barrier (no/yes); and LTLN = left turn lane present (no/yes). The following subscripts are used: S = single-vehicle crashes, M = multi-vehicle crashes, A = all crashes, D = divided, U = undivided, and I = injury crashes. No predictor variables were statistically significant for the injury models for the undivided roads; hence, these mod- els are not reported here. There are three intercepts (b0) for the models developed because each state was used as an indicator to allow for a more accurate estimation of the variables and their coefficients. The three intercepts are sim- ilar for all models and are presented in Table 12. The user can use any of these in the development of estimates since all will produce results of similar magnitude. An approach for predicting crashes with the models is described later in this section. As described above, the data were divided into two halves for the analysis: the training and validating datasets, respec- tively. The training datasets contained 1,028 divided and 242 undivided segments. The validation datasets included 997 divided and 243 undivided segments and were used for E N L e ADI b ADT MBAR[ ] = − + + −0 12 0 835 0 657 0 06ln . ln . . 8 14SW ( ) 23 Highway Crash Type CA KY MN Single –3.087 –3.567 –3.002 Multi –7.974 –7.884 –8.100 Divided All –4.235 –4.457 –4.317 Single –4.759 –4.976 –5.043 Multi –7.970 –7.052 –7.671 Undivided All –5.105 –4.758 –5.054 Single, injury –3.644 –4.141 –4.711 Multi, injury –7.217 –6.764 –7.900 Divided All, injury –4.614 –4.569 –5.547 Table 12. Model intercepts.

validating the models and determining their statistical accu- racy. The statistical performance of the models was assessed by the following methods: 1. Mean Prediction Bias (MPB): this measure is an estimate of the direction and magnitude of the average bias of the predictions (47). MPB considers the differences between predicted and actual values; a positive value indicates that the model overpredicts crashes. Smaller absolute values of MPB indicate a better predictive model. 2. Mean Absolute Deviation (MAD): this measure is the aver- age dispersion of the model (47); it can be used to estimate the absolute value of the difference between predicted and observed values. An estimate close to zero suggests that the model predicts the actual values well. 3. Mean Square Prediction Error (MSPE): this measure is an assessment of the error associated with the validation dataset and is the sum of the squared differences between the predicted and actual values (47). MSPE is used in con- junction with the mean squared error (MSE), which is similar in concept: the difference between the two measures is related to the denominator. For the MSPE, the denom- inator is the sample size used for the validation dataset; for the MSE, it is the sample size used for estimating the model less the number of variables included in the model (i.e., the degrees of freedom). This difference is an indicator of how well the validation dataset fits the data. An MSPE value larger than the MSE value indicates that the model shows signs of overfitting (i.e., that the model incorporates too many parameters) and that some of the relationships observed may be spurious. 4. Sum of Model Deviance (SMD): SMD is a measure of the model’s goodness-of-fit; a value of 0 indicates a perfect fit (48). In practical terms, SMD represents a lower bound limit for the observed values. The model with the lowest SMD is considered the model with the best fit for predict- ing crashes when several models are compared. 5. R2-like Measure of Fit (RMF): This measure provides an estimate similar to the R2 commonly used in linear regres- sion, but is not appropriate for GLMs, and is calculated from the residual sum of squares and total sum of squares after the model is applied to the data (48). The measures described above were used to estimate the goodness-of-fit of the models by using them with the validation datasets. The values obtained for these measures are shown in Tables 13 and 14. Overall, the models for single-vehicle crashes fitted the data better than do the models for multi- vehicle and all crashes. Despite these differences, the measures show that the models perform adequately. Trade-Offs from Models’ AMFs Two regression model methods can be used to estimate AMFs (or the effects of changes in geometric design features). Both were described above, and the method selected for this research is presented here. The chosen method consists of estimating AMFs directly from the coefficients of statistical models using Equation 2. This method provides a simple way to estimate the effects of changes in geometric design features. Divided Highways Single-Vehicle Crashes Four variables had single value AMFs: functional classifi- cation, paved shoulder, median barrier, and left turn lane presence. The AMF for the functional classification was 1.50, indicating a 50% increase for arterial roads in comparison to other roadways. For paved shoulder, it was 1.18, indicating an 18% increase for roads with paved shoulder. For median barrier, it was 2.72, indicating a 172% increase in crashes for roads with a barrier compared to roads without one. For left turn lane presence, it was 0.72, indicating a 28% reduction when a left turn lane was present. AMFs were also computed for shoulder width, yielding a 6% reduction per foot increase of average (left and right) shoulder width. 24 Divided Highways Crash Type MPB MAD MSPE SMD R2 Single 0.90 5.20 157.75 4261.49 0.8425 Multi 1.42 5.31 188.03 5196.85 0.6816 All 2.47 9.44 532.68 6548.12 0.8028 Undivided Highways Crash Type MPB MAD MSPE SMD R2 Single 0.69 4.49 93.21 1001.04 0.7837 Multi 1.07 5.66 202.48 1460.82 0.6598 All 1.78 8.51 469.45 1585.85 0.8055 Table 13. Measures of goodness-of-fit for all crashes. Divided Highways Crash Type MPB MAD MSPE SMD R2 Single 0.18 1.70 13.22 1983.99 0.7982 Multi 0.39 1.56 18.19 1774.24 0.7435 All 0.57 2.78 47.76 2684.18 0.8109 Undivided Highways Crash Type MPB MAD MSPE SMD R2 Single 0.28 1.91 16.85 542.65 0.6974 Multi 0.36 1.54 19.47 498.11 0.4094 All 0.70 2.90 64.80 684.69 0.6558 Table 14. Measures of goodness-of-fit for injury crashes.

Multi-Vehicle Crashes Single value AMFs were obtained for two variables: median barrier and left turn lane. The AMF for median barrier was 1.69, indicating that median barriers increase crash potential by 69% in comparison to roads without barriers. The AMF for left turn lanes was 1.57, indicating that the presence of left turn lanes increases crash potential by 57% in compar- ison to roads without one. Two additional continuous vari- ables entered the model; the AMFs for these are shown in Figures 1 and 2. All Crashes Single value AMFs were obtained for three variables: paved right shoulder, median barrier, and functional classification. The AMF for the paved right shoulder was 1.26 indicating that paved shoulders increase crashes by 26% in comparison to unpaved shoulders. For median barrier, the AMF was 2.18, indicating that median barriers increase crash potential by 118% in comparison to roads without barriers. The AMF for functional classification was 1.19, indicating that arterials increase crash potential by 19% in comparison to other roads. An additional continuous variable also entered the model; this AMF is shown in Figure 1. Undivided Highways Single-Vehicle Crashes A single value AMF was obtained for only one variable, paved shoulders. The AMF for this is 1.46, indicating a 46% increase for roads with paved shoulders in comparison to roads without paved shoulders. Multi-Vehicle Crashes A paved right shoulder was a predictor variable in this case; the AMF was 0.62 for paved shoulders, indicating a 38% reduction in comparison to unpaved shoulders. An AMF for the continuous variable shoulder width, the effect of this variable can be estimated from Figure 3. As used here, the shoulder width is the average width for both the left and right shoulders. All Crashes The only significant variable was the shoulder width for which the effect can be estimated from Figure 3. Injury Models In addition to all crashes, models were developed for injury- only crashes. These models followed the same data grouping as the all crashes (i.e., data were split into divided and undi- vided highways and single-, multi-, and all vehicle crashes). For undivided roadways, no variable was significant enough to be entered in the model other than the ADT. This indicates that none of the variables of concern had any significant influence on injury-only crashes on undivided roads. For divided roads, the models were very similar to those observed for all crashes. For the single-vehicle crashes, pres- ence of median barrier, functional class, shoulder width and presence of left turn lane had impacts. Most of these 25 Figure 1. AMFs for average shoulder width. Figure 2. AMFs for median width. Figure 3. AMF for undivided roadways, multi-vehicle crashes.

values had similar magnitudes, and all were slightly smaller than those obtained with the all crashes model. For the multi- vehicle crashes, only presence of median barrier and shoul- der width entered the model, and they had values similar to those observed before. This was also the case for the all crashes model. Summary The AMFs presented here follow the general trends of prior knowledge and research. In general, the trends of the variables used in all models showed an agreement with rational expec- tations indicating reasonable trends. A notable result was for divided highways where there was an increase in crashes (all three models) with the presence of a median barrier. This could be associated with the possible higher speeds that could be present on divided highways and, thus, the presence of the barrier could contribute to the occurrence of a crash. More- over, the fact that an obstacle is placed within the roadway environment is an indication of the potential increase in crashes since the absence of any barrier could have resulted in an unreported crash (i.e., the vehicle could have been able to return to the roadway and drive off). Therefore, this trend was considered acceptable. For divided highways, the presence of left-turn lanes provided different results depending on the crash type. For single-vehicle crashes, it showed an intuitive result indicating that its presence has a benefit (AMF of 0.72). For multi-vehicle crashes, it showed an increase in crashes (AMF of 1.57). This could be attributed to two possible issues: (1) the presence of higher operating speeds on divided rural highways and (2) the sites may not be truly rural sites, but are within a more built-up environment. These explanations, while plausible, cannot be verified with the available data, and so they cannot conclusively explain the counterintuitive nature of the results. A more critical issue with the AMFs developed in this research is their magnitude and whether such significant dif- ferences should be observed from the introduction or change of each of these elements. This issue was addressed by a meet- ing of the research team at which a reasonable magnitude was estimated by consensus for each selected design element. This approach allowed the research team to consider past research, weigh the findings of this research, and adjust the magni- tude of the AMFs as needed to reflect practical and research experience. This approach also facilitated the development of the guidelines based on the research results. 26

Next: Chapter 4 - Design Elements Recommendations »
Impact of Shoulder Width and Median Width on Safety Get This Book
×
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB’s National Cooperative Highway Research Program (NCHRP) Report 633: Impact of Shoulder Width and Median Width on Safety explores crash prediction models and accident modification factors for shoulder width and median width on rural four-lane roads.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!