National Academies Press: OpenBook

Guidelines for the Development and Application of Crash Modification Factors (2022)

Chapter: Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations

« Previous: Appendix F - Enhancing Future CMF Research
Page 361
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 361
Page 362
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 362
Page 363
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 363
Page 364
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 364
Page 365
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 365
Page 366
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 366
Page 367
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 367
Page 368
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 368
Page 369
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 369
Page 370
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 370
Page 371
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 371
Page 372
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 372
Page 373
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 373
Page 374
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 374
Page 375
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 375
Page 376
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 376
Page 377
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 377
Page 378
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 378
Page 379
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 379
Page 380
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 380
Page 381
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 381
Page 382
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 382
Page 383
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 383
Page 384
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 384
Page 385
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 385
Page 386
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 386
Page 387
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 387
Page 388
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 388
Page 389
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 389
Page 390
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 390
Page 391
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 391
Page 392
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 392
Page 393
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 393
Page 394
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 394
Page 395
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 395
Page 396
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 396
Page 397
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 397
Page 398
Suggested Citation:"Appendix G - Developing Consensus in Research About the Safety Effect of Manipulations." National Academies of Sciences, Engineering, and Medicine. 2022. Guidelines for the Development and Application of Crash Modification Factors. Washington, DC: The National Academies Press. doi: 10.17226/26408.
×
Page 398

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

G-1   This appendix was contributed by Dr. Ezra Hauer upon request by the Project 17-63 team. G-2 Introduction G-3 Chapter 1 Review of Past Research About the Safety Effect of Pavement Marking Retroreflectivity G-8 Carlson et al. (2013) G-11 Karwa et al. (2011) G-15 Donnell et al. (2009) G-17 Smadi et al. (2010) G-20 Smadi et al. (2008) G-22 Dravitzki et al. (2006) G-24 Bahar et al. (2006) and Masliah et al. (2007) G-28 Lessons of Past Research: Summary G-30 Chapter 2 Ways to Determine the Effect of a Cause G-30 Introduction G-32 Summary and Lessons G-34 References A P P E N D I X G Developing Consensus in Research About the Safety Effect of Manipulations

G-2 Introduction Crash modification factors (CMFs) allow one to anticipate the safety effect of manipulations (interventions, design changes, etc.). The main aim of Project 17-63 was to suggest ways to make CMF estimates more trustworthy and more widely applicable. CMF estimates are trustworthy if several research studies using reliable methods and different data sources come to similar conclusions; this is how professional consensus is established. Con- versely, when research studies tend to produce diverse CMF estimates one cannot know which should be used, and the road to consensus about the safety effect of manipulations is obstructed. Many believe that to determine the safety effect of a manipulation it is best to do the manipu- lation and then observe the result. However, many also believe that one can come to trustworthy conclusions about the safety effect of manipulations by comparing safety units that are found to have different traits. This is a typical setting in regression modeling using cross-section data. In many cases actual manipulations are so rare that this is the only viable option. In some cases, the desired consensus in the use of conventional before-and-after and regres- sion modeling approaches can be elusive. In this paper, the safety effects of pavement marking retroreflectivity (PMR) are examined. A review of eight prominent research studies considers both data and method in examining the safety effect of PMR. The essence of CMFs is that they capture the safety effect of causes. If the commonly used before-and-after and regression modeling approaches run into problems, there may be other approaches to the determination of the safety effect of causes that hold promise. In Chapter 2 of this appendix, the following question is addressed: How does one usually go about determining the effect of causes?

G-3   Review of Past Research About the Safety Effect of Pavement Marking Retroreflectivity In 2007, the nationwide annual pavement marking expenditure was estimated to be approxi- mately $2 billion (Carlson et al. 2009). This cost is directly proportional to how often the mark- ings are repainted. Were one to repaint at half the frequency,1 the cost (to the taxpayer) would be cut in half and so would be the revenue (of contractors and manufacturers). The decision of how frequently to repaint is based on various considerations, one of which is the effect of PMR on safety. The effect of PMR on safety is not well known. In 2013, Carlson et al. (2013) stated that their research “. . . was initiated to determine whether a correlation between pavement marking retroreflectivity and safety could be established . . .” and that “Previous research on this topic provided mixed results and sometimes counterintuitive findings.” The adoption of research paradigms that are insufficiently used in road safety research should be done by carrying the following two tasks: • Demonstrate how and why past research led to inconsistent conclusions and on this basis to claim that more research along similar lines is not likely to bring about consensus. • Suggest alternative paradigms that produced results in other fields and are both promising and workable in the road safety setting. This is report is devoted to the first task: demonstrating that past research about the safety effect of PMR did not find consensus. The role of this review is to discuss findings of past research and to highlight some generic problems that characterize the commonly used research paradigms (Avelar and Carlson 2014). This study was an examination of the association between PMR and nighttime crashes on two- lane rural roads in Michigan. The bulk of the data are from Carlson et al. (2013), to which information about horizontal alignment was added. A statistical model was formulated, and its parameters estimated. The modeling was expertly executed and is clearly and fully described. The work affords an opportunity to discuss issues that arise in this kind of modeling.2 A log-linear model equation was used. Parameters were estimated by maximizing the negative binomial likelihood function. The model evolved in four stages. In the first stage, three predictor variables (Segment Length, AADT, and Speed Limit) were selected for the base model. All three parameter estimates had the expected sign and reached statistical significance. Issues of interest arise in Stage 2. C H A P T E R 1 1 About the pavement marking maintenance policy of California, Carlson et al. (2009) say that “. . . they restripe their higher- volume highways up to three times a year with paint, or every two years with thermoplastic markings.” “The Michigan DOT restripes about 85% of its system with paint each year’’ according to Carlson et al. 2013 (p. 60). 2 The fitting of multicovariate single-equation regression models based on observational cross-section data.

G-4 Guidelines for the Development and Application of Crash Modification Factors In Stage 2, two PMR predictor variables were added to the ‘base model’ as multipliers: exp(White PMR × β_White) and exp(Yellow PMR × β_Yellow). The estimate of βWhite was 1.11 × 10−3 (with p < 0.05), and the estimate of βYellow was −1.62 × 10−3 (with p > 0.1). At this point two general issues came to the fore. First, the estimate of βWhite is positive and this means that sites where the white pavement markings are more retroreflective tend to have more accidents. If this finding is interpreted as representing cause, not a mere association, it would be contrary to prior expectation. Simi- lar counterintuitive findings will be encountered in the results of others whose work will be reviewed later. The second general issue stems from the first: If a regression parameter estimate is suspect it might be biased. One of the main causes of parameter bias is the absence from the model equa- tion of important predictor variables: the omitted variable bias (OVB). The question is whether there is an obvious omitted predictor variable in Avelar and Carlson (2014), and, therefore, whether there might be a corresponding OVB in its results. Issue 1—Prior Expectations Avelar and Carlson (2014) say that “(r)etroreflectivity of longitudinal pavement markings is expected to improve safety on rural highways.” The positive sign of the estimate of βWhite is unexpected. A variety of responses are as follows: Some remove the problematic predictor variables from the model; some search for reasons why the result could be correct (or incorrect) and leave it standing; still others continue modeling until a satisfactory result emerges. In the case of PMR modeling, both common sense and traditional ergonomic beliefs bring about the expectation that better visibility → fewer crashes. A countervailing (adaptation-based) expectation is the following: better visibility → faster and/or less attentive driving → all out- comes are possible (fewer crashes, more crashes, or no change.) While making modeling decisions, the modeler can add and delete predictor variables, alter the function by which the predictor variables combine, etc. In this sense, the result could pro- duce conclusions that are in line with what the modeler expected before analyzing the data. As in the case of PMR modeling, there are two plausible but contradictory prior expectations the question of what the modeler should do. Avelar and Carlson (2014) suspect “the straight interpretation of these coefficients.”3 Had they stopped modeling at Stage 2, the conclusion that more retroreflective white pavement markings are associated with more crashes could be made. The review of their third and fourth stage of modeling will resume following the discussion of the OVB. The question is whether the regression coefficient is questionable because missing from the Stage 2 model are some impor- tant predictor variables. Issue 2—The OVB Fifty predictor variables (lane width, roadway width, shoulder width, percentage of commut- ing traffic, median width, type of shoulder, etc.) were available to Avelar and Carlson (2014) for modeling. Of these, in the first stage of modeling, three (Segment Length, AADT, and Speed Limit) were chosen for the base model. Then, in Stage 2, the retroreflectivity of white and yellow markings were added. 3 The Stage 2 estimates of βWhite (1.11 × 10−3 with p < 0.05) and of βYellow (−1.62 × 10−3 with p > 0.1).

Developing Consensus in Research About the Safety Effect of Manipulations G-5   Data about retroreflectivity and crashes was used for two periods: April and May and September and October.4 Restriping in Michigan is done mostly during June and August. It follows that low PMR values occur mostly in the April–May period (before restriping) and most of the high PMR values are in the (after-restriping) September–October period. In other words, PMR and Period are correlated. However, the Period predictor variable is not in the model equation; it was “initially considered [for inclusion in the model equation] but later dropped based on the Akaike Information Criterion.” This omission creates the conditions for the OVB. Traffic, weather, and many other safety-related factors change between the April–May and the September–October periods. The corresponding changes in safety are usually captured by seasonal variation factors. The seasonal variation factors for Michigan in April–May are about 0.8 and in September–October about 1.0.5 If there is no Period predictor variable in the model, and because Period and PMR are correlated, the regression coefficient of the PMRs represents an uncertain mixture of the influence of retroreflectivity and of seasonal variation. One cannot say what change in target accidents is due to a change in PMR (i.e., due to restriping) and what is due to seasonal variation. The bias due to the omission of a correlated predictor variable can be problematic in regres- sion modeling. To guard against it one must account for the influence on the dependent variable of all predictor variables that are associated with the predictor variable of interest, PMR. Review of Avelar and Carlson (2014), Part 1 In Stage 3, Avelar and Carlson (2014) added the interaction term exp((PMR White) × (PMR Yellow) × β_(White&Yellow)) to exp((PMR White) × βWhite) × exp((PMR Yellow) × β_Yellow). The estimate of βWhite was 5.636 × 10−3 (with p < 0.1), of βYellow it was 6.427 × 10−3 (with p < 0.001), and of βWhite × Yellow it was –2.635 × 10−5 (with p < 0.001).6 All parameter estimates are now statistically significant, and the Akaike Information Criterion (AIC) is enhanced. Avelar and Carlson (2014) consider the Stage 3 model to be a “mild improvement.” However, as shown in Figure G1, the results still run counter to what Avelar and Carlson (2014) said was expected. Going down the column values for all 4 In each of the six years from 2003 to 2008. 5 The values are for all accidents in 2014. The corresponding values for the target accidents of the Avelar and Carlson (2014) study may be somewhat different. 6 The addition of the interaction term diminished the Akaike Information Criterion (AIC) slightly and may be a marginal improvement. However, the Bayesian Information Criterion (BIC) was substantially increased, and on this more stringent score the addition of the interaction term would be considered unjustified. Figure G1. Values of exp((PMR White) ë a_White) ë exp((PMR Yellow) ë a_Yellow) ë exp((PMR White) ë (PMR Yellow) ë a_(White&Yellow))

G-6 Guidelines for the Development and Application of Crash Modification Factors PMRYellow < 225 mcd/m2/lx), the larger PMRWhite is, the more target crashes are predicted by the model. Also, for all PMRWhite < 200 mcd/m2/lx, the larger PMRYellow is, the more target crashes are predicted by the model. Issue 3—When to Stop Modeling The base model from Avelar and Carlson (2014) is made of three predictor variables. The two PMR predictor variables were added in the second model. The interaction term was added to create the third model. This third model is thought to be better than the preceding two. While it is not consistent with the expectation that more visibility leads to improved safety it is not inconsistent with the expectation that more visibility causes adaptation and therefore all out- comes are possible. There are two main motivations for continuing: one is to determine whether the fit can be further improved; the other motivation is to determine which of the alternative prior expecta- tions or theories is better supported by the data. How prior expectation may shape the outcome of modeling is discussed under Issue 1. Here, under Issue 3, the question does not have a clear answer.7 Review of Avelar and Carlson (2014), Part 2 The refinement leading to the Stage 4 model consisted of transforming the PMRWhite and PMRYellow predictor variables into two new predictor variables: Sum ≡ PMRWhite + PMRYellow−325 and Diff ≡ PMRWhite−PMRYellow−80. With this transformation, the parameter estimates were βSum = 5.205 × 10−4, βDiff = 4.595 × 10−3 (with p < 0.01), and βSum&Diff = −1.780 × 10−5(with p < 0.01). The transformation increased the likelihood, while the number of parameters remained unchanged8 even though the statistical significance of the parameter estimate was degraded compared with the estimate used in the earlier model. However, as the results in Figure G2 show, for all PMRYellow, the predicted number of crashes is an increasing function of PMRWhite. There are many other functions of PMRWhite and PMRYellow that could have been tried and transformations other than ‘Sum” and “Diff’ examined, but Avelar and Carlson (2014) stopped modeling here. 7 The Akaike or Bayesian Information criteria can be used for this purpose as long as the objective of estimation and distribu- tional assumptions do not change. 8 For this reason, both the Akaike and Bayesian Information Criteria improved. Using the AIC, one would think the Stage 4 model best, but when the more stringent BIC is used, the bare-bones base model would be chosen and the PMRs would not enter the model. Figure G2 Values of exp((Sum - 325) a_(Sum - 325)) ë exp((Diff - 80) ë a_(Diff - 80)) ë exp((Sum - 325) ë (Diff - 80) ë a_(Sum - 325& Diff - 80)).

Developing Consensus in Research About the Safety Effect of Manipulations G-7   With model development concluded, Avelar and Carlson (2014) asserted that the part of the model equation contains the two PMR predictor variables; i.e., the product exp((Sum − 325) β_(Sum − 325)) × exp((Diff − 80) × β_(Diff − 80)) × exp((Sum − 325) × (Diff − 80) × β_(Sum − 325&Diff − 80)) “can be interpreted directly as Crash Modification Function (CMF).” When a CMF is extracted from a model equation, one asks what change on the left-hand side of the model equation is caused by a treatment that changes the value of a predictor variable on its right-hand side, or whether the CMF is about a cause and its effect. Opinions differ about whether regression equations built from cross-section data can be used to predict the effect of treatments. Issue 4—Can One Get the CMF of PMR from Models Based on Cross-Section Data? Based on their Stage 4 model, Avelar and Carlson (2014) concluded that “The safety association of retroreflectivity can be described as a function of two factors: the sum of white and yellow retroreflectivity levels, and the difference between white and yellow retroreflectivities. The first factor can be understood as a measure of total brightness; the second can be viewed as a measure of how much less bright the yellow center line is compared to the white edge line.” Avelar and Carlson (2014) discuss an association between safety and retroreflectivity. As is said repeatedly, “If you know one thing about the word [‘]correlation[‘] it’s that correlation does not imply causation” (Ellenberg 2014, p. 335). In other words, that the association found in the cross-section data set assembled by Avelar and Carlson (2014) is a specific function of Sum and Diff does not mean that the safety effect of changing the PMRs is caused by the Sum and Diff predictor variables, nor does it mean that the safety effect of a change in Sum or Diff is predicted by the function that Avelar and Carlson (2014) found to fit the data well. To illustrate, by the Stage 4 model, roads found to have PMRWhite = 350 mcd/m2/lx and PMRYellow = 100 mcd/m2/lx are expected to have twice as many accidents (1.60/0.80 = 2.00, Fig- ure G2) as roads found to have PMRWhite = 150 mcd/m2/lx and PMRYellow = 100 mcd/m2/lx. This does not mean that repainting the white edgeline of a road and thereby increasing its PMR from 150 to 350 mcd/m2/lx while leaving the yellow centerline at PMR = 100 mcd/m2/lx will cause a doubling of the expected accident frequency. When Avelar and Carlson (2014) say that their model “can be interpreted directly as Crash Modification Function (CMF)” they cross the boundary from association to causation. There is no pre-existing theory or commonly held scientific opinion saying that it is the sum and the difference of retroreflectivities which are the causal agents of accident occurrence. Even if improvement in goodness of fit show that the prospects of causality are thereby enhanced, the Stage 4 model is not a better fit than the base model9 from which the PMRs are absent. Because model equations look like familiar mathematical expressions, some may use them to compute the effect of a treatment by changing the magnitude of a predictor variable. Regression equa- tions capture associations and may not be used in this manner unless some difficult-to-satisfy conditions are met.10 Since the Period predictor variable is not in the Avelar and Carlson (2014) model, the estimates βWhite and βYellow represent not only the influence on crash frequency of PMR but also all causal 9 While AIC improves, the more stringent BIC deteriorates. 10 Confidence in the causal interpretation of a single-equation regression model depends, among other things, on how good and complete is the information about all major causal predictor variables, on the extent to which the causal web of accident generation can be approximated by a single algebraic equation, and on whether the form of the chosen equation can adequately represent process of accident generation. None of these conditions is fully met in the Avelar and Carlson (2014) model.

G-8 Guidelines for the Development and Application of Crash Modification Factors factors (traffic, climate, etc.) that are different in the April-May and the September-October periods. Because this and other alternative explanations may exist,11 the aforementioned asso- ciation may not be accepted as causal. Issue 4 goes to the heart of the matter. If single-equation regression models built from cross- section data cannot reliably capture cause and effect, they will continue to produce diverse CMFs and thereby impede progress toward evidence-based consensus. Review of Avelar and Carlson (2014), Conclusion In their summary, Avelar and Carlson (2014) say that the “Results indicate that, in general, retroreflectivity of pavement markings at two way rural roads in Michigan tend to relate to safety.” This statement reflects that when the PMR predictor variables are first introduced (into the Stage 2 model) information was lost (as measured by the AIC) and that even using the Sum and Diff transformed predictor variables in the final (Stage 4) model, the posterior probability of the data (measured by the Bayesian information criterion [BIC]) is less than for the base model from which the PMRs are altogether absent. Avelar and Carlson (2014) say that “. . . there is statistical evidence of an increased risk of nighttime crashes for sites with yellow markings having retroreflectivity levels less than 165 mcd/m2/lx;” the same is evident in Figure G2. However, Figure G2 also contains statistical evidence that the larger the retroreflectivity of the white markings, the larger is the risk of the nighttime crashes. Avelar and Carlson (2014) chose the model in which the predictor variables are the “sum of retroreflectivity values, which represents how bright both pavement markings are in general; and the difference of retroreflectivity values, which represents how much less bright the yellow center line is in comparison to the white edge line.” The support for this model comes only from the increase of the log-likelihood of the data. Only if future research will show a causal link between the Sum and Diff predictor variables, driver responses, and crash occurrence mecha- nisms, will the corresponding CMF become trustworthy. Most of the data in the Avelar and Carlson (2014) analysis came from an earlier study (Carlson et al. 2013) where detail is given about the limitations of data quality. Carlson et al. (2013) The blurring between association, correlation, and causation in Avelar and Carlson (2014) seems to stem from its predecessor publication, which says that “[t]his research was initiated to determine whether a correlation between pavement marking retroreflectivity and safety could be established” (p. 59), but later that “[t]he objective of this research was to evaluate relation- ships between crashes and longitudinal pavement marking retroreflectivity” (p. 60). By rela- tionships the authors mean statements such as that the “effect of WEdge retroreflectivity on single-vehicle nighttime crashes was statistically significant at α = 0.1. The estimated coefficient for WEdge retroreflectivity for single-vehicle nighttime crashes is approximately −0.0009, which corresponds to the percent crash reduction of 0.09%, with an equation of (1 − e−0.0009) × 100, for each increase in retroreflectivity value of 1 mcd/m2/lx. If WEdge retroreflectivity increases by 10 mcd/m2/lx and by 100 mcd/m2/lx, the percent reduction in single-vehicle nighttime crashes is 0.9% and 8.6%, respectively, over the range from 150 to 450 cd/m2/lx” (p.64). These effects 11 Michigan extends in latitude from 41° 41′ N to 48° 18′ N. If pavement remarking begins earlier in the Southern part (in the Metro or University regions) and later in the Northern part of the state, region-based alternative explanations may exist.

Developing Consensus in Research About the Safety Effect of Manipulations G-9   and relationships are given action-oriented causal meaning in the Conclusions section (p. 65) by saying that they “. . . support to the positive safety effects of maintaining retroreflectivity of pavement markings.” The Carlson et al. (2013) paper has a comprehensive description of the data (Michigan 2002– 2008) and of its limitations, but it is still not possible to reliably determine what average retro- reflectivity prevailed on a road section in a certain month or period. Issue 5—The Quality of Retroreflectivity Data Problems with the data may be caused by the fact that Michigan DOT measures PMR mostly in September and October. Therefore, most average retroreflectivity values for the statistical analysis in Carlson et al. (2013) were determined by using temporal and spatial imputation. The retroreflectivity for April was usually taken to be the value measured in the previous October or September from which some fixed value12 was subtracted. If the imputed value was negative, it was reset to 50 mcd/m2/lx. If a road segment had no measured retroreflectivity values, the value from a neighboring segment within a 2-mile range was used. In addition, while most Michigan roads are restriped between June and August, the date of restriping is not recorded. This required another layer of assumptions to be made. Carlson et al. (2013) concluded that “Limitations on the availability of measured retroreflectivity levels hinder the ability to establish relationships between crashes and retroreflectivity. . .” (p. 65). The association between retroreflectivity and crash frequency is therefore viewed through the lens of imputations and assumptions.13 Review of Carlson et al. (2013), Part 1 Unlike the follow-up paper (Avelar and Carlson 2014), the statistical analysis by Carlson et al. (2013) has less documentation. Negative binomial regression was used with segment length, right shoulder width, lane width, terrain, AADT, and PMRs14 as predictor variables. Judging by the results, that part of the model equation which represents retroreflectivities consisted of multi- pliers exp(β × PMR) where β is the regression parameter for the PMRs WEdge, WLane, YEdge, or YCntr. No details about the form of the model equation for the other predictor variables, or about the corresponding regression parameters are in the paper. The only information given is whether a β is statistically significant and, if yes, then its estimated value is provided. The data were organized and analyzed in several categories. In database A the road segments were those from the Michigan crash data base; in database B the road segments were from the Michigan PMR data base. Within each database (A and B), separate models were estimated for two-lane roads and for freeways. In some cases, separate models for nighttime and single-vehicle nighttime crashes were attempted. Results can be found in Table G1. Issue 6—Should Only Statistically Significant Results be Considered? When reporting the regression parameters of their models, Avelar and Carlson (2014, Table 2) list and use all estimates, including those that are not statistically significant.15 Thus, while their 12 WEdge = −22 mcd/month, WLane = −22.3 mcd/month, YCntr = −8.9 mcd/month, and YEdge = −10.7 mcd/month. 13 Not only does the imputation create uncertainty (variance) it may create confounding and bias. Thus, the imputed monthly amount by which retroreflectivity is assumed by Carlson et al. (2013) to diminish does not depend on the amount of traffic. This may make high-traffic road sections seem to have unrealistically high PMRs. Since high-traffic roads differ from low- traffic roads in many safety-related traits, conditions for confounding exist. 14 The retroreflectivity data consist of the measurements of pavement markings representing white edge lines (WEdge), white lane lines (WLane), yellow edge lines (YEdge), and yellow centerlines (YCntr). 15 They describe the statistical significance of parameter estimates by a superscript code: ° when α < 0.1, * when α < 0.05, ** when α < 0.01, *** when α < 0.001.

G-10 Guidelines for the Development and Application of Crash Modification Factors Sum predictor variable was not statistically significant, its value was reported and, moreover, the same predictor variable and parameter estimate was then used in their CMF. In contrast, Carlson et al. (2013) neither report on nor make use of any regression parameter that is not statistically significant. Statistical significance testing has been controversial for nearly a century. Experts disagree about the soundness of its logical foundations and about its usefulness in science and practice. A common misinterpretation is the notion that if a parameter estimate is not statistically sig- nificant it is best to assume that it is zero. On the contrary, unless there is some prior ground for assuming that the magnitude of a parameter is 0, when a data-based parameter estimate is not zero there is no statistical reason to assume that it is. Statistical significance is not a means for separating Effect from No Effect. A related misinterpretation is that the level of significance represents the probability that the parameter is not zero; that is, if, for example, α = 0.05, then there is a 95% chance that the parameter estimate is 0.16 The estimate of a parameter is usually related to effect size. The absence of its statistical sig- nificance does not mean that the effect is 0, that it is likely to be 0, that no effect exists, that the effect is not real, or that it is unimportant. Review of Carlson et al. (2013), Conclusion In Table G1, the authors caution that “Because there were almost no nighttime fatal plus injury crashes and single-vehicle nighttime fatal plus injury crashes . . . models . . . could not be estimated reliably.” (pp. 63–64). It is unknown how the segment length, right shoulder width, lane width, terrain, and AADT predictor variables feature in the model. Also unknown are the correspond- ing βs for all WEdge, WLane, YEdge, and YCntr predictor variables, as well as information about the quality of fit, log-likelihood, AIC, BIC, etc. The βs that Carlson et al. (2013) report are negative. A negative β supports the belief that roads found to have better PMR tend to have fewer crashes; a positive β would be an indication to the contrary. Using similar data as Carlson et al. (2013), Avelar and Carlson (2014) found for white Database A Database B Two-lane Freeway Two-lane Freeway Data: All, two periods, βWedge and βWCntr not significant Data: All and all subsets. None of the βs was statistically significant at α=0.1 Data: All, Monthly. βWEdge (α=0.1) estimate not given. Data: All, Monthly. For single-vehicle nighttime βWEdge = -0.0009 (α=0.1). Data: YCntr<200, βWEdge and βWCntr not significant Data: All, two periods. βWEdge= -0.001 (α=0.1) Data: All, two periods. For single-vehicle nighttime βWEdge = -0.0013 βYEdge= -0.0009 (α=0.1) Data: YCntr<150 Nighttime βWCntr= -0.0066 (α=0.1); single-vehicle nighttime βWCntr= - 0.0061 (α=0.05) Data: Subsets, two periods. Results similar to those provided above. Data: PMR<200, two periods, βWLane= -0.0027 (α=0.1) Table G1. Summary of reported findings. 16 “The casual view of the P value as posterior probability of the truth of the null hypothesis is false and not even close to valid under any reasonable model, yet this misunderstanding exists even in high-stakes settings. . .” (Gelman 2013).

Developing Consensus in Research About the Safety Effect of Manipulations G-11   PMRs a statistically significant positive β. Other researchers using data from different states also typically found a mixture of positive and negative β estimates. Segment length, right shoulder width, lane width, terrain, AADT, and PMRs17 served as pre- dictor variables. However, differences between months or periods (April–May and September– October) were not represented in the models.18 In Michigan, most low PMR values occur in the April–May period (before restriping) while most of the high PMR values are in the (after-restriping) September–October period; the effects of retroreflectivity and seasonal differences cannot be separated. As in the Avelar and Carlson (2014) paper reviewed earlier, due the presence of the OVB, the estimates of the βs represent the joint effect of retroreflectivity and seasonal variations in ADT, precipitation, and other predictor variables. The estimates in Table G1 seem to indicate that when PMR is low (less than 150 mcd/m2/lux), roads found to have higher PMR tend to have fewer target crashes. Taking these results, it can be concluded, as Carlson et al. (2013) do, that the findings “lend support to the positive safety effects of maintaining retroreflectivity of pavement markings” and that “the findings provide evidence that maintenance of pavement marking retroreflectivity can have a positive effect on safety.” Karwa et al. (2011) Avelar and Carlson (2014) and Carlson et al. (2013) do not justify the transition from the description of associations to the making of statements about the causal effect of treatments. In contrast, Karwa et al. (2011) state that “[i]t is well known that causal propositions . . . are best estimated from randomized experiments” (p. 1) but recognize that “[t]he types of data available in transportation safety studies are primarily observational, which makes it difficult to consistently estimate causal effects of countermeasures.” Furthermore, they observe that “[c]ausal inference methods in transportation safety studies have received little attention,” but note that there are “two commonly used causal inference frameworks” that can be applied to observational studies in transportation safety. Karwa et al. (2011) apply both frameworks to the estimation of the causal effect of PMR on target crashes. Because causal inference is the main motivation for research on road safety, and because Karwa et al. (2011) discuss the possibility of drawing causal conclusions from observational road safety data, their work is seminal for this paper. The Karwa et al. (2011) data come from North Carolina and include the PMR, crash counts per month, and 12 other predictor variables (shoulder width, number of lanes, presence of a median, monthly traffic volumes, percentage of trucks, geographic district, urban or rural setting, terrain type, and others) for 192 road segments. The setting is conceptualized as a hypothetical experiment in which the treatment is a change of PMR from (a) Low to Medium, (b) Medium to High, or (c) Low to High.19 The effect of these treatments is estimated by Karwa et al. (2011) using both the Potential Outcomes (PO) and Causal Diagrams (CD) frameworks. The PO framework is gaining prominence in those disciplines and circumstances in which the conduct of experiments is difficult and observational data are easier to come by. The core of 17 The retroreflectivity data consist of the measurements of pavement markings representing white edge lines (WEdge), white lane lines (WLane), yellow edge lines (YEdge), and yellow centerlines (YCntr). 18 The model equations are not given in the paper. 19 Values are in mcd/m2/lux. Low if 139 < PMR < 200, Medium if 200 < PMR < 280, High if 280 < PMR < 447. Instead of the hypothetical treatment being “from Low to High,” it can be equivalently “from High to Low.”

G-12 Guidelines for the Development and Application of Crash Modification Factors the PO framework is treatment of observational data as coming from a randomized experiment if for each unit (road segment in the PMR context) one knew the probability of its being treated or being left as a control.20 This probability is called the propensity score.21 The propensity score is assumed to depend on the traits of units and is estimated using their known traits (predictor variables). Within the PO framework the propensity score is used to either match comparable units, to create subpopulations with similar propensity scores, or to compute appropriate weights when calculating the CMF.22 When units are assigned to the treated and to the control groups by randomization, and if the number of units in both groups is very large, the distribution of outcome-affecting unit traits in both groups will be similar. In observational studies, the danger is that the difference in outcome between the treated and non-treated groups is partly due to the treatment and partly due to differences between the two groups in the distribution of outcome-affecting traits. The hope is that by using propensity scores in observational studies, a balance of outcome-affecting traits is rendered, like that obtainable by randomization. Without using the propensity scores in observational studies, the transition from association to causation is unjustified. The estimation of propensity scores is a distinctive feature of the PO framework.23 The description of the specific implementation of the PO framework by Karwa et al. (2011) is limited. Usually, one can only know what approach or software package was used. Thus, one can know that the propensity scores and the safety outcome model24 were estimated by the “Generalized Boosting Model [GBM; McCaffrey, Ridgeway, and Morral 2004], a multivariate nonparametric technique. . .” using “the ’twang’ package in R [Ridgeway, McCaffrey, and Morral, 2006].” At the core of these techniques is the consecutive splitting of the data, predictor variable after predictor variable, into subgroups that are as homogeneous in the outcome covariate as pos- sible. Karwa et al. (2011) say that to predict the safety of a road segment, “[w]e again make use of GBM25 to estimate a single regression model by using an indicator for the treatment. All covariates listed in Figure 126 are included in the model, and we choose the interactions implicitly. Over- fitting is avoided by using 10-fold cross-validation and out-of-bag estimation (Ridgeway, 2007).” Being non-parametric at its core, the model has the advantage of the outcome prediction not being forced to fit a regression equation. For the PO framework, the average causal effect (ACE) estimates are in Figure G3. To illustrate, the ACE of reducing PMR from High to Low is to increase the probability of a road section to have more than zero accidents by a factor of 2.53 (when the Individual prediction27 method was used).28 20 Thus, e.g., in the Karwa et al. (2011) paper, treated could refer to having a high PMR and control as having a low PMR. 21 Karwa et al. (2011) comment that “[i]t is important to understand the role propensity scores play in the PO framework. We want to compare two segments that are exactly the same in all possible ways except for the level of PMR. We can do this by matching segments that have the exact same value of covariates. However, there are many different covariates and with the limited amount of data, we may be left with only one data point that matches exactly. Hence to avoid this issue, we need to find a proxy variable to match on. Rubin and others have shown that matching on true propensity scores is equivalent to matching on all covariates.” 22 The CMF is usually called the average causal effect (ACE) or average treatment effect (ATE). 23 The PO framework is the subject of a vast literature, such as Rubin (1997). 24 “The safety of a segment is stochastic and each segment has a fixed probability p of at least one target crash occurring, which is assumed to be an inherent property of the road segment.” 25 The GBM mentioned earlier. 26 The listed predictor variables are: Urban, Truck, Terrain, Rt. Shoulder, Multilane, Median, Age, and ADT. 27 “. . . in the combined method, the complete data are used to estimate the potential outcomes under treatment and control application, irrespective of the actual assignment, whereas in the individual method the potential outcomes under treatment and control are estimated by using that part of the data which actually received the treatment and control assignment, respectively.” 28 Let µhigh be the expected target crash frequency of a segment when PMR is High and µlow the corresponding value when PMR is Low. If µhigh = 0.1 target crashes/year, what must be the µlow for ACE to be 2.53? Assuming that target crashes are Poisson distributed, µlow must be 0.27, almost a threefold increase. If µhigh = 0.2, the increase is by a factor of 13. These numbers may indicate a problem.

Developing Consensus in Research About the Safety Effect of Manipulations G-13   The PMR ranges for the three categories are in Footnote 2. The implication of the evident dose-response relationship is that the higher the PMR the lesser is the probability of target accident29 occurrence. The other framework for justifying causal inferences from observational data is referred to by Karwa et al. (2011) as Causal Diagrams or CDs. In the implementation of the CD framework, the causal relationships extracted from the North Carolina data are shown in Figure G4. Thus, the data indicate that Age and % Trucks are causally related to PMR, and Terrain to Right shoulder width. While PMR is causally related to Safety, ADT is not.30 The ACE estimates for the CD framework are in Figure G5. To illustrate, the ACE of reducing PMR from High to Low is estimated to increase the probability of a road section to have more than zero accidents by a factor of 3.12. The estimates in Figure G5 are like those in Figure G3. In Figure G4, there is not a causal link between ADT and safety demands. Karwa et al. (2011)31 set the ADT predictor variable to 0 when the annual average daily traffic (AADT) was less than 30,000 and to 1 otherwise, for both the PO and the CD computations. The net effect of making ADT into a binary predictor variable is that for all two-lane roads32 (comprising about half of the sample) and some of the four lane roads, the ADT=0. This may explain the absence of the causal link between ADT and safety. All other predictor variables were also categorical.33 The Trucks predictor variable was 0 if the percentage of trucks was ≤ 18 and 1 otherwise, Shoulder was 0 if ≤ 7 ft and 1 otherwise, etc. Karwa et al. (2011) provide 95% confidence limits for their ACE estimates. However, these limits are computed as if all modeling assumptions were correct, and if the assumptions are many, the estimated limits may be too tight.34 Figure G3. ACE estimates for the Potential Outcomes framework. 29 Accident that “occurred during dusk, dawn or at night; dry roadway surface conditions; ran-off-the-road crashes; fixed object crashes (off-road); and opposite- or same-direction sideswipe crashes.” 30 On this the authors note that: “[i]t was surprising to see the safety of a segment unaffected by ADT in this particular DAG, since it is commonly observed that the higher the ADT, the higher the probability of a target crash on a segment. However, two of the top 10 graphs show that ADT does indeed affect safety. Other possible reasons could be the fact that ADT has been dis- cretized into two levels, and hence possess high correlation with the Multilane variable. Similar problems were encountered in the PO framework. Segments with more than two lanes generally have high ADT. This could be the reason why the Multilane indicator affects safety in 8 out of the 10 highest scoring models.” In the matching based estimators where the continuous version of ADT is used, ADT is a significant predictor of safety. 31 Described in their Table 2. 32 Two-lane roads typically carry about 2,000−3,000 vehicles per day (vpd); very few have traffic volumes exceeding 10,000 vpd. 33 The reason for the use of categorical predictor variables is that the purpose of this research was to compare results obtained by the PO and CD approaches and, at the time the research was done, only discrete predictor variables could be used in the Causal Bayesian Network Learning algorithms. 34 Karwa et al. (2011) comment that “. . . one cannot draw causal conclusions without any assumptions. . . . A good practice in causal modeling is to acknowledge the need for assumptions and mention them explicitly. “

G-14 Guidelines for the Development and Application of Crash Modification Factors In regression models, ADT usually accounts for about 80% of the variance. Researchers attempted to make the counterfactual safety prediction independent of the amount of traffic volume. The Boosting Model that Karwa et al. (2011) used to generate these propensity scores and outcome predictions relies on optimally splitting data into ever more homogeneous sub- populations. But with only two categories for most predictor variables,35 the benefit of optimal splitting is circumscribed. Furthermore, the counterfactual outcome predictions are bound to have a large variance. When the variance of counterfactual predictions is large, and when simi- larly uncertain and possibly correlated propensity scores are used as weights, bias is possible. Another potential source of bias is the neglect of seasonal variation. In the data used by Karwa et al. (2011), time (the Age predictor variable) is divided into months, and the PMRs estimates are also month-by-month. However, by making ADT into a binary predictor variable, the monthly differences in traffic are not used; if the binarized ADT of a certain road section is 0 (1) in one month, it will almost always be 0 (1) in all other months. However, there are real monthly changes in traffic flow, precipitation, and many other factors, and these cause real seasonal Figure G4. The best scoring Directed Acyclic Graph from Figure 3 in Karwa et al. (2011). Figure G5. ACE estimates for the Causal Diagram framework. 35 Only the PMR AGE predictor variable has three categories: 0 if between 0 and 9 months, 1 if between 10 and 20, and 2 otherwise.

Developing Consensus in Research About the Safety Effect of Manipulations G-15   differences in road safety (e.g., see Roess et al. 2004). Because Karwa et al. (2011) do not have predictor variables to represent these factors, what they ascribe to be the causal effect of PMR is partly due to the factors that cause seasonal variation in safety. Therefore, the ACE estimates in Figure G3 and Figure G5 cannot be ascribed to PMR alone. The missing causal arrow in Figure G4 may be an indication of potential problems due to binarizing the ADT (and of most other predictor variables) in the Propensity Score and CD estimates. The consequent potential for confounding was not clear in the paper. Issue 7—Choices, Assumptions, and Uncertainties Modeling is the transformation of data into findings. While modeling, one makes choices and assumptions. Thus, the modeler may choose between maximizing likelihood or minimizing squared differences, decide whether Terrain should or should not be a predictor variable, etc. The modeler may assume that safety in the population of units is gamma distributed, and assume that a log-normal model equation correctly represents the dependence of safety on the available predictor variables. In this sense the findings and conclusions are a function of the choices and assumptions that the modeler makes. Review of Karwa et al. (2011), Conclusion The main finding by Karwa et al. (2011) is that the more retroreflective the markings, the lower the probability of target accident occurrence. Thus, Karwa et al. (2011) find that increas- ing PMR from Low (139 < PMR < 200 mcd/m2/lux) to High (280 < PMR < 447 mcd/m2/lux) is estimated to lessen this probability by a factor of about 3. This finding is different from the findings of Avelar and Carlson (2014) who, using data from Michigan, found that the increase in accident frequency as a function of PMR deterioration is noticeable mainly when retroreflectivity is below 200 and 150 mcd/m2/lx. Karwa et al. (2011) emphasize justifying the causal interpretation of observational road-safety data. Unlike most other researchers of road safety, Karwa et al. (2011) confront the difference and show that to allow that estimation of Crash Modification Factors (ACEs), the additional modeling work required by the PO or the CD framework must be undertaken. Donnell et al. (2009) Donnell et al. (2009) rely on the same data as Karwa et al. (2011), but their focus and approach are different. As in Avelar and Carlson (2014) and Carlson et al. (2013), reviewed earlier, the Donnell et al. (2009) approach is to fit to the cross-section data a single-equation regression model36 and thereby aim to provide “[a]n assessment of the statistical association between pave- ment marking retroreflectivity and expected nighttime crashes. . .” (p. 50). To develop the requisite database, Donnell et al. (2009) developed a procedure37 for predict- ing how PMR degrades over time. The monthly estimates of PMR from this model were then appended to roadway inventory and target crash frequency data. The strength of the Donnell et al. (2009) modeling effort is twofold. First, as noted in the review of Carlson et al. (2013), the assignment of a PMR value to a segment is problematic as it often 36 In the chosen mode, the expected number of target crashes is proportional to segment length and monthly ADT is propor- tional to the power of β1 and the product of eβ1xi where the βs are regression parameters and the xi are the predictor variables % trucks, width of right shoulder, and the PMRs of the white edgeline, white skipline, and yellow edgeline. 37 The procedure is based on the use of artificial neural networks.

G-16 Guidelines for the Development and Application of Crash Modification Factors requires various assumption-based imputations to be made. Here, in contrast, at each of the 19 sites (182 miles, 313 target crashes) that make up the Donnell et al. (2009), database, the PMR was measured at least twice over a minimum of one year. Second, the database and its analysis are month-by-month. This ensures that at least a part38 of the concern about the seasonal-factor confounding39 is eliminated. The results obtained by Donnell et al. (2009) are summarized in Table G2. Most regression coefficients for PMR are negative. This indicates that after the influence of ADT, % trucks, and right shoulder width is accounted for, segments found to have a more retro- reflective pavement markings tend to have fewer target crashes. The exception is the estimate of β for yellow edgelines on multilane highways, where the opposite holds.40 In much of the paper, Donnell et al. (2009) use the non-causal term association and refer to PMR being “related to” an increase or decrease in crash frequency. Donnell et al. (2009) say that “. . . a 50-unit increase in pavement marking retroreflectivity of white edgelines on multilane roadways reduces the expected nighttime target crash frequency by approximately 18 percent. . .” (p. 58). To be consistent, Donnell et al. (2009) conclude that similarly increas- ing the retroreflectivity of yellow edgelines increases the expected nighttime target crash fre- quency. Using data from Michigan, Avelar and Carlson (2014) obtained the obverse result: a positive βwhite and a negative βyellow. Issue 8—Consistency in Regression Coefficients It is possible but unlikely that increasing the visibility of white edgelines on multilane roads in North Carolina saves accidents, whereas increasing the visibility of yellow edgelines has the opposite effect. It is also unlikely but possible that the opposite is true in Michigan. As discussed in Issue 4, the regression coefficients may not be a trustworthy indication of the safety effect of predictor variable manipulation; that is, as Karwa et al. (2011) show, the transition from associa- tion to causation requires modeling attention. Review of Donnell et al. (2009), Conclusion The Donnell et al. (2009) research provides credible estimates of PMR in the database and the month-by-month modeling which partly accounts for the seasonal variation. The results pertain Predictor variable Multilane Two-lane β Std. error p-value β Std. error p-value β1 in ^( _1 ) 0.44 0.20 0.03 1.10 0.29 <0.01 β2 in ( _2 × (% )) -0.02 0.01 0.23 0.06 0.03 0.02 β3 in ( _3 × ℎ ℎ ℎ) -0.08 0.04 0.04 -0.11 0.05 0.02 PMR white edgelines -0.004 0.002 0.04 PMR white skiplines -0.002 0.001 0.10 PMR yellow edgelines 0.007 0.002 <0.001 PMR white edgeline -0.001 0.004 0.71 PMR yellow centerline -0.007 0.005 0.17 Table G2. Parameter estimates. 38 The monthly seasonal factors from a handbook (Roess et al. 2004) were applied to annual ADT estimates. 39 See the discussion of OVB above. 40 Donnell et al. (2009) say that “[t]his finding was unexpected and future research is recommended. . .” (page 57).

Developing Consensus in Research About the Safety Effect of Manipulations G-17   only to PMRs larger than about 100 mcd/m2/lux. In this range, four of five regression parameters indicate the larger the PMR, the fewer the nighttime target crashes; the one remaining parameter indicates to the contrary. Donnell et al. (2009) say that “[b]ased on the findings from the present experiment, the statistical relationship between pavement marking reflectivity and nighttime crash frequency is relatively weak” (p. 58). Smadi et al. (2010) In their Problem Statement, Smadi et al. (2010) say that “It is assumed that lower retroreflec- tivity values are a contributing factor in some crashes (such as nighttime, single vehicle, and ROR41 crashes); however, a statistically significant relationship has not yet been determined” (p.11). Therefore, “. . . a study utilizing measured retroreflectivity data accounting for the dete- rioration of pavement markings over time along with a sufficient amount of crash data is needed to provide a relationship between pavement marking retroreflectivity and safety performance” (p. 10). The practical importance of this study follows from the statement that “[i]f a statistically reliable relationship can be identified, agencies can improve their pavement marking strategies to reduce the number of nighttime crashes where low pavement marking retroreflectivity values are a contributing factor” (p.11). In the papers reviewed in the preceding sections, the data were from Michigan and North Carolina. Smadi et al. (2010) use data from Ohio. Estimates of the representative retroreflectivity (RR) that might have prevailed on each 1-mile road segment during two 4-month periods of a year were obtained.42 These estimates were produced from retroreflectivity measurements con- ducted once in the spring, once in the fall and, if the pavement markings were repainted in that year, from an additional retroreflectivity measurement (the Paint) done after the remarking. How the RR was thought to prevail in a period was determined is shown in Figure G6.43 The safety effect of retroreflectivity was estimated via a regression. The data set used con- tained representative retroreflectivity values for each one-mile road section with accompanying *August Retroreflectivity = Average of Spring & Fall Retroreflectivity Spring Retroreflectivity Average of Spring & August* Retroreflectivity Average of August* & Fall Retroreflectivity Average of Paint & Fall Retroreflectivity No Re-Striping Re-Striping April 1st December 1stPaint Date April 1st December 1stAugust 1st3 4 1 2 Figure G6. From Figure 7 in Smadi et al. (2010). 41 Run off the road. 42 This task is documented in the Database Preparation chapter and illustrated by the 16-step Crash and Retroreflectivity Assignment Procedure (Smadi et al. 2010). 43 For Period 3, the RR = 0.75 × (Spring Retroreflectivity) + 0.25 × (Fall Retroreflectivity); for period 4, the RR = 0.25 × (Spring Retroreflectivity) + 0.75 × (Fall Retroreflectivity).

G-18 Guidelines for the Development and Application of Crash Modification Factors predictor variables: vehicle miles traveled,44 line type, direction, road type,45 route number, and crash information. These were used in a logistic regression46 model of the form P crash P crash exp x xn n1 . . .0 1 1( ) ( )( )) ( )− = β + β + β in which the βs are parameters and the xs are non-negative predictor variables. In this expres- sion, a positive β is an indication that road sections found to have a larger value of x tend to have a higher probability of crashes to occur. Conversely, a negative β means that roads found to have a larger x tend to have a lower crash probability. When data for all road types were used, the parameter estimates in Figure G7 were obtained. The authors note that “[n]either retroreflectivity nor VMT is (statistically) significant with high p-values”47 (p. 42). Smadi et al. (2010) say that “. . . only road type interstate and multilane divided are significant with p-values lower than 0.05. . . .” (p. 42). To illustrate the problem, consider the βs for Road Type in Figure G7, which tell the safety of a road type relative to a two-lane road. Thus, the statistically significant estimate of βinterstate = 1.3354 would mean that in Ohio, on a mile of an interstate road, the probability of one or more target crashes to occur is 3.8(= e1.3354) times that of a two-lane road with the same vehicle miles traveled (VMT). Similarly, the β estimates for multilane highways imply that under otherwise identical conditions, the probability of a target crash on a mile of a divided highway is about twice (e0.9196−0.1384 = 2.2) that on a mile of an undivided multilane highway. These findings run counter to the Highway Safety Manual (AASHTO 2010).48 There may be issues with the data, the regression, or both. Figure G7. All road types. From Table 13 in Smadi et al. (2010). 44 (AADT/2) × (Number of days) 45 The four road types were freeway, multilane divided, multilane undivided, and two-lane. 46 Logistic regression is used when the outcome is binary. Here the binary outcome is No Crash or One or More Crashes. Since on some of these 1-mile segments there must have been two or more crashes in a four-month period, the transformation of crash counts into a binary variable and the use of the logistic regression may amount to unnecessary loss of information. 47 Smadi et al. (2010) do not show p-values. However, the values in the rightmost column can serve a similar purpose. If the standardized variable Z in the adjacent column is larger than 2, then the parameter estimate is significant at a level of signifi- cance of less than about 0.05. 48 Interstates are safer than two-lane roads, and divided highways safer than undivided ones (e.g., Figures 11-3 and 11-4 in the first edition of the Highway Safety Manual).

Developing Consensus in Research About the Safety Effect of Manipulations G-19   In Figure G7 (and in all other parameter estimate tabulations in Smadi et al. 2010), the RR is a single number. Every road direction has at least two pavement markings. Thus, a two-lane road has a white edgeline and a yellow centerline, a multilane divided road has a white and a yellow edgeline (as well as skiplines, which are not included in this data set), etc. The choice of RR was based on the kind of crash. Thus, “[r]uns-off-the-road right and ROR straight crashes were assumed to potentially be white edge line related. Cross-center line and ROR left crashes were assumed to potentially be yellow center line or yellow edge line related” (p. 20). This procedure does not conclusively show what RR to use when a 1-mile segment had no crashes. Similarly, it is uncertain how to proceed when one crash on a segment is related to the white edgeline and another to, say, the yellow centerline. In this case, the decision was to repre- sent that 1-mile segment by two records, each with one crash, the same VMT, but with different RRs. Inasmuch the retroreflectivity of yellow markings is usually about 200 mcd/m2/lux less than that of white yellow markings of the same age, the regression software cannot distinguish between a data point representing a faded white marking and one representing a fresh yellow one. Smadi et al. (2010) tried separate analyses for various subsets of the data. The corresponding estimates of βRR are collated in Table G3. The estimates fluctuate around 0, and no clear conclusions can be drawn. However, Smadi et al. (2010) focus on estimates that did reach statistical significance. Thus, in their Conclusions section they say that “[r]etroreflectivity was found to be a statistically significant factor in crash probability occurrence at a 90% confidence level for the interstate data subset, but the positive parameter estimate suggested increasing crash probability with increasing retroreflectivity values.” Later in the Conclusions section they assert that “retroreflectivity was found to be a signifi- cant parameter for all line types—at 90% confidence level for white edge lines, at 95% confidence level for yellow edge lines, and at 99% confidence level for yellow center lines. For white edge lines and yellow center lines, crash occurrence probability was found to increase by decreasing values of longitudinal pavement marking retroreflectivity”50 (p. 55). Kind of data and analysis Estimate of βRR Standard error Interstate roads only 0.0010* ±0.0006 Two-lane roads only -0.0005 ±0.0004 Multilane undivided roads only 0.0019 ±0.0020 Multilane divided roads only -0.0008 ±0.0007 High-crash routes in 2008 0.0033 ±0.0014 High-crash routes in 2007 -0.0000 ±0.0009 ... … … Low PMR segments (less than 200 mcd/m2/lx) -0.0001 ±0.0009 High PMR segments (more than 200 mcd/m2/lx) -0.0016 ±0.0031 Very low PMR segments (less than 100 mcd/m2/lx) 0.0009 ±0.0006 White edgeline (with subject effect49) -0.0005* ±0.0003 Yellow edgeline (with subject effect) 0.0021* ±0.0010 Yellow centerline (with subject effect) -0.0022* ±0.0008 * Statistically significant Table G3. aRR estimates for various data subsets. 49 The subject effect accounts for the correlation of readings taken on the same Route. 50 The estimates in the three bottom rows of Table G3 are coupled with estimates of βRoad type. Thus, the βRR for white edgelines is coupled with the significant estimate of βinterstates, which implies that, relative to two-lane roads with the same VMT, interstates have a 3.6 times higher probability of one or more target crashes.

G-20 Guidelines for the Development and Application of Crash Modification Factors The question whether single-equation regressions built from cross-section data can be used to predict the effect of manipulations51 as discussed under Issue 4. However, the Smadi et al. (2010) regression is problematic in several other ways. First, a logistic regression was used when the crash count outcome is not binary. This amounts to a loss of information. Second, no argu- ment was given to support the choice of functional form exp(β0 + β1 x1 + . . . βn xn) and, in fact, it would be difficult to do so. To explain, as VMT → 0, the e^(β_VMT × VMT) factor in the model equation approaches 1. However, when there is very little traffic that factor should approach 0 as one expects the probability of a crash to be very small; the chosen functional form chosen does not allow it to be so. Third, it is unclear if an attempt was made to improve the model by using a more appropriate functional form. The modelers used a postulated functional form. Fourth, no predictor variables representing lane width, shoulder width, curvature, illumination, pres- ence of rumble strips, etc. were used. When relevant predictor variables are few the variance of parameter estimates is bound to be large. Fifth, there is the OVB. As in the research reviewed earlier, Periods 1 and 2 and Periods 3 and 4 in Figure G6 differ not only in retroreflectivity but also in the seasonal factors that reflect Spring to Fall changes in traffic, weather, etc. The Ohio seasonal factors are listed in Table G4. Since seasonal factors are not in the model equation, the regression cannot distinguish between what is due to the change in retroreflectivity and what is due to seasonal variation. Sixth, crash frequency is usually proportional to period duration but its relationship with AADT is not one of proportionality. By combining AADT and duration into one predictor variable (the VMT), this difference is lost. Seventh, as discussed earlier, using data for the same section twice, once with a white and once with a yellow PMR, confuses the regression. Smadi et al. (2010) conclude that “these findings support increased investment in marking application and maintenance” (p. 55). The estimates of βRR in Table G3, whether statistically sig- nificant or not, oscillate around zero. As evident from this and the previous sections, the leitmotif of this review is that the accumulation of past research results did not lead to consensus. The Smadi et al. (2010) research exemplifies the difficulty. Smadi et al. (2008) Preceding the Smadi et al. (2010) paper is the Smadi et al. (2008) paper. It is based on the simi- larly collected but less extensive Ohio data52 and the same analysis method was used. Separate models were estimates for the entire data base, two-lane roads and, low retroreflectivity record. The comparison of the βRR estimates in 2008 and 2010 for two-lane roads is in Table G5. It is not possible to determine their magnitudes or differences. Using five years of data instead of three would have reversed the sign of the βRR estimate for two-lane roads. April–July August–November Rural two-lane 0.83 1.06 Rural multilane 0.84 1.15 Rural freeway 0.93 1.04 Table G4. Seasonal factors in Ohio. 51 The word manipulation is used here in a generic sense, representing words such as treatment, intervention, design alterna- tive, and change of trait. 52 The 2010 report used data from 2004 to 2008 whereas the 2008 paper used data from 2004 to 2006.

Developing Consensus in Research About the Safety Effect of Manipulations G-21   When road type, VMT, and marking color were used but only records with retroreflectivity below 200 mcd/m2/lx were used, the results in Figure G8 were obtained. The corresponding estimates from Smadi et al. (2010) where two more years of data were used are in Figure G9. Smadi et al. (2008) say that because edgelines were found to be safety effective, “intuition leads one to assume that pavement marking visibility and retroreflectivity would also have a positive effect on safety performance. However, models of the entire database and the two-lane records did not show that lower pavement marking retroreflectivity correlating to a higher crash probability. But, when records with only low retroreflectivity values were analyzed (≤200 mcd/m2/lx), a negative correlation was found to be statistically significant.” Kind of data 2010 Estimate of βRR 2008 Estimate of βRR All road types -0.0002 -0.0005 Two-lane roads only -0.0005 0.0006 Table G5. Comparing as in 2008 and 2010. Figure G8. Copy of Figure 4 in Smadi et al. (2008). Figure G9. Copy of Table 27 from Smadi et al. (2010).

G-22 Guidelines for the Development and Application of Crash Modification Factors The β estimates in Figures G8 and G9, significant or not, are often different. Thus, with three years of data in Smadi et al. (2008), the estimate of βRR is a statistically significant −0.0021, whereas with two more years of data in Smadi et al. (2010), the same parameter estimate is an inaccurate −0.0001. The 2008 analysis shares not only the data structure and analysis approach but also the issues noted earlier in the review of Smadi et al. (2010); the same regression prob- lems listed for Smadi et al. (2010) apply here. Thus, also in Smadi et al. (2008), a freeway has a larger probability of target accident occurrence than all other road types. If the estimates of βRoad type are not believable, then the βRR of −0.0021 is also in doubt. The relationship between retroreflectivity and safety remains elusive. As Smadi et al. (2008) note, “It has been shown in previous research that greater longitudinal pavement marking retro- reflectivity levels increase drivers’ visibility and detection distance. However, increased visibility may also cause drivers to feel too comfortable during nighttime conditions and drivers may then pay less attention and/or operate at unsafe speeds” (Abstract). The net effect can go either way, and the direction it went is undetermined by the reported results. Even so, the authors conclude that “[t]his study identified a statistically significant relation- ship between low pavement marking retroreflectivity levels and safety performance. With this new information, it is hoped that agencies can make more informed decisions about their pavement marking management programs and achieve the ultimate goal of reducing the number of nighttime crashes where low pavement marking retroreflective values are a con- tributing factor.” This is the same predisposition to believe that more visible markings must be good for safety. It may have been more accurate to say that while one β was statistically sig- nificant, others were not, and that the one significant β comes from a regression with several unpredicted results. Dravitzki et al. (2006) The question in all these studies is how a change of PMR is likely to affect safety. The studies reviewed used observational cross section data to determine how crash frequency of probability depend on unit traits, PMR being one such trait.53 The Dravitzki et al. (2006) research report used a different approach. At its center is a treatment that, when applied to certain units, sepa- rates time into Before and After treatment periods. The treatment in Dravitzki et al. (2006) was the implementation in New Zealand of PMR performance standards54 expressed in the use of reflectorized instead of non-reflectorized markings. Dravitzki et al. (2006) note that “the improvement of reflectorised markings over non-reflectorised markings is about 20 to 40%; that is by about 0.5 to 1.0 seconds [which the reflectorization adds to the drivers’ preview time]” (p. 15). The 3-year Before period was sepa- rated from 1- or 2-year After periods by about a half-year Installation period. The data pertain to 5,133 km of state roads and 4,344 non-intersection crashes. To describe the target crashes, Dravitzki et al. (2006) say that “[f]or this study a range of crash types will be used. First, all mid-block crashes will be considered; then second, obvious crash types not affected by delineation will be removed from the data set (such as rear end crashes); 53 The exception is the CD approach in Karwa et al. (2011), where conditional distributions were used instead of regression. 54 The standards were developed by Transit New Zealand which, between 1989 and 2008, was the agency responsible for oper- ating and planning the New Zealand State Highway network.

Developing Consensus in Research About the Safety Effect of Manipulations G-23   and finally, only single-vehicle loss of control/run off road type crashes will be considered” (p. 18). The effect of the treatment was sought in the: (a) Change in average number of crashes. (b) Change in the ratio of day to night crashes. (c) Comparison of crash rate in regions where treatment was implemented and other regions where not. (d) Change in the ratio of crashes on straight sections of road versus crashes on curves. About (a), the change in the average number of midblock crashes from the Before to the After period, Dravitzki et al. (2006) say that “[a]t this coarse level any effect of the change in marking brightness cannot be seen. . . . The differences vary markedly between regions, giving no overall trend” (p. 26). About (b), Dravitzki et al. (2006) say that “[c]omparing the ratios of day to night crashes (that could be affected by the improved PMR) helps to reduce effects arising from moderate changes in traffic volume over the time of this study, as it does also for the split in traffic travelling at night compared to the day for the different sites. Due to the low number of crashes in many groups, comparisons are valid for only the ‘rural, no street lighting’ group. . . . The hypothesis is that bright roadmarkings should have little impact on crashes in daylight but should reduce night-time crashes” (p. 26). The crash counts are in Table G6. If one assumes that had there been no treatment the number of crashes at night would have increased in the same proportion as the number of day crashes did, one should expect in the After period 250 × (585/517) = 283 night crashes. Since, with the treatment implemented, 322 such crashes occurred, Dravitzki et al. (2006) say that this “implies that the intervention has not been successful” (p. 27). Post treatment, about 14% more crashes than expected have occurred, indi- cating harm rather than absence of success. This assumption might be invalid if there was a decline in drinking and driving between the Before and After periods. With blood alcohol concentration (BAC) more of a factor at night, were the decline in BAC considered, fewer than 283 nighttime crashes would be expected; if so, the estimated harm attributed to improved PMR would be even larger. Conversely, suppose that in the after period more police officers were assigned to nighttime duty and, as a result, a larger proportion of crashes got reported. Had the increase in reporting been accounted for, more than 283 nighttime crashes should be expected, and the estimated harm attributed to PMR would have been smaller. In addition, there is the statistical (in)accuracy of the estimate. The number of crashes in Table G6 is relatively small so that, even if what was assumed were true, the standard error of the estimated increase (by 14%) is ±12%. About (c), Dravitzki et al. (2006) say that “[a] comparison of crash rates . . . within the treated regions with crash rates . . . on all state highway open roads was made. This gives context to the crash rates on the treated regions” (p. 28). They conclude that “the trend found for the treated state highways is mirrored almost exactly by these comparable roads and datasets. This again points to the brighter roadmarkings having little impact on crash rates” (p. 30). Their last approach, (d), was to examine the day/night ratios for straights (tangents) and curves. The rationale for “. . . this type of analysis is the commonly-held belief (supported by the research discussed in the executive summary) that delineation has a particular role in assisting safe driving around curves at night-time” (p. 32). This kind of comparison “. . . normalises for traffic volume, weather conditions, and driver attributes since each driver making a particular journey needs to successfully negotiate each straight and curve along the route” (p. 33). Before After Night 250 322 Day 517 585 Table G6. Crash counts.

G-24 Guidelines for the Development and Application of Crash Modification Factors As in (a), (b), and (c), Dravitzki et al. (2006) conclude that “[c]omparing the ‘before’ and ‘after’ periods . . . and comparing first all open roads . . . and second all state highway open roads . . . with the improved state highway open roads . . . , shows no noticeable change caused by the brighter roadmarkings. This finding, though unexpected, is explained by the same dis- cussion given previously, that of drivers’ adaptive behaviour” (p. 36). Dravitzki et al. (2006) summarize by saying that “[c]omparisons were completed several ways: annual average crash frequency, ratio of light to dark crashes, and crashes on curves compared to straights. In all the methods no significant trend of brighter roadmarkings having an effect on reducing crash rates could be isolated” (p. 39). The authors offer several qualifica- tions to this finding: that the number of crashes was small, that the effects of improved PMR may have been nullified (on some roads) by the presence of other delineation devices such as reflectorized raised pavement markers and edge marker posts, that the treatment did not increase retroreflectivity by much, and that there may have been changes in crash reporting that obscure the real effect. The Dravitzki et al. (2006) study has two main messages: • Improving PMR was not associated with a reduced crash frequency, and perhaps there was an increase. • Improved PMR may not translate directly into better safety because of driver adaptation (perhaps via higher speed, less vigilance, and others). The Dravitzki et al. (2006) report is the first before-after study in this review and follows six cross-section type studies. Dravitzki et al. (2006) find that improved retroreflectivity did not reduce crash frequency. The findings of the cross-section type studies are variable. Karwa et al. (2011) conclude that increasing retroreflectivity from Low to Medium and from Medium to High does reduce the target crash frequency substantially. Dravitzki et al. (2006) also find, similarly, that except for yellow edgelines on multilane highways, road segments with more retroreflective markings have fewer target crashes. In their 2010 report, Smadi et al. (2010) find retroreflectivity to be a statistically significant factor in crash occurrence and claim that their results support increased investment in the retroreflectivity of pavement markings. In their 2008 study, which uses the same data as the 2010 study, Smadi et al. (2008) also speak of a statistically significant relationship, but only for low retroreflectivity levels. The conclusions of the before-after studies by Dravitzki et al. (2006) differ from the con- clusions of the preceding six cross-section studies. Which kind of study is more trustworthy under what conditions? A discussion of the progenitors of observational before-after and cross-section studies, of their conceptual underpinnings, and of their strength and weakness is in Chapter 2. Bahar et al. (2006) and Masliah et al. (2007) The observational cross-section regression studies herein used data about predictor variables such as PMR, Segment Length, AADT, Speed Limit, Horizontal Alignment, etc. to estimate the expected number of target accidents of road segments using a regression equation. These studies aimed to separate contribution to the expected target accidents of the PMR and of the other predictor variables. The distinctive feature of this approach is that the dependent variable (number of accidents) is for a certain period and the predictor variables are for the same period; one road segment—one data point. Bahar et al. (2006) (and Masliah et al. 2007 in the derived paper) take a different tack. For each road segment in their data, they have month-by-month information about the number of

Developing Consensus in Research About the Safety Effect of Manipulations G-25   target accidents55 and about how long it has been since the most recent repainting. Models were developed to predict the decline in PMR as a function of time since last repainting.56 The PMR (by color and function) that prevailed in the month when the count of accidents materialized was predicted for each road segment. And so, each road segment had a time series of data points. Table G7 (an excerpt from Table 3 in Masliah et al. 2007) is an illustration of what the data for one road section would look like. The data pertain to California, and their extent is shown in Figure G10. The advantage of this time-series approach is that the fixed traits of a segment (such as Seg- ment Length, AADT, Speed Limit, Horizontal Alignment) did not change over time their influ- ence on the expected number of target crashes did not have to be modeled or accounted for. The influence of month-by-month changes in traffic, precipitation, type of traveler, etc. was represented by monthly multipliers. The power of the method to correctly estimate the effect of PMR on safety where one exists was checked and verified using artificially created realistic data. Estimates of the effect of the PMR on the expected target accident frequency were expressed as CMFs. To explain: a CMF of 1 indicates no effect and a CMF of 0.95 indicates a 5% decrease. The results grouped into retroreflectivity bins are in Figure G11 (the notation for the CMF is qr). For all three highway types and for both white and yellow markings, the CMFs fluctuate around 1. Thus, if a safety effect exists, it is very small. Nor is there, within the range of available data, a detectable dose-effect relationship. Issue 9—Politics and Criticism At the time the findings by Bahar et al. (2006) were published, the Federal Highway Adminis- tration (FHWA) was in the process of introducing into the Manual on Uniform Traffic Control Devices (MUTCD)57 a requirement to maintain minimum retroreflectivity levels. Acting on the congressional direction, FHWA said in the Federal Register that “[t]he FHWA is now proposing the establishment of minimum pavement marking retroreflectivity levels in the MUTCD. The FHWA has analyzed and considered technical research results as well as Year Month Months after repainting PMR Crash count 1998 January 7 386 0 February 8 335 3 … … … … 1999 June 13 129 2 July 0 386 0 August 1 335 3 … … … … … 2000 November 4 239 1 December 5 218 1 Table G7. Data for one road segment. 55 Non-daylight, non-intersection. 56 The models were based on information collected by The National Transportation Product Evaluation Program that collects data at test decks in various states. Separate prediction models by Color, Material, Climate Region, and Snow Removal were developed; Pavement Space and Traffic Volume variables did not improve prediction. 57 The MUTCD, which has been administered by the FHWA since 1971, is a compilation of national standards for all traffic control devices, including road markings, highway signs, and traffic signals. It is updated periodically to accommodate the nation’s changing transportation needs and address new safety technologies, traffic control tools, and traffic management techniques.

G-26 Guidelines for the Development and Application of Crash Modification Factors Figure G10. Tables 61 and 62 from Bahar et al. (2006). Figure G11. Figures 65 and 67 from Bahar et al. (2006).

Developing Consensus in Research About the Safety Effect of Manipulations G-27   input from participants of FHWA-sponsored workshops (as discussed later in this document) and developed proposed minimum maintained pavement marking retroreflectivity levels for the MUTCD.” To document the No Effect finding by Bahar et al. (2006), FHWA funded the development of a synthesis of research findings on the benefits of pavement markings by Carlson et al. (2009) that “included a critical review of the results” by Bahar et al. (2006) (Federal Register, p. 20936).58 CPA state that the Bahar et al. (2006) study has “significant limitations” (pp. 12, 21): (1) the study used data only from California, (2) the retroreflectivity values used are modeled, not mea- sured, (3) the study presupposes that markings in California never reach a value where there is an adverse impact on safety, (4) the PMR bins were not selected on a basis of a logarithmic scale, and (5) all bins used acceptable or “above minimum” PMR levels. CPA state that “[c]ombined, these limitations and concerns seriously challenge the quoted concluding remarks shown above.”59 Carlson et al. (2013, p. 59) repeat these criticisms and say that they “limit [the] acceptability” of the zero-effect conclusion. Limitation (1) was that Bahar et al. (2006) used data from only one state (California). The same limitation is present in all the other reviewed studies. Avelar and Carlson (2004) and Carlson et al. (2013) used data from Michigan only; Karwa et al. (2011) and Donnell et al. (2009) used only North Carolina data; Smadi et al. (2008, 2010) used only Ohio data. For Bahar et al. (2006), the main attraction of data from California was the availability of the repainting date. Limitation (2) was that Bahar et al. (2006) used data-based models to predict retroreflectivity and its decline instead of the measured retroreflectivity levels used by Carlson et al. (2013). While Carlson et al. (2013) did use measured values of PMR, seldom were these for the road segment and period of interest.60 To match PMRs and crash data to a certain road segment and period, Carlson et al. (2013) had to use various model-based imputation rules.61 Thus, both Bahar et al. (2006) and Carlson et al. (2013) used models to estimate the PMR of the road section to which the crash data pertain. It is unclear which of the two approaches to PMR estimation yields the more appropriate retroreflectivity values. The third limitation CPA note seems to be a misconception. CPA say that the Bahar et al. (2006) study presupposes that retroreflectivity never declines below a level that can adversely impact safety. Limitation (4) is, as CPA note, about the use of “binning.” Bahar et al. (2006) examined the safety effect of retroreflectivity by grouping roads segments with similar PMR values into bins.62 They chose the bin boundaries to ensure that each bin contains enough data for reliable safety effect estimation. CPA think that the bin boundaries ought to be a logarithmic function of PMR because “. . . the performance of retroreflectivity has been repeatedly shown to be best modeled logarithmically.” (p. 13 and Carlson et al. 2013, p. 59). 58 Federal Register, Vol. 75, No. 77, April 22 2010/Proposed Rules, 20935–20941 59 The concluding remarks by Bahar et al. (2006) to which CPA refer are that “. . .the difference in safety between new markings and old markings during non-daylight conditions on non-intersection locations is approximately zero.” 60 According to Table 1 (page 62) in 3% to 4% of the cases. 61 To illustrate, nearly all retroreflectivity measurement are taken in September–October, at the end of the re-marking season. To get a temporarily imputed value for, April, for example, Carlson et al. (2013) used the measured value for the previous end of season and applied a fixed monthly degradation value (e.g., –22 mcd/month for white edgelines). If the result was negative, the PMR was set to 50 mcd/m2/lx. Similarly, if a segment had no measured value (40%–50% of cases), an imputed value from within 2 miles was used. 62 Thus, from Table 69 (Bahar et al. 2006, page 158), for white markings on two-lane roads the bins were 21–184, 185–204, 205–225, 226–250, 251–263, 264–292, 293–328, 329–341, and 342–413 mcd/m2/lux; and from Table 71 (Bahar et al. 2006, page 159), for yellow markings on two-lane roads the bins were 15–82, 83–100, 101–115, 116–131, 132–149, 150–165, 166–187, 188–201, and 202–238 mcd/m2/lux.

G-28 Guidelines for the Development and Application of Crash Modification Factors The last limitation that CPA note is that in the California data, the PMR is seldom less than 100 mcd/m2/lx.63 For yellow markings, the leftmost bins contain retroreflectivities of 15–79 mcd/ m2/lx a. The reverse is also true. Since data with white marking and PMR > 100 mcd/m2/lx and yellow markings with PMR > about 80 mcd/m2/lx were plentiful, the zero-effect conclusion in their ranges should stand. Segments with low PMRs are rare in both the Bahar et al. (2006) (California) and the Carlson et al. (2013) (Michigan) data. Like Bahar et al. (2006), Carlson et al. (2013) did their analysis separately “using a subset of the data having YCntr retroreflectivity values satisfying the fol- lowing thresholds: ≤200 mcd/m2/lx, ≤150 mcd/m2/lx, and ≤100 mcd/m2/lx” (p. 63). Carlson et al. (2013) assert that “[w]hen a subset of data in Database B was used, with WLane ≤ 200 mcd/m2/lx, the effects of WLane retroreflectivity on nighttime crashes and single-vehicle nighttime crashes were found to be statistically significant. The negative coefficients suggested that expected night- time crash frequency and single-vehicle nighttime crash frequency decrease as WLane retrore- flectivity increases, for low retroreflectivity values of WLane” (p. 65). Were a similar “negative coefficient” present in the Bahar et al. (2006) data, they too should have expected in their tables a non-zero safety effect. While such an effect is not present in the Bahar et al. (2006) safety-effect estimates for the low PMR bin, the California and Michigan data sets lead to different conclu- sions about the low PMR bin that may reflect different re-marking practices in California and in Michigan. Bahar et al. (2006) stated that their conclusions pertain to “. . . roads that are main- tained at the level implemented by California” (p. 3). It is therefore unclear why the results by Bahar et al. (2006) for the lowest PMR bin are found objectionable when the results by Carlson et al. (2013) for the same bin are not. In Carlson et al. (2013), the question was “whether a correlation between pavement mark- ing retroreflectivity and safety can be established” (p. 59). Bahar et al. (2006) ask, similarly, “. . . how non-intersection, non-daylight (night, dawn, and dusk) safety is impacted by the change in retroreflectivity of longitudinal pavement markings and markers” (Bahar et al. 2006, p. 4). Bahar et al. (2006) say that within the range of PMRs found in California, there is no evidence that more reflectivity is associated with fewer target accidents. While this finding may not say much about minimum acceptable levels of PMR, none of the limitations noted by Carlson et al. (2013) cast doubt about the validity of the Bahar et al. (2006) findings. Review of Bahar et al. (2006), Conclusion There were also some earlier studies. Using data from Michigan, Lee at al. (1999) found no evidence that nighttime crash frequency is sensitive to pavement marking retroreflectivity levels. Migletz and Graham (2002), in a before-and-after study, set out to determine whether “longer lasting more retroreflective materials reduced crashes.” Their results were mixed and inconclu- sive. Drivers adjust their behavior to circumstances in both visible and hard-to-discern ways making the safety effect of treatments difficult to anticipate. Lessons of Past Research: Summary Eight recent and extensive research studies were reviewed. Using Michigan data, Avelar and Carlson (2014) concluded that target crash frequency is a function of the Sum and the Difference of the white and yellow PMRs. Tabulation of their final model shows that for any 63 The same limitation applies to the data used by Carlson et al. (2013). In their Figure 2 (page 61), markings with less that 100 mcd form a miniscule proportion.

Developing Consensus in Research About the Safety Effect of Manipulations G-29   PMRYellow, increasing PMRWhite will cause an increase in crash frequency. Using the same data as Avelar and Carlson (2014) but contrary to their findings, Carlson et al. (2013) report that the statistically significant regression coefficients all indicate that increasing the PMR (white or yellow) is expected to reduce the crash frequency. Karwa et al. (2011) use data from North Carolina. Unlike Avelar and Carlson (2014) and Carlson et al. (2013), they distinguish between association and causation. Applying five different variants of causal inference approached they consistently find that a decrease in PMR causes an increase in the probability of target crash occurrence. Donnell et al. (2009), use the same observational cross-section data from North Carolina as Karwa et al. (2011), but their conclusions are based on fitting to it a model equation. The results pertain only to PMRs larger than about 100 mcd/m2/lux. In this range, four of five regression parameters indicate the larger the PMR, the fewer the nighttime target crashes; the one remaining parameter, the most accurate one, indicates to the contrary. Smadi et al. (2010) use data from Ohio. They too fit a model equation to cross-section data. Some PMR regression coefficients are positive, and some are negative. Regression coefficients of some other predictor variables are incorrect in magnitude. Smadi et al. (2008) use similarly collected but less extensive Ohio data. They also fit a model equation to observational cross-section data and use the same analysis method as Smadi et al. (2010). Unlike the aforementioned studies, that of Dravitzki et al. (2006) did not fit a model equation to cross-section data but compared target crash frequencies before and after a change in PMR. The effect was sought in the change in average number of crashes, the change in the ratio of day to night crashes, the comparison of crash rate in regions where treatment was implemented and other regions where not, and in the change in the ratio of crashes on straight sections of road versus crashes on curves. In none of these could they see that brighter roadmarkings reduce crash rates. Bahar et al. (2006) and Masliah et al. (2007) use monthly target crash counts and PMR estimates for California road segments to determine whether as the PMR diminishes over time the number of crashes increases. As no such increase could be found, they are led to a No Effect conclusion. These eight studies used observational data and produced diverse and often contradictory results. As a result, there is not an evidence-based consensus about the safety effect of PMR. Does this mean that observational data cannot produce trustworthy CMFs? Chapter 2 will examine this question.

G-30 Ways to Determine the Effect of a Cause Chapter 1 is a review of several studies that tried to determine how changing the retro- reflectivity of pavement markings is likely to affect the expected frequency of target accidents. The studies are of either the Observational Cross-Section or of the Observational Before-After kind. The results are mixed and inconclusive. The cross-section studies tended to conclude that increasing the PMR seems to reduce the frequency of target crashes while the observational before-after studies did not find such reductions. Introduction What change in target crashes is caused by a change in PMR is an instance of what change in the safety64 of some unit (a road segment, an intersection, a driver, etc.) is caused by some manipulation. If the unit is a road segment, then traffic flow, road geometry, prevailing speed, and amount of precipitation are some of its many safety-related traits. If the unit is a driver, then gender, age, annual mileage, and personality are some safety-related traits.65 One can broaden the same reasoning by replacing the safety property of a unit with a property of interest (PoI). In this more general setting, units have PoI-related traits that, if manipulated, will affect the PoI. To illustrate, the unit may be a field, the average crop yield, soil quality, and amount sunshine as some PoI-related traits, and fertilization as manipulation. Such a broaden- ing of terminology makes it clear that concern about cause and effect in road safety is an instance of the general concern about the effect of manipulations, in agriculture, education, epidemiology, economics, and other applied disciplines. There are many ways to estimate what change in the PoI is caused by a manipulation. Fig- ure G12 shows five prototype ways. Whichever way is used, to determine what change in the PoI is caused by the manipulation of some trait or traits of interest, one must compare what the PoI was with manipulation to what the PoI would have been at the same time without the manipulation. What the PoI was with the manipulation is based on observation, measurement, and statistical estimation. What the PoI would have been at the same time but without the manipulation cannot be observed and measured since such a state never existed; it can only be predicted with varying degrees of plausibility and confidence. This comparison between what can be estimated and what can only be predicted is shown in Figure G13. C H A P T E R   2 64 The safety property of a unit is its expected accident frequency by severity. 65 Some safety-related traits are fixed (e.g., the length of a road segment, or the gender of a driver) and some vary over time (e.g., the traffic flow of a road segment or the annual mileage of a driver).

Developing Consensus in Research About the Safety Effect of Manipulations G-31   Figure G13. The essential comparison. Figure G12. Five study prototypes.

G-32 Guidelines for the Development and Application of Crash Modification Factors The “at the same time” phrase makes it clear that PoI-related traits unaffected by the manipu- lation are the same when the What Was and the What Would Have Been are compared. The need to keep these traits unchanged or to correct for what change did occur is a nuisance. This is why the PoI-related traits that are unaffected by the manipulation will be called the nuisance traits. The manipulation that caused the change in the PoI only if the nuisance traits under the circumstances to which the estimate and the prediction pertain are the same. The five prototypes in Figure G13 differ in the way the Prediction is produced. They were placed on a slanting loss-of-control axis. The less control one has over the source of data, the less confident one can be in prediction. Summary and Lessons Several studies of the safety effect of PMR were reviewed in Chapter 1. They all used observa- tional data of the before-after or of the cross-section kind. The studies did not lead to consensus. The cause-effect determination issue is described as follows: There is a PoI and a PoI-related trait of units that can be manipulated. It is the effect on the PoI of this manipulation that is of interest. Units also have several nuisance traits that, if changed, will affect their PoI. To deter- mine what change in the PoI is caused by the manipulation of the trait of interest, one compares the PoI with the manipulation in place to what the PoI would have been without the manipula- tion if the nuisance traits in the with and without manipulation conditions stayed at the same level. This is the ceteris paribus condition. The cleanest way to create ceteris paribus conditions is in the laboratory when the same unit can be repeatedly manipulated without changing its PoI-related traits, and where all nuisance traits can be held constant. Next in trustworthiness is the laboratory experiment in which the same unit cannot be manipulated repeatedly. Here, small differences in nuisance traits between the units exist and these are accounted for by a correction. The determination of a correction requires the level of the nuisance traits with and without manipulation to be measured and the function telling how the PoI depends on the nuisance traits to be known. Moving from the laboratory to the field involves an inevitable loss of control over the homogeneity of nuisance traits. Here the strategy is to randomly select units belonging to either the treatment or the control group and to make these groups large enough so that the difference between the PoI of the treatment group of units, were they not treated, and that of the control group of (untreated) units is likely to be sufficiently small. The next step is from experiments to observational studies. In an experiment the decision of which unit is to be treated and how derives from the research aim: the question of what change in the PoI is caused by the manipulation. In an observational study the same decision about which unit is treated how is made for a variety of reasons unrelated to the research question. The observational before-after study is a mélange of elements from the laboratory and field experiments. The main challenge is to come up with a correction to account for the difference in those nuisance traits that do change over time. There are several ways to do so. One option is to use a comparison group of similar units. But a comparison group is not a control group. One cannot be confident that the nuisance traits in both groups changed over time in a similar fashion and cannot say how accurate the correction is likely to be. And so, because of the absence of randomization, the similarity of the control and comparison group is inconsequential. One might be able to partly compensate for the absence of randomization using Propensity Scores. This option has rarely been tried in road safety research and therefore deserves research atten- tion. It too, however, may be hampered by the fact that there are reasons why one unit is treated

Developing Consensus in Research About the Safety Effect of Manipulations G-33   one way and another differently. These reasons cannot be easily considered in the construction of comparison groups nor the determination of propensity scores. While the observational before-after study borrows ideas from the tradition of scientific exper- imentation, the observational study that fits a single equation to cross-section data is a break with it. It is the tradition of scientific experimentation to minimize confounding by eliminating nuisance traits, by keeping them unchanged, and by applying a correction for what change does occur. In contrast, when fitting an equation to cross-section data, modelers aim to include in the model equation all nuisance traits and for each trait to have data at widely varying levels. In Chapter 1, I concluded that despite persistent attempts there is little progress toward an evidence-based consensus about the effect of PMR on safety. In Chapter 2, I attempted to explain the obstacles to progress. Building on the lessons of Chapters 1 and 2, I think that there are at least four such directions. 1. In fields not dissimilar to road safety, the conduct of randomized experiments is the gold stan- dard and engine for progress toward evidence-based consensus. In research about the safety effect of manipulations, this option is nowadays not entertained. I think it should be given a fair hearing. 2. To attribute cause to effect in one must account for the effect of nuisance influences. It is easiest to do so accurately when the number of nuisance influences is limited, when they are well measured, and when the function linking them to target crashes is known. This prin- ciple should guide us in seeking opportunities for fruitful and trustworthy research; the same principle should serve for identifying those research approaches that are less likely to produce trustworthy results. 3. Other disciplines benefited from research directions that are seldom used in road safety. Specifically, the possibilities embedded in Structural Equations Modeling, in PO Causal Inference, and in CD modeling should be pursued. Central to research about the safety effect of manipulations is the ability to predict what would have happened without the manipulation. The various study types surveyed in Chapter 2 differ mainly in the way predictions of this kind are produced. Empirical research about which approach to prediction is best is limited. Such a research program is feasible and should be pursued.

G-34 AASHTO. 2010. Highway Safety Manual, American Association of State Highway and Transportation Officials, Washington, DC. Avelar, R. and P. Carlson. 2014. Link Between Pavement Marking Retroreflectivity and Night Crashes on Michigan Two-Lane Highways, Transportation Research Record 2404, 59–67. Bahar, J., M. Masliah, T. Erwin, E. Tan, and E. Hauer. 2006. NCHRP Web-Only Document 92, Pavement Mark- ing Materials and Markers: Real-World Relationship Between Retroreflectivity and Safety over Time. http:// onlinepubs.trb.org/onlinepubs/nchrp/nchrp_webdoc_92.pdf. Accessed December 22, 2014. Carlson, P. J, E. S. Park, and C. K. Andersen. 2009. Benefits of Pavement Markings: A Renewed Perspective Based on Recent and Ongoing Research, Transportation Research Record 2107, Transportation Research Board, Washington, DC, 59–68. Carlson, P. J., E. S. Park, and D. H. Kang. 2013. Investigation of Longitudinal Pavement Marking Retroreflectivity and Safety, Transportation Research Record 2337, Transportation Research Board, Washington, DC., 59–66. Donnell, E. T., V. Karwa, and S. Sathyanarayanan. 2009. Analysis of the effects of pavement marking retroreflec- tivity on traffic crash frequency on highways in North Carolina. Transportation Research Record 2103, 50–60. Dravitzki, V. K., S. M. Wilkie, and T. J. Lester. 2006. The Safety Benefits of Brighter Road Markings. Research Report 310. Land Transport New Zealand, Wellington, NZ. Ellenberg, J. 2014. How Not to Be Wrong, The Penguin Press. New York. Gelman, A. 2013. P values and statistical practice. Epidemiology, 24, 1, 69–72. Karwa, V., A. B. Slavkovic, and E. T. Donnell. 2011. Causal inference in transportation safety studies: Comparison of potential outcomes and causal diagrams, The Annals of Applied Statistics, Vol. 5, No. 2B, 1428–1455. Lee, J. T., T. L. Maleck, and W. C. Taylor. 1999. Pavement marking material evaluation study in Michigan, ITE Journal (July): 44–51. Masliah, M., G. Bahar, and E. Hauer. 2007. Application of Innovative Time Series Methodology to Relation- ship Between Retroreflectivity of Pavement Markings and Crashes, Transportation Research Record 2019, 119–126. McCaffrey, D. F., G. Ridgeway, and A. R. Morral. 2004. Propensity Score Estimation With Boosted Regression for Evaluating Causal Effects in Observational Studies. Psychological Methods, 9, 403–425. Migletz, J. and J. Graham. 2002. Long-term pavement marking practices. In NCHRP Synthesis 306: Traffic Crashes and Pavement Markings (Ch. 4). Transportation Research Board, Washington, DC. Ridgeway, G., D. McCaffrey, and A. Morral. 2006. TWANG: Toolkit for Weighting and Analysis of Nonequivalent Groups, Software for using matching methods in R, RAND Corporation, Santa Monica, California. Ridgeway, G. 2007. Generalized boosted models: a guide to the GBM package, Available at http://cran.r- project.org/web/packages/gbm/vignettes/gbm.pdf. Roess, R. P., E. S. Prasa, and W. R. McShane. 2004. Traffic Engineering, 3rd edition, Prentice Hall. Rubin, D. B. 1997. Estimating Causal Effects from Large Data Sets Using Propensity Scores, Annals of Internal Medicine, Volume 127, Number 8 (Part 2), 757–763. Smadi, O., R. R. Souleyrette, D. Ormand, and N. R. Hawkins. 2008. Pavement Marking Retroreflectivity: Analysis of Safety Effectiveness, Transportation Research Record 2056, 17–24. Smadi, O., N. Hawkins, I. Nlenanya, and B. Aldemir-Bektas. 2010. Pavement Markings and Safety. Center for Transportation Research and Education, Iowa State University, IHRB Project TR-580. November 2010. References

Abbreviations and acronyms used without de nitions in TRB publications: A4A Airlines for America AAAE American Association of Airport Executives AASHO American Association of State Highway Officials AASHTO American Association of State Highway and Transportation Officials ACI–NA Airports Council International–North America ACRP Airport Cooperative Research Program ADA Americans with Disabilities Act APTA American Public Transportation Association ASCE American Society of Civil Engineers ASME American Society of Mechanical Engineers ASTM American Society for Testing and Materials ATA American Trucking Associations CTAA Community Transportation Association of America CTBSSP Commercial Truck and Bus Safety Synthesis Program DHS Department of Homeland Security DOE Department of Energy EPA Environmental Protection Agency FAA Federal Aviation Administration FAST Fixing America’s Surface Transportation Act (2015) FHWA Federal Highway Administration FMCSA Federal Motor Carrier Safety Administration FRA Federal Railroad Administration FTA Federal Transit Administration GHSA Governors Highway Safety Association HMCRP Hazardous Materials Cooperative Research Program IEEE Institute of Electrical and Electronics Engineers ISTEA Intermodal Surface Transportation Efficiency Act of 1991 ITE Institute of Transportation Engineers MAP-21 Moving Ahead for Progress in the 21st Century Act (2012) NASA National Aeronautics and Space Administration NASAO National Association of State Aviation Officials NCFRP National Cooperative Freight Research Program NCHRP National Cooperative Highway Research Program NHTSA National Highway Traffic Safety Administration NTSB National Transportation Safety Board PHMSA Pipeline and Hazardous Materials Safety Administration RITA Research and Innovative Technology Administration SAE Society of Automotive Engineers SAFETEA-LU Safe, Accountable, Flexible, Efficient Transportation Equity Act: A Legacy for Users (2005) TCRP Transit Cooperative Research Program TDC Transit Development Corporation TEA-21 Transportation Equity Act for the 21st Century (1998) TRB Transportation Research Board TSA Transportation Security Administration U.S. DOT United States Department of Transportation

Transportation Research Board 500 Fifth Street, NW Washington, DC 20001 ADDRESS SERVICE REQUESTED ISBN 978-0-309-68686-0 9 7 8 0 3 0 9 6 8 6 8 6 0 9 0 0 0 0

Guidelines for the Development and Application of Crash Modification Factors Get This Book
×
 Guidelines for the Development and Application of Crash Modification Factors
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Crash modification factors (CMF) provide transportation professionals with the kind of quantitative information they need to make decisions on where best to invest limited safety funds.

The TRB National Cooperative Highway Research Program's NCHRP Research Report 991: Guidelines for the Development and Application of Crash Modification Factors describes a procedure for estimating the effect of a proposed treatment on a site of interest.

Supplemental to the report are a CMF regression tool, a CMF combination tool, a slide summary, and an implementation memo.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!