National Academies Press: OpenBook

A Multivariate Analysis of Crash and Naturalistic Driving Data in Relation to Highway Factors (2013)

Chapter: Chapter 6 - Statistical Analysis: An Approach Using Extreme Value Theory

« Previous: Chapter 5 - Statistical Analysis: A Unified Approach to the Analysis of Rates for Crashes and Crash Surrogates
Page 24
Suggested Citation:"Chapter 6 - Statistical Analysis: An Approach Using Extreme Value Theory." National Academies of Sciences, Engineering, and Medicine. 2013. A Multivariate Analysis of Crash and Naturalistic Driving Data in Relation to Highway Factors. Washington, DC: The National Academies Press. doi: 10.17226/22849.
×
Page 24
Page 25
Suggested Citation:"Chapter 6 - Statistical Analysis: An Approach Using Extreme Value Theory." National Academies of Sciences, Engineering, and Medicine. 2013. A Multivariate Analysis of Crash and Naturalistic Driving Data in Relation to Highway Factors. Washington, DC: The National Academies Press. doi: 10.17226/22849.
×
Page 25
Page 26
Suggested Citation:"Chapter 6 - Statistical Analysis: An Approach Using Extreme Value Theory." National Academies of Sciences, Engineering, and Medicine. 2013. A Multivariate Analysis of Crash and Naturalistic Driving Data in Relation to Highway Factors. Washington, DC: The National Academies Press. doi: 10.17226/22849.
×
Page 26

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

24 C h a p t e r 6 extreme Value analysis Another way to analyze crash surrogates is in terms of the underlying continuous measurements rather than from dis- crete surrogate events. This offers a possible way to estimate the probability of crash or near crash from the frequency of small crash margins. Extreme value theory provides a robust statistical method by which probability levels in the tails of observed distributions of crash margins can be estimated. One potential advantage is to link absolute exposure values (AADT) to actual crash numbers via surrogate estimated fre- quencies. Another advantage is that an objective surrogate threshold can be used in place of the more arbitrary percen- tile thresholds used previously. Gumbel (1958) laid the foundation for the study of extreme values. Since then, extreme value theory has received much attention and undergone many changes. The idea is to model rare events that lie outside the range of available observations. The problem begins by selecting the largest (smallest) obser- vation from each of many samples. The resulting sample of maximum (minimum) values is the sample of extreme values for analysis. Gumbel showed that for large samples, depending on the parent distribution, distributions of extremes can fol- low one of three asymptotic distributions. The three asymp- totic distributions that Gumbel referred to as the first, second, and third asymptotes are now commonly called the Gumbel, Frechet, and Weibull distributions, respectively. A generalized extreme value (GEV) distribution contains a parameter for accommodating all three solutions simultane- ously so that a sample of extreme values can be fit to one dis- tribution without consideration of the three cases separately (see, for example, Coles 2001). One approach is to make use of extremal probability paper based on the first asymptotic solu- tion (the Gumbel distribution). If the observed data follow the Gumbel distribution, the data should plot as a straight line. If, however, the data follow the Frechet or Weibull distribution, the points will plot as a curve. The distinguishing feature of an extreme value analysis is the objective to quantify the stochastic behavior of a process at unusually large (small) levels. Extreme value analyses usually require estimation of the probability of events that are more extreme than any that have already been observed. As an exam- ple, traffic crashes are generally regarded as rare events and few, if any, actual crashes may be observed during a field operational test using instrumented vehicles in a naturalistic driving study. Instead, surrogate measures may be defined that approach actual crashes if extrapolation is permitted from observed levels of the surrogate measures to unobserved levels. The TTEC variable is used in this demonstration of the use of extreme value theory in the search for crash surrogates. With TTEC, the event of concern is road departure, a neces- sary but not sufficient condition for a road departure crash. However, extreme value analysis can help to explore relation- ships between road departures and road departure crashes by providing a surrogate for a surrogate, and thus leading to a better understanding of the road departure crash. To demonstrate this approach it was necessary to identify a length of roadway traveled by a large number of subjects. An extreme value distribution can be fit to a sample of the minimum TTEC for each driver. By plotting the values on Gumbel probability paper, the rate at which vehicles run off that particular roadway can be estimated. Screening the analysis database for adjacent HPMS segments traversed by a large number of same subjects was not fruitful because the total number of drivers was not large, and their trips are dis- tributed over a very large area. A search was conducted to find a single long road segment that was traversed by the largest number of drivers. The search yielded a 2.3-mi segment of US-23, a freeway in the Ann Arbor, Michigan, area which had 117 traversals by 43 different drivers. The segment has two lanes in each direction, a center median, a long horizontal reverse curve, and 12-ft shoulders. The minimum TTEC values for the 117 trips were used to fit the extreme value distribution. Treating traversals as Statistical Analysis: An Approach Using Extreme Value Theory

25 Figure 6.2 shows the fit of a GEV distribution to the mini- mum TTEC data. The data are plotted on extreme value probability paper. Because the data plot as a curve that is increasing at a decreasing rate, the data tend to follow a Weibull distribution. The assumption is that a road departure event occurs when TTEC = 0, so interest focuses on the intersection of the fitted line and the horizontal line where TTEC = 0. For these data, no observations resulted in actual events. Solution of the point where TTEC crosses 0 gives a return period of approximately 2 million. The return period is the reciprocal of the probability and can be used to estimate the expected number of obser- vations required to attain a certain level of TTEC. For this example, one would expect to record approximately 2 mil- lion extreme observations before seeing one road departure. The AADT for this road segment obtained from the HPMS data files is 65,755 vehicles. Thus, a road departure on this segment can be expected about once every 30.4 days or about 12 times a year. There were nine roadway departure crashes on this road segment from 2001 to 2005, or an average of 1.8 crashes per year. This indicates that on this particular road segment, about 15% of road departures resulted in a road departure crash. While the TTEC data are quite sparse, and these calculations cannot be seen as definitive, the resulting estimates appear very reasonable. This supports the poten- tial future value in conducting an in-depth study based on extreme value theory and TTEC. Figure 6.3 shows the distribution of crashes and TTEC extremes on the 2.3 mi of US-23 near Ann Arbor. Here the research team has not attempted to correlate TTEC extremes with actual crash locations but noted that for this segment, the crashes are neither uniformly spaced nor clustered around observations may violate an assumption of independence but results in larger sample size. It is possible that the normal dif- ferences in the environmental and traffic conditions encoun- tered by the same driver on this road segment would justify treating each traversal as independent. An attempt was made to find differences due to night and rain conditions, but the numbers of traversals at night and during rain were too few for meaningful analysis. Figure 6.1 shows a kernel density estimate of the minimum TTEC for the 117 observations. Standard extreme value analy- ses usually consider maximum values, but interest in TTEC is related to analysis of minimum values. One way to pro- ceed with an extreme value analysis for minimum values is to reverse the sign of TTEC and simply conduct a standard extreme value analysis for maximum values. This was done, and hence the shown TTEC values are negative (the actual distribution is the mirror image of the distribution shown, symmetric about zero). The smoothed plot covers the value 0. -10 -8 -6 -4 -2 0 0. 0 0. 1 0. 2 Figure 6.1. Kernel density plot of TTEC. Reduced Variate Ti m e To E dg e Cr os sin g - 10 - 8 - 6 - 4 - 2 0 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 .999 .9 .5 .2 .1 .05 .02 .01 .005 .002 .001 .0005 .0002 .0001 Probability 5 10 20 50 100 200 500 1000 2000 5000 10000 Return Period Figure 6.2. Fit of the generalized extreme value distribution to TTEC.

26 particular points. It therefore seems unlikely on the one hand that further analysis will establish meaningful trends of crash and surrogate event locations within individual segments. On the other hand, the relationship between frequencies of TTEC-based excursions (TTEC ≤ 0) to actual crash num- bers across different segments may be fruitful, because that relationship offers the possibility of deriving a nonparamet- ric relationship between crashes and surrogate events. The TTEC ≤ 0 surrogate event frequency, which is derived from extreme value theory, may be viewed as a higher-order sur- rogate event to be modeled via SUR in the same way other surrogates were considered in the previous section. As with several other aspects of the team’s exploratory analysis, the possibilities are deferred to future studies based on the larger NDD to be collected under SHRP 2. This section explored the use of extreme value analysis for examining properties of a surrogate. In contrast to the SUR approach that required categorical variables and used crash data, highway data, and NDD, the extreme value analysis pre- sented here used only continuous natural use data. The link- age to crashes was done afterward by estimating the number of observations one would expect to record before seeing one road departure. The potential of the extreme value approach is in the further exploration of driver behaviors associated with small crash margin events in general or by specific road- way features. Source: Google Earth. Figure 6.3. Crashes and TTEC extremes on 2.3 miles of US-23.

Next: Chapter 7 - Yaw Rate Error »
A Multivariate Analysis of Crash and Naturalistic Driving Data in Relation to Highway Factors Get This Book
×
 A Multivariate Analysis of Crash and Naturalistic Driving Data in Relation to Highway Factors
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB’s second Strategic Highway Research Program (SHRP 2) Report S2-S01C-RW-1: A Multivariate Analysis of Crash and Naturalistic Driving Data in Relation to Highway Factors explores analysis methods capable of associating crash risk with quantitative metrics (crash surrogates) available from naturalistic driving data.

Errata: The foreword originally contained incorrect information about the project. The text has been corrected in the online version of the report. (August 2013)

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!