Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 22

22
Data analysis focused on developing models by design The models developed in this research predict the number
element for assessing safety impacts from trade-offs among of crashes for a given condition. This decision was reached
values of each design element and predicting the potential during the project panel meeting, during which the appropri-
safety consequences expressed as number of crashes per unit ateness of crash rates and number of crashes was discussed.
time. Models for predicting crashes by severity level were also The decision was based on the need to develop results that
developed. However, models for specific crash types were not could be eventually used in the HSM. The rationale for this
developed due to lack of available crash data. For the statisti- decision is that the current trend is to avoid the use of crash
cal modeling, GLMs were used because they are considered rates because of potential problems arising from the implicit
more appropriate for variables that are not normally distrib- assumption of linearity between volume and crashes as well as
uted. Such models use a maximum likelihood function to the possible misuse by unaware users who may assume that a
determine which variables are significant and how well the change in traffic volumes could proportionally affect the
model fits the data. Crashes are considered random events number of crashes. It was therefore decided to separate the data
that follow a Poisson distribution; therefore, the use of GLMs in divided and undivided segments and to develop separate
is appropriate. Such models are derived using a relatively recent models for each group.
statistical approach; the literature suggests they have been Models developed in this research were validated to deter-
gaining popularity among researchers (3941). mine their goodness-of-fit. The available data were randomly
The SAS statistical software was used to develop the divided into two sets: one was used in the model development,
prediction models and to determine their coefficients (46). while the second was used for the evaluation of the strength
The Generalized Modeling procedure (GENMOD) was imple- of the model to predict the number of crashes. This is an
mented, and the model coefficients were estimated through accepted approach to determine the goodness-of-fit of a model,
the maximum-likelihood method. This approach is well even though it reduces the data available for developing the
suited to the development of models that have predictors that model by one-half.
are either continuous or categorical2. The residual deviance
statistics were used to assess the model's goodness-of-fit. Prediction Models
Initially, all variables of concern were included in the model,
and variables with coefficients that were not statistically Models were developed and evaluated for their applicability
significant (at the 5% level) were removed from the model. and ability to produce predictors with reasonable coefficient
This process was followed until a model was obtained in which signs. Initially, models were developed where the exposure
all variables entered were statistically significant. The signs of was considered as the product of length and traffic volume.
the coefficients were also evaluated to determine whether they However, these models produced consistently counterintuitive
results: the coefficient signs were opposite to a priori expecta-
reflected previously observed crash trends.
tions based on past research. Therefore, a second round of
A desirable outcome from such a model is the determina-
models was produced that used volume as a predictor with the
tion of the relative safety impact of specific geometric ele-
goal of obtaining more robust models with coefficients more
ments. This requires the availability of adequate data to
in accordance with past work. These new models had a better
establish such comparisons as well as the isolation of the
fit, and most coefficients were in agreement with past research
impact of each element. There are potential problems that
findings. The general form of these models was as follows:
should be considered when a model is developed. First, spe-
cific elements may not be easily isolated and examined alone E [ N ]i = L e b - ln 12+b ln ADT +b X +b X + . . . + b X
0 1 2 1 2 2 n n
(5
5)
since the literature has indicated that there are elements that
interact. Second, there is the potential for significant vari- where
ability among the various roadway segments included in the E[N]i = expected crash frequency per year for Condition i;
database such that, even if an element can be isolated, there L = segment length (mile);
may be other variables (such as traffic volume, number of bi = model coefficients;
lanes, and functional class) that could also require attention ADT = average daily traffic (vehicles/day); and
and, thus, require an additional data classification, further Xi = predictors (various variables).
reducing a model's strength in reaching statistically sound
conclusions. The predictor variables varied for each condition--divided
and undivided segments and single-vehicle, multi-vehicle, and
all crashes--are discussed in the following paragraphs. The
2A categorical predictor variable is a variable whose categories identify class
term ln 12 is included in each model to provide the results in
or group membership, which is used to predict responses on one or more units of crashes per year (as 12 years of data were used for esti-
dependent variables (from http://www.statsoft.com/textbook/glosc.html). mating the model).