Page 145 Cite

Suggested Citation:"Appendix C: Item Response Theory Example Using Motor Carrier Management Information System Data - Jacob Spertus." National Academies of Sciences, Engineering, and Medicine. 2017. Improving Motor Carrier Safety Measurement. Washington, DC: The National Academies Press. doi: 10.17226/24818.

×

Appendix C

Item Response Theory Example Using Motor Carrier Management Information System Data

Jacob Spertus

The following provides a brief worked example of an item response theory (IRT) model as discussed in Chapters 3, 4, and 5 applied to actual Safety Measurement System data. It uses a very small proportion of the full data, purposefully selected for quality and carrier size. It is not intended to be a comprehensive, unbiased analysis but merely a proof of concept. A full implementation would look roughly similar, but with many more carriers and violations included.

This appendix is included to show that the proposed model is relatively easy to implement in the Motor Carrier Management Information System. However, this model is in many ways a toy model. It is only applied to 865 carriers (not 200,000) and only unsafe driving violations that appeared for 5 or more carriers are included; there are none of the features, including multidimensionality (in fact, only one BASIC is used); and the priors have not been updated over time. Therefore, the quality of the fit should not be seen as an indication of the quality of the fit of the full model as represented in Chapters 4 and 5.

Suppose for each carrier i ∈ {1,…,C} we have a measure of exposure E_i, a count of eligible inspections n^k_i, and a count of violations y^k_i, where k ∈ {1,…,K} indicates the type of violation. Note that inspections are superscripted with k because the type of inspection determines which violations can be recorded. We propose the following model:

Page 146 Cite

Suggested Citation:"Appendix C: Item Response Theory Example Using Motor Carrier Management Information System Data - Jacob Spertus." National Academies of Sciences, Engineering, and Medicine. 2017. Improving Motor Carrier Safety Measurement. Washington, DC: The National Academies Press. doi: 10.17226/24818.

×

P(N_i = n_i\|E_i = Poisson(λ·E_i)	(C-1)
P(Y_i^k = y_i^k \| n_i^k,p_ik) = Binomial(n_i^k,p_ik)	(C-2)
logit(p_ik\|θ_i) = β_k + α_kθ_i	(C-3)
θ_i ∼ N (0,1)	(C-4)
β_k ∼ N (0,3²)	(C-5)
α_k ∼ log N (1,1)	(C-6)

The parameter λ in equation (C-1) estimates the rate of inspections per vehicle miles traveled (VMT) across the population. p_ik in equation (C-2) represents the probability of carrier i receiving violation k at a given inspection. It is modeled as a logistic function of the prevalence (or difficulty) parameter β_k, which reflects the marginal prevalence of violation k in the data, and discrimination parameter α_k, which reflects the association of violation k and latent safety. Thus for a given violation k a higher value of β_k indicates that it is observed more frequently, and a higher value of α_k indicates it is more associated with safety or danger.

To implement this model on a small scale, we use SMS data from 2014 to 2015, selecting a subset of the carrier population and including only unsafe driving violations. The carrier subpopulation are those carriers with 100 or more average power units (APUs), more than 80,000 VMT per APU reported, and less than 200,000 VMT per APU reported (medium to high utilization carriers according to SMS methodology). This leaves only 865 carriers. Finally in order to set a lower bound on sparsity, we drop violations that occur in less than 5 carriers. After this processing, 35 different types of unsafe driving violations (features) remain for 865 carriers (observations).

We used the ‘rstan’ interface to Stan for Hamiltonian Monte Carlo to fit the fully Bayesian models described above (Stan Development Team 2016). In the first step, a simple Poisson model (equation (C-1) is fit. For exposure E_i, we take 100,000 VMT. λ is specified with a non-informative prior and thus gives, essentially, the average inspections per 100,000 VMT. The observed mean in the population is 2.014, while the posterior mean (95% CI) of λ is 1.806 (1.802, 1.811)

The second step uses equations (C-2) and (C-3), and priors (C-4), (C-5), and (C-6), to fit a binomial IRT model. The prior on θ_i is selected to enforce identifiability of the model. The log-normal prior on α_k, which has support only over the positive real numbers, encodes the assumption that α_k is strictly positive. We provide default choices for the priors on β_k and

Page 147 Cite

Suggested Citation:"Appendix C: Item Response Theory Example Using Motor Carrier Management Information System Data - Jacob Spertus." National Academies of Sciences, Engineering, and Medicine. 2017. Improving Motor Carrier Safety Measurement. Washington, DC: The National Academies Press. doi: 10.17226/24818.

×

α_k by matching the specification in the ‘edstan’ R package for IRT models (Furr, 2017). Note that substantive knowledge about the effect of certain violations on safety could be included via the prior (C-6). The ‘trials’ for this IRT model are inspections eligible for unsafe driving violations, that is, driver inspections (level 1, 2, 3, or 6). A brief caveat: unsafe driving violations (e.g., speeding) often precipitate an inspection, so inspections might not be the best denominator for these violations. VMT might be a better exposure variable, requiring a Poisson model with VMT as offset.

Figure C-1 shows the characteristic curves for θ. The probability of receiving a given violation at an inspection is plotted against θ. Thus more unsafe carriers (with higher θ) receive more violations over all, and certain violations with especially high frequency.

The top 20 most discriminating violations (highest α) are shown in Table C-1.

Figure C-2 plots unsafe driving violations per inspection against θ. There is a very close mapping between these two metrics, such that θ more or less returns the number of unsafe driving violations per inspection.

Figure C-3 plots crashes per 100,000 VMT against θ. There is a positive relationship but also very high variance about the least squares line (low R²). The Spearman correlation between crashes per 100000 VMT and θs from the binomial IRT is 0.23.

**FIGURE C-1** Characteristic curves showing on the y-axis the probability of receiving a given violation (at a single inspection), and given safety level θ on the x-axis. Parameters are generated from binomial item response theory. Larger values of θ correspond to less safe carriers.

Page 148 Cite

Suggested Citation:"Appendix C: Item Response Theory Example Using Motor Carrier Management Information System Data - Jacob Spertus." National Academies of Sciences, Engineering, and Medicine. 2017. Improving Motor Carrier Safety Measurement. Washington, DC: The National Academies Press. doi: 10.17226/24818.

×

TABLE C-1 α, β, and Prevalence for the 20 Violations with the Highest α, Value Generated from Binomial Item Response Theory

Description	Alpha	Beta	Prevalence
Speeding	0.931	–11.2	10
Following too close	0.871	–5.5	3010
Lane restriction violation	0.839	–4.5	8328
State-local laws operating a CMV while texting	0.833	–8.9	95
Inattentive driving	0.771	–11.0	12
State-local laws speeding 6-10 miles per hour over limit	0.760	–3.6	18946
Operating a CMV while texting	0.734	–9.6	47
Driving a CMV while texting	0.635	–8.8	111
Improper lane change	0.631	–5.3	3614
Failing to use seat belt while operating a CMV	0.627	–4.7	6532
State-local laws speeding 11-14 miles per hour over limit	0.620	–4.7	6629
Using a handheld mobile telephone while operating a CMV	0.568	–5.5	2752
Allowing or requiring driver to use a handheld mobile telephone	0.565	–10.4	20
Failure to slow down approaching a railroad crossing	0.540	–11.4	7
State-local laws speeding in work/construction zone	0.530	–5.8	2134
Failing to stop at railroad crossing bus	0.489	–11.3	8
State-local laws speeding 15+ miles per hour over limit	0.459	–5.3	3549
Railroad grade crossing violation	0.441	–10.4	21
Improper passing	0.426	–7.3	463
Scheduling run to necessitate speeding	0.400	–10.7	15

NOTE: CMV, commercial motor vehicle.

Page 149 Cite

Suggested Citation:"Appendix C: Item Response Theory Example Using Motor Carrier Management Information System Data - Jacob Spertus." National Academies of Sciences, Engineering, and Medicine. 2017. Improving Motor Carrier Safety Measurement. Washington, DC: The National Academies Press. doi: 10.17226/24818.

×

**FIGURE C-2** Unsafe driving violations per inspection plotted against θ generated from binomial item response theory. Larger values of θ correspond to less safe carriers.

**FIGURE C-3** Crash per 100,000 vehicle miles traveled by safety level θ for each carrier in subpopulation estimated from binomial item response theory. X-axis is θ, y-axis is crashes per 100,000 vehicle miles traveled, color is log base 10 average number of power units. Larger values of θ correspond to less safe carriers.

Page 150 Cite

Suggested Citation:"Appendix C: Item Response Theory Example Using Motor Carrier Management Information System Data - Jacob Spertus." National Academies of Sciences, Engineering, and Medicine. 2017. Improving Motor Carrier Safety Measurement. Washington, DC: The National Academies Press. doi: 10.17226/24818.

×

REFERENCES

Furr, D.C. (2017). edstan: Stan models for item response theory, version 1.0.6. Available: https://CRAN.R-project.org/package=edstan [July 2017].

Stan Development Team. (2016). RStan: the R interface to Stan, version 2.14.1.Available: http://mc-stan.org [July 2017].

Improving Motor Carrier Safety Measurement (2017)

Chapter: Appendix C: Item Response Theory Example Using Motor Carrier Management Information System Data - Jacob Spertus

Appendix C

Item Response Theory Example Using Motor Carrier Management Information System Data

Jacob Spertus

REFERENCES

Welcome to OpenBook!

Get Email Updates

Improving Motor Carrier Safety Measurement (2017)

Chapter: Appendix C: Item Response Theory Example Using Motor Carrier Management Information System Data - Jacob Spertus

Appendix C Item Response Theory Example Using Motor Carrier Management Information System DataJacob Spertus

REFERENCES

Welcome to OpenBook!

Get Email Updates

Appendix C

Item Response Theory Example Using Motor Carrier Management Information System Data

Jacob Spertus