Every year, roughly 100,000 fatal and injury crashes occur in the United States involving large trucks and buses (Federal Motor Carrier Safety Administration, 2015). The Federal Motor Carrier Safety Administration (FMCSA) in the U.S. Department of Transportation has as its mission: “. . . to reduce crashes, injuries, and fatalities involving large trucks and buses” (Federal Motor Carrier Safety Administration, 2017). FMCSA believes that a sizable fraction of these crashes could be prevented through better safety management. This enhanced emphasis could include instituting carrier policies that reduce fatigued driving in various ways, that motivate drivers to drive under the speed limit, or that dictate conscientious attention to maintenance. In addition, a carrier could institute better hiring practices; more regularly monitor the health of its drivers, including their use of drugs and alcohol; give greater priority to driver training; and acquire technology to assist in collision avoidance.
FMCSA uses information that is collected on the frequency of approximately 900 different violations of safety regulations discovered during (mainly) roadside inspections to assess motor carriers’ compliance with Federal Motor Carrier Safety Regulations (FMCSRs), as well as to evaluate their compliance in comparison with their peers. Through use of this information, FMCSA identifies carriers to receive its available interventions in order to reduce the risk of crashes across all carriers. Approximately 3.5 million commercial motor vehicle (CMV) roadside inspections are conducted each year by specially trained inspectors, often state police and other enforcement officers, who are guided by the inspection stan-
dards and certification requirements of the Commercial Vehicle Safety Alliance (CVSA). The inspections, which encompass six different levels, involve checking a subset of all possible violations to assess various aspects of the compliance of the driver and vehicle. (See the Glossary for a description of the six inspection levels.) The identification of violations from these inspections and all associated data (including time and date of inspection) are input into the Motor Carrier Management Information System (MCMIS) and used by FMCSA to identify those carriers that are out of compliance with federal regulations and that are viewed the best candidates for interventions. MCMIS, primarily an administrative database that contains the results of carrier registrations, inspections, inspection violations, investigations, and crash involvements, provides all of the data input into these assessments. FMCSA deserves considerable credit for making use of this administrative dataset to discriminate between safe and unsafe motor carriers.
Carriers found to have frequent violations are subject to interventions from FMCSA. These can be in the form of warning letters (about 20,000 sent out a year), further monitoring, different types of investigations (about 15,000 a year), as well as suspension or revocation of the right to operate (about 800 a year). In addition, CMV crashes of a certain gravity are also input into MCMIS, and used by FMCSA both as an additional predictor for safety performance that can result in an intervention and as a measure of the future crash risk of a carrier that can be used to validate the assumption that increases in safety performance can lead to reductions in crash risk. Since crash risk is dependent on the amount of driving done, crash rate (crashes divided by vehicle miles traveled [VMT]) is the metric that FMCSA uses to assess crash risk.
FMCSA’s Safety Measurement System (SMS) identifies carriers for intervention. SMS uses MCMIS data to produce metrics to discriminate between carriers that are least compliant with regulations and therefore in need of an intervention, and those carriers that are operating more safely by generally complying. SMS is used to evaluate about 550,000 active motor carriers (see Table 6-1 in Chapter 6), a population that is heavily skewed by carrier size, ranging from those with a single motor vehicle to those with tens of thousands of vehicles. Owner-operator companies account for 44 percent of the active motor carriers. Also, the largest 5 percent of motor carriers accounts for 67 percent of the CMV fleet.
The total of 550,000 active carriers entails a good deal of churn, as about 30,000 of these carriers go out of existence each year, to be replaced by about the same number of new carriers. Given the data sufficiency standards used in SMS, FMCSA generates SMS percentiles for roughly 200,000 of the 550,000 active carriers.
In SMS, the approximately 900 types of violations are grouped into
six Behavior Analysis and Safety Improvement Categories (BASICs). The frequencies of violations in these categories, weighted by time and severity (which is an assessment of how relevant a violation is to future crash frequency), are divided either by the time-weighted total number of relevant inspections for that BASIC, or by an estimate of VMT for the carrier, to produce the six measures. The seventh BASIC looks at weighted (by crash severity) crash frequency over the previous 2-year period. Carriers are grouped essentially by size categories and the degree to which the carrier uses combination vehicles, so they are compared to other carriers of a similar size and type. Then, within these groupings, the BASIC measures for carriers are ranked from low to high, and each carrier is assigned its resulting percentile rank, which is the rank converted to a percentile between 0 and 100. High BASIC percentiles—indicating that a carrier was worse than a large percentage of similarly sized carriers—result in an intervention, with the overall goal being to incentivize carriers to adopt safe practices that will reduce their frequency of serious crashes in the future. Since some of these interventions are resource-intensive, SMS is a workforce prioritization tool for FMCSA.
Some stakeholders and outside reviewers have critiqued the SMS program concerning various aspects of its algorithms and structural assumptions. For instance, one issue that has been raised is that rather than percentile ranks, the measure used should be an absolute,1 rather than a relative, assessment of safety behavior. Otherwise, a carrier could be improving its safety performance, but if its peers are all showing greater improvements, the carrier would have increasingly high percentile ranks suggesting that the carrier is becoming less safe over time. Further, little information is available for the safety performance of small carriers, especially owner-operators that typically involve a single driver operating a single truck. Such a carrier would likely have very few inspections over a 2-year period. As a result, the minimum amount of information needed to calculate BASIC measures using the SMS algorithm has been questioned. In addition, concerns have been put forward about the role of state enforcement priorities and the effects on the frequency with which violations are issued, whether the algorithm should stratify based on the types of vehicles the carrier operates (e.g., separate bus and truck strata), and whether crashes generally viewed as nonpreventable should be ignored or at least downweighted by the algorithm.
1 Examples of absolute measures of safety are crash rate, number of crashes per mile, violation rate, and number of violations per inspection.
The overall concern raised by these various criticisms has motivated this review, which was written into Section 5221 of the Fixing America’s Surface Transportation (FAST) Act of 2015, mandating that the U.S. Department of Transportation request that the National Academies of Sciences, Engineering, and Medicine conduct a study analyzing SMS. The National Academies units tasked to carry out this study were the Committee on National Statistics, as the lead, with the assistance of the Transportation Research Board. The resulting expert panel, the Panel on the Review of the Compliance, Safety, and Accountability Program of the Federal Motor Carrier Safety Administration, was convened in March 2016, and was charged by this congressional mandate to evaluate the accuracy and sufficiency of the data used by SMS, and to assess whether other approaches to identifying unsafe carriers would identify high-risk carriers more effectively. In addition, the panel was asked to examine the effectiveness of the use of the percentile ranks produced by SMS for identifying high-risk carriers, and if not, what alternatives might be preferred, and to reflect on how members of the public use the SMS and what effect making the SMS information public has had on reducing crashes. The panel first met in June 2016, and then three additional times prior to issuing this report. The agendas for the open parts of the first three meetings are provided in Appendix A. In addition, FMCSA provided the panel with a version of the MCMIS data and the SMS algorithm, details of which are contained in Appendix B.
The following is the statement of task that the panel was charged to address:
An ad hoc panel will carry out a consensus study in response to Section 5211 of the Fixing America’s Transportation (FAST) Act of 2015. The purpose of this study is to analyze
- The accuracy with which the Behavior Analysis and Safety Improvement Categories (BASICs) safety measures used by the Compliance, Safety, Accountability (CSA) Safety Measurement System (SMS):
- identify high-risk carriers.
- predict or are correlated with future crash risk, crash severity, or other safety indicators for motor carriers, including the highest risk carriers.
- The methodology used to calculate BASIC percentiles and identify carriers for enforcement, including the weights assigned to
particular violations, and the tie between crash risk and specific regulatory violations, with respect to accurately identifying and predicting future crash risk for motor carriers.
- The relative value of inspection information and roadside enforcement data.
- Any data collection gaps or data sufficiency problems that may exist and the impact of those gaps and problems on the efficacy of the CSA program.
- The accuracy of safety data, including the use of crash data from crashes in which a motor carrier was free from fault.
- Whether BASIC percentiles for motor carriers of passengers should be calculated separately than for motor carriers of freight.
- The differences in the rates at which safety violations are reported to FMCSA for inclusion in the SMS by various enforcement authorities, including States, territories, and Federal inspectors.
- How members of the public use the SMS and what effect making the SMS information public has had on reducing crashes and eliminating unsafe motor carriers from the industry.
The study should also consider
- Whether the SMS provides comparable precision and confidence, through SMS alerts and percentiles, for the relative crash risk of individual large and small motor carriers.
- Whether alternatives to the SMS would identify high-risk carriers more accurately.
- The recommendations and findings of the Comptroller General of the United States and the Inspector General of the Department of Transportation, and independent review team reports, issued before the date of the act.
The panel will issue a report with findings and recommendations at the end of the study.
For clarification, the Compliance, Safety, Accountability (CSA) Program has components in addition to SMS. CSA is an overall agency, data-driven, safety compliance and enforcement program. Its objectives are to assess and intervene with a large segment of the industry, maximize the impact on large truck and bus safety, respond early to unsafe operation using a broad array of interventions, and make more effective use of resources. The components of CSA are (1) SMS, (2) the interventions process, and (3) safety fitness determinations that identify carriers that are not fit to operate CMVs. CSA is based on carrier’s performance in seven
BASICs and investigation results. The idea is a carrier’s significant pattern of noncompliance would be documented. SMS, which produces the BASIC percentiles, is itself comprised of a Carrier Safety Measurement System (CSMS) and a Driver Safety Measurement System (DSMS). This study is not concerned with non-SMS aspects of CSA, and it is concerned only with CSMS, not with DSMS, but we will refer to our topic as SMS in the remainder of this report.
SMS partitions nearly 900 possible violations that can arise from primarily roadside inspections into six groups, BASICs. The violations in each BASIC are grouped together because they are associated with similar types of unsafe practices: (1) Unsafe Driving, (2) Hours-of-Service Compliance, (3) Vehicle Maintenance, (4) Controlled Substances/Alcohol, (5) Hazardous Materials Compliance, and (6) Driver Fitness. For each carrier that has enough data to meet FMCSA’s data sufficiency standards for each BASIC, the SMS algorithm produces measures which, for five of these six noncrash BASICs, are weighted violation frequencies divided by weighted counts of inspections, for any violations within a given BASIC. Unsafe Driving uses a different denominator, which is essentially an estimate of VMT. The seventh BASIC, referred to as the Crash Indicator BASIC, is a weighted crash frequency, where the weights are time weights and crash severity weights. Then, within groups of similarly sized carriers, referred to as safety event groups, the carriers are ranked from low to high for each BASIC, and the percentile ranks (expressed between 0 and 100%) for each of the seven BASICs are computed for each carrier. Separately for each BASIC, those carriers with percentile ranks above thresholds set by FMCSA may receive interventions, which range from a warning letter to an on-site investigation. Finally, FMCSA has, until recently, published the percentile ranks for five of the seven BASICs. There may be important implications in making the measures public, since those carriers that have measures that exceed the thresholds can lose business and have their insurance rates increased.
SMS has been evaluated by the U.S. Government Accountability Office (2014), the American Transportation Research Institute (2012, 2014, 2015), the Independent Review Team (2014) of the U.S. Department of Transportation, Green and Blower (2011), and others. Before we present the issues that have been raised in these various reviews, we provide
some perspective on the nature of what FMCSA is attempting to accomplish and a sense of what level of performance is likely.
We assert that a direct approach to the problem of predicting which motor carriers will have the highest future crash risk is extremely difficult. This is primarily because crashes are a rare phenomenon, each crash is associated with multiple contributing factors, and the data associated with many of these factors are either unrecorded or unidentifiable. With SMS, FMCSA has instead adopted a related approach focused on prevention and not prediction. FMCSA identifies carriers engaging in observable behaviors that have been shown to be associated with future crash risk, and it intervenes with those carriers to encourage them to adopt safer practices in the hopes of reducing future crashes. The objective of FMCSA’s SMS is to identify carriers that are giving too little priority to behaviors and practices indicative of safety performance. FMCSA uses data on the frequency that inspections of CMVs are found to have violations of various safety regulations, investigations, along with the frequency of crashes, and thereby directly measuring important components of safety practices for carriers. By doing this, FMCSA’s objective becomes an easier problem of discriminating between those carriers that do and do not emphasize various aspects of safe operations.
A number of specific criticisms of SMS were documented in past reviews and repeated in presentations to the panel.
Not All BASICs Are Predictive: Evaluations of SMS have shown that two of the seven BASICs—Driver Fitness and Controlled Substances/Alcohol—have low correlations with future crash rates, which raises a question about their utility as part of SMS.
Data Sufficiency Standards: As noted above, FMCSA does not apply SMS to carriers that have not had a sufficient number of inspections, violations, and crashes. There is no question that SMS measures and percentiles for carriers with just a few inspections would be extremely variable. However, FMCSA points out that if they adopt a more stringent standard, they will exclude a very large fraction of the CMV population from their purview. This is the trade-off that must be considered.
Use of an Absolute Versus a Relative Metric: The use of percentile ranks to decide which carriers receive interventions is a relative scoring method, where each carrier has to do better than their peers to receive a lower (improved) ranking. In contrast, absolute measures inform users
about a carrier’s change in performance over time, not in comparison with any peers. Given the constant improvement in various aspects of CMV driving, including better technologies, a relative rank has the important advantage that such improvements do not make the metric irrelevant over time. Also, relative ranks are consistent with FMCSA’s fixed budget. However, carriers that are improving against an absolute standard feel that the improvement should relieve them from receiving interventions.
Use of Data from Nonpreventable Crashes: The American Transportation Research Institute (American Transportation Research Institute, 2015) has raised the point that many crashes involving CMVs are not the fault of the CMV drivers, such as when they get rear-ended. It can be argued that such crashes do not help discriminate between unsafe and safe carriers and therefore should be removed from the SMS algorithm. The argument for the retention of such crash data is that based on effectiveness testing, all crashes are useful in discriminating between safe and unsafe carriers. Further, it is not always easy to determine preventability and doing so would require substantial additional resources.
State Differences in Rate of Inspections and Violations: There are differences from state to state in road type, congestion, and in prevalence of ice, degree of visibility, and other conditions. Since the driving environment varies state by state, this can have an impact on crash frequency. Further, the American Transportation Research Institute (2014) has provided strong evidence that there are significant differences among states in administration of the CVSA inspection system. This raises the question as to whether SMS is unfair to carriers that operate in states that issue more frequent violations (or that issue a higher percentage of violations that have a higher severity weight).
Stratification of SMS in Addition to Safety Event Groups: Besides safety event groups, which as we have noted are essentially based on carrier size, the only formal stratification that SMS makes use of is the stratification of the carrier population into carriers where, for truck carriers, more than 70 percent of the trucks are “combination” as opposed to “straight,” with a related stratification for motorcoach carriers. (See Glossary for explanation of these terms.) In addition, there are different thresholds for interventions for different types of carriers, including passenger carriers and hazardous material carriers, which is also a form of stratification. Given the heterogeneity of the trucking and motorcoach industry, a greater degree of stratification has been suggested by critics of SMS so that carriers are compared to peers who have similar operations. FMCSA counters such suggestions by pointing out that the greater the degree of
stratification used the fewer peers a carrier can be compared to and the less useful are the ranks in the tails of distributions.
Better Measures of Exposure: A relatively fair comparison between two carriers requires that the total number of violations for unsafe driving or the total number of crashes be standardized by the number of VMT. That is, the greater the miles traveled, the greater the likelihood of an inspection or violation, and the greater the likelihood for a crash. Therefore, the number of violations or the number of crashes is divided by the number of total VMT for a carrier, resulting in violations or crashes per mile. Unfortunately, the current data from MCMIS on VMT for a carrier is often missing, and so FMCSA uses a more highly reported figure on the average number of power units, multiplied by a utilization factor (see Appendix B) that is a function of VMT, as a proxy for VMT. The resulting denominator is often based on the proxy rather than actual mileage, and therefore it is important to improve the quality of data on VMT so that this normalization is an accurate reflection of the motor carrier’s operations.
Quality of MCMIS Crash Data: While crash data collection is not solely FMCSA’s responsibility, evaluations of state reporting have shown significant underreporting of qualifying crashes by the states. The underreporting varies by crash severity (with fatal crashes less likely to be missed), truck type (where crashes with smaller qualifying trucks less likely to be reported), type of enforcement agency that covered the crash, and other factors.
Appropriateness of Severity Weights and Violation Coding: The severity weights were established by FMCSA using a combination of subject-matter expertise and empirical work. They are used to give additional weight to violations that are more closely associated with future crash risk. Severity weights and violation coding have been criticized, since essentially equivalent violations can be assigned to different violation codes, which can result in severity weights that can differ by two or more times, depending on the specific violation codes cited.
Currently Uncollected Variables That Might Substantially Improve SMS: MCMIS is relatively effective at capturing many characteristics of the commercial driver and the vehicle and some aspects of the environment. However, it is incomplete regarding carrier operations. SMS is in some sense based on the belief that some crashes are due in part to carrier operations. Hence, it is important to gain knowledge of those carrier factors when considering improvements to SMS, as these factors are what
FMCSA is attempting to change. For example, it would be extremely useful, though currently not feasible, to have information on all 550,000 motor carriers as to their primary business, how they schedule drivers, and their driver turnover rate.
Sparsity of Some Violations: Many of the 899 violations are only occasionally cited on any inspections and so have little tangible impact on SMS for the great majority of carriers. It seems reasonable to believe that the greater the number of different violations for which data are collected, the lower the quality of the information, which would advocate for the removal of such violations from data collection. On the other hand, such violations, for circumstances that are admittedly somewhat rare, might still be extremely predictive of unsafe carriers for those circumstances, and therefore very important information when they occur.
Selection Effects: The reason that some trucks are more frequently pulled over for inspections and others are not depends both on state guidance and the inspectors’ discretion. As a result, there may be factors that cause some carriers to be inspected more than others, and these factors may or may not be closely related to safety performance.
Transparency of the SMS Algorithm: The panel did not carry out any formal surveys or even informal interviews, but the presentations suggested many carriers find SMS relatively complicated. As a result, they are uncertain whether their score or percentile ranks will increase or decrease based on their most recent months’ pattern of violations and crashes. Some degree of transparency would enhance the reproducibility of SMS measures and give carriers greater trust in the measures and resulting percentiles. This will make it clearer to the carriers what factors could reduce their future chances of having percentile ranks that generate an intervention. In addition, greater access to MCMIS data and the SMS algorithm would facilitate research on SMS and its alternatives by the academic research community.
Making Percentile Ranks Public: Until recently, the SMS measures for the BASICs except for Hazardous Materials Compliance and Crash Indicators were made public, as were the percentile ranks. However, as part of the FAST Act, those percentile ranks are not released for property-carrying motor carriers. Doing so would have the benefit of increasing the incentives for carriers to improve their safety performance, since public reporting can result in lost business and in increased insurance rates. On the other hand, it can be argued that the worse SMS does in discriminating between low- and high-risk carriers, the weaker the argument for
making the percentile ranks public. If the difference is not substantial, that would provide further justification for not publishing such ranks. FMCSA has examined the future crash risk of the carriers identified for interventions by SMS, and the same for the remaining carriers, and the differences (as discussed in Chapter 2) are substantial.
Given the stakes involved, it is important to examine SMS to see whether these criticisms are valid, and to see whether improvements can be made to the input data or to the approach used to discriminate between safe and unsafe motor carriers. FMCSA has been forthcoming in incorporating changes in response to suggestions made by various stakeholders or in providing their reasons for not making changes. Since its implementation more than 6 years ago, SMS has had several revisions to finetune the methodology.
The rest of this report is organized as follows. Chapter 2 provides a brief description of the SMS methodology, the evaluations of SMS that have been carried out, the reviews and critiques that have been presented, and our summary assessment of the evaluations and critiques. Chapter 3 describes the statistical modeling issues faced by FMCSA in developing SMS and the opportunity to use item response theory (IRT) to assess safety in the trucking industry. Chapter 3 also describes previous uses of IRT to assess hospital performance and the effectiveness of teachers in elementary schools. Further, it discusses whether SMS percentile ranks should be made public, and the benefits of transparency of SMS or alternatives. Chapter 4 contains the details of the IRT model as it could apply to motor carrier safety, starting with its conceptual basis and continuing through its technical description. Chapter 5 contains some extensions to the model, including multivariate responses, as well as the ability of the model to accommodate exposure measures, provide absolute instead of relative metrics, represent the uncertainty of measures, and identify carriers for interventions. Chapter 6 describes what variables currently on MCMIS need to be improved for use in SMS and in the new model, and then what additional variables, if collected, could potentially improve the performance of the proposed model in the future. The panel’s six recommendations are presented in the relevant chapters, specifically Chapters 3, 4, and 6. Appendix A summarizes the agendas of the panel’s public meetings, Appendix B details the current SMS algorithm, Appendix C provides examples of simple IRT models applied to the MCMIS data, and Appendix D contains biographical sketches of the panel members and staff.