The causes of a crash can be related to the characteristics of the driver; driver behavior/performance; the vehicle; the fleet; or the environment in which the driver and the vehicle, as well as other vehicles, were traveling. For a given crash, multiple factors can and often do play a part. Any analysis of individual causal factors, including driver fatigue, can be biased by a failure to represent the impacts of various confounding influences. Therefore, assessing the role of driver fatigue in highway safety requires a comprehensive approach that accounts for the contribution of all the important factors that can cause changes in the degree of driver fatigue and in crash frequency. To this end, one must have data on the contributions of these other causal factors.
This need in turn prompts the need for an assessment of what data on the various causal factors exist, at what level, and for what populations and how linkable these data are to other variables. Such an assessment can provide a better understanding of the existing data gaps and how they might be filled to develop a comprehensive database that can support more conclusive research on the role of driver fatigue and hours-of-service (HOS) regulations in highway safety. With such a database, research would be able to identify which factors play more or less important roles in causing driver fatigue, and which factors, including fatigue, play more or less important roles in the frequency of crashes. Along with other considerations, such as the feasibility of implementing policy changes that can alter various causal factors, such research should greatly assist the Federal Motor Carrier Safety Administration (FMCSA) and other govern-
ment agencies in determining how to focus their efforts to reduce crashes due to fatigued driving and, more generally, to improve highway safety.
Specifically, one needs to collect data from commercial motor vehicle (CMV) drivers on their degree of fatigue (e.g., hours of service, work hours, work demands); other driver factors, such as medication use, drug use, and body mass index (BMI); vehicle factors (e.g., noise, distraction, quality of brakes, visibility); carrier factors (e.g., scheduling, compensation methods); and environmental factors (e.g., time of day, road type, traffic conditions). Table 10-1 represents an initial attempt to outline the various predictors that might be expected to have an association with crashes or other safety outcomes. The table includes driver fatigue as both a predictor and an outcome because it is necessary to understand not only the extent to which driver fatigue causes crashes but also, assuming that fatigue is an important causal factor for crashes, what factors cause drivers to be fatigued.
The table also includes traffic density and whether driving occurs during the nighttime as important factors in crash risk. Both are linked to increased crash risk. Moreover, since these two factors tend to be negatively correlated, their joint impact on crash risk can be difficult to anticipate. That is, the time that is worst for staying awake—nighttime—is the best for avoidance of other vehicular traffic. Therefore, any assessment of crash risk needs to include the contribution of both factors. This is a simple example arguing for a comprehensive approach to the question of fatigue as a causal factor for CMV crashes.
Several points need to be emphasized about Table 10-1. First, it is incomplete as to the potential factors that might be included. Confirmatory research has not been conducted to enable a comprehensive understanding of the causal structure underlying crash risk. The table is also incomplete regarding data sources and their availability. Finally, it is incomplete with respect to the information that one might want to know about each data source. Completing the column on predictors is a challenge, but for existing data sources, columns 4, 5, and 6 could easily be filled in. The panel believes FMCSA can use this table as a starting point for a living compendium of the factors that affect crash risk and the information available about them. FMSCA is best positioned to complete the table because it has the most thorough understanding of the various data sets and what information they do and do not include. Also, analysts using these data to draw inferences about fatigue, hours of service, and highway safety may find that the table needs to be augmented to include columns representing other important data features.
It is also important to note that determining which factors can raise or lower crash risk is important in and of itself, and is necessary to support further causally relevant research on the interrelationship among fatigue,
hours of service, and crash risk. There is a chicken-and-egg problem here in that until one knows which factors to include as confounders, one cannot know whether a variable is in fact a causal factor or simply correlated with a true causal factor. The panel believes that for now, it is better to err on the side of including variables whose status regarding causal impact on crash risk is unclear.
The column labeled “Public/Private” is a reminder that much of the data collected now is not easily available to researchers (see Chapter 5). In addition, as the columns “Level of Aggregation” and “Populations for Which Available” suggest, the data often are not available at the necessary level of aggregation or for an appropriately representative population of CMV drivers. Relevant to level of aggregation, some of the predictors will need to be linked to each trip sequence (possibly every few minutes of each trip) and so at a very detailed level. Examples include degree of driver fatigue, degree of precipitation, and traffic density. However, some other variables, such as all fleet variables and many environmental factors, including number of lanes and geometry of the road, are stable over long periods of time and can be collected infrequently.
Various definitional and measurement complications result from the need to represent many of the above concepts. With respect to outcomes, what is a crash? Does one count curb strikes? For serious crashes, does one require a threshold on damages, and should one use only avoidable crashes? Should the focus be on crashes that result in fatalities? What safety-critical events (SCEs) are relevant to analysis of fatigue and highway safety? This lack of clarity in measurement is true for some of the predictors as well. For example, previous chapters have addressed the difficulty of defining and measuring driver fatigue, but how should safety culture be defined and measured? While additional work in this area is needed, definitional vagueness and measurement complexity often can be assigned lower priority since inferences are frequently robust to the precise definitions and measurements used.
Filling out the remaining columns of Table 10-1 would clarify what data are available to FMCSA and academic/industry researchers and identify existing data gaps. FMCSA then would need to determine the relative priority of each gap and the best means of closing these gaps. FMCSA would need to determine whether private sources exist, and if so, whether various techniques could be effective for making such sources public while avoiding disclosure of individual data. Further, FMCSA would need to determine the level of aggregation at which the data exist and whether they are sufficiently detailed, as well as whether the data exist for a subset or for the entire population of CMV drivers.
Finally, the data need to be linked so that rapidly changing information on drivers (e.g., whether they obtained sufficient sleep) and the
TABLE 10-1 Factors Associated with Outcomes Representing Crash Risk
|Predictor Domain||Predictors/Variable Set||Data Source||Public/Private||Level of Aggregation||Populations for Which Available||Outcomes|
|Trucks and Buses||
environment (e.g., degree of precipitation) can be merged in a way that is amenable to statistical modeling of the most relevant information. (The assumption here is that carrier and vehicle data are less changeable, but there are likely exceptions to that assumption.) Linking such information would be greatly facilitated by some means of identifying unique drivers and trips that could be used across databases, since the information described here is not likely to reside in a single database.1
The fact that FMCSA will discover many data gaps should be no surprise. As is clear from a review of Chapter 5, none of the data sets currently available for assessment of the role of driver fatigue in highway safety provides all the information necessary to draw rigorous inferences for the full population of interest. Crash data sets underrepresent the degree to which fatigue contributes to crashes since they are based on after-the-fact police reports and therefore include incomplete information on the driver’s recent sleep history. Further, since comprehensive exposure data are not available, crash counts cannot be translated into crash rates and therefore support risk assessments for individual factors. Naturalistic driving data sets typically include few crashes because crashes are rare events. Moreover, since all such research requires participants’ informed consent, it collects data only on volunteers, who may be more or less subject to certain risk factors relative to the general CMV driver population. Also, the various special-purpose data sets comprising small surveys, driver logs, output from various devices, and data collections for particular populations (e.g., drivers for large fleets) are available only for subpopulations and for a subset of the necessary causal factors. As discussed in this chapter, new data collection efforts and new technologies for automatically collecting relevant data could provide much of the needed data not currently available.
The remainder of this chapter presents the panel’s analysis and recommendations with respect to the most important directions for future research on fatigue among CMV drivers and highway safety. It describes in turn (1) survey data that could be collected on CMV drivers to help reduce some of the major data gaps on drivers that inhibit this research; (2) data available from vehicles themselves that could help close existing gaps in data on drivers, vehicles, and the environment; (3) other data sources that could be tapped for this research; (4) examples of key research questions that could be investigated with better data; and (5) methodological and statistical issues entailed in research on driver fatigue and highway safety and how these issues can be addressed.
1 Longitudinal Employer-Household Dynamics (LEHD) is an example of a data set in which employer-employee information is linked and therefore is useful for tracing the employment history of a worker. See http://lehd.did.census.gov/ [June 2016].
The population of CMV drivers in the United States is large and diverse (see Chapter 2). Currently, there are approximately 3 million CMV drivers (depending on the definition used) for whom knowledge remains incomplete.2 Given that FMCSA is charged with establishing HOS regulations, setting standards for medical certification, and communicating the dangers of driver fatigue, the agency would greatly benefit from knowing important characteristics about this population. For instance, truck drivers who drive locally on set routes are less likely than long-haul drivers to be affected by changes to HOS regulations. Given the lack of a continuing survey, however, much about the population of CMV drivers remains undocumented, even on matters of simple demographics. This includes the number of drivers engaged in various types of employment (e.g., the number of local versus long-distance drivers); the number of hours in a day and number of days in a week that drivers with various types of employment drive; how drivers are compensated; and their age, gender, and race. This information is lacking in particular for the large number of independent drivers who are not directly employed by carriers. Absent this information, it is difficult to estimate accurately how many drivers are directly affected by FMCSA’s guidelines and policies and how many might benefit from various proposed changes.
Capturing many of the measurements needed for research on fatigue and on the health and welfare of CMV drivers can be viewed as somewhat intrusive. Such data collection often is dependent upon CMV drivers who volunteer this kind of personal information. It is therefore common for various subsets of the truck and bus driver population to be underrepresented in such research efforts. Having information on the total CMV driver population and some of its characteristics would support appropriate weighting of samples so that findings could be generalized to the full population of drivers (based on certain assumptions).
Commercial Driver’s License Information System Database
Generally speaking, a person wishing to legally operate a vehicle with a gross vehicle weight rating (GVWR) of greater than or equal to 26,001 lb or to transport 16 or more passengers needs to have a commercial driver’s license (CDL). Obtaining a CDL requires passing a written test and a road test. Information on drivers with CDLs, along with their driving
2 For purposes of issuing a CDL and for drug testing, one includes only drivers of vehicles greater than 26,000 lb gross vehicle weight rating. For medical qualification, the definition is based on FMCSA regulation 390.5 and is for drivers of vehicles greater than 10,000 lb gross vehicle weight rating.
records, is collected by the individual states and aggregated in the Commercial Driver’s License Information System (CDLIS). The purpose of this database is to serve as a clearinghouse and repository of information pertaining to the licensing and identification of CMV drivers. State driver licensing agencies use the CDLIS to transmit information on out-of-state convictions, transfer the driver record when a license holder moves to another state, and respond to requests for driver status and history. Prior to the development of the CDLIS, drivers could obtain multiple CDLs and could use that capability to conceal violations from law enforcement personnel or prospective employers. Each state separately administers its own portion of the CDLIS, and all of the system’s files are linked together in a national relational database.
In addition to information on a driver’s safety record, the CDLIS contains information on a driver’s age, gender, height, and weight. As of February 2008, the CDLIS contained more than 13 million CMV driver records (see Federal Motor Carrier Safety Administration, 2008). Since it is generally believed that there are about 3 million active truck and bus drivers, the CDLIS contains a substantial number of records for individuals who are no longer active CMV drivers. In addition, the currency of many of the addresses in the database is a concern. (As noted above, part of the vagueness comes from how a CMV driver is defined. Sections 383.5 and 390.5 of the FMSCA Safety Regulations differ in their definitions.) In addition, given that access to this database is limited to FMCSA and its contractors, using it either as a sampling frame or to compute demographic summary statistics (age, gender, race) would raise privacy concerns. Thus, while the CDLIS appears to offer a source of data on CMV drivers, it has a number of limitations that constrain its use for gathering information on CMV drivers and crash risk.
Commercial Motor Vehicle Driver Surveys
Surveys of the population of CMV drivers have been attempted. Recently, the National Institute on Occupational Safety and Health (NIOSH) conducted a survey of long-haul truck drivers. The survey design involved randomly selecting limited-access highway segments and then randomly selecting truck stops within those segments. Truck drivers entering those truck stops during a 3-day interview period were recruited for the survey. Drivers also could be approached at fueling stops or weigh stations. This survey was limited to long-haul truck drivers. (See Sieber et al.  for details.) Another survey was recently carried out as part of the Behavioral Risk Factor Surveillance System in Washington State (for details, see Bonauto et al. ). Respondents for this survey were contacted via telephone, which raises the concern that long-haul
truck drivers were underrepresented. This is hardly a comprehensive list of surveys of CMV drivers, but it is safe to say that such studies are not carried out on a regular basis, and they generally are available only for specific subgroups.
Until 2002, the Census Bureau conducted a Truck (or Vehicle) Inventory and Use Survey,3 which collected some of the information needed to populate Table 10-1. The sampling frame was constructed from files of truck registrations identified as being active. Sampling was stratified by state and by truck characteristics. Body type and GVWR determined the following five truck strata: (1) pickups; (2) minivans, other light vans, and sport-utilities; (3) light single-unit trucks (GVWR under 26,000 lb); (4) heavy single-unit trucks (GVWR over 26,000 lb); and (5) truck-tractors. Within each stratum, a simple random sample of truck registrations was selected without replacement, producing a sample of approximately 136,000 truck registrations. Questionnaires were mailed to the addresses corresponding to these registrations, and the results were tabulated. This survey provided information relevant to the present study, including whether a vehicle had been involved in a crash, the vehicle type, the jurisdiction, the type and size of the carrier, the distance traveled, and the range of operation. Unfortunately, this Census Bureau survey was discontinued after 2002.4
The Need for a Continuing Survey of Truck and Bus Drivers
The panel contends there is a need for information on CMV drivers, their vehicles, their routes, and their employers. Given that attempts to survey CMV drivers have been infrequent and limited as to the population coverage, the panel believes that a regular survey of a sample of CMV drivers is needed to collect information on the drivers (age, gender, race), their basic health characteristics, their employment, how much they drive, their vehicles, and their routes. Also, this survey would need to be repeated on a regular basis because of the high turnover rates among the CMV driver population, as well as regular changes in the particular types and characteristics of vehicles being driven, the driving environments, the types of employment, and other variables. Repeating such a survey every 5 to 10 years would help inform the modification of guidelines for CMV drivers over time.
Since many long-haul truck drivers are not easily reached, such a
3 See https://www.census.gov/svsd/www/vius/products.html [March 2016].
4 In addition to the Vehicle Inventory and Use Survey, the Bureau of Transportation Statistics used to collect substantial data on Form M that provided details on compensation and benefits across a wide range of trucking occupations.
survey would not be easy to conduct. However, area sampling methods, such as those used in the NIOSH survey, could provide quality information. (NIOSH also used incentive payments, which would be important given that respondents would be on the clock.) While the panel is more optimistic about an area sample approach, the CDLIS could possibly serve as a sample frame for a mail or Internet survey provided that (1) it could be kept current and include information on whether the individual was employed as a truck or bus driver for greater than so many months per year, and (2) contact information could also be kept current and could include mail and Internet addresses.
The panel believes that NIOSH is best positioned to support such a data collection effort, at least in part, given the relevance of its charge to assess occupational safety and health. Because of the cost of conducting such surveys, however, other agency support undoubtedly would be needed.
RECOMMENDATION 1: The National Institute for Occupational Safety and Health should be enlisted to design and conduct a regularly scheduled survey every 5 to 10 years to gather information needed to better understand the demographics and employment circumstances of all commercial motor vehicle drivers in various industry segments.
As detailed in Chapter 9, a number of new devices currently are being used or have been proposed for use with commercial motor vehicles. These devices include electronic on-board recorders that collect information on when a vehicle is in operation. Further, a number of technologies enable the collection of real-time video and telematics data, which can be used to alert a fleet manager when a vehicle is, or recently has been, speeding, hard braking, or swerving. There also are various on-board safety systems already in use or proposed for use, including indicators of weaving out of one’s lane and automatic collision avoidance systems. Such systems can provide, in real time, much of the information relevant to the driver and the vehicle listed in Table 10-1. One prime example is technology for assessing whether the driver is fatigued by measuring PERCLOS (percentage of eyelid closure). Some systems collect other, more indirect measures of fatigue, such as making quick motions with the steering wheel or multiple lane departures. In addition, information can be collected on the environment because cameras can be trained on the road ahead. When collected, however, much of this type of data is proprietary, so legal barriers currently limit researchers’ ability to acquire the data to
examine the role of fatigue in highway safety. Although such data likely suffer from a lack of representativeness, the panel believes they still could play an important role in this research. For instance, they could be used to estimate upper and lower bounds for various key parameters for the entire population of CMV drivers.
It would therefore be useful to explore ways in which such data might be made available to the research community. The past 25 years has seen the growth of effective disclosure limitation techniques. Use of these techniques can greatly limit the risk of disclosure of individually identifiable information while at the same time allowing researchers to use the protected version of the data, which retains the great majority of the information. There also are arrangements whereby confidential data can be made available through research data centers such that access is allowed only for a preapproved set of researchers, who cannot take any sensitive information from the center when they have completed their analysis. Given the successful use of such techniques in many contexts, the panel believes FMCSA could benefit from working with owners of these data sets to help make them researcher-accessible and disclosure-limited. Such data sharing could be facilitated either by utilizing such techniques as cell suppression, noise addition, or production of synthetic data sets before granting researchers access or by establishing research data centers providing researchers limited access to such data for summary analyses. The following subsections describe the vehicle-based data sources currently available or on the horizon.
Data from Electronic Logging Devices
In the past few years, many carriers have decided to replace their paper logs with electronic logging devices. In the near future, all carriers may be legally obligated to do so. At a minimum, these devices record when and where the truck or bus was in operation. Therefore, in the event of a crash, the number of hours the vehicle was in operation for at least 24 hours, and likely much longer, will be automatically documented, and such data can then be used to assess whether a driver violated the HOS regulations.
It has been suggested that electronic logging devices (similar to electronic on-board recorders) provide higher-quality documentation of the number of hours a truck or bus was in operation relative to the paper logs currently used by many CMV drivers because they are more difficult to tamper with, and they document hours automatically instead of requiring the driver to do so (see also Chapter 5). The panel finds this argument very persuasive. By providing higher-quality assessments of when the truck or bus was in operation, electronic logging devices are better able
to document compliance with HOS regulations, and will also provide better inputs for research examining the linkage among hours of service, fatigue, and highway safety. Accordingly, the panel believes it would be beneficial for FMCSA to compare these devices with paper logs to determine whether their use reduces HOS violations. To address the fact that those carriers with electronic logging devices may not provide representative subsets of the population of CMV drivers, this comparison could use an interrupted time series design, as described in Chapter 6. Further, the importance of switching from paper logs could be demonstrated by research showing lower crash rates for carriers that have installed these devices compared with those that have not, controlling for confounding factors with a technique such as propensity scoring.
RECOMMENDATION 2: The Federal Motor Carrier Safety Administration should conduct an evaluation to determine whether commercial motor vehicle drivers’ use of electronic on-board recorders correlates with reduced frequency of hours-of-service violations and reduced frequency of crashes compared with those drivers who do not use such instruments.
If either of these reductions were established, the argument for requiring the widespread use of electronic logging devices would be significantly enhanced.
A provision for Title 49 of the U.S. Code, Section 31137, mandates the use of electronic logging devices, but seems to prevent the use of the data from such devices for research, stating: “The Secretary may utilize information contained in an electronic logging device only to enforce the Secretary’s motor carrier safety and related regulations…The Secretary shall institute appropriate measures to ensure any information collected by electronic logging devices is used by enforcement personnel only for the purpose of determining compliance with hours of service requirements” [emphasis added]. A recent FMCSA final rule also exists that mandates the use of electronic logging devices but could not mandate that the resulting data be used in research studies. We are therefore led to the following:
RECOMMENDATION 3: Given the potential research benefits of the use of data from electronic logging devices, Congress should consider modifying Title 49 of the U.S. Code to permit the use of such data for research purposes in a manner that protects individualized confidential data from disclosure, and if such a change
is made, the Federal Motor Carrier Safety Administration should make parallel provisions in its regulations.5
Data Collected by Electronic Data Recorders, Vehicle Telematics, and Collision Avoidance and Fatigue Detection Systems
Many trucks and buses are equipped with electronic data recorders to record information on driver actions. Electronic data recorders are different from the electronic logging devices discussed above. They record data that are saved continuously or in response to such triggers as an acceleration exceedance (e.g., from braking, accelerating, or swerving) or a change in vehicle status (e.g., electronically sensed engine or wheel speed changes). These devices can record a wide range of data, including vehicle speed, application of brakes or clutch, steering angle, seat belt use, airbag deployment, and g-force measures associated with impact during crash sequences.
Vehicle telematics refers to various on-board technologies and wireless devices, described in Chapter 5, that transmit data to an organization in real time on how a vehicle is functioning, such as speed, acceleration and braking, airbag deployment, and crashes. They can also collect information on vehicle location. Some companies, such as Lytx, collect video data for clients, such as parents of new drivers and trucking companies, that can be used to monitor remotely how a car or truck is being driven.
The companies that currently collect such data could be encouraged to provide properly disclosure-protected data for use by researchers in examining such questions as what percentages of trip segments with and without crashes involved fatigued drivers, as measured indirectly through the operation of the vehicle or possibly more directly by adding a data capture feature for assessment of PERCLOS. In this way, research could approximate, in some sense, a continuous version of the Large Truck Crash Causation Study.
Newer devices that warn about such behaviors as lane departures (e.g., Iteris AutoVue®), driving too close to the vehicle ahead (collision avoidance systems), or abrupt steering motions have been promoted for use in alerting distracted or fatigued drivers (see also Chapter 9). While the data such devices collect depends on the specific system and are proprietary, it is reasonable to believe that they are similar to those collected by electronic logging devices and telematics systems, and as such are likely relevant to research assessing the increased crash risk associated with fatigued driving. It should be noted that drivers’ acceptance of any
5 A change has been made from the prepublication copy to update language to make it clear that FMCSA cannot change the law but it can modify its regulations.
system that is recording their performance or their personal information needs to be addressed.
Data Collected Prior to Serious Crashes
Black boxes on airplanes can provide detailed information on the sequence of events leading to a crash, which can then be used to suggest design modifications or implementation changes to reduce the chances of a repeat of that event. Since many of today’s trucks and buses are equipped with electronic devices that record information on driver actions preceding a crash, such data (especially video data on driver physiognomy) could help in determining what factors contributed to a crash, including those related to driver fatigue or distraction. Consequently, such data would be valuable to investigators and to researchers examining relationships between driver fatigue and highway safety.
RECOMMENDATION 4: When commercial trucks and buses containing electronic data recorders that record data on the functioning of the driver and the truck or bus are involved in serious crashes, the relevant data should be made available to investigators and to safety researchers.
Other data sources that could be tapped for research on CMV driver fatigue and highway safety include vehicle inspection reports, data from large carriers, and American Transportation Research Institute (ATRI) data.
Vehicle Inspection Reports
While no technology is involved, a safety inspection system inspects 4 million commercial motor vehicles each year in North America to ensure that trucks and buses are operating safely. These inspection data are a component of the Motor Carrier Management Information System (MCMIS), compiled, maintained, and funded by FMCSA (see Chapter 5). Trained inspectors in each state examine vehicles based on criteria developed by the Commercial Vehicle Safety Alliance (CVSA). CVSA inspectors examine drivers and trucks for unsafe driving practices, HOS compliance, driver fitness, use of controlled substances, vehicle maintenance, hazardous materials compliance, and crash indications. As part of the most comprehensive Level I inspection, a driver’s certificate from his or her medical examiner is checked, as is the driver’s record of duty status and
adherence to HOS regulations. Relevant to the present study, drivers are checked for visible signs of fatigue (although it is easy to imagine that this process misses a large fraction of cases).
In the context of this study, the utility of the data in such reports is limited because there is no direct measurement of fatigue for drivers of commercial trucks and buses. Also, the criteria used to select drivers and vehicles to inspect may differ from state to state. However, it still may be profitable to study such data on the inspection and crash history of drivers. For instance, one could determine which types of drivers, driving for what kinds of carriers, with what types of logging devices are detected by which inspection criteria to have more or fewer HOS violations. This information would have the usual deficiencies of data that are not controlled for confounding factors but might be useful for generating hypotheses.
Data from Large Carriers
Most carriers collect information on their truck drivers for such purposes as compensation, management, supervision, and monitoring. Many large carriers collect additional information on various aspects of drivers’ health, their crash rates, their schedules, their routes, and their vehicles. They collect such data for various reasons—especially economic because, beyond concern about the health of drivers, drivers who frequently are involved in crashes are costly for their employers.
Driver health data can include medical information from company-sponsored clinic visits and screening exams, as well as treatment information for a variety of medical conditions, including obstructive sleep apnea (see Chapter 8). Some of the largest carriers use such data to assess the effectiveness of their health and wellness programs and to assess the economics of their insurance coverage plans.
Schneider National, for example, collects electronic log data from many trucks in its fleet that can be used to track work shift variability and the number of days since a truck was last at the home terminal. Data on such critical events as hard braking, roll stability control activation, and collision mitigation also are collected. Data collected on collisions include the time of day the crash occurred, the number of hours since the last break for the driver, and the location of the crash and the roadway type at that location. Some large carriers have relatively sophisticated data collection systems for loss events. Their loss files may include incidents that are not police-reportable or MCMIS-reportable traffic crashes. Carrier-based data also can include the history of noncrash driving, thereby addressing the lack of exposure data that exists for all trucks collectively. Crash
data can be linked to personnel/work records, as well as to equipment manifests.
Clearly, such carrier-collected data could offer a rich opportunity for analysis of various questions of interest concerning HOS regulations, fatigue, and crash frequency. If data from a number of large carriers across the commercial trucking industry could be collected, organized in a database, and made available to researchers, these data could represent an important segment of the trucking industry. Such a database also would supply important information on topics on which little is otherwise known. However, it must be acknowledged that such data would exclude a large fraction of the trucking population, especially independent owner-operators, so the results of this research would not be fully generalizable.
American Transportation Research Institute Data
ATRI collects truck data through collaboration with participating carrier fleets. The ATRI database currently contains billions of data points from electronic on-board records for several hundred thousand unique vehicles spanning more than 7 years. These data, which include time, location, speed, and anonymous unique identification information, are used by ATRI researchers to produce various indicators on truck movement, highway bottlenecks, crossing times and delays, demand for truck routes, and facilities on highways. Knowing the location of a truck or bus prior to a crash could make it possible to detect whether the driver violated HOS regulations. However, this database is not available to researchers, and the data currently are collected for only a modest fraction of all trucks and buses in the United States. Therefore, these data would be of limited utility for nationally comprehensive surveillance studies. On the other hand, with such large sample sizes, the data could be useful for assessing some factors associated with an increase in crashes.
To summarize, there are or soon will be a number of data sources that could potentially provide the location of trucks and buses continuously and therefore indicate the length of time a CMV driver was driving prior to a crash. (The panel would be surprised if insurance companies did not also have relevant data.) Some direct measurement of fatigue could even be derived from assessment of PERCLOS data and indirectly from other measures. Data also will be collected on the driving environment. All of this information will exist for some subsets of the population of CMV drivers. In addition, considerable information collected on trucks and buses will provide data on crash-free driving, or exposure data. Some sources may even occasionally provide information on other characteristics listed in Table 10-1 that are needed to undertake a comprehensive assessment of the factors that cause crashes, such as health status, sleep,
diet and exercise habits, number of years employed, and demographics. While these data sources will not be representative of the full population of truck and bus drivers, they still could be useful in analyzing relationships for important subsets.
A large proportion of these data is currently proprietary. Of course, even if such data could be released in a protected form, the lack of standardized fields, terminology, and definitions would be problematic. To address this problem, these elements could be standardized and disseminated, and carriers could be asked to try to match the standard format prior to data submission. Efforts to make data sets more comparable would likely be a natural by-product of the establishment of such a collective database.
In such data sets, the names of the drivers and the fleets and even the specific routes would have to be suppressed to preserve anonymity. In addition, a variety of techniques—including error inoculation, data swapping, and creation of synthetic data sets—could be applied to further protect against disclosure (see Duncan et al., 2011). Also, as discussed earlier, research data centers could provide access only to approved researchers, and review and limit the data a researcher could remove from the premises. The panel is fairly certain that such techniques could be applied successfully to data sets from these and other similar sources to permit their use for research purposes.
It must be acknowledged that there are few examples of the use of such unstructured cooperation among the private sector, nonprofits, and government agencies to substitute for what should be mainly a federal data collection program. However, the magnitude of the existing data gaps makes it unlikely that these deficiencies will be addressed in the next several years without such cooperation. The fact that relevant data are currently being collected and could be shared with researchers with almost no chance of disclosure motivates the panel’s call for what is clearly a nonstandard approach to establishing a research database. In the short term, this appears to be one of the very limited possible means of acquiring this type of information that is collected by certain large carriers, as well as by ATRI.
RECOMMENDATION 5: The Federal Motor Carrier Safety Administration should incentivize those that capture driver performance data (e.g., large fleets, independent trucking associations, companies that collect telematics data, insurance companies, researchers) to increase the availability of those data relevant to research issues of operator fatigue, hours of service, and highway safety. Any such efforts should ensure that data confidentiality is maintained, per-
haps through restricted access arrangements or use of statistical techniques for disclosure protection.
The data described in the preceding sections could be used to answer many important questions related to CMV driver fatigue and highway safety. This section focuses on two specific needs for additional data collection (and analysis): (1) data on exposure and (2) driver decision making.
One important need to support research in this area, described in greater detail in Chapter 5, is data on exposure (time spent on the road) and its relationship to other factors associated with driver fatigue and highway safety. Both exposure per hour of day—needed to compute crash rates and risk by time of day—and trip lengths and driving hours—as predictors of driver fatigue and crash risk—need to be investigated.
RECOMMENDATION 6: The Federal Motor Carrier Safety Administration should work to improve the collection of and/or access to baseline data on driving exposure by including in its data collection efforts greater detail on the driving environment and by providing these data at low levels of geographic aggregation—even for individual highway segments. Comparisons enabled by the availability of these baseline data would benefit several proposed lines of new research.
Driver Decision Making
Many factors contribute to CMV drivers’ decision whether to continue driving when they recognize they are fatigued. For example, the nature of drivers’ compensation likely affects their assessment of their own economic consequences (their pay) of stopping for needed rest. FMCSA is currently examining this issue in its study “Impact of Driver Compensation on Commercial Motor Vehicle Safety,” a study the panel strongly supports. (The panel understands that a modification to the study design to address the possibility that the type of compensation scheme in use may be associated with other aspects of a fleet’s operation is being considered, which we also support.) Factors that might be included in research on drivers’ compensation are how the scheduling of work is carried out, whether time spent unloading and loading is treated
as a separate component that is paid for, what time is spent waiting to load or unload, and commuting time.
Drivers’ decision making—including their ability to determine whether they are too drowsy to drive safely—can be compromised if they are fatigued (see Mitler et al., 1988). Thus driver decision making could be affected by the availability and location of the nearest rest area, truck stop, or parking area and by the delivery deadline for picking up or dropping off the next load.
RECOMMENDATION 7: The Federal Motor Carrier Safety Administration should support research aimed at better understanding the factors associated with driver behavior related to fatigue and sleep deficiency, including what motivates drivers’ decisions about whether to continue driving when they feel fatigued.
Such research could encompass (1) barriers to healthy practices, such as compensation schemes; (2) whether management’s adoption of a safety culture is beneficial in reducing fatigued driving; and (3) the impact of education and training on the degree of fatigue a driver experiences, its causes, and the possible results of fatigued driving.
As discussed in Chapter 7, data collection in studies of hours of service, fatigue, and crash risk commonly makes use of relatively standard population study designs and statistical approaches, including case-control designs, regression adjustment, and nonequivalent comparison groups. These methods can allow for the intrusion of confounding factors that are simultaneous with the intervention of interest. More generally, it is common for research on issues concerning highway safety and fatigue to be based on observational studies, which are prone to confounding influences related both to the indicator of receiving or having the “treatment” of interest and to the outcome under study (see Chapter 6). Researchers in this area, including FMCSA staff, therefore need to consider greater use of designs for nonexperimental studies that are better able to provide data that are easier to analyze in support of a clear causal statement. They also need to make use of more appropriate statistical analysis techniques that better utilize data from existing observational studies, again to counter the influence of confounding factors.
An important point emphasized in this report is that truck and bus drivers are heterogeneous with respect to the types of driving they do (see Chapter 2). Therefore, the impact of changes in HOS regulations can vary significantly for different drivers. Further, the type of driving can affect
the degree to which a driver is fatigued. Thus both the nature of drivers’ work-rest schedules and the type of driving they do are likely important causal factors for fatigue and for crash risk that need to be represented in any statistical model of crash risk. One way to accommodate this need is for the analysis to be stratified by employment type, but other constructs are possible as well.
It should also be noted that, as described in Chapter 6, evaluating the effects of causes rather than the causes of effects is often a more feasible and more policy-relevant goal. For example, evaluating the effect of a program designed to reduce crashes is more feasible and policy-relevant than evaluating the underlying cause of a crash (see Rosenbaum, 2002, 2009; Shadish et al., 2002). Moreover, learning about the effects of causes can help provide some understanding of the causes of effects; for example, seeing that a fatigue management program reduces collisions provides some information on the extent to which crashes are caused by fatigue.
Study Designs and Associated Data Sets
Several different types of data collection are relevant to research on driver fatigue and highway safety. Table 5-2 in Chapter 5 indicates the advantages and disadvantages of these various data sources, which vary in cost; fidelity to field operations; and ease of investigating specific driving scenarios, such as crashes. The main sources at present are crash data sets; naturalistic driving studies; simulator studies; and vehicle instrumentation, including electronic logging devices. In considering which kind of data collection to undertake to answer a specific research question, one needs to acknowledge tradeoffs in terms of control versus real-world relevance. These tradeoffs will determine whether one conducts laboratory studies and simulations, which can focus on specific situations a driver might confront, or uses field operational tests, naturalistic driving studies, and crash data sets, which can focus on how a change will be implemented in the real world and for the general population of CMV drivers. The former, more controlled experimental designs can include more (simulated) crashes and drivers with and without specific predictors. The latter, less controlled designs may offer greater fidelity but are dominated by event-free driving. In some cases, one might need to have the advantages of both to address research questions related to specific driving circumstances but applicable to the general CMV driver population. The focus here is on how designs of naturalistic driving studies, simulator studies, field experiments, and observational studies (accident reports) can be improved to better support research going forward.
Improving the Utility of Naturalistic Driving Studies
As a primary source of data on driving during noncrash periods—referred to here as exposure data—naturalistic driving studies provide an important tool for research in this field. As described in Chapter 5, such studies provide the opportunity to collect extensive data on drivers who are engaging in their typical truck and bus operations. These data on noncrash driving are extremely useful, and when crashes do occur, their cause or causes can be readily determined. Thus, naturalistic driving studies occupy an important position between crash data and simulator studies.
Several issues could be addressed to increase the utility of naturalistic driving studies. First, as with many observational studies, naturalistic driving data are collected only from volunteers, so their findings may not be generalizable to the entire population of truck and bus drivers. Second, because crashes are rare events, such studies often include fewer crashes than are needed to support estimation of exposure-outcome relationships with statistical models. This limitation motivates the use of high-kinematic events, or SCEs, as surrogate outcomes of interest. Some of these kinematic events, such as hard braking and swerving to avoid collisions, may be necessary to avoid a collision that was the fault of other drivers and may be due to a driver’s alertness rather than to his or her fatigue (or distraction). Therefore, these events are not necessarily appropriate surrogate outcomes for studies on fatigued driving. (The assessment of whether an individual event is or is not related to fatigue is complicated because driving behaviors such as speed selection, lane keeping, car following, and gap acceptance all can be involved. It is similarly not clear when a crash involves driver fatigue.) Another issue is that a large amount of video data is collected, and additional research is needed on how best to identify relevant subsets of these data. The following subsections consider how these issues might be addressed.
Self-selection. To generalize any observed findings to the complete population of truck or bus drivers, one would need to collect sufficient covariate information on both the participants in a naturalistic driving study and the parent population to which one would like to generalize. Such information would have to be sufficient to enable construction of a fairly strong predictive model of which drivers would and would not volunteer for such a study. Techniques that could then be used include those based on propensity scores, poststratification, and similar methods. Some of these techniques, described in Chapter 6, require information on the parent population that does not currently exist, but could exist if the panel’s Recommendation 1 on a regular sample survey were implemented.
Use of safety-critical events (SCEs). The SCEs used as surrogates for crashes in naturalistic driving studies are in theory events that in slightly different circumstances would have resulted in a crash. In practice, these events often entail the driver’s causing strong g-forces on the vehicle’s passengers by sudden braking or sudden veering, frequently used to avoid a collision. However, some near-crashes are not high-g-force events since, for example, another driver may have used sudden braking to prevent a crash with one of the instrumented vehicles in the study. The decision as to which SCEs to include in an analysis of a given research question and the interpretation of results using such surrogate outcomes often are difficult and subject to the criticism that the SCEs used are not representative of the same phenomenon as crashes. Guo and colleagues (2010) showed that one can obtain biased estimates of parameters when near-crash data are added to crash data.
Moreover, not all crashes should be treated equally. Some may be relatively mild bumps into a curb, while others are much more substantial and result in much greater damage and injury. There also are at-fault and not-at-fault crashes. Certainly not all crashes, whether mild or severe, are reflective of problems with driver fatigue (perhaps as many as 90% are not). Therefore, the decision as to which crashes or SCEs are indicative of problems with fatigue and should be included as outcomes and which should not be used is quite complicated and depends on the research question of interest. The crucial issue here is the need to measure outcomes that are meaningful for the research problem under investigation.
The panel advocates use of the following principles to help determine the validity of utilizing specific types of SCEs as crash surrogates. Use of SCEs is warranted, first, if they can be shown to have causal factors identical to those of crashes and, second, if there is a strong correlation in their frequency over different driving scenarios (see Guo et al., 2010). As an example, Guo and colleagues (2010) found that near-crashes provided useful information for distraction risk assessment and that use of such near-crash incidents would generate conservative estimates of risks. And Kim and colleagues (2013) found that when a g-force was beyond a prespecified threshold, it was a good predictor of crash risk.
Another reason to analyze SCEs, apart from their inclusion with crashes as outcomes of interest, is to predict future driving risk. It has been shown that high-g-force events and other SCEs are good predictors of driving risk (see Guo and Fang, 2013). Fleets (e.g., Lytx) also use telematics data on noncrashes to improve safety. Existing data sets, such as that of the Strategic Highway Research Project (SHRP) 2, can potentially be used to evaluate whether different types of SCEs are valid crash surrogates for the study of driver fatigue.
Finally, the issue of surrogate outcomes has a rich history in bio-
statistics and epidemiology, and recent advances in these fields could be examined for their relevance in refining understanding of the use of surrogate outcomes. (Good overviews can be found in Joffe and Green  and Wittes et al. . Engineering approaches to this problem can be found in Jonasson and Rootzén  and Tarko .)
Use of feature extraction for video data. Research on feature extraction for video data is of continuing interest in computer science and other fields. Advances in this area could provide an alternative to kinematically defined SCEs for driving researchers. One would like to identify patterns in which behaving one way greatly lowers or raises the risk of a crash, and not behaving that way either does not affect the risk or moves it in the opposite direction. Since manually coding the continuous behavior of a driver throughout the many hours of a naturalistic driving study is extremely difficult, kinematic and similar events currently are used to identify short segments of the complete video capture for which one can code the behaviors that preceded an accident or an SCE. However, this is the same as examining crashes to assess the impact on crash risk of behaviors that preceded the crashes when one does not know how common those behaviors were during driving time when no crash or SCE occurred. It would be extremely helpful if software were developed that could alert the analyst to behaviors or situations that were good at discriminating between times of safe and unsafe driving.
Impressive research is currently being carried out on feature extraction from video. Two key research areas are understanding the degree to which a driver is fatigued from data captured by a camera focused on his or her physiognomy, and learning about the driver’s external situation from a video trained on the road ahead (for details, see Hebert ). Some of the challenges to such research arise because any fleet that considered installing cameras to collect such video data would likely use relatively low-cost cameras, which would not provide high enough resolution to see the driver’s eyelids in all types of light or make it possible to see much more than 100 feet ahead to identify anomalous events. Therefore, it would be difficult to determine whether a driver’s actions, such as swerving or braking, were warranted.
Naturalistic driving studies, such as SHRP 2, collect thousands of hours of driving data, so human assessment of such data is nearly impossible. While the time during which no crashes (or no SCEs) occur is valuable for providing baseline comparisons, the density of features of interest during noncrash time may not be very high. One therefore has at least two options. First, one can look only at interesting events that are easy to search for automatically, such as high-kinematic events and crashes. This approach can provide hypotheses for behaviors and situations that
are risky. However, it cannot support assessments of increased risk since one is ignoring overall exposure data. Second, one can sample from the noncrash time periods and examine those periods for features of interest. Work by McDonald and colleagues (2013) greatly reduces the amount of video data by utilizing an alphabet indicating what actions took place. However, this data reduction approach requires that a human perform the interpretation and scoring, and until these tasks can be automated, its utility is somewhat limited. Therefore, it is difficult to determine what driver behaviors unanticipated by the researcher might be linked to increased crash risk. This is an important area for future research as cameras and pattern recognition techniques improve.
Improving Crash Reports and Data from Observational Studies in General
Most currently available data on traffic safety are collected in observational studies, which do not benefit from the relatively equal distribution of behaviors or characteristics across treatment groups that characterizes a randomized experiment. As Chakraborty and Murphy (2013) note, “In observational data associations observed in the data (e.g., between treatment and outcome) may be partially due to the unobserved or unknown reasons why individuals receive differing treatments as opposed to the effects of the treatments. Thus to conduct inference, assumptions are required.” Accordingly, these data sets are limited with respect to what can be learned about the impact of a change in a regulation or the institution of a new or modified policy or program.
To address this problem, various types of observational designs can be used to enable more valid inferences about what changes might reduce crash frequency. Further, certain techniques can be used on data from observational studies to provide the balancing needed for confounders so that the comparison between the “treatment” and “control” groups approximates the comparison that would be possible with random assignment (see, e.g., Hernán and Robins, 2008). Examples of these approaches include encouragement designs, regression discontinuity, and interrupted time series designs. There are also many methods, such as difference of differences, propensity scoring, and instrumental variables, that can be used on observational data to account for unbalanced confounders. Chapter 6 introduces some of these techniques. Here the focus is on how they might be applied in the context of driving studies.
The approaches that could be attempted include both experimental and nonexperimental designs—studies in which a number of factors are randomly selected for or in which little random selection of factors is conducted, respectively. FMCSA could play a role in the design of experiments since policy interventions can be conducted in a way that facilitates
randomization. For example, if FMSCA were considering changes to HOS regulations, to the North American Fatigue Management Program, or to certified medical examinations, it could first pilot the changes by rolling them out in ways that would allow for rigorous evaluation, such as by randomly selecting truck drivers to receive some new technology.6
There are also many scenarios in which nonexperimental designs, which often are more feasible than experiments, could be used to help isolate the causal effects of interventions. For example, an existing variation in medical examiners’ policies regarding which drivers to approve could be used to look at the effects of the relevant health conditions on crash rates (under the assumption that those being evaluated were otherwise similar). This would be analogous to medical studies using physician prescribing preference as an instrumental variable for examining the effects of specific drugs on outcomes. Similarly, trends in, for example, crash rates for specific entities could be used for comparative interrupted time series designs comparing states or companies with and without policy changes.
In observational studies that compare a “treated” and an “untreated” group, one always needs to be careful about confounding due to differences between the two groups on baseline characteristics that are also associated with outcomes. For example, comparison of different types of compensation strategies could be problematic if the trucking companies that implement one compensation strategy differ on other factors (such as driver tenure, types of trucks, types of routes, and driving conditions) from those that implement the control strategy. One often can adjust for those confounding variables by using such methods as propensity scoring. Another approach to try is a comparative difference in differences analysis (a special type of interrupted time series design). (See the discussion of methods in Chapter 6.)
Perhaps more relevant to the health and wellness issues discussed in Chapter 11 are sequential multiple assignment randomized trial (SMART) designs and observational study analogues, developed by Susan Murphy and her colleagues (see, e.g., Almirall et al., 2014). Given the heterogeneous nature of truck drivers and the likelihood that they will use a variety of interventions over time, researchers in this field may find that there are applications for this new class of experimental designs.
For example, consider a naturalistic study of the effects of one versus two nights of rest after a workweek for truck drivers. A driver would
6 For examples of other agencies and government groups using similar techniques, see http://www.mathematica-mpr.com/our-publications-and-findings/publications/smarter-better-faster-the-potential-for-predictive-analytics-and-rapidcycle-evaluation-to-improve [March 2016].
sometimes elect to drive after 1 day of rest and sometimes after 2 days. Here each driver is adapting the “treatment” based on his or her characteristics and previous experiences with each type of rest duration, as well as the immediately prior workweek, and these variations could be analyzed with a SMART design. The goal would be to identify optimal dynamic treatment regimes (i.e., the optimal rest patterns that adapt to previous rest). The panel believes this may be a fruitful approach for research.
Finally, as has been discussed, several different sources of data are relevant to research on driver fatigue and highway safety whose use entails trade-offs in terms of control versus real-world relevance. Each source has advantages and disadvantages with respect to support for causal inference and generalizability. Researchers could consider combining these sources to maintain their advantages and mitigate their disadvantages.
An example is the case in which policy questions can be divided into two or more components. Consider the question of whether any among several medications are fatigue inducing, such that those drivers using such medications at various dosages might be at greater crash risk. First, one might test in a simulator whether those drivers taking various doses of the different drugs had slower response times or performed worse in applying defensive driving techniques in induced crash-likely scenarios. The results would indicate which medications were worrisome and at what levels. Then, the question would be whether these findings could be generalized to the general CMV driver population and to real-world driving. To investigate that question, one might request a set of observational data from a large carrier listing the specific medications used by their drivers that had been found to be worrisome to see whether those drivers taking doses above and below the worrisome level had different crash frequencies. The simulator study would have helped reduce the chances of false positives and would have identified the medications and doses on which to focus. And to balance those drivers above and below the threshold dose for confounding factors for any given drug, techniques such as propensity scoring might be used. Much more generally, such phased approaches to research in this area might be useful.
Assessment of New Technologies for Reducing Fatigued Driving
A previous section described how electronic data recorders, driver monitoring technology, and fatigue detection technology can be used to provide relevant data for research on crash risk. Obviously, these technologies, if successful, could also accomplish their intended purpose, which is to reduce the frequency of drowsy driving and enhance driver and vehicle safety. However, the claims for new technology can some-
times exceed the realized benefits. Therefore, the panel notes here some issues that need to be addressed in the proper evaluation of such devices.
First, it is necessary that any device designed to help avoid collisions by alerting fatigued drivers be adequately field tested, including whether and how alerts are communicated to the driver and/or the fleet and the incentives, if any, for complying with the warnings. Such testing is complex and requires a human engineering approach and a human-systems integration perspective, along with expertise in the study of fatigue. This is the case because to identify effective methods for alerting drivers in a way that will get their attention, it is necessary to determine what false-negative error rates most drivers would view as tolerable, as well as other aspects of the interaction between the driver and the system.
Furthermore, the panel believes such testing would need to adhere to the following two principles: (1) tests need to be carried out by third parties to reduce the opportunity for biased assessments, and (2) care is necessary in constructing valid comparison groups, as well as in studying the contrasts between those with and without the device in question (potentially including the use of such techniques as propensity score matching). To communicate what a valid testing scheme would entail, it would be helpful for FMCSA and the National Highway Traffic Safety Administration to develop and issue a joint report indicating what they view as necessary features of an effective testing program.
RECOMMENDATION 8: Using a human-systems integration framework, the Federal Motor Carrier Safety Administration and the National Highway Traffic Safety Administration, in consultation with the Centers for Disease Control and Prevention and the National Institutes of Health, should develop evaluation guidelines and protocols for third-party testing, including field testing, conducted to evaluate new technologies that purport to reduce the impact of fatigue on driver safety.
Complex Correlation Structures
Researchers often employ standard statistical models on crash data. One can find examples of the use of logistic regression (for, say, accidents versus no accidents) or modeling of the number of accidents using Poisson regression, sometimes zero-inflated to deal with a large percentage of zero values or adjusted to accommodate overdispersion. Fatigue has transient effects on driving safety, so analysis typically needs to be conducted at a detailed level, such as the level of a trip or a driver. In
this level of analysis, standard statistical models can fail to accommodate the correlation structure that is typical of data from naturalistic and other observational studies. This correlation structure is due to the fact that observations from the miles traveled by a given driver, everything else being fixed, are more highly correlated with results from other miles traveled by the same individual than with miles driven by someone else. Similarly, observations on a given segment of highway, everything else being fixed, are more highly correlated with other driving on that segment than with results from other segments. Temporal correlations may also be present, with observations at a given point in time being highly correlated with observations from nearby points in time. Such correlation structures can be handled using models with random effects for individual drivers and for road segments or through the use of models that integrate time series components. Models of this type, known as mixed-effects or hierarchical models, warrant more frequent consideration in the analysis of CMV driving data. A study by Kim and colleagues (2013) is a relevant example from the passenger vehicle literature, in which the number of high-g-force events on a trip is modeled with driver random effects and a latent temporal structure.
Another approach to handling heterogeneity in driving behavior is through the use of latent variable or hidden Markov models. These models represent heterogeneity over time within a driver through the use of assumed hidden or latent states (e.g., a high-risk driving and a low-risk driving state) whose value is inferred as part of the statistical analysis. For example, Jackson and colleagues (2015) use a two-state hidden Markov model in the analysis of trip-level driving data from a naturalistic study of teenage drivers. Methodologies drawn from survival analysis, such as recurrent event models and time-to-event models, may also be useful for the analysis of driving safety data.
Mannering and Bhat (2014) provide additional discussion of methodological approaches that have been applied to driver safety studies, together with a large number of salient references.
Power of Studies
Studies of truck and bus drivers often must make use of relatively small sample sizes, and the panel was asked to comment on the assessment of power for such studies. Before doing so, it is valuable to emphasize that the framework proposed in this chapter emphasizes attempts to estimate the magnitude of the effect of different factors on crash risk or other safety outcomes. Estimation of effect sizes, supplemented with confidence intervals that provide information about the effects of sampling variability, is a useful way to summarize the information available
in a given study. Occasionally, statistical studies are summarized using a significance test; this approach is most appropriate in assessing whether a particular intervention provides any improvement over the status quo. If the goal is substantial improvement, however, significance tests alone will be insufficient. Instead, one will need to be provided with a confidence interval. Should a significance test make sense, the power of the study based on the sample size and the effect difference sought needs to be noted explicitly.
If one is using the significance of a hypothesis test as support for a research finding that a factor is (likely) causal, one must be aware of both error rates associated with the test—namely the probability of observing a significant result when that the factor has no effect, which is often set to small values such as 1 or 5 percent—and the probability of not observing a significant result when the alternative hypothesis is true—that the factor has an effect. This second error rate is one minus the power of the test. The power of a hypothesis test is sometimes ignored when a study is being designed, but a test with power of less than, say, 75 percent means that there is at least a 25-percent chance of not finding a significant result when there is one. Therefore, it is important to design a study so that the power of the alternative hypothesis is sufficiently high to trust the results. This section provides some general guidance on power analyses.
First, the alternative hypothesis often includes a range of possibilities. For example, the alternative hypothesis might be that a program had some beneficial effect on reducing driver fatigue versus the null hypothesis that the program had no or a detrimental effect. It will not be possible to obtain high power for all ranges of the alternative hypothesis since it will be difficult to distinguish a program with a very small effect from one with no effect. Furthermore, it may not be of practical importance whether a program has a very small or no effect. One needs to decide what magnitude of effect is of practical importance and choose the sample size so that the power of the study is high (e.g., greater than 80%) for all magnitudes of effects that are practically important.
In addition, calculating power post hoc is not the same as carrying out a power analysis prior to a study to determine whether it is worth conducting. This is the case because one can derive a significant result and then assess the power as being high, but that might be an artifact of a significant result that was due to chance. For details, see Hoenig and Heisey (2001).
Finally, in the case of observational studies, there is likely a bias due to the lack of balance of various confounding factors. If a sensitivity analysis can provide a bound of the impact of the confounders collectively, one would like to have power to reject the null hypothesis even allowing for bias from confounders up to the bound. (See Rosenbaum , Small
and Rosenbaum , and Zubizarreta et al.  for examples.) The panel acknowledges that the ability to do this depends on having plausible ranges of the imbalance and impact of confounders. Those assumptions can be informed by research on the confounders and their associations both with each other and with the outcomes of interest. For details, see Hsu and Small (2013) and Shepherd et al. (2007).
As the panel has argued, research in the area of the association between CMV driver fatigue and crash risk, while often praiseworthy, has not always been reflective of current statistical methods with respect to both study design and analysis. Since the number of staff that FMCSA can devote both to writing requests for proposals and to reviewing submissions is limited, and since the number of statisticians it has on its staff is similarly small, the institution of a peer review system would help in formulating requests for proposals for research projects that would make greater use of the latest statistical methods, which, given the complex nature of data on crashes, are often an advantage. Further, such a peer review system could be used to evaluate the resulting proposals and to monitor progress after awards. Examples of agencies with such peer review systems include the U.S. Department of Education and the Agency for Healthcare Research and Quality.
FMCSA also makes use of indefinite delivery, indefinite quantity contracts (IDIQs), to facilitate contract awards. The panel believes such IDIQs need to include a broader collection of researchers with statistical expertise. Finally, for investigator-initiated studies, FMCSA needs to have more infrastructure to provide for greater interaction with such researchers while they are designing and carrying out their studies.
RECOMMENDATION 9: The Federal Motor Carrier Safety Administration should make greater use of independent peer review in crafting requests for proposals, assisting in decisions regarding awards, and monitoring the progress of projects (including in the study design and analysis stages). Peer review should include expertise from all relevant fields, including epidemiology and statistics—especially causal inference—to address appropriate design and analysis methods.