National Academies Press: OpenBook

Integration of Analysis Methods and Development of Analysis Plan (2012)

Chapter: Chapter 5 - Work Plan Requirements

« Previous: Chapter 4 - Overview of Phase II: Formulating the High-Priority Research Topics
Page 10
Suggested Citation:"Chapter 5 - Work Plan Requirements." National Academies of Sciences, Engineering, and Medicine. 2012. Integration of Analysis Methods and Development of Analysis Plan. Washington, DC: The National Academies Press. doi: 10.17226/22847.
×
Page 10
Page 11
Suggested Citation:"Chapter 5 - Work Plan Requirements." National Academies of Sciences, Engineering, and Medicine. 2012. Integration of Analysis Methods and Development of Analysis Plan. Washington, DC: The National Academies Press. doi: 10.17226/22847.
×
Page 11
Page 12
Suggested Citation:"Chapter 5 - Work Plan Requirements." National Academies of Sciences, Engineering, and Medicine. 2012. Integration of Analysis Methods and Development of Analysis Plan. Washington, DC: The National Academies Press. doi: 10.17226/22847.
×
Page 12
Page 13
Suggested Citation:"Chapter 5 - Work Plan Requirements." National Academies of Sciences, Engineering, and Medicine. 2012. Integration of Analysis Methods and Development of Analysis Plan. Washington, DC: The National Academies Press. doi: 10.17226/22847.
×
Page 13
Page 14
Suggested Citation:"Chapter 5 - Work Plan Requirements." National Academies of Sciences, Engineering, and Medicine. 2012. Integration of Analysis Methods and Development of Analysis Plan. Washington, DC: The National Academies Press. doi: 10.17226/22847.
×
Page 14
Page 15
Suggested Citation:"Chapter 5 - Work Plan Requirements." National Academies of Sciences, Engineering, and Medicine. 2012. Integration of Analysis Methods and Development of Analysis Plan. Washington, DC: The National Academies Press. doi: 10.17226/22847.
×
Page 15
Page 16
Suggested Citation:"Chapter 5 - Work Plan Requirements." National Academies of Sciences, Engineering, and Medicine. 2012. Integration of Analysis Methods and Development of Analysis Plan. Washington, DC: The National Academies Press. doi: 10.17226/22847.
×
Page 16
Page 17
Suggested Citation:"Chapter 5 - Work Plan Requirements." National Academies of Sciences, Engineering, and Medicine. 2012. Integration of Analysis Methods and Development of Analysis Plan. Washington, DC: The National Academies Press. doi: 10.17226/22847.
×
Page 17
Page 18
Suggested Citation:"Chapter 5 - Work Plan Requirements." National Academies of Sciences, Engineering, and Medicine. 2012. Integration of Analysis Methods and Development of Analysis Plan. Washington, DC: The National Academies Press. doi: 10.17226/22847.
×
Page 18
Page 19
Suggested Citation:"Chapter 5 - Work Plan Requirements." National Academies of Sciences, Engineering, and Medicine. 2012. Integration of Analysis Methods and Development of Analysis Plan. Washington, DC: The National Academies Press. doi: 10.17226/22847.
×
Page 19
Page 20
Suggested Citation:"Chapter 5 - Work Plan Requirements." National Academies of Sciences, Engineering, and Medicine. 2012. Integration of Analysis Methods and Development of Analysis Plan. Washington, DC: The National Academies Press. doi: 10.17226/22847.
×
Page 20

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

10 C h a p t e r 5 This chapter outlines the team’s recommendations for ele- ments to be included in proposals submitted in response to the S08 RFP. The work plan elements described reflect the key information needed to assess the merits of a particular research question or approach. These elements provide a common format and template for guiding the development of RFPs and the review of research proposals. The following sections describe the recommended require- ments for S08 work plans. Example work plans, which briefly demonstrate how a researcher might address all of the key elements in a proposal, are provided in Chapter 6. Defining a Specific research Question SHRP 2 will determine the topic areas to be examined for Project S08. For each research topic to be examined, a spe- cific research question will be addressed by each proposer. Therefore, it is recommended that the first section of an S08 proposal, regardless of the topic area, include a well-defined specific research question. This question will most likely fall within one of the proposed research topic areas and should include a rationale for its selection. Each specific research question should take into account what crash types (e.g., ROR, rear end, intersection) will be examined. The crash type does not necessarily have to fit into the traditional crash categories. Regardless of the crash type chosen, the proposer needs to explain the potential safety benefit of the research. The proposer should consider the possibility that there may not be enough samples of the events to be considered (e.g., intersection crashes, lane depar- tures, or exposure to roundabouts) in a 2-year NDS to cap- ture meaningful outcomes related to a particular crash type. Hence, potential crash surrogate measures and the rationale for their selection should be proposed. Value and Limits of Naturalistic Driving Data for the proposed research Question It is very important that the proposal demonstrate the value of using naturalistic driving data to study the selected ques- tion, explain how using naturalistic driving data could offer substantial insights unavailable using other methods, and address why the question cannot be answered using another data collection method. There are many ways to examine driver safety, and each data collection method provides a unique opportunity to address a specific research question. Data collection meth- ods can range from high experimental control with low levels of external validity/realism to high levels of external validity/realism with no experimental control (see Fig- ure 5.1). There are advantages and disadvantages to each. For example, bench laboratory studies, such as assessing driver choice reaction time, have extremely high experi- mental control and the ability to assess response time, but typically have little external validity/realism because they lack driving context. On the other extreme, crash data have high validity/realism but lack specific driver performance data and depend on the reporting of driving situations and environmental characteristics. Naturalistic driving studies offer strong external validity/realism but lack experimen- tal control. As a result, the use of naturalistic driving data is most appropriate for research questions that cannot be answered with more readily available methods. For exam- ple, a research question that attempts to relate crash rates to road geometry might be best answered by using existing crash and roadway data; it does not require the information about driver behavior that NDS data can supply. Similarly, a research question about where drivers need to look to safely traverse a horizontal curve may be more appropriate for a simulator study. Work Plan Requirements

11 to the model of the interactions between these factors (see Figure 5.2). Each of these elements is discussed below. The data collected from the SHRP 2 project will contain various data formats (e.g., video, numeric, and text contained in rela- tional databases). Driver characteristics include attention, perception, situa- tion assessment, and motor control (Lee 2006). These charac- teristics vary among drivers and are influenced by individual differences such as age and driving experience. Drivers’ psy- chological functioning also varies across time as a function of fatigue and impairment caused by alcohol or drug use (these factors are identified as the driver state in Figure 5.2). In addi- tion, nondriving-related activities, especially those leading to driver distraction, influence driver attention, perception, situation assessment, and motor control. Technology (e.g., cell phones, MP3 players, and Internet connectivity) enables a wide range of nondriving activities that can distract drivers. The effect of such technology on crash risk clearly depends on more than vehicle characteristics. It is very dependent on how drivers use and react to the technology according to the roadway characteristics (Lee 2006). Vehicle characteristics also influence driver behavior (e.g., advanced braking systems influence the braking effective- ness of the driver). Rear-end collision avoidance systems have been shown to have a safety effect in reducing crash fre- quency (Lee et al. 2002). Other technologies, such as adaptive cruise control or crash warning systems, change the driving task more fundamentally and may lead drivers to disengage from the driving task and lose situation awareness (Stanton and Young 2005; Young and Stanton 2004). The interaction between the driver and the vehicle is a critical aspect of the safety of the driver–vehicle system. Although advanced vehi- cle technology represents a critical emerging issue for driving safety, assessment with SHRP 2 NDS data will depend on the market penetration of such systems at the time of data col- lection and the number of participants who have cars with such systems. Considerations for Data analysis plans This section outlines the major considerations for the devel- opment of data analysis plans. S08 researchers will need to consider what variables and data sampling plans are needed to answer their proposed research questions. Data variables can be selected based on driver, vehicle, roadway, or envi- ronmental information. Within each variable, the data can be segmented into continuous data, sequential blocks, and sample- or event-based data. A description of each method is provided below. Data analysis plans must consider and jus- tify the selected variables, sampling plans, data aggregation methods, and analytical tools. These are critical aspects of research using naturalistic driving data. Data Levels and Variables While approximately a petabyte of data will be collected from the SHRP 2 instrumented vehicles, the full data set will not be available to S08 researchers. Thus, they will need to deter- mine what data they need, how they will process or extract it in preparation for analysis, and how it will be formatted. Data storage and computational limits will also affect S08 researchers. Some analyses may require continuous data (i.e., the data collected at the highest sampling rate), others are likely to require data sampled around specific events, and still others may need the data that describe each trip. If events, features, or other triggered events are desired, the researcher must clearly define how these will be used to extract the required data. Proposers should demonstrate that they have the ability to work with spatial and temporal data sources. They should identify the challenges associated with naturalistic driving data that involve information such as roadway characteristics. The data can be systematically separated into driver, vehicle, roadway, and environmental elements according La bo rat ory Sim ula tio n Te st tra ck Fie ld ex pe rim en ts Ev en t-t rig ge re d v ide o Fie ld op era tio na l te sts Na tur ali sti c d riv ing Cr as h d ata External Realism Experimental Control Figure 5.1. External realism and experimental control inhabit contradictory continuums.

12 Roadway characteristics influence the safety consequences associated with changes of vehicle position on the roadway. A narrow lane or shoulder magnifies the safety consequences of a deviation from the center of the lane. Isolated roadway characteristics such as shoulder width and curve geometry can influence crash risk, but the interaction of these factors with driver characteristics may have a more powerful influ- ence on risk. Lane width and shoulder treatment influence lane-keeping behavior, and drivers’ ability to anticipate curves based on signage, geometry, and other factors influ- ence road-departure crashes. The overrepresentation of older drivers in intersection crashes also demonstrates how road- way characteristics interact with driver characteristics. Environmental characteristics influence the safety con- sequences of driver characteristics, vehicle dynamics, the demands of nondriving tasks, and roadway characteristics (Lunenfeld and Alexander 1990). Environmental character- istics include traffic density, ambient lighting, weather condi- tions, and pavement surface conditions. These elements not only represent situational factors considered (or ignored) by the driver in planning behavior (or triggering errors), but they also define boundaries for the operation of the vehicle on a particular roadway (e.g., ice reduces the speed at which a vehicle can negotiate a curve without sliding). Thus, research questions must be framed to consider the relevant context of the driving environment. Data Sampling and Aggregation Plans Research questions can be examined in various ways depend- ing on the chosen analytical method and will therefore require different data sampling plans. Each of the data ele- ments described above can be sampled in different ways. Table 5.1 demonstrates how the data can be described in the proposal to address each specific research objective. Some research questions may require data sampled from a specific cell of the matrix, while others may require a grouping of the cells. For example, to address a specific research question that relates vehicle kinematics to curves, only vehicle data would be needed at the event level when the events are defined as Perceptual, cognitive, and motor characteristics Driver state Environmental Driver Distractions Vehicle state Proximity to safety boundaries Nondriving activities Dynamics and crash characteristics Vehicle Geometry, vehicles, and control devices Roadway Intersection incidents Lane departures Roadway departures Safety consequences Traffic density, weather, season, and time of day Figure 5.2. The dynamic relationships between driver, vehicle, roadway, and environment and the resulting safety consequences. Table 5.1. Matrix of Data Elements and Data Sampling Levels Data Sampling Strategies Random Epochs Event Epochs Periodic Trip Data Elements Driver Vehicle Roadway Environment

13 curves. However, a research question related to driver behav- ior in response to lane departures would also require event- based data related to the driver. Figure 5.3 puts the specific sampling strategies in Table 5.1 into a more general context. Sampling can be defined in terms of the factors that define the period of interest and the way the data are aggregated from these periods. Often the period of interest can be defined in terms of a triggering condition or a window around an event. Figure 5.3 shows sample trig- gering conditions ranging from random instances and specific events (e.g., acceleration threshold exceedance or intersection traversal) to periodic samples that occur at a set number of seconds or miles traveled. The initiation of a trip can also serve as a triggering event. Figure 5.3 also shows how the win- dows around events that define the data of interest can vary in length. An extremely small window will capture only the instantaneous state of the vehicle (as shown at the top of the figure); a long window can almost encompass the entire trip (as shown at the bottom of the figure). The data window for the trip excludes the start and end of the trip to protect the anonymity of the participant. As the second example in Figure 5.3 shows, the window around critical events is typically asym- metrical because more data are collected before the event than after the event. Some algorithms to identify driver state might use overlapping windows, as shown in the fourth example. Once the data sampling strategy is characterized, the aggre- gation and transformation of the data over each sampled win- dow will also need to be defined. If the instantaneous state of the driver, vehicle, roadway, or environment is of inter- est, then a single measure at the point of the event may be all that is required. In the other sampling plans shown in Figure 5.3, the data associated with each variable from the sampled window will most likely need to be aggregated into a single descriptive number (e.g., the mean speed over the window or the standard deviation of the speed or lane position). In some instances the raw data are of interest, such as in the University of Minnesota study described above, in which microscopic models of driver behavior were fit to the continuous data. The following paragraphs describe some possible combinations of data sampling and aggregation strategies. The number of combinations defined by the triggering condition, window, and aggregation strategies is very large, and the specific selec- tion must be tailored to the specific research issue. Continuous data encompass the raw data and are not aggregated. Figure 5.4 shows how continuous data could be used to construct a speed profile of a trip for a single driver. Some research questions of interest can only be addressed with this fine-grained detail. Figure 5.3. Data sampling strategies (figure not to scale). Figure 5.4. Example of a continuous data stream used to produce a speed profile.

14 Random epochs provide random snapshots of the state of the driver and vehicle. Data samples might be collected every minute, every 5 minutes, or every hour. For each random epoch, speed or lane position would be available for a specified period around that epoch. For example, for a specific epoch, a researcher may be interested in the 20 seconds before and after the epoch (see Figure 5.3). The analyst could aggregate these snapshots to generate mean speeds or other descriptive statistics. Unlike epochs based on random or periodic sampling, event-based epochs provide a means for examining data for a predefined event or feature of interest. This method is useful for addressing research questions related to critical or non- critical events. For example, the number of lane departures at specific curves of specific radii may be considered at the event level (see Figure 5.5). Examples of events of interest include a lane departure, a headway distance of less than 20 feet, dial- ing a cell phone, or an intersection crash. A specific interest may also be a feature, such as curves greater than 500 or 1,000 feet or mountainous roads with no guardrails. As with all the other types of data-sampling schemes (except continuous), the researcher will need to develop a plan for reducing the data to these triggers, events, or features of interest. Periodic sampling provides data aggregated for a given block of time (e.g., seconds, minutes, or hours) that can include measures of interest such as mean speeds or stan- dard deviation of lane position for each block of time (see Figure 5.6). This level of data sampling provides researchers with more manageable data sets while still capturing insights for research questions that require observations over time. As observed in Figure 5.6, the blocks of time associated with the period of interest may actually overlap, and the data analyst will need to take this overlap into account. Trip-level samples in which the trip start time and end time define the duration of the trip will also be available to S08 researchers. Figure 5.7 is an example of trip-level data. Coding Video Data Because transforming video into numerical data that can be included in an analysis can require significant resources, this process merits particular attention. Before embark- ing on the analysis of video data, such as eyeglances, the researcher should consider the magnitude of work that may be involved. Depending on the research question, various elements may need to be reduced from the video data that will require either manual reduction or the development of some automated functions by S08 researchers. Poten- tial elements of interest that may be reduced from the data include information related to driver distraction, as well as roadway, environmental, lane position, and traffic opera- tion elements. Although most eyeglance data reduction is automated, manual reduction may occasionally be required. Such man- ual coding allows for a more refined, context-specific analy- sis. Specialty analyses might be required around events of interest such as extreme acceleration, crashes, or near crashes. Other incidents of interest may be related to dialing on a cell phone, typing or reading text messages, or a positive indica- tion on the passive alcohol sensor. Visual scanning measures of the interior of the vehicle and the view outside of the vehicle will be accomplished by placing cameras throughout the vehicle as specified by VTTI (Figure 5.8). From these videos of the driver’s eyes, face, and head, eyeglance direction can be estimated. Eyeglance esti- mations demonstrated in research conducted at the Univer- sity of Iowa and the Crash Avoidance Metrics Partnership (Angell et al. 2006) classified glances into nine zones (see Radius curves Figure 5.5. Event epochs of roadway departures at given speeds for curves with different radii. Figure 5.6. Periodic sampling with hourly collection of mean speeds. Figure 5.7. Sample trip-level data. M ea n sp ee d (m /s) Trip number

15 Figure 5.9). A trained data reductionist can take the context of an eyeglance and assign it a position in one of these nine areas: 1. Forward (road scene); 2. Center rearview mirror; 3. Up (including visor and road scene); 4. Left (including outside mirror); 5. Right (including outside mirror); 6. Steering wheel/cluster area (includes meters—e.g., speed- ometer, tachometer); 7. Center stack (e.g., radio, CD player); 8. Down (below steering wheel and center stack); and 9. Other (all glances that are not assigned to one of the above eight zones—e.g., a glance toward passengers). Video with assigned areas of interest can be created with specialty packages such as Noldus Observer or custom soft- ware that enables analysts to view and code VTTI digital videos frame by frame (i.e., 1/30th of a second). Video reductionists may also want to use a programmable jog-and-shuttle video editor to help expedite data entry and scrolling to specific loca- tions of interest. Jog-and-shuttle keyboards have programma- ble keys so that single keystrokes can be made for each of the physical visual locations of interest. The level of effort for coding manual data can vary sub- stantially across projects and data sources. For estimating purposes, manual video reduction may take anywhere from Figure 5.8. VTTI example video views (based on presentation material from SHRP 2 July 2009 meeting). Figure 5.9. Nine eyeglance zones.

16 five to six times as long as the actual video clip. For example, if a video clip of interest is 30 seconds long, it will usually take 150 to 180 seconds to manually code the clip. VTTI’s MASK software is expected to be able to automati- cally and accurately code a high percentage of the eyeglance data. Manual coding will only be required when the auto- matic coding is not able to code the data or when specific behaviors need to be logged. The effort associated with this will depend on the precision and reliability of the MASK system. From this more precise frame-by-frame coding, the fre- quency and duration of eyes-off-road can be computed for each event of interest, and various descriptive statistics can be applied to these reduced data. Specific variables of interest are the number and duration of glances to complete a task. Total glance times can then be computed for each task. Issues Related to Time-Dependent Variables The NDS data that will be collected will have inherent time dependencies that will occur at all sampling levels from sec- onds to weeks. Several time domain methods could be used to account for this dependency, including moving averages, exponential smoothing, autoregressive moving average mod- els, and distributed lags analysis. Frequency domain methods (e.g., Fourier transforms) can also be used to examine time- dependent outcomes. When considering a time–series analysis, researchers will need to demonstrate an understanding of how to examine the underlying patterns (e.g., trends, serial correlation, seasonal effects, and residuals) in the data because outcomes that do not account for these patterns may have confounding effects and inappropriate conclusions. For example, if a researcher is considering an autoregressive moving average model for the analysis plan, trends and cyclical effects anticipated from the segmented data should be identified. The researcher’s expla- nation should include an example of a time–series plot of anticipated events (e.g., number of text messages, cell phone calls, or even lane departures). Figure 5.10 shows an example plot of events over several study days. A preliminary observa- tion of this plot indicates the existence of a downward trend and the appearance of cyclical patterns (with a higher num- ber of events at the start of each day). As random variation in data is inevitable, researchers should identify how they will observe emerging patterns within a time period (e.g., by using moving averages, exponential smoothing, or some other technique). The data may also exhibit cyclical or sea- sonal patterns that will need to be considered in the model by differencing or accounting for the order of the data. Issues Related to Spatially Dependent Variables In addition to time dependency, certain spatial effects will be inherent in the data collected in the SHRP 2 NDS. An analysis with intrinsic geographic, geometric, or topological relationships needs to account for the spatial dependencies related to roadway characteristics, travel routes, or region. Spatial relationships can be defined and measured in many ways (e.g., travel start and end points, travel distances, crash or incident locations, and the relationship between crash or incident locations). Researchers need to address these issues in their proposal or describe the analytical methodology that accounts for these factors. Crash- or incident-migration behavior is an example of a research area that requires an understanding of spatial rela- tionships. Several spatial methods can be used to address this complexity, including spatial cluster analysis, T-squared sampling, artificial neural networks, and geographically weighted regression. Figure 5.10. Number of events per day.

17 In assessing the appropriate spatial technique, the pro- posal should explain the types of geospatial data that will be used and how these data will affect the segmented data that will be used in the analysis. Researchers should take care to recognize that some spatial data analysis tech- niques (e.g., T-squared sampling) may not be appropriate for transportation research because of constraints in inter- section configuration. Techniques that have demonstrated usefulness in transportation research include spatial cluster analysis (Miller and Wentz 2003) and spatial autocorrelation- corrected regression modeling (LaScala et al. 2000). A clus- ter is a group of data points that exhibit some similarity to some subset of the rest of the data and dissimilarity to all of the other data (Jacquez 2008). There are many methods for determining clusters and evaluating the statistical difference between them. However, this is only one of many possible techniques. In general, the researcher needs to explain and justify the spatial method chosen. Because GIS data will be needed to complete the spatial analysis, proposers should document how the databases will be combined and the data analyzed. Model Formulation Analysis models developed for Project S08 will most likely be formulated with both static and dynamic variables. Within each sampling plan, some variables will generally be static for an entire trip or study period, while other variables will vary greatly within trips. Thus, variables’ characteristics can range along a continuum from static to dynamic. One would expect static variables such as the driver’s age and gender, the vehi- cle make and model, and the region to remain unchanged at all sampling levels. At the epoch level (random or event), road type, pavement markings, and visibility should also be static. (An event epoch is a 5- to 15-second period surround- ing a notable state change.) In contrast, the characteristics of dynamic variables can change during the time period of interest. For example, a driver might be distracted for only part of a trip; hence distraction is a dynamic variable within a trip. At the trip level, unlike at the epoch level, the curvature of the road, the pavement markings, and the speed of the vehicle may change. The distinction of dynamic and static is based on the time constant of the variables relative to the time period over which the data are aggregated. A variable could be dynamic in one sampling plan but static in another. For example, at the trip level, there are no static roadway variables because the roadway is expected to change continuously. However, a researcher might wish to examine event epochs such as lane departures; for these events roadway variables such as road curvature and pavement markings are static. Similarly, environmental variables are expected to be static at the event epoch level. Visibility can change on the order of minutes, making it a dynamic variable at the trip level, but at the event level visibility is considered static because it is likely to change minimally if at all. While some variables are essentially constant (e.g., driver age), others are constantly changing (e.g., speed without cruise control). Still other variables will change at different rates, some gradually, others essentially instantaneously. Thus, the research question must explicitly define the level of sampling for the analysis (see Chapter 6). It is then necessary to identify which variables of interest are static and which are dynamic within these levels. The answer may differ from level to level; for example, speed may be constant in some, variable in others. Defining Crash Surrogates Although crashes will occur in this study, their rarity makes it difficult to directly address most safety issues that need crashes to analyze. Hence, crash risk will have to be estimated using crash surrogates. Crash surrogates represent events that are equivalent to crashes except that a crash was avoided. Researchers have identified variables to define potentially viable crash surrogates. Some of these are described in S01 reports (e.g., lane departures or a specific time period for roadway departure, and drops below a threshold value of time to lane crossing). As appropriate, Project S08 proposers will need to define appropriate crash surrogates and a sam- pling approach to ensure there are enough surrogate events to justify the analytical approach. They must also demonstrate that the surrogate event belongs to the same equivalence class as the crash it is used to represent. Justifying membership in the same equivalence class as a crash is currently based on the somewhat informal judgment that a crash would have occurred had the driver not intervened. Analytic justification of crash surrogates is a critical issue facing the interpretation of naturalistic driving data. The proposer will also need to demonstrate the relation- ship of the surrogate measure to the safety outcome being examined. Examples of this include the association between close headway distance and the likelihood of rear-end crashes or the relationship between variation in lane position or speed for alcohol-, fatigue-, or distraction-related crashes. Consid- ering drivers’ dynamic adaptation at the level of the second or even millisecond also has important consequences for defin- ing safety surrogates. For example, when examining driver adaptation, one might expect a linear relationship between lane-keeping performance, lane departure, and roadway departure. A perspective that acknowledges driver adaptation may clarify the factors that lead to safety boundary violations and how these factors affect drivers’ subsequent ability to recover and avoid a more severe incident. The crash surrogate

18 should also support interventions for roadway and vehicle design, as well as changes in driver behavior that might be induced through training, policy, or regulation. Identification and Justification of Analytical Approach The range of approaches to data sampling makes selecting the appropriate statistical technique a challenge. The tradi- tional techniques commonly used in simulator studies are often inappropriate for the temporal and spatial nature of data generated by naturalistic studies. Each research ques- tion can be answered from different analytical perspectives depending on whether an outcome of interest is to be dis- covered, confirmed, or further explored. S08 project propos- ers will need to justify the analytic technique they plan to use and explain why the chosen technique is the best choice to address the specific research question. For example, a researcher may be interested in examining the factors that relate to speed propensity in order to address the research question, “How does speeding behavior influence the likeli- hood of a crash?” A potential technique could be factor analy- sis, in which latent variables can be uncovered from several seemingly unrelated variables. The proposer would have to justify why factor analysis was superior to other techniques, such as cluster analysis or principal components analysis. Finally, proposers should also justify the sample size and how they will validate the model or approaches that they propose. Pitfalls and Limitations That May Be Encountered and How to Address Them Potential limitations associated with using the SHRP 2 NDS data will also need to be discussed in the Project S08 pro- posals. For example, if a proposal uses an uncommon crash surrogate, finding an answer to a research question may not be feasible because the data may be insufficient. Initial results from the VTTI 100-car study suggest that approxi- mately 10,000 cases (crash surrogates and baseline events) are likely to be needed to achieve sufficient statistical power. Therefore, proposers should address sample size and power issues, how the power of an analysis might affect their abil- ity to reach conclusions, and how much they expect to rely on surrogates. In other cases, there may be substantial chal- lenges in linking data sets, such as linking GIS data to the vehicle data. These limitations of data availability, data for- mat, and linkages between data sets require that researchers demonstrate their understanding of the issues and provide a plan for how they will be addressed. It should be noted that the discussion here involves some early assumptions about data availability and formatting. Significant additional information should be available by the time the first round of S08 RFPs is issued. Data Format Initially, VTTI will be the steward of the SHRP 2 NDS data. They will also manage access to the data by S08 researchers. While some quality assurance will be performed, the natu- ralistic driving data that will be provided will essentially be in the form of raw video and data streams from sensors located in the instrumented vehicles. The list of data elements that will be reported from the in-vehicle sensors is included in the Project S05 final report. It is expected that the data from vehicle sensors will be reported at 10 Hz, even if the data are collected at a higher or lower frequency. In addition to providing the raw continuous data, it is expected that VTTI will aggregate vehicle sensor data to the trip level. This data can then be manipulated and aggre- gated to various levels by S08 researchers to suit their par- ticular purposes. However, it is important to recognize that although the data will be available in several formats (con- tinuous or raw data, reduced data sets, and trip-level data), S08 researchers who require the data at any other level of aggregation will need to perform the aggregation themselves or request that the reduction be performed by VTTI. Some data reduction may also be necessary to prepare the data for use in the appropriate format. As a result, researchers should describe and justify the level of data aggregation desired for their specific research question. Additional details on data sampling levels are provided in the next sections. Several sources of data will be available to S08 proposers and could be used in conjunction with the data collected from the instrumented vehicles. The roadway data will con- sist of existing data sets, as well as data collected as part of mobile mapping data collection in Safety Project S04B, Mobile Data Collection. Existing data sets may include road- way centerline and attribute data, crash data, aerial imagery, roadway weather information system data, video log data, automatic traffic recording data, and archived weather data. Researchers for Safety Project S04A, Roadway Information Database Development and Technical Coordination and Quality Assurance of the Mobile Data Collection Project, will organize existing data and reduce collected mobile data sets into a database format to be determined by the S04A team. Although the final format for this database is currently unknown, S08 proposers may assume that the final data sets will be provided in the form of a GIS. It is expected that pre- liminary lists of data elements to be collected will be available for release with the S08 RFP. S08 researchers should address the methods they will use to transform data into the final format required to answer their specific research question. This explanation includes

19 Documentation of results and Data Warehousing This section describes the elements that should be addressed regarding documentation of the research results and data sharing between contractors and the data warehouse. Documentation of Research Results Proposers should outline the way research results will be reported to SHRP 2 according to the set guidelines for report formats. If the results of the research will be presented in technical briefs, conference presentations, or journal articles, this should also be discussed. If an analysis tool is developed, researchers will be expected to fully describe the tool and dis- cuss its availability to other researchers. Warehousing and Data Sharing New data sets are anticipated from the S08 studies. Propos- ers should describe their plan for preparing the data they have aggregated or extracted so that it can be accessed by other researchers. The plan should include what data will be available; descriptions of how data will be reduced, extracted, or aggregated; and a data dictionary describing the data elements. Data sharing between the S08 researchers and the data warehouse is a significant issue that must be carefully addressed. Details of the data-sharing agreements must be determined and mechanisms created for data transfers. An important concern with data sharing is protecting the pri- vacy of participants. Any data sharing must conform to the requirements of the institutional review boards (IRBs) of the organizations involved. Privacy issues are particularly acute if face video or identifying global positioning system (GPS) data are needed. Data use may require that researchers physi- cally travel to the site that hosts the data. Another potential concern involves data reduction. Because raw data require substantial processing before they can be interpreted, the algorithms for data reduction play a critical role in deriving meaning from the data. Extracting seemingly simple measures such as distance to the vehicle ahead or time to lane crossing can involve substantial pro- cessing of radar and video data. Such processing might involve proprietary algorithms or data coding that make it difficult or impossible to replicate studies. expected Outcome The outcomes of the research conducted as part of the S08 projects should relate to the goals of SHRP 2. More specifi- cally, S08 proposals should be able to clearly demonstrate demonstrating the scope of work required to accomplish this. Proposers should also demonstrate their ability to under- stand how to use and manipulate the various other data sets. Finally, they should articulate the limitations inherent in these data sets. Data Availability Because the field data collection will not be complete until 2013, the initial S08 researchers will have only a few months of naturalistic driving data available to them. Additional data will become available while the S08 projects are under way. There will also be some limitations on the availability of roadway data. Assuming S04A researchers are selected and under contract by March 2010, existing data sets will be acquired from NDS sites by December 2010. While it will be necessary to process these databases into consistent formats, the data sets may be usable by S08 researchers in the interim. Mobile mapping data collection will commence approxi- mately mid 2011, and results from initial sites may be avail- able early in 2012. Thus, S08 researchers initially will not have access to final formatted roadway data sets. If access is provided to exist- ing data sets before formatting and processing is completed, S08 researchers may need to overlay and link instrumented vehicle data with the roadside data, and they should dem- onstrate that they understand how to do so. They will also need to consider other sources for the necessary data until roadway data sets are available. For instance, some roadway features can be determined from the instrumented vehi- cles’ forward video. Aerial imagery such as that available on Google could be used, although it is likely to require additional data reduction. In brief, S08 researchers will need to demonstrate their understanding of the potential limitations in data availability and have a plan to address associated uncertainties. Linking Between Data Sets Researchers should address how different data sets will be linked to combine or extract information. As noted above, there are a number of existing data sets that S08 proposers may want to consider, including those listed in Chapter 6. It is expected that all databases will be available in a spatial for- mat that can be linked and manipulated in a GIS. It is impor- tant that proposers planning to use these data sets identify how this data linking will be accomplished. Researchers may wish to use additional databases outside those collected by Project S04A. It is recommended that these proposers con- sider submitting a letter of collaboration or agreement from the organization that maintains the external data confirming that data can be accessed.

20 that project outcomes directly contribute to enhancing driv- ing safety. It is the understanding of the Project S02 team that S08 projects should result in outcomes that can be directly used by engineers and policy makers to guide decision mak- ing. This may include information that • Leads to better selection and application of roadway or vehicle design and countermeasures; and • Leads to more informed regulation and policies. The focus of SHRP 2 is on identifying safety interventions, not on developing statistical methods or understanding driver behavior not specifically related to safety. Given that resources are limited, research projects with the potential for large safety impacts are vital for SHRP 2 goals. Potential safety-related outcomes of S08 projects include identifying factors related to possible reductions in crash fatalities, reduc- tions in crashes, improved policies and infrastructure, and improved driver licensing protocols (e.g., graduated driver licensing). It is strongly recommended that S08 researchers include a section that clearly delineates the expected safety outcomes of their project, the stakeholders affected, and the method by which research results will be transmitted to rel- evant stakeholders.

Next: Chapter 6 - Examples of Summary Work Plans »
Integration of Analysis Methods and Development of Analysis Plan Get This Book
×
 Integration of Analysis Methods and Development of Analysis Plan
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB’s second Strategic Highway Research Program (SHRP 2). Report S2-S02-RW-1:Integration of Analysis Methods and Development of Analysis Plan provides an analysis plan for the SHRP 2 Naturalistic Driving Study (NDS) to help guide the development of Project S08, Analysis of In-Vehicle Field Study Data and Countermeasure Implications, and to help assist researchers planning to use the SHRP 2 NDS data.

This publication is only available in electronic format.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!