National Academies Press: OpenBook
« Previous: Chapter 2 - Introduction
Page 10
Suggested Citation:"Chapter 3 - Data Collection." National Academies of Sciences, Engineering, and Medicine. 2014. Analysis of Naturalistic Driving Study Data: Roadway Departures on Rural Two-Lane Curves. Washington, DC: The National Academies Press. doi: 10.17226/22317.
×
Page 10
Page 11
Suggested Citation:"Chapter 3 - Data Collection." National Academies of Sciences, Engineering, and Medicine. 2014. Analysis of Naturalistic Driving Study Data: Roadway Departures on Rural Two-Lane Curves. Washington, DC: The National Academies Press. doi: 10.17226/22317.
×
Page 11
Page 12
Suggested Citation:"Chapter 3 - Data Collection." National Academies of Sciences, Engineering, and Medicine. 2014. Analysis of Naturalistic Driving Study Data: Roadway Departures on Rural Two-Lane Curves. Washington, DC: The National Academies Press. doi: 10.17226/22317.
×
Page 12
Page 13
Suggested Citation:"Chapter 3 - Data Collection." National Academies of Sciences, Engineering, and Medicine. 2014. Analysis of Naturalistic Driving Study Data: Roadway Departures on Rural Two-Lane Curves. Washington, DC: The National Academies Press. doi: 10.17226/22317.
×
Page 13
Page 14
Suggested Citation:"Chapter 3 - Data Collection." National Academies of Sciences, Engineering, and Medicine. 2014. Analysis of Naturalistic Driving Study Data: Roadway Departures on Rural Two-Lane Curves. Washington, DC: The National Academies Press. doi: 10.17226/22317.
×
Page 14
Page 15
Suggested Citation:"Chapter 3 - Data Collection." National Academies of Sciences, Engineering, and Medicine. 2014. Analysis of Naturalistic Driving Study Data: Roadway Departures on Rural Two-Lane Curves. Washington, DC: The National Academies Press. doi: 10.17226/22317.
×
Page 15
Page 16
Suggested Citation:"Chapter 3 - Data Collection." National Academies of Sciences, Engineering, and Medicine. 2014. Analysis of Naturalistic Driving Study Data: Roadway Departures on Rural Two-Lane Curves. Washington, DC: The National Academies Press. doi: 10.17226/22317.
×
Page 16

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

10 C h a p t e r 3 This chapter describes the process for obtaining and reducing the different NDS data sets used in the present analysis. A further description of the data reduction effort for each spe- cific analysis is included within each corresponding section. Identification of Data Needs Before requesting data, the research team identified the desired driver, roadway, and vehicle characteristics necessary to answer the stated research questions. To accomplish this, the team conducted a number of literature reviews regarding fac- tors related to rural roadway departures and researched the impact of various countermeasures. This information and the team’s expertise were used to develop a list of roadway, driver, and environmental data elements necessary to answer the stated research questions. An in-depth summary can be found in Hallmark et al. (2011). In addition, the team completed an assessment of the SHRP 2 NDS and RID data to determine the likely sources of data and the accuracy of that data. This list of desired data elements was used to guide the data requests, which are described in the following section. Data requests New Institutional Review Board (IRB) and data-sharing agree- ments were necessary to initiate data requests. Both the Center for Transportation Research and Education at Iowa State Uni- versity (CTRE/ISU) and University of Iowa (UI) teams obtained the appropriate IRB approval from their respective home insti- tutions and then submitted data-sharing agreements to the Virginia Tech Transportation Institute (VTTI). Both the NDS and RID data were still being collected at the time this analysis was being conducted. In addition, the NDS and RID data had not yet been linked at the time the data requests were initiated. As a result, it was necessary for the CTRE/ISU team to manually identify potential curves of interest using the method described below. The team focused on Florida (FL) for Phase 1 because NDS and roadway data collection were the most advanced in that area at the time Phase 1 commenced. Phase 2 included North Carolina (NC), Indiana (IN), New York (NY), and Pennsylva- nia (PA) because those states had the greatest share of rural roadway data. Table 3.1 provides a summary of the data that were ultimately collected for the RID by state. To identify potential curves of interest, the project team made use of weighted section maps that VTTI prepared dur- ing the early stages of the NDS data collection to help the team working on SHRP 2 Safety Project S04A, Roadway Informa- tion Database Development, to focus mobile mapping on roadway sections where NDS trips were occurring. A data-sharing agreement and data request were made so that the weighted section maps could be used to identify curves of interest. The team manually reviewed the weighted section maps and the RID and identified segments of rural two-lane paved roadways with curves. Rural was defined as approximately 1 mile from an urban or built-up area. A seg- ment consisted of a continuous stretch of roadway with no major changes in roadway cross section. Each segment included one or more curves. Each segment also included a tangent distance of at least 0.5 miles upstream or down- stream of the first or last curve included in the respective segment. A buffer was created around each segment so that trips through the segment could be identified by VTTI. A buffer was used so that vehicle activity passing along the correspond- ing roadway section could be “clipped” out using GIS overlay functions. Trips through a buffer are referred to in this report as traces because the term trip is used in the NDS to indicate a journey from origin to destination. Traces are thus portions of a trip. An example of a buffer section and corresponding traces is shown in Figure 3.1. Identified segments were evaluated and were removed from further consideration when they included turning or passing lanes in the curve, stop or signal controlled intersections, or a Data Collection

11 significant portion in which the surrounding area appeared to be more urban than rural. Fourteen buffers and 32 curves were identified in Florida in Phase 1, and the remaining were identified in Phase 2. A total of 217 buffers and 739 curves were identified, as shown in Table 3.2. Segment locations identified for Phases 1 and 2 (Florida, New York, Pennsylvania, North Carolina, and Indiana) are shown in Figure 3.2. A sampling plan was developed after initial analyses in Phase 1, and a total request for about 1,000 traces was planned for Phase 2. This request was based on an estimate of an ideal sample size for the statistical methods identified for the research questions and was balanced against what the CTRE/ ISU and UI teams could reasonably accomplish in Phase 2. The resources needed for VTTI to complete the request were also considered in selection of the sampling size. Data were requested in three phases, as described in the following sections. Each trace represents one trip through one buffer segment by one driver. Only data for which the driver traversed the entire segment were requested because the driver may have turned onto or off of the segment. VTTI typically first provided a data file with output from the in-vehicle data acquisition system (DAS) along with static driver/vehicle characteristics such as age, gender, and vehicle type. The CTRE/ISU team reviewed the raw DAS file to deter- mine whether sufficient data were available for variables of interest; VTTI then provided a video with a forward and rear view of the roadway only for these traces. The data elements to be provided were specified in the data request. Driver face and steering wheel position/hand position vid- eos were viewed and reduced by the team at the VTTI secure data enclave, as described in Chapter 4. Information such as vehicle speed, acceleration, pedal posi- tion, and wiper blade position were provided in the data file. Additional information about available variables is provided in Chapter 4, but an example of a raw data file is shown in Table 3.3. Data were provided at 10 Hz (0.1 s intervals). GPS location was also provided so that the data could be imported into a geographic information system (GIS) program and overlaid with the RID and aerial imagery. These data are referred to as time series DAS data. Video and time series data are linked using timestamps. An example of the video views available is shown in Fig- ure 3.3. The forward and rear roadway video views were pro- vided along with the time series data and could be viewed in-house. The driver face and steering wheel/hand position videos could only be viewed at the VTTI secure data enclave. A still cabin view was also available at the enclave, which showed a blurred view of the cabin that could be used to indi- cate passengers. Data Request 1 (Phase 1) The first data request was in Phase 1 and was made in the initial stages of the NDS and RID data collection. Only Flor- ida was included in this request because collection of NDS and RID data was the most advanced in Florida at that time. Using the trip maps provided by VTTI, 50 rural two-lane curves Table 3.1. Rural/Urban Split for RID Data Collection by State Study State Miles Collected for RID Rural/Urban Split FL 4,366 45% rural/55% urban IN 4,635 64% rural/36% urban NC 4,558 59% rural/41% urban NY 3,570 68% rural/32% urban PA 3,670 83% rural/17% urban WA 4,277 31% rural/69% urban Table 3.2. Location of Buffer Segments Study State Buffers Curves IN 80 375 NY 71 173 NC 20 58 PA 32 101 FL 14 32 Total 217 739 Source: World_Imagery (Esri, DigitalGlobe, GeoEye, i-cubed, USDA, USGS AEX, Getmapping, Aerogrid, IGN, IGP, swisstopo, and the GIS User Community). Figure 3.1. Example of traces through buffer segment.

12 Florida New York Pennsylvania North Carolina Indiana Source: World_Topo_Map (Esri, HERE, DeLorme, TomTom, Intermap, Increment P Corp., GEBCO, USGS, FAO, NPS, NRCAN, GeoBase, IGN, Kadaster NL, Ordnance Survey, Esri Japan, METI, Esri China [Hong Kong], swisstopo, MapmyIndia, © OpenStreetMap contributors, and the GIS User Community). Figure 3.2. Location of identified segments by study state.

13 in Florida were identified. Buffers representing geographic boundaries around each curve were developed and submitted to VTTI. After the initial data request was made (May 2012), it was determined that data were only available for eight buf- fer sections in Florida, which resulted in data for 14 curves because some buffers contain more than one curve. The team requested that all vehicle activity through each identified buffer area be provided. The specific data elements from the DAS were included in the data request. The team worked with VTTI via e-mail and phone to refine the data request. Members from the CTRE/ISU and UI teams also vis- ited the secure data enclave in Blacksburg, Virginia, in May 2012. This visit provided the opportunity to actually see the data and get a sense of the quality of data. This was particu- larly important for the UI team, whose members would later make a second visit to reduce data from driver videos. The team received almost 400 traces. Traces were imported into a GIS program and reviewed, and some were removed from the data set because of the following issues: • Potentially identifiable data were present. • Drivers turned onto or off the roadway of interest within the curve. • Construction zones were present. • Traffic control was present within the curve (not identified earlier). • Lane position was not available or was highly unreliable. • Forward or face video data were missing (indicated by VTTI).Figure 3.3. Example of video views. Source: VTTI. Table 3.3. Raw Data Output System. Time vtti.accel_x vtti.accel_y vtti.accel_z vtti.pedal_ gas_position vtti.gyro_y vtti.gyro_x vtti.wiper vtti.gyro_z speed 205 0.0116 -0.0087 -1.0063 12.54902 0 -0.3252 -0.3252 23.33335 206 0.0174 -0.0174 -0.9976 12.54902 0 -0.3252 -0.3252 23.33335 207 0.0203 -0.0058 -0.9947 12.54902 -0.3252 0 -0.3252 23.33335 208 0.0319 -0.0174 -1.0092 12.54902 0.325195 0 -0.3252 23.05557 209 0.0029 -0.0174 -0.9976 12.54902 0 -0.3252 -0.3252 22.7778 210 0.0261 -0.0029 -0.9918 12.54902 0 -0.65039 0 22.7778 211 0.0145 0.0029 -0.9947 12.54902 22.7778 212 0.0058 0.0029 -0.9976 12.81046 0 0 0 22.7778 213 0.0203 -0.0232 -0.9715 13.46406 -0.65039 0 -0.3252 22.7778 214 0.0029 -0.0232 -0.9831 13.92157 0 0 0 22.7778 215 0.0145 -0.0116 -0.9831 14.31373 0 0 -0.3252 22.7778 216 0.0145 -0.029 -1.0034 15.09804 0 -0.65039 -0.3252 22.7778 217 0.0232 -0.0203 -1.0005 15.55556 0.650391 -0.65039 -0.3252 22.7778 218 0.029 -0.0145 -0.9802 16.33987 -0.65039 0 -0.3252 22.7778 219 0.0174 -0.0116 -0.9715 16.60131 -0.97559 0 -0.3252 22.7778 220 0.0058 -0.0261 -1.0034 16.86275 0 0 -0.65039 22.7778 221 0.0261 -0.0261 -1.0063 17.12419 22.7778 222 0.0145 -0.0116 -1.0295 17.25491 0.650391 -0.3252 -0.65039 22.7778 223 0.0348 -0.0116 -0.9947 17.25491 0 0 -0.3252 22.7778 224 0.0377 -0.0232 -0.9686 17.25491 -0.65039 0 -0.3252 22.7778

14 After removing traces with problematic data, a total of 137 initial usable traces through the various curves were identi- fied. Researchers realized that requesting one forward road- way view for each segment would have allowed the team to identify locations where roadway conditions had changed or construction was present. This detail would have better guided the data request and was used in Phase 2. Data Request 2 (Phase 2) After data were collected and reduced for Phase 1, the data requests were refined. A sampling plan was developed based on the expected number of samples necessary for the statistical analyses, the time and resources available to reduce the data, and the ability of VTTI to provide the data in a timely manner given that the NDS data collection was concurrent with this research effort. Reduction of the driver face video at the VTTI secure data enclave was expected to be the limiting factor. It was originally estimated that about 1,000 traces could be reduced. Given that this research project was one of the first applica- tions of the NDS data, and based on the experience with the data in Phase 1, it was expected that unknown issues were likely to arise that would alter the sampling plan. As a result, two data requests to VTTI were planned for Phase 2. The first (Data Request 2) was for about 200 traces. The goal was to reduce these data, identify additional issues, and then use this information to make a more targeted final data request of about 800 traces (Data Request 3). The 203 buffer segments identified for Phase 2 were pro- vided in ArcGIS shape files and were provided to VTTI. When the second data request was completed (early November 2013) only 10% to 15% of the available NDS data was pro- cessed. Trips were found for about 80 of the buffers in this data request. One forward-view video for each of the 80 buffers was provided and reviewed by the CTRE/ISU team. Review of a single forward view for a roadway segment provided infor- mation such as presence of construction or other changes not evident in the RID or Google View. Issues were found with four buffers, and they were removed. VTTI identified about 1,455 traces across the remaining 76 buffers. The CTRE/ISU team then worked with VTTI to set criteria for selection of about 200 of these traces for the second data request. The criteria are summarized in the following general terms: • Step 1. Exclude traces in which the driver does not traverse at least 75% of buffer. In some cases, the driver turns onto or off the selected segment; these instances do not consti- tute through trips. • Step 2. Exclude traces in which speed or lateral position were problematic or not working or in which GPS appears problematic. • Step 3. Excluding traces identified in Step 1 or 2, select traces in which the following conditions are met: 44 Side or forward acceleration ≥0.3 g; 44 Speed ≥100 km/h (68 mph); 44 A crash/near crash had occurred; and 44 High alcohol readings were present. The intention of this step was to identify locations where a potential roadway departure had occurred or where other driver behaviors of interest were present. • Step 4. From the remaining traces, select traces to balance age and gender for a total of 200 traces. When possible, include traces in which pedal position and steering wheel position are identifiable (both variables are used to deter- mine when a driver reacts to the curve). The VTTI and the CTRE/ISU and UI teams communicated back and forth to set filters for the above conditions. However, there was no easy method to identify when offset was not reli- able. Unless only null values are present, it is not a simple task to determine that the lane-tracking system is not functioning properly or is producing erroneous data. The DAS does pro- vide a variable for the probability that the lane-tracking sys- tem is correctly interpreting right- or left-side lane markings. However, no guidance is currently available regarding when the probability is low enough that the data should be dis- carded (e.g., probability ranges from 0 to 1,024, with higher values indicating better probability). As a result, it was difficult to know at what point to set the threshold. A threshold of 500 was set in Phase 1 to indicate reliable versus unreliable data and was refined to 512 in consultation with VTTI for Phase 2. Similar problems were present in determining when data such as speed or acceleration were valid. Due to resource constraints, it was difficult for VTTI to check the data to determine how to better set filters and iden- tify traces in which key output such as speed or offset were not available or reliable. It was decided that the most expeditious way to get data was for VTTI to provide the CTRE/ISU team with a data file for each of the 1,455 traces. Each spreadsheet contained DAS data such as position, offset, speed, and accel- eration. The CTRE/ISU team reviewed all of the data and selected 200 traces for the second data request. Once the 200 traces of interest were identified, VTTI provided the forward and rear roadway videos. Data Request 3 (Phase 2) The team intended to use steering wheel position to indicate drowsy driving and to identify the point at which drivers began reacting to the curve. The research team also intended to ensure a subset of impaired drivers indicated by the alcohol sensor. However, steering wheel position data was much less available than expected because this variable could not be

15 downloaded by the DAS in certain types of vehicles. Signifi- cant noise was also present in the alcohol measure, and it was not certain whether potentially alcohol-impaired drivers could be identified. As a result, presence of alcohol was not used as a filtering criterion. In addition, it was decided not to bias the sample toward vehicles for which steering wheel posi- tion data were available, because that might have resulted in oversampling of certain vehicle types. By mid-March 2014, about one-third of the NDS data had been processed. VTTI queried the processed data using the provided buffers and identified an additional 2,647 traces. It was again determined that the most expeditious way to move the data request forward was for the CTRE/ISU team to review all of the available data and select a subset of about 800 traces. A Microsoft Excel macro was developed that summa- rized speed, lane position probability, side acceleration, and forward acceleration for each of the traces. One forward-view video was also requested for each seg- ment so that any unusual situations such as construction or recent changes to the roadway could be identified and those segments excluded. Traces meeting the exclusion criteria in Step 1 or Step 2 were identified and removed from further consideration. About 60 of the viable traces met the criteria for side acceleration, for- ward acceleration, or speed in Step 3 and were selected. The remaining viable traces were sorted into a matrix by curve and driver characteristics. An additional 720 traces were selected to balance curve and drivers characteristics, resulting in a total of 787 traces. VTTI provided the forward and rear roadway view for these traces. During later processing some additional issues were identified with the data files, resulting in some additional attrition. Summary of Data received and Limitations Three separate data requests were made, as described in the previous section. About 137 traces were identified and reduced for Phase 1. In Phase 2, about 900 additional traces were deter- mined to be viable after data screening and quality assurance were conducted, as described in the previous section. The data sets provided in-house for each trace included the following: • One Excel file with GPS location and vehicle kinematic data; • One forward video; and • One rear video. A total of 739 curves were initially identified and supplied in the full data request. Data were available for some curves, but trips did not traverse the entire segment. Additionally, only about one-third of the full NDS data set was available for query at the time the final data request was made. As a result, data were only available for 148 curves, which limited the number of curve characteristics that could be represented. Although a large number of potential trips were ultimately available during Phase 2, there were a number of issues with the data (as is expected with this type of data collection). As a result, only a subset of traces was viable. The following issues with the some of the time series data were encountered: • Missing values. In these cases, data such as speed are missing for all or portions of the trace. • Repeat values. In these cases, the same value is repeated over multiple rows. This error is easy to identify because even travel at constant speed will produce minor fluctuations in values from row to row. • Erroneous values. Values are reported incorrectly. This is usu- ally evidenced by unusually high or low values. For instance, lateral acceleration is >0.3 g for 30 rows. • Irregularly reported values. In most cases, variables such as speed, acceleration, lane offset, and pedal position are reported or averaged at 0.1-s intervals. In a number of cases, values were missing for a number of rows (e.g., reported every eighth interval). If the values are correct but less fre- quent, they can still be used in event-level analyses, but they are problematic when time series data are needed. • Data not available for all vehicles. Because the DAS inter- faces with the vehicle computer, some data, such as steering wheel position, could not be downloaded from all vehicles. Steering wheel reversal can be used to indicate traces in which drivers may have been drowsy and to indicate at what point a driver began reacting to the curve, but this information was only available for a fraction of vehicles. • Sensor accuracy unknown. The accuracy of the head pose, lane offset, and alcohol sensor data had not been reported at the time data requests were made. Some indication of lane offset accuracy could be determined using lane line probability, plotting the data, and reviewing the forward view. Head pose could not be confirmed, and there was some indication that the alcohol sensor was not reporting consistently. As a result, neither the head pose nor alcohol data were included in any of the analyses. Issues with the various video views include the following: • Views are blurry due to glare and other factors. • Views are missing. • The driver’s face or eyes cannot be seen because of glare, sunglasses, or other reasons. Many of the variables were critical to the analyses, so it was important to screen out problem traces. Several attempts were made to set filters with VTTI so that problematic data were not included. However, as explained in the description

16 of data requests, the filtering process could not be fine-tuned as much as needed, so the CTRE/ISU team ended up review- ing a large number of traces that ultimately were not viable. This problem will likely have been addressed by the time final data quality assurance has been conducted by VTTI. In some cases, a trace could be used for one research ques- tion but not others because necessary data were missing. For instance, pedal position was irregularly reported or missing in a number of traces. Consistent values were needed for Research Questions 1, 2, and 4 but not for Research Question 3. Research Questions 2 and 4 required consistent, accurate lane-tracking information, which further reduced the available traces. As a result, more traces were available for Research Question 3 than for the other research questions. The data request and reduction process is shown in Figure 3.4. Figure 3.4. Data request and reduction process. •Received 4,102 raw traces. Evaluated key variables. •Removed traces when key variables not functioning. •Developed comparison matrix by driver age/gender and curve characteristics. •Identified traces of interest (123 total). •acceleration ≥ 0.3 g. •speed ≥ 100 km/h. •Requested forward view for traces of interest (123), and an additional 864 traces selected to balance driver/roadway characteristics. •Driver/glance location reduced for 515 traces. •Due to reliability of data, not all traces were used for all research questions.

Next: Chapter 4 - Data Reduction »
Analysis of Naturalistic Driving Study Data: Roadway Departures on Rural Two-Lane Curves Get This Book
×
 Analysis of Naturalistic Driving Study Data: Roadway Departures on Rural Two-Lane Curves
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB’s second Strategic Highway Research Program (SHRP 2) Report S2-S08D-RW-1: Analysis of Naturalistic Driving Study Data: Roadway Departures on Rural Two-Lane Curves analyzes data from the SHRP 2 Naturalistic Driving Study (NDS) and Roadway Information Database (RID) to develop relationships between driver, roadway, and environmental characteristics and risk of a roadway departure on curves.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!