National Academies Press: OpenBook
« Previous: Chapter 2 - Research Approach
Page 12
Suggested Citation:"Chapter 3 - Findings and Applications." National Academies of Sciences, Engineering, and Medicine. 2015. Naturalistic Driving Study: Linking the Study Data to the Roadway Information Database. Washington, DC: The National Academies Press. doi: 10.17226/22200.
×
Page 12
Page 13
Suggested Citation:"Chapter 3 - Findings and Applications." National Academies of Sciences, Engineering, and Medicine. 2015. Naturalistic Driving Study: Linking the Study Data to the Roadway Information Database. Washington, DC: The National Academies Press. doi: 10.17226/22200.
×
Page 13
Page 14
Suggested Citation:"Chapter 3 - Findings and Applications." National Academies of Sciences, Engineering, and Medicine. 2015. Naturalistic Driving Study: Linking the Study Data to the Roadway Information Database. Washington, DC: The National Academies Press. doi: 10.17226/22200.
×
Page 14
Page 15
Suggested Citation:"Chapter 3 - Findings and Applications." National Academies of Sciences, Engineering, and Medicine. 2015. Naturalistic Driving Study: Linking the Study Data to the Roadway Information Database. Washington, DC: The National Academies Press. doi: 10.17226/22200.
×
Page 15
Page 16
Suggested Citation:"Chapter 3 - Findings and Applications." National Academies of Sciences, Engineering, and Medicine. 2015. Naturalistic Driving Study: Linking the Study Data to the Roadway Information Database. Washington, DC: The National Academies Press. doi: 10.17226/22200.
×
Page 16
Page 17
Suggested Citation:"Chapter 3 - Findings and Applications." National Academies of Sciences, Engineering, and Medicine. 2015. Naturalistic Driving Study: Linking the Study Data to the Roadway Information Database. Washington, DC: The National Academies Press. doi: 10.17226/22200.
×
Page 17

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

12 Findings and Applications As each file was processed, the results of the matching solution were written to a database. This database is referred to as the Linking Table. It lists each link, identified with a unique LINK_ ID shared with the RID, the timestamp within each file when the vehicle started on the link, the timestamp when the vehicle left the link, and the vehicle-to-node measure at the start of each link. The timestamp in the table is the same timestamp as the timestamp of the closest GPS point to the start (and end) of a link. The following sections describe the findings from the Matching Process and data developed through the process. Linking Table The primary output of the matching work is the Linking Table. It lists the LINK_IDs on which participants drove, as well as timestamps at the beginning and end of the links and the file that includes the rest of the vehicle data describing these epochs. Researchers can use this table to index into the vehicle data for analysis of driving data. The Linking Table contains approximately 305 million rows. See Table 3.1 for a small sample of the Linking Table. Fields in the Linking Table The fields in the Linking Table are defined as follows: • FILE_ID. The unique number identifying the trip file within the SHRP 2 NDS database. One trip file is typically one key-on to key-off ignition cycle, but sometimes trips are divided into multiple files. • SEQUENCE_NO. A within-file sequence number identifying the order in which the LINK_IDs were traversed in the file. • LINK_ID. A permanently unique ID for a roadway link, within the digital map databases. A road or road segment is made up of multiple links. • TIMESTAMP_START. The time (ms) into the file at which the vehicle began traversing the link. • TIMESTAMP_END. The time (ms) into the file at which the vehicle finished traversing the link. • VEHICLE_TO_NODE_DIST. The estimated straight-line distance (ft) between the latitude/longitude location reported by the vehicle GPS when passing closest to the start of the link, and the latitude/longitude location in the digital map. This value provides a measure of how closely the vehicle GPS output is following the digital map. The example, Table 3.1, shows six links traversed within FILE_ID 1000019. Over the entire file, these are the 13th through 18th links traversed. The vehicle entered the 17th link, with LINK_ID 41839022, 213.5 seconds into the file. The vehi- cle left the link 33 seconds later. The straight-line distance between the start of the digital map link and the position mea- sured by the vehicle GPS was 19.8 ft. Linking Maps Maps provide a helpful tool for visualizing the data stored in the Linking Table. In Figure 3.1 through Figure 3.6, the count of the number of participant traversals over a link is color coded on a map. Links shown in green have fewer traversals, and links in red have the most traversals. These counts were determined by querying the 305 million rows in the Linking Table to count the number of traversals on each unique link and the number of participants, restricting the result to links with more than 10 participants and 30 traversals, joining these counts to the geospatial data for a link, and portraying it in a GIS application, in this case ArcGIS. Incident to Roadway Tables and Crash/Near-Crash Count Maps Through the roadway LINK_IDs, connections between inci- dents and many roadway-related descriptors can be created to explore the relationships between roadway attributes and C h a p T e R 3

13 different incident types. Attributes that can be joined include Highway Safety Information System (HSIS) category, num- ber of lanes, functional class, speed limit, and lane width. This incident data can also be visualized on maps. In Figure 3.7, links with different numbers of incidents (crashes or near crashes) are color coded. Roadway-related attributes for the links are listed in the boxes in the figure. For example, the red link has two lanes, is Functional Class 1, and is classified as a ramp. Measures of amount Matched As of December 2014, the main data processing effort has been completed. Approximately 3.7 billion latitude/longitude pairs from 5,512,844 trips by consented drivers were analyzed and the roadways on which the participants drove were cata- loged. This catalog is the main deliverable of this work, and constitutes approximately 305 million rows in a database table (one row per link traversal). In total, the database identifies the roadway for approximately 1,026,000 hours of driving. The following sections describe the results of the data processing in more detail, with emphasis on quantifying what has not been matched. Percentage of All Files The number of files not processed constitutes approxi- mately 1.8% of the total files to which algorithm code was applied. The 1.8% difference is made up of cases in which Table 3.1. Linking Table Sample FILE_ID SEQUENCE_NO LINK_ID TIMESTAMP_START TIMESTAMP_END VEHICLE_TO_NODE_DIST 1000019 13 771927354 181461 191461 29.91060377 1000019 14 771927355 191461 192461 31.48441077 1000019 15 122666697 192461 202461 25.67103791 1000019 16 771927351 202461 213462 43.94671698 1000019 17 41839022 213462 246462 19.82166502 1000019 18 41835835 246462 248462 6.555630753 Figure 3.1. Buffalo, NY–area map indicating links and file count (green 5 low, red 5 high). (text continues on page 17)

14 Figure 3.2. Raleigh/Durham, NC–area map indicating links and file count (green 5 low, red 5 high). Figure 3.3. Tampa, FL–area map indicating links and file count (green 5 low, red 5 high).

15 Figure 3.4. Seattle, WA–area map indicating links and file count (green 5 low, red 5 high). Figure 3.5. Bloomington, IN–area map indicating links and file count (green 5 low, red 5 high).

16 Figure 3.6. State College, PA–area map indicating links and file count (green 5 low, red 5 high). Figure 3.7. Illustration of links coded with the number of crashes or near crashes identified.

17 GPS data were never present or never reached the accuracy criteria, the file was too short for GPS to begin reporting, the vehicle was started but never moved, out-of-memory errors occurred for some very long trips, for some reason a succession of road links could not be established for the file, and so on. Percentage of Time Within Files Matched to Roads For the files that were processed, an analysis was conducted to determine what percentage of the driving time in each file was matched to roadways by the algorithm. Twenty files that were manually reviewed and assigned to roads (see Chapter 2) were checked to determine the amount of time that the partici- pant was on roads. That is to say, time spent on a driveway or in a parking lot was omitted. The amount of time that the Matching Algorithm allocated to links was also computed. The percentage of time assigned by the Matching Algorithm was divided by the time on roads determined manually. The median percentage allocated by the algorithm was 87% of the time spent driving on roads within a file. Time was generally lost at the beginning of the trip as the GPS came online and established reception with a sufficient number of satellites to provide good positioning. Sample Code and Instructional Materials Bulk Inserting CSV Files into a Database Once methods are developed for masking PII, the Linking Table will be made available in multiple CSV files, which in total will be 305 million rows of data in six columns. A sample of this table is shown in Table 3.1. Though details of a proce- dure to import CSVs into a database depend on the operating system and database platform, importing CSVs into a data- base is a standard method. In general, the technique will loop over files that meet a specification, executing the database command line utility for bulk inserting for each file. For example, if the map link data are written to CSV files with file names MAPLINKxxxx.csv, where xxxx is an integer from 0000 to 9999, then the following is an outline of how such a script would look: for filename in MAPLINK*.csv bulkload filename If the operating system is Windows and the target database engine is MS SQL Server, then the following is a more specific example of a script to load the CSV files: for %f in ( MAPLINK*.csv ) do bcp database.dbo.map_linking_table in %f -T -c where database is the name of the target database on the local machine and map_linking_table is the name of the target table in that database. If the database is on a remote machine, the server can be specified using the -S server_name option. Counting Number of Files Traversing a Link Once imported into a database, use of the table to, for exam- ple, determine how many files include traversals of a link, is straightforward. The SQL coded below orders a list of links with the most frequently traversed at the top and provides a count of the files that include traversals on the link. select link_id, count(unique(file_id)) as file_count from map_linking group by link_id order by file_count desc Six lines of output from the query above are shown in Table 3.2. Visualizing the Link Counts in ArcGIS It is also sometimes helpful to portray the number of traversals on roadways. A query such as the one shown above, restricted to a specific location, could be portrayed in GIS similarly to the map in Figures 3.1 at the beginning of this chapter. Detailed instructional materials for this type of task are provided in Appendix B. Table 3.2. Sample Output from Link Count Query LINK_ID FILE_COUNT 126684274 22375 126684273 22360 34129178 22356 779335195 22295 824914307 22292 824914306 22286 (continued from page 13)

Next: Chapter 4 - Conclusions and Limitations »
Naturalistic Driving Study: Linking the Study Data to the Roadway Information Database Get This Book
×
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB's second Strategic Highway Research Program (SHRP 2) Report S2-S31-RW-3: Naturalistic Driving Study: Linking the Study Data to the Roadway Information Database details the methodology used to link the second Strategic Highway Research Program (SHRP 2) Naturalistic Driving Study (NDS) data to the SHRP 2 Roadway Information Database (RID), the final critical step in completing the SHRP 2 Safety database. The NDS data set contains detailed data collected continually from more than 5.5 million trips taken by the instrumented vehicles of 3,147 volunteer drivers in six sites.

The RID contains detailed data on 25,000 centerline miles of roadways in these six sites, less detailed data on 200,000 centerline miles of roadways in the six states in which the sites were located, and supplemental data on topics such as crash histories, travel volumes, construction, and weather in the six states.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!