Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
18 Conclusions and Limitations The SHRP 2 NDS contains a record of personal vehicle routes of travel on a scale well beyond what has been collected in the past. The data contain more than 5.5 million trip records from consented drivers. The time totals more than 1 million hours of driving, generating approximately 3.7 billion latitude/ longitude pairs. The separation between each pair of points was different depending on vehicle speed, and each pair included different levels of precision depending on the reception of GPS and geometry of the visible satellite constellation as the vehi- cle moved through the network. The roadway database making up the RID includes 1,820,954 links. The objective of this proj- ect was to create a catalog of driving epochs and identify on which of the 1.8 million links they occurred. The catalog, exported to the Linking Table, contains 305,000,000 link traversals and the time at which the par- ticipant vehicle entered and exited the link. Through the Linking Table, in seconds, researchers can locate as many as 27,000 traversals on a single link out of more than 1 million hours of data. Limitations The output of any large-scale data mining effort should be used cautiously. When processing large amounts of data, it is often difficult to detect how the process might be affecting the output. The Linking Table itself should not be considered completely representative of all the SHRP 2 participants and all their trips. As previous sections have indicated, approxi- mately 1.8% of the trips were not processed. Within trips, the amount that could be processed varied. Some vehicles may have had a faulty GPS for a period of time. Some participants may live in locations that have exceptionally good or excep- tionally poor GPS coverage compared with other partici- pants. Researchers are encouraged to check their sample for biases before conducting analyses. This can be done by check- ing trip counts for different participants against each other, or spot-checking each participantâs data to establish a rate of files that are missed and determine if it is more or less than other participants. C h a p t e r 4