Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 56
C H A P T E R 7 Potential Problems and Issues in Data Reduction Because of the limitation of hardware or software design, data used in the other three VTTI projects. These differences in elements in candidate data sets may or may not be sufficiently equipment result in variations in types of data collected and accurate and verifiable for the analysis of driver behavior asso- the associated data storage and computation requirements. ciated with crashes and near crashes and travel time reliability. Customized data computation software and hardware are The derivation of data elements from raw data sets may not needed for individual studies. UMTRI developed a system meet the needs of the study purpose and may require modifica- called Debrief View, as shown in Figure 4.4, to conduct data tions. In some cases, the data element may not be accurate but reduction. VTTI developed DART software, which visualizes can be easily transformed. An example of such a case is when an and reduces data, as shown in Figure 4.3. accelerometer box is installed backward in a vehicle. In this case, Even with similar types of equipment, different settings the data can be quickly converted through a simple mathemat- might apply according to the varied research purposes. For ical procedure without compromising data integrity. In other example, although all the candidate studies recorded video cases, the method to collect the data may be inaccurate for a par- data at a predefined stable frequency (i.e., continuous), the ticular purpose (e.g., GPS-calculated vehicle speed compared frequency set by each study was not the same. Project 2 and with speed measured directly from the CAN bus). Similarly, it Project 5 saved data at a relatively low frequency (as shown in may be that some portions of the data are useful but others are Table 4.2) unless an event trigger was sent to the DAS to initial- not. Data need to be reviewed to determine the suitability for ize a continuous video recording for 8 s. The fact that the nonrecurrent congestion research purposes. Potential prob- purpose of these studies was to test the effectiveness of an ACA- lems, risks, and limitations for data collection and reduction are RDCWS warrants this lower frequency. The disadvantage, as discussed in this section. stated in the UMTRI reports, was that several alerts of very short duration that occurred below the 10-Hz recording rate may be omitted (1). On the contrary, most VTTI studies collected Overall Data Collection continuous video data at approximately 30 Hz. Details of dri- A common problem associated with naturalistic studies is the vers' behavior are essential to the studies conducted by VTTI. proper identification of drivers. Normally, the data collection A relatively higher video recording frequency will provide process tries to ensure the exclusive use of an equipped vehicle researchers with a better opportunity to closely observe drivers' by the assigned driver whose demographic information has driving habits and distractions. A higher frequency generates been recorded. It is not unusual that the equipped vehicle would data sets that are larger in size and brings more challenge to data be used by family members or friends. The different driving reduction. When postprocessing and analyzing the same type habits and behaviors of a second driver and the mismatched of data from different sources, as was done for this study, spe- demographic information can bias the data. An elaborate cial attention should be paid to the differences in data collection scheme to correctly match driver information with the actual rates. Conversion or inference of data may be necessary. driver can improve the situation. Another common problem is data dropping out; for exam- Multiple pieces of equipment are on board the test vehicles. ple, GPS outages because of urban canyons. As shown in Fig- Different devices can be chosen for different research pur- ure 7.3, when high buildings in downtown areas block out poses. The DAS adopted in Project 2 and Project 5 is shown in satellite signals, the resulting GPS data collected has gaps Figure 7.1 (1). The basic arrangement of the DAS in Project 7 between the points. Postprocessing such GPS data to trace the by VTTI is illustrated in Figure 7.2 (2). A similar setup was real route traveled by the vehicle usually leads to an error, as 56