National Academies Press: OpenBook

Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington (2014)

Chapter: CHAPTER 3: Data Compilation and Integration

« Previous: CHAPTER 2: Research Approach
Page 17
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 17
Page 18
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 18
Page 19
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 19
Page 20
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 20
Page 21
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 21
Page 22
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 22
Page 23
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 23
Page 24
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 24
Page 25
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 25
Page 26
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 26
Page 27
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 27
Page 28
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 28
Page 29
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 29
Page 30
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 30
Page 31
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 31
Page 32
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 32
Page 33
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 33
Page 34
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 34
Page 35
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 35
Page 36
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 36
Page 37
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 37
Page 38
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 38
Page 39
Suggested Citation:"CHAPTER 3: Data Compilation and Integration." National Academies of Sciences, Engineering, and Medicine. 2014. Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington. Washington, DC: The National Academies Press. doi: 10.17226/22254.
×
Page 39

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

17 CHAPTER 3 Data Compilation and Integration 3.1 Test Site Selection Table 3.1 shows all the reliability products selected to test and their test objectives. Following the needs of testing all the products, the SHRP 2 L38D research team and its steering committee met and generated a list of candidate test sites. Among those qualified candidate sites, a number of test sites are selected and considered representative to normal roadway conditions in Washington. A brief description of each site is given below: Table 3.1. The Reliability Products Selected to Test and the Test Objectives Products Description Test Objectives L02 Establishing monitoring programs for travel time reliability. Effectiveness L05 The guide for state DOTs and MPOs to fully integrate reliability performance measures and strategies into the transportation planning and programming processes. Usability, Performance L07 Evaluation of the cost-effectiveness of geometric design treatments, such as alternating shoulders, emergency pull-offs, etc., in reducing nonrecurrent congestion. Operability, Usability, Performance L08 Guidance on incorporating travel time reliability into Highway Capacity Manual (HCM) analyses. Operability, Usability, Performance C11 Development of improved economic analysis tools based on recommendations from project C03. Usability, Performance  Test Site A: I-5 between the interchanges with I-405. This facility operates in oversaturated conditions during both morning and afternoon peak periods near downtown Seattle. Loop detectors are deployed every half-mile on the mainstream lanes and on the on-and-off ramps. This test site is used for testing products of L02, L07, and L08.  Test Site B: I-405 between the interchanges with I-5. This facility also operates in oversaturated conditions during morning and afternoon peak periods near downtown Bellevue. Loop detectors are deployed every half-mile on the mainstream lanes and on the on-and-off ramps. This test site is used for testing products of L02 and L08.  Test Site C: I-5 Joint Base Lewis-McChord (JBLM). As the single largest employer in Pierce County and the third largest in Washington State, JBLM plays an important role in the region’s communities. I-5 JBLM is the major thoroughfare for freight and commuter traffic in this region. In recent years, significant increases in traffic congestion have been witnessed due to the regional growth, with longer commute times, longer duration of

18 congestion, impacts to freight movement, military operations, and the overall economy. This test site is used for testing products of L05 and C11.  Test Site D: SR-522 between the intersections with 68th Avenue NE and 83rd Place NE. This is a busy signalized corridor serving as an alternative of I-90 and SR-520 for traffic crossing Lake Washington. It also connects I-5 and I-405. It gets congested during the peak hours and carries relatively low demand during nighttime. This test site is used for testing products of L08. 3.2 Data Set Creation Based on the selected test sites and the needs of data for the tests, the L38D research team reviewed available traffic data in each site and developed further data collection plans to ensure the coverage and quality of data. In general, study data are collected from two types of facilities: urban freeways and signalized arterials.  Urban freeway data: WSDOT maintains a loop detector station approximately every half-mile in the central Puget Sound area freeways. Urban freeway traffic volume and occupancy data are obtained from the WSDOT loop detector network via the STAR Lab fiber connections to the WSDOT Northwest region’s traffic system management center (TSMC), where loop data are stored and disseminated. In addition to the loop detector data, INRIX probe vehicle speed data, traffic incident data, weather data, and roadway geometric data are archived and used for urban freeway analysis.  Signalized arterial data: Signalized arterial traffic data are acquired from two sources: in-road loop detectors and ALPRs. Loop detectors provide volume and occupancy data. ALPRs offer travel time measurements. Besides these two data sets, weather and roadway geometric data also are obtained and used in the analysis of signalized arterials. However, these existing data sets are not sufficient for arterial analysis. Video-based on- site data collection was conducted to obtain directional vehicle movements at signalized intersections on this corridor. Details of the data sets created for this research project follow. 3.2.1 Data Set A: Loop Detector Data Data Set A consists of direct loop detector measurements (volume and occupancy for single loops and traffic speed and bin volumes for dual loops) and delay estimates based on loop detector data for Test Sites A (I-5), B (I-405), and D (SR-522). Data set creation involves obtaining, cleaning, and integrating data collected by the research team. There are several challenges within this process. Among them are processing, reviewing, and reducing raw data into summaries suitable for analysis and conflating traffic data with geospatial data. Inductive loop detectors are widely deployed in Washington State for the purpose of monitoring traffic conditions and freeway performance. WSDOT maintains and manages loop detectors on state highways as well as those on Interstate freeways within Washington State. For

19 the purpose of traffic management, the State of Washington is divided into six regions: Northwest, North Central, Eastern, South Central, Southwest, and Olympic. Relevant to this project, there are approximately 4,200 single or dual-loop detectors installed in the Northwest region that are used to monitor traffic conditions around the Seattle metropolitan area. There are two general types of loop detectors in Washington State, single loop and dual loop. Single-loop detectors are only capable of detecting whether a vehicle is present or absent, which allows volume and occupancy to be measured directly. Dual-loop detectors, on the other hand, are composed of two single-loop detectors placed a short distance apart, thereby allowing travel speed to be estimated from the difference in arrival time between upstream and downstream detectors. Vehicle length can also be estimated from dual-loop detector data, based on the estimated speed and measured detector occupancy. Loop detector data in Washington State is available at both 20-second and 5-minute aggregation intervals. Note that all data is collected at the 20-second aggregation level, and is further aggregated into 5-minute periods. The key information for the 20-second and 5-minute aggregation intervals is listed in Table 3.2 and Table 3.3, respectively. WSDOT primarily uses the 5-minute aggregation level loop data for freeway performance monitoring and reporting (Wang et al. 2008). The LOOPID field in Table 3.2 and Table 3.3 is a unique identifier for each loop detector that can be matched to a detector cabinet, and multiple loop detectors can be connected to a given cabinet. A cabinets table contains descriptive and location information for each cabinet, so associating loops with the cabinets they are connected to facilitate locating the loops using cabinet milepost and route. The key information contained in the cabinets table is listed in Table 3.4. Table 3.2. 20-Second Freeway Loop Data Description Table: SingleLoopData and StationData (Single Loop) Columns Data Type Value Description LOOPID smallint Unique ID number assigned in order of addition to LoopsInfo table STAMP datetime 24-hour time in integer format as YYYYMMDD hh:mm:ss (in 20-second increments) DATA tinyint Indicate whether a record is present or not FLAG tinyint Validity flag (0–7): 0 = good data; otherwise, bad data VOLUME tinyint Integer volume observed during this 20-second interval SCAN smallint Number of scans when a loop is occupied during each period (60 scans per second multiplied by 20 seconds per period equals 1,200 scans)

20 Table: TrapData (Dual Loop) Columns Data Type Value Description SPEED smallint Average speed for each 20-second interval (e.g., 563 means 56.3 miles per hour) LENGTH smallint Average estimated vehicle length for each 20-second interval (e.g., 228 means 22.8 feet) In addition to reporting the single and dual-loop detector observations at the individual loop level, loop detectors data are aggregated at the cabinet level to a loop group or station. For each cabinet, the station volume is the sum of total volumes for the associated loops, and the occupancy (or scan) is the average of total occupancies (scans) for the associated loops. Note in Table 3.2 and Table 3.3 that both detector level (SingleLoopData and STD_5Min) and station- level (StationData and STN_5Min) data are reported for single-loop detectors. Table 3.3. 5-Minute Freeway Loop Data Description Table: STD_5Min and STN_5Min (Single Loop) Columns Data Type Value Description LOOPID smallint Unique ID number assigned in order of addition to LoopsInfo table STAMP datetime 24-hour time in integer format as YYYYMMDD hh:mm:ss (increased by 5 minutes) FLAG tinyint Good/bad data flag with 1 = good and 0 = bad (simple diagnostics supplied by WSDOT) VOLUME tinyint Integer volume observed during each 5-minute interval OCCUPANCY smallint Percentage of occupancy expressed in tenths to obtain integer values (6.5% = 65) PERIODS smallint The number of 20-second readings incorporated into this 5- minute record (15 is ideal, less than 15 almost always indicates that volume data are unusable unless adjusted to account for missing intervals) Table: TRAP_5Min (Dual Loop) Columns Data Type Value Description SPEED smallint Average speed for each 5-minute interval (e.g., 563 means 56.3 miles per hour) LENGTH smallint Average estimated vehicle length for each 5-minute interval (e.g., 228 means 22.8 feet) WSDOT makes 20-second and 5-minute loop detector data available for download using an online FTP website. Detector data are periodically retrieved from the posted FTP website,

21 formatted, and stored in the STAR Lab Microsoft SQL Server databases using an automated computer program written in Microsoft Visual C#. For the pilot testing of SHRP 2 L02, L07, L08, and C11 products, traffic volume data along the Test Sites A, B, and D corridors were collected. Figure 3.1 illustrates most of the loop locations along I-5 and I-405 in the Northwest region of Washington State. Five-minute traffic volume data were collected for the time period from January 2009 to June 2013. Figure 3.2 illustrates the traffic flow map based on the 5-minute loop data collected at 5:30 p.m. December 11, 2012. Loop detectors along SR-522 are shown together with the other available sensors in Figure 3.3. Table 3.4. Cabinet Data Description Columns Data Type Value Description CabName varchar Unique ID for each cabinet UnitType varchar Type for each loop (i.e., main, station, speed, and trap) ID smallint Unique ID number assigned in order of matching the loop data table Route varchar The state route ID (e.g. 005 = Interstate 5) direction varchar Direction of each state route isHOV tinyint Bit indication whether loop detector is on an HOV lane (1 = HOV, 0 = not HOV) isMetered tinyint Bit indication whether loop detector is on a metered ramp (1 = metered, 0 = not metered) 3.2.2 Data Set B: Intelligent Transportation Systems (ITSs) Data Data set B consists of ALPR data from roadway surveillance systems along the SR-522 corridor chosen for this study (Test Site D) as shown in Figure 3.3. On this section of SR-522, ALPR data have been archived since September 1, 2012. The ALPR data in particular were selected for use in testing the STREETVAL software application designed by the SRHP 2 L08 research team.

22 © OpenStreetMap contributors Figure 3.1. Loop detectors in Northwest Washington State.

23 Figure 3.2. Traffic flow map based on loop detector data. © OpenStreetMap contributors Figure 3.3. Traffic detectors along the SR-522 corridor.

24 ALPR technology uses high-definition cameras, typically mounted on top traffic signal gantries and placed directly over the roadway so that the appropriate angle of sight can be achieved (Figure 3.4 shows a mounted ALPR camera). The cameras collect video data, which are then processed in real time using a license-plate-reading algorithm. Each time a plate is identified, it is stored in memory along with the time stamp of when it was identified. For travel time data collection purposes, these plate-reading cameras are installed at several intersections along the test site corridor. Link travel times are then obtained from comparing the data collected at two different intersections; if a plate is identified in both data sets, then the travel time is computed as the difference in the time stamps between the two intersections. Approximately 8 months of travel time data were available and downloaded from the WSDOT database. These data span from August 16, 2013, to March 31, 2014. These data were uploaded onto the STAR Lab database where they were then queried and analyzed. Table 3.5 shows the information and basic data types available from the ALPR data set. Given that these data are to be used for test verification purposes, it was ensured that the times the data were collected match the selected study period and reliability reporting period defined in the project’s temporal scope. Figure 3.4. ALPR cameras mounted at the 61st Avenue, NE, and SR 522 intersection.

25 Table 3.5. ALPR Data Descriptions Columns Data Type Value Description Stamp Datetime Date and time of observation ID int Unique ID for each route, defined by a unique combination of location of origin and destination TravelTime int Travel time on the section in seconds Trips int Number of trips during observation period UpCount int Number of license plates read by upstream reader DownCount tinyint Number of license plates read by downstream reader Lanes Number of lanes Flag tinyint Error identification flag 3.2.3 Data Set C: INRIX Data INRIX is an international company for traffic analytics and data located in Kirkland, Washington. It gathers traffic information from around 100 million GPS-equipped vehicles traveling the roads in 32 countries around the world. Rather than depending on just one source for data, INRIX combines multiple data feeds to provide more comprehensive travel advice to drivers available. INRIX collects data streams from local transportation authorities, sensors on road networks, fleet vehicles such as delivery vans, long haul trucks and taxis, as well as consumer users of the INRIX traffic apps. INRIX crunches these data and translates the information into easy-to-understand travel advice that drivers can access through radio reports, real-time sat-nav systems in cars, and through INRIX’s apps. This data set consists of 1-minute resolution probe vehicle speed data for the section of I- 5 south of Seattle between SR 510 and SR 512, provided by INRIX. To aggregate and fuse heterogeneous transportation data, INRIX developed a series of statistical models to compute real-time traffic information such as speed and travel time based on measurements from GPS devices, cellular networks, and loop detectors. The resulting speed data were aggregated into 5- minute intervals for 2008, 2009, and 2010 and into 1-minute intervals for 2011 and 2012. WSDOT was authorized to use and archive the data from January 1, 2009, to December 31, 2012, in the STAR Lab database. The key information for INRIX data is presented in Table 3.6. A traffic speed map based on the INRIX data for Northwest Washington State at 5:30 p.m. on December 11, 2012, is shown in Figure 3.5.

26 Table 3.6. INRIX Data Description Columns Data Type Value Description DateTimeStamp datetime 24-hour time in integer format as YYYYMMDD hh:mm:ss SegmentID varchar Unique ID for each segment-Traffic Message Channel (TMC) code Reading smallint Average speed for each segment Figure 3.5. Traffic speed map based on INRIX data. INRIX has adopted the Traffic Message Channel (TMC), a common industry convention developed by leading map vendors, as its base roadway network. Each unique TMC code is used to identify a specific road segment. For example, in Table 3.7, TMC 114+05099 represents the WA-522 road segment with start location (47.758321, -122.249705) and end location (47.755733, -122.23368). However, WSDOT roads follow a linear referencing system based on mileposts poses, so substantial work is required to combine these two sources of data. This was completed using geographic information system (GIS) software, and the results were stored in the DRIVE Net database.

27 Table 3.7. TMC Code Examples TMC Roadway Direction Intersection Country Zip Start Point End Point Miles 114+05099 522 Eastbound 80th Ave King 98028 47.758321,- 122.249705 47.755733,- 122.23368 0.768734 114-05095 522 Westbound WA- 523/145th St King 98115 47.753417,- 122.27005 47.733752,- 122.29253 1.608059 3.2.4 Data Set D: Incident Data This data set was extracted from the WITS and describes the basic characteristics of traffic incidents. WITS data provide a standardized source of information for traffic incidents in Washington State and include the majority of incidents that happen on its freeways and state highways (totaling 550 and 376, respectively, by March 2013). For each incident, the Washington State incident response (IR) team logs details such as incident location, notified time, clear time, and closure lanes. For this project, the WITS data sets from 2002 to 2013 were obtained and integrated into the DRIVE Net database. Several key columns are listed in Table 3.8. Table 3.8. WITS Data Description Columns Data Type Value Description SR varchar State route ID, e.g., 005 = Interstate 5 Direction varchar Route direction (NB = northbound, SB = southbound, WB = westbound, EB = eastbound) MP float Milepost Notified_Time datetime The time when an incident was reported to the IR program Arrived_Time datetime The time when an IR truck arrived at the incident location Clear_Time datetime The time when all lanes became open to traffic Open_Time datetime The time when the incident had been fully cleared and the IR teams left the incident scene 3.2.5 Data Set E: Weather Data This data set consists of weather data from stations in Washington State. Weather data were sourced from a website maintained be the UW Atmospheric Sciences Department, which provides access to hourly observations from 209 weather stations through the National Oceanic and Atmospheric Administration. Weather data are automatically fetched from the website and stored in a STAR Lab database using a JAVA-based computer program written for this purpose. Several key pieces of information are shown in Table 3.9. Weather data are visualized geographically on the DRIVE Net system using the latitude and longitude information associated with each weather station and can be viewed at www.uwdrive.net.

28 Table 3.9. Weather Data Description Columns Data Type Value Description name Varchar The weather station identifier timestamp Datetime 24-hour time in integer format as YYYYMMDD hh:mm:ss visibility Smallint Visibility in miles temp Smallint Temperature in degrees Fahrenheit dewtemp Smallint Dew point temperature wind_direction Smallint Direction wind is coming from in degrees; from the south is 180 wind_speed Smallint Wind speed in knots pcpd Smallint Total 6-hour precipitation at 00z, 06z, 12z and 18z; 3-hour total for other times. Amounts in hundredths of an inch. 3.2.6 Data Set F: Roadway Geometric Data This data set contains roadway geometry sourced from WSDOT’s GIS and Roadway Data Office (GRDO). The GeoData Distribution Catalog is maintained by GRDO to promote data exchange and can be accessed online at http://www.wsdot.wa.gov/mapsdata/geodatacatalog/. These data are made available in the form of Esri shapefiles, which is an industry standard digital format for geospatial data. Available geometric data sets include lane count, roadway widths, ramp locations, shoulder widths, and surface types. In order to allow geometric elements to be located using the WSDOT linear referencing systems, state route identification and milepost information are included in this data set. A substantial quantity of such geometric data have been obtained and stored in a spatial database as part of the STAR Lab DRIVE Net system, and made available for this project. 3.3 Data Quality Control For this project, a great deal of emphasis has been placed on data quality control (DQC). Fortunately, much of the necessary data quality assurance procedure has previously been developed and implemented in the DRIVE Net system. Most notably, a two-step DQC procedure for loop detector data is developed as illustrated in Figure 3.6. The raw loop data are first subjected to a series of error detection tests to identify missing and erroneous data. These data are flagged for further corrections and remedies. Several statistical algorithms are developed to estimate the missing data and replace those erroneous records. The corrected data are periodically stored in the database for use in further analysis. The 20-second and 5-minute loop data as well as the ALPR data are all processed for quality control purposes. 3.3.1 Loop Detector DQC Procedure Figure 3.6 shows how incoming loop detector data are processed in the DQC procedure. Error detection algorithms identify and remove erroneous observations based on controller hardware

29 diagnostics and value thresholding, and then sensitivity issues are detected and corrected using a Gaussian Mixture Model algorithm. All loop detector quality control is completed according to the methodologies outline in Wang et al. (2013). Raw (unadjusted) loop detector data are retained throughout the process as back up as well as to quantify the efficacy of the quality control algorithms. These raw data also serve as a benchmark for comparison purposes in performance measurements and in the effectiveness of data quality control algorithms (Wang et al. 2013). When data are retrieved from the WSDOT FTP site, basic error detection results are already present in the form of simple hardware diagnostics error flags. This process is run at the cabinet level and reports the presence of common loop detector quality issues such as short pulses, loop chatter, and values outside of allowable volume/occupancy ranges as well as whether or not the loop has been manually deactivated (Ishimaru and Hallenbeck 1999). Based on these flags, a loop reporting at least 90% good data is considered acceptable for use in analysis (with erroneous data removed). A series of additional error detection procedures are performed on the data before uploading into the DRIVE Net platform, primarily based on value thresholding. These procedures are outlined below; for additional information see Wang et al. (2013). Values outside the established thresholds are marked as missing, though in many cases this does not mean the observations are the result of a hardware malfunction. For example, when no vehicles pass over the detector in a given interval (which frequently happens during low- volume time periods), the volume, occupancy, and speed will all be reported as zero. This simply means that no data are available for that interval, and in this case data must be marked as missing. The thresholding criteria, based on Chen et al. (2003), are as follows: A. Volume is reported as zero, with occupancy greater than zero. B. Volume and occupancy are reported as zero (between 5:00 a.m. and 8:00 p.m. C. Reported occupancy exceeds 0.35.

30 Figure 3.6. Loop data quality control flow chart (Wang et al. 2013). Loop data are retrieved between the hours of 5:00 a.m. and 8:00 p.m., as the above listed threshold criteria are not particularly instructive during night time when volume and occupancy are consistently very low. During this time period, there are 2,700 and 180 records for 20-second and 5-minute loop data, respectively, per detector. Because the researchers expect the number of zero volume/occupancy intervals to be low during the reporting time period, a basic measure of loop detector health can be developed based on the number of type A, B, and C errors reported. Based on this, loop detectors reporting a high number of these error types are discarded according to the methodology described in Wang et al. (2013). The above listed procedures are primarily oriented toward hardware and communications errors and do not address systematic sensitivity issues. To address this, a statistical Gaussian Mixture Model (GMM) algorithm is implemented based on Corey et al. (2011). This algorithm is designed to identify undersensitive and oversensitive detectors and to correct the resulting observations when possible. The procedure is implemented on a monthly basis and classifies detectors as (1) good, (2) suffering from correctable errors, or (3) suffering from uncorrectable technical issues. Correction factors are produced for detectors classified as type (2). For more information about this algorithm and the specifics of implementation see Wang et al. (2013). Based on the three quality control procedures described above, a health score for each loop detector observation is computed as an indicator of reliability and stored in the loop detector database. For loop detectors reporting a sufficient number of nonmissing observations, corrections are applied to recover the records flagged by the error detection algorithm. Different corrections are applied based on the scenario and the availability of adjacent observations, listed below: Spatial Correction

31 1. Replacement by spatial interpolation, 2. Replacement by temporal interpolation, and 3. GMM sensitivity correction. A brief discussion of each of these correction approaches follows. 3.3.1.1 Spatial Interpolation For loop detector records flagged by the error detection algorithm or simply missing from the data set because of hardware malfunction, records from adjacent detectors are used to replace the missing observations when possible. There are two ways in which this is done, the selection of which depends on the availability of nearby detector observations marked as “good.” In scenario 1, interpolation is performed using data from lanes adjacent to that of the missing or erroneous record. This is the preferred approach, as there is in general a high correlation between speed, volume, and occupancy in adjacent lanes at any given location. However, this is not always possible, because certain error types (e.g., communications failure) often impact all detectors on a given cabinet. In this case, multiple detectors at the cabinet of interest will report missing or erroneous records for one or more intervals. In scenario 2, interpolation is performed using data from detectors positioned upstream and downstream of the missing or erroneous record. This approach is applied when the method applied in scenario 1 is impossible because of a lack of adjacent lane records. 3.3.1.2 Temporal Interpolation Temporal interpolation is used to fill in missing values when only a single consecutive observation is missing. That is, it is only applied when records are present before and after the missing or erroneous observation in the time series. This method is preferable to spatial interpolation but cannot be applied when multiple consecutive observations are marked missing. Note that if a detector has been marked as malfunctioning because of a high number of observations flagged as “bad” then spatial interpolation cannot be performed. Spatial and temporal interpolation are imputation processes for filling in missing values, where data are not present in the data set because of either a hardware malfunction or having been removed by the error detection algorithms. What is presented here is a very brief summary; refer to the Wang et al. (2013) for a more thorough description of the methodology and implementation. 3.3.1.3 GMM Correction The GMM algorithm simulates the distribution of occupancy as a mixture of Gaussian distributions. This allows the ratio between normal and biased occupancy to be calculated and used to correct records from oversensitive or undersensitive loops. The GMM algorithm produces a flag assigned to each detector by month, designating the detector as one of the

32 following: (1) good, (2) suffering from correctable errors, or (3) suffering from uncorrectable technical issues. For those detectors classified as (2), a correction factor is estimated based on the ratio between normal and biased occupancy. The correction factor is computed based on knowledge of vehicle length distributions and is estimated monthly using intervals during which only a single vehicle passed over the detector (i.e., during low-volume periods). For a thorough description of the GMM procedure refer to Wang et al. (2013) and Corey et al. (2011). The GMM algorithm is implemented in a software package written in SQL, JAVA, and R programming languages. A graphical user interface (GUI) has been developed to ease execution; see Figure 3.7. 3.3.2 ITS DQC Procedure While ALPR travel time estimates are in general reliable, some unrealistically high travel times are recorded because of the opportunity for vehicles to make incomplete trips through a corridor. Typically, this happens when a vehicle stops along the corridor for a period of time (such as at a local business) and then continues along the route. The ALPR quality control methodology, then, is primarily focused on identifying and eliminating these outlying travel times. Based on the FHWA’s Mobility Monitoring/Urban Congestion Program (Turner et al. 2004), the following quality control criterion is defined for probe data: Any two consecutive travel times cannot differ by more than 40%. Another criterion, based on methods proposed by UW researchers, is to restrict travel times to not more than one standard deviation above or below the moving average of the 10 previous entries. However, these methods were not designed for the sparse data coverage typical of arterial ALPR data, and so without a sufficient number of immediate adjacent observations, many outliers are able to pass through this method undetected. In response, an additional arterial data quality control methodology was developed that focuses on the overall spread of the data. Based on an examination of the arterial data, the following quality control procedures were developed and conducted on the ALPR data:  Any extremely low or high travel times are removed based on visual inspection.  After ranking of all travel times for a section any value greater than the 75th percentile plus 1.5 times the interquartile distance or less than the 25th percentile minus 1.5 times the interquartile distance are removed. By using quartile values instead of variance to describe the spread of the data, this technique is made more robust.  As described above, records in which two consecutive travel times change more than 40% were removed.

33 Figure 3.7. GUI for freeway data quality control. 3.4 Speed and Travel Time Calculations Using the previously identified data sets, speed and travel time for various segments and routes must be computed for multiple facilities and data types. A new approach to calculate travel time from real-time loop data is described in subsection 3.4.1. Calculation of free-flow speed is described in subsection 3.4.2. 3.4.1 A Travel-Based Approach to Calculating Travel Time from Single-Loop Detector Data For testing the SHRP 2 products, route-level travel time data are needed. The research team developed a new approach to calculate travel time from single-loop data as described below. In many locations, single-loop detectors are one of the most convenient data sources for travel time calculation. They collect volume and occupancy data that can be converted to an average speed. By dividing the distance between detectors by the average speed, segment travel times can be calculated. From here, the simplest and most common way to calculate a route travel time at a specific time is to calculate all of the segment travel times along the route at the time the route starts and sum them together to get the route travel time. This method requires

34 minimal calculation effort and is often very accurate when the level of congestion remains stable. However, when the level of congestion changes quickly, the predicted segment travel times at the end of the route will be quite inaccurate. The travel-based approach described in this section aims to address this shortcoming. The first step in calculating a travel-based route travel time begins with the raw data from single-loop detectors. These detectors measure volume and occupancy in each lane; the results can then be converted to speed using the g-factor formula (Equation 3.1). 1flow Speed occupancy g   (3.1) The g-factor is a parameter based on the average length of vehicles passing over the detector and generally ranges from 2.0–2.5. Before calculating travel times, the average vehicle length for a route should be studied and an appropriate g-factor should be chosen. Since the travel time calculation relies on spot speeds, a greater density of detectors along a route will yield more accurate travel times. At minimum, the density should be greater than one per mile, but a density closer to two per mile is preferable. Once speeds have been determined for each lane, they can be averaged together at each location. If an HOV lane exists, it should be excluded from this average in order to get the travel time for general vehicles. Quality control procedures can then be applied to the speed data. For this study, the following procedures were used, adopted from WSDOT travel time calculation methods:  If occupancy is less than 12%, then the speed is set to 60 mph;  If occupancy is greater than 95%, then the speed is set to 0 mph;  If the calculated speed is less than 10 mph, then the speed is set to 10 mph; and  If the calculated speed is greater than 60 mph, then the speed is set to 60 mph. After cleaning up the data, the segment travel time between two adjacent detectors can then be calculated by taking the distance between the detectors and dividing them by the average of the speeds they record (Equation 3.2). This result will be referred to as the segment travel time. Once these segment travel times have been calculated, they can be summed together over large distances to obtain the travel time for entire routes or corridors. 𝑆𝑒𝑔𝑚𝑒𝑛𝑡 𝑡𝑟𝑎𝑣𝑒𝑙 𝑡𝑖𝑚𝑒 (𝑚𝑖𝑛) = 60 ∗ 𝑀𝑃2−𝑀𝑃1 (𝑆1+𝑆2)/2 (3.2) where: MP = milepost of the detector S = speed from detector in mph As mentioned earlier, the simplest and most common way to calculate a travel time at a specific moment is to calculate all of the segment travel times along the route (using Equations

35 3.1 and 3.2) at that time and then sum them together to get the route travel time. However, this method often yields travel times that vary significantly from ground truth times when a route’s congestion is in flux (especially on either end of peak travel periods). To overcome this problem, when calculating travel times from previously collected data (as opposed to real-time results) the segment travel times can be calculated when vehicles actually reach that segment rather than when they begin the route. This is clarified by an example below. Consider Table 3.10, which lists segment travel times (STTs) for eight segments and how they change over a 25-minute period as congestion increases. Table 3.10. Segment Travel Time Table for Example Route Time STT 1-2 STT 2-3 STT 3-4 STT 4-5 STT 5-6 STT 6-7 STT 7-8 STT 8-9 3:50 p.m. 1.8 2.0 2.2 4.4 4.6 1.8 2.0 4.2 3:55 p.m. 2.0 2.2 2.4 4.6 4.8 2.0 2.2 4.4 4:00 p.m. 2.2 2.4 2.6 4.8 5.0 2.2 2.4 4.6 4:05 p.m. 2.4 1.6 2.8 5.0 4.2 2.4 2.6 4.8 4:10 p.m. 2.6 1.8 2.0 5.2 4.4 2.6 2.8 4.9 4:15 p.m. 1.8 2 2.2 4.4 4.6 2.8 3 5.2 Using the simple method, the calculated travel time for a vehicle beginning this route at 3:50 p.m. would be the sum of the first row of the table: 23 minutes. However, using the travel- based method, the travel time for a vehicle starting at 3:50 p.m. would be calculated as follows. Segment 1-2 is completed in 1.8 minutes, which is before 3:55 p.m. Thus, segment 2-3 is assumed to be completed in 2.0 minutes. The elapsed time is still before 3:55 p.m., and segment 3-4 is assumed to be completed in 2.2 minutes for a running total of 6 minutes. Now the elapsed time is between 3:55 and 4:00 p.m., so segment 4-5 is assumed to be completed in 4.6 minutes. This brings the elapsed time to 10.6 minutes, which is between 4:00 and 4:05 p.m., so segment 5- 6 would be completed in 5 minutes. Following this procedure (the highlighted path), the travel- based route travel time is calculated as 26 minutes for a trip starting at 3:50 p.m., rather than the 23 minutes for the simple method. This travel time result should then be stored with the time that travel along the route began. Note that both of these methods generate an average expected travel time, and thus individual drivers will experience at least some variation around this average. This travel-based method for calculating route travel times responds to the dynamic nature of the congestion along a route. Therefore, it is expected to be a closer match to ground truth travel times during periods where congestion changes quickly. Figure 3.8 summarizes the entire method of calculating travel times, starting with the raw single-loop detector data.

36 Figure 3.8. Diagram of travel-based route travel time calculation. 3.4.2 Calculation of Free-Flow Speed The distribution statistics for the TTI depend on measuring travel time relative to an ideal or free-flow speed. For urban freeways, the research team uses a constant value for all sections of 60 mph. This is a well-established threshold for measuring congestion on urban freeways. For signalized highways, the situation is more complex because of variation in speed limits and signal-influenced delay, even at very low volumes. For these sections, researchers applied the 85th percentile speed as the free-flow speed. In all cases, if section speeds are greater than the free-flow speed, then the TTI is set to 1.0; no credit is given for going faster than the free-flow speed. 3.5 Final Data Set for Analysis As the preceding discussion demonstrates, an array of data sets at various levels of spatial and temporal aggregation has been created. The end result of the processing and fusing is a high- quality preprocessed data set to be used in the analyses. A relatively high level of aggregation is required because reliability is defined over a long period of time to allow all pertinent factors to exert influence on it. Each observation in the analysis data set is for an individual section for an entire year for each of the daily time slices studied: peak hour, peak period, midday, weekday, and weekend/holiday. Data set characteristics under consideration include the following attributes that are intended to capture characteristics for an entire year on the study sections:  Reliability metrics o Mean, standard deviation, median, mode, minimum, and percentiles (10th, 80th, 95th, and 99th) for both travel time and the TTI o Buffer indices (based on mean and median), planning time index, skew statistic, and misery index o On-time percentages for thresholds of median plus 10%, median plus 25%, and average speeds of 30 mph, 45 mph, and 50 mph  Operations characteristics Collect volume and occupancy data from single loop detectors. Apply Equation 3-1 to calculate average five- minute lane speeds. Average lane speeds together for all loops at the same location. Apply WSDOT quality control procedures to the speed data. Apply Equation 3-2 to calculate segment travel times for each pair of adjacent loops. Apply travel- based method to add segment travel times together that reflect when a vehicle reaches that segment. Store the route travel time as the travel time when the route was started.

37 o Area-wide and section-level service patrol trucks (average number of patrol trucks per day) o Area-wide and section-level service patrol trucks per mile (average number of patrol trucks per day divided by centerline mile) o Traffic Incident Management Self-Assessment scores o Quick clearance law (yes/no) o Property damage only move-to-shoulder law (yes/no) o Able to move fatalities without medical examiner (yes/no) o IRT staff per mile covered o Number of ramp meters, DMSs, and closed-circuit televisions.  Capacity and volume characteristics o Start and end times for the peak hour and the peak period o Calculated and imputed vehicle-miles traveled (VMT) o Demand-to-capacity and average annual daily traffic (AADT)-to-capacity ratios o Average of all links on the section o Highest for all links on the section o AADT-to-capacity ratios for downstream bottlenecks as segregated by ramp merge area  Incident characteristics o Number of incidents (annual) o Incident rate per 100 million vehicle-miles o Incident lane-hours lost (annual) o Incident shoulder-hours lost (annual) o Mean, standard deviation, and 95th percentile of incident duration  Work zone characteristics o Number of work zones (annual) o Work zone lane-hours lost (annual) o Work zone shoulder-hours lost (annual) o Mean, standard deviation, and 95th percentile of work zone duration  Weather characteristics o Number of annual hours with precipitation amounts greater than or equal to 0.01 inches, 0.05 inches, 0.10 inches, 0.25 inches, and 0.50 inches o Number of annual hours with measurable snow o Number of annual hours with frozen precipitation o Number of annual hours with fog present 3.6 Data Acquisition and Integration As described in the previous subsections, several sizable data sets from a variety of sources were archived for this project. To address the challenges of integrating and fusing these diverse data

38 sets, the STAR Lab DRIVE Net platform is used as a data repository, visualization, and analysis tool. Figure 3.9 shows an interface snapshot of DRIVE Net Version 3.0. DRIVE Net is an online e-science platform for data access, analysis, visualization, and quality control, and is already home to a great deal of public and private transportation data. In addition to its utility as a data storage and integration tool, DRIVE Net was in employed in both analysis and visualization roles at various stages of this project. DRIVE Net is currently housing multiple data sources through various methods of data retrieval, for example, traditional flat file exchange, passive data retrieval, active data retrieval, and direct data archival. A variety of data sources are digested and archived into the STAR Lab server from WSDOT and third-party data providers through different data acquisition methods, as depicted in Figure 3.10. All of the aforementioned data quality procedures are implemented in the DRIVE Net system, allowing analysts access to a variety of high-quality data sources in an integrated environment. Quality control is performed on data before they are made available on the platform, removing the need for substantial preprocessing work and providing a high level of confidence for researchers and practitioners. Figure 3.9. DRIVE Net interface with color-coded traffic flow feed from WSDOT.

39 Figure 3.10. Data acquisition methods for the DRIVE Net system (Wang et al. 2013).

Next: CHAPTER 4: Pilot Testing and Analysis on SHRP 2 L02 Product »
Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington Get This Book
×
 Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Washington
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB’s second Strategic Highway Research Program (SHRP 2) Reliability Project L38 has released a prepublication, non-edited version of a report that tested SHRP 2's Reliability analytical products at a Washington pilot site. This research project tested and evaluated SHRP 2 Reliability data and analytical products, specifically the products for the L02, L05, L07, L08, and C11 projects.

Other pilots were conducted in Southern California, Minnesota, and Florida,

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!