National Academies Press: OpenBook
« Previous: Chapter 2 - Summary of Best Data Sources and Methods to Test
Page 52
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 52
Page 53
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 53
Page 54
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 54
Page 55
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 55
Page 56
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 56
Page 57
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 57
Page 58
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 58
Page 59
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 59
Page 60
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 60
Page 61
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 61
Page 62
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 62
Page 63
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 63
Page 64
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 64
Page 65
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 65
Page 66
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 66
Page 67
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 67
Page 68
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 68
Page 69
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 69
Page 70
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 70
Page 71
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 71
Page 72
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 72
Page 73
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 73
Page 74
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 74
Page 75
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 75
Page 76
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 76
Page 77
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 77
Page 78
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 78
Page 79
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 79
Page 80
Suggested Citation:"Chapter 3 - Methods Evaluation." Transportation Research Board. 2014. Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests. Washington, DC: The National Academies Press. doi: 10.17226/22370.
×
Page 80

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

52 Introduction Chapter 3 summarizes the results from the tests conducted on the methods identified in Chapter 2. This demonstration was conducted through two experiments: • Experiment A: augmenting person-based GPS HTS data with trip details. • Experiment B: enriching anonymized smartphone GPS data with socioeconomic and demographic information. Overview of Experiment A The goal of Experiment A was to evaluate data fusion meth- ods that can be used in the context of GPS-only household travel surveys. The tests consist of implementing code, or using implementations from the original authors when available, for each method and using it to process the raw GPS data into trips. Figure 3-1 shows an overview of the process of turning raw GPS data into processed travel information using the methods tested in Experiment A. It should be noted that a balanced approach with respect to level of effort allocated was pursued when implementing these methods; in other words, the researchers attempted to perform a comparable amount of calibration and setup across all methods (i.e., to invest a similar level of effort across methods). This ensured that a fair evaluation was performed and also benefited methods that were simpler to calibrate and set up; however, this also means that the full potential of these methods was not necessarily extracted from the tests. Initial work focused on data preparation and standardiza- tion. This work included exporting original GPS data files to comma-separated value (CSV) text files and also locating original raw GPS data files from the ARC and DRCOG surveys. The resulting CSV input file types generated included: • Raw GPS points – as collected by the field devices; • Processed GPS points – filtered to exclude data outside the travel date range and noise; and • Mode segments – one record per unlinked trip (also referred to as an elemental trip) identified in the processed GPS data; these correspond to individual mode segments in a multi- modal trip. Survey data, which were used to calibrate mode and pur- pose identification models, were converted from their original database structures to a standardized place-based relational schema. In addition to places, this schema included tables for storing household, person, and location data. Overview of Experiment B The purpose of Experiment B was to evaluate methods for attaching person- and household-level information to travel patterns observed in GPS-based household survey data or other sources of GPS trace data. As such, the experiment was designed to be as general as possible, with very few assump- tions about the data that would be available in the source data set. So, while the experiment included models derived from household travel survey data, these models should be generally applicable to any source of GPS traces. Figure 3-2 provides an overview of the process developed for Experiment B. The experiment can be broken down into four stages: • Stage 0: processing of input trip data from Experiment A into person-travel data records. • Stage 1: development of the primary demographic clusters. • Stage 2: selection of optimal person-type clusters based on travel attribute similarity. • Stage 3: development of person-attribute assignment models for a selected set of demographics. The outputs of Experiment B include open-source computer code that processes the mode segment data from Experiment A into person-travel records and a set of model files that can be applied to the processed travel records using various open- source modeling packages, including WEKA and BIOGEME. C H A P T E R 3 Methods Evaluation

53 Figure 3-1. Overall sequence of steps covered in Experiment A. Trip Records Demographics Travel Patterns Demographic Clusters Land Use Data Travel Pattern Data Selected Demographic Cluster Data Processing PART Nested Logit Stage 0: Land Use Selected Demographic Cluster Travel Patterns Demographic Attribute 1 Demographic Attribute 2 … Demographic Attribute N Stage 1: Stage 2: Stage 3: Various Models Data Dependent Variable Independent Variable Result Model Figure 3-2. Experiment B demographic characterization process.

54 The remainder of this chapter is organized as follows. First, it introduces the data sets used to carry out the selected experiments along with an outline of the implementation and testing approaches used to evaluate the methods. This is followed by discussions of the two experiments, along with the main findings and results. Additional information on the models implemented in Experiment A and Experi- ment B is included in Appendix D and Appendix E, respec- tively. Appendix F contains explanations of the various tools used to conduct the experiments, along with script or code instructions and listings, where applicable. Reference Data and Software Tools The research team obtained permission from ARC for using the GPS data collected as part of its recently completed household travel survey. Both person and vehicle GPS data were used to test the GPS processing methods. GPS vehicle data were used to test the noise filtering data because they included values for HDOP and number of satellites, which were not available in the person-based GPS data due to limitations in the wearable GPS logger that was employed. Person-based GPS data were used in all subsequent GPS processing method tests. Both the vehicle-based and person-based GPS data were exten- sively reviewed as part of the original HTS effort. The origi- nal cleaning process included the review of individual travel days by analysts using custom data processing and visualiza- tion tools. Because of this original review, the processed data can be treated as a reliable benchmark against which the tested methods’ outputs can be evaluated. Three types of reference GPS data sets were created: filtered points, trip ends, and mode segments (corresponding to unlinked trips). A subset of the Atlanta diary data was also used to calibrate trip purpose iden- tification methods; the idea here was that these would simulate the data that are typically collected using GPS prompted- recall methods in GPS-only HTSs. With respect to smartphone data sets collected as part of an HTS, the research team obtained the PaceLogger data set, which consisted of a subsample of households in Portland as part of the recent OHAS. This data set was collected using a modified version of the original CycleTracks app and contains data from 308 smartphone users within 256 households. Permission to use and access this data set was obtained from the OHAS subcommittee of the OMSC. In addition to the smartphone GPS points, data from the regular survey were used to test the fit of the models used in the experiments. The data used to train the demographic characterization models were drawn from the 2008 Chicago HTS, which included 2 travel days for about 40% of the respondents (9,736 total observa- tions). The data used for model estimation were limited to the 2-day sample to reduce the confounding effects of intraper- sonal, day-to-day variability on the models. This is discussed further in the Experiment B discussion. Table 3-1 provides a summary of the reference data used in Experiment A. To increase the reproducibility of the tests implemented as part of the research project, a decision was made to use, as much as possible, free and open-source software tools for data processing, modeling, and data analysis. Consequently, the tools (and supporting software) selected to implement the algorithms and models in Experiments A and B were: • R 3.0 (R Core Team 2013) for heuristics methods and for calling fuzzy logic routines in Java; • BIOGEME 2.2 and BIOSIM (Bierlaire 2003) for multinomial and nested logit choice modeling; • WEKA 3 data-mining tool set (Hall et al. 2009) for neural networks, classifier trees, and clustering; • PostgreSQL; and • C+ + for developing Experiment B data processing scripts. Table 3-2 shows the assignment of the three programming packages to the experiment tasks. Study Name Number of Households Number of Persons ARC 2010 Diary 10,278 25,810 ARC 2010 Person GPS HTS 334 649 ARC 2010 Vehicle GPS HTS 727 1,422 OHAS 2009 Portland HTS Diary 4,799 11,133 OHAS 2009 Portland HTS Smartphone 256 307 Chicago 2008 HTS Diary (Total) 10,552 23,808 Chicago 2008 HTS Diary (2-day w/travel)* 2,395 5,125 *Excludes anyone with a travel day on Saturday or Sunday, anyone only responding for 1 day, and anyone who did not travel. This sample was used for the analysis, implementation, and testing approach. Table 3-1. Reference data details and sources used in Experiment A.

55 The resulting filtered points from the application of each method were compared against the processed and filtered points in the original GPS data deliverable, also referred to as the reference data, which were reviewed by GeoStats analysts. The null hypothesis here was that each point’s final filtered state was to be the same in both data sets. Errors were categorized as belonging to one of the following groups: • Type I error: Point is not blocked in method when it is blocked in the reference data. • Type II error: Point is blocked in method when it is not blocked in the reference data. The noise filtering methods were run against 100 randomly selected raw GPS point vehicle-based data files from the ARC data set. These files contained a total of 2,446,984 GPS points. Each point had a flag appended to it that indicated whether it was considered to be noise by the method. These flags were then used to compare the method’s results with the reference, filtered GPS data. Error percentages were calculated by taking the number of errors in each category and dividing by the total number of points. Table 3-3 summarizes the error results of the tested data cleaning methods along with insights obtained while implementing the method. The first evaluated method (Stopher, Jiang, and Fitzgerald 2005) was originally developed for use in vehicle-based sur- veys, while the other methods were developed to be applied to person-based GPS data. More specifically, the method by Schüssler and Axhausen (2008) featured additional steps that were added to deal with limitations of the originally used device (i.e., it did not capture instantaneous speeds, HDOP, or number of satellites). The results indicated that simple rule-based methods based on point quality and speed data available from the GPS device were the most effective. While all three methods had similarly small counts of Type II errors (point was filtered by method, but it was not filtered in the reference data), two of them displayed higher rates of Type I errors (point was not blocked in method, but it was blocked in the reference data). Scripts and code sets created for many of these experiments are provided in Appendix F, along with some basic instruc- tions for use. The data associated with these experiments can- not be made available to the public given privacy concerns for the original participants in the survey efforts under which the GPS data were collected. This is mostly applicable to the original raw GPS coordinate data that were used as input for the data cleaning and trip identification methods. Experiment A: Basic GPS Data Processing This section covers the testing of data processing meth- ods that are used to convert raw GPS data into clean points, trips, and mode segments. The application of these methods constitutes the first step necessary for turning raw trace data into transportation behavior information. The methods pre- sented in this section correspond to the boxes with diagonal hashing, as presented in Figure 3-1. GPS Data Cleaning Methods GPS data cleaning or noise filtering methods seek to iden- tify points that do not indicate real participant movement and that can hence be removed from the data without loss of travel information. They are typically necessary to improve visualization of the data and also to improve the accuracy of trip identification methods. Furthermore, by removing points that are not indicative of actual movement, they have the added benefit of making the data more manageable. Original, raw, vehicle-based GPS point files (each one repre- senting all the points collected for an instrumented vehicle) were used as input into the data cleaning methods. The devices used were continuously powered by an internal recharge- able battery, which was also charged by the vehicle’s cigarette lighter connector whenever the car was driven. The logging frequency of these devices was set to one point per second, and only points with instantaneous speeds above 1.9 mph (3 km/h) were recorded. Procedure Tool Components R WEKA BIOGEME Cleaning raw GPS data X Identifying trips and mode transitions X Identifying travel mode X X X Identifying trip purpose X X Inferring demographics X X Table 3-2. Programming packages used in the processing of GPS data.

56 Figure 3-3 illustrates the results of the three tested meth- ods using a sample of the points from one of the processed files, which contained activity in downtown Atlanta. The maps show points that were classified as noise by the various methods, with the shade of gray varying based on each point’s instantaneous speed (converted to miles per hour). The maps indicate that the Stopher, Jiang, and Fitzgerald method only This is to be expected since it is much more likely that the methods will try to err on the side of caution (and thus fail to block some points) than it is that they will block valid points. The Type I error rate for the Stopher, Jiang, and Fitzgerald method was found to be the lowest of the three methods, with a very small number of points being incorrectly flagged as noise. Source References Implementation Findings Type I Error Type II Error Stopher, Jiang, and Fitzgerald (2005) First filtered out all points with fewer than three satellites in view and HDOP equal to or greater than 5. Then removed points that showed no movement (speed equal to zero, less than 15 m of movement, and heading also being zero or unchanged). Point movements were calculated using the great circle distance according to the Vincenty (sphere) method (Vincenty 1975). 0.00% 8.89% Lawson, Chen, and Gong (2010) Remove points based on HDOP, number of satellites, zero speed or heading, and presence of jumps. The thresholds for considering points to be of poor quality using HDOP, number of satellites, and speed proposed in the paper were used; these were: HDOP > 5, number of satellites < 3, and speed < 3 m/s (6.7 mph). The paper and its sources did not contain details on how the jump-detection procedure was implemented, and this information could not be obtained. As a result, the stop-flag procedure proposed by Chung and Shalaby (2005), which consisted of discarding points that showed less than 0.00005 decimal degrees of movement, was used. 13.79% 6.90% Schüssler and Axhausen (2008) Points are removed if their altitude is not within the study area. They are then smoothed and filtered by speed and acceleration. Since the GPS data used included instantaneous speed data from the actual devices (which are more accurate), the implementation did not calculate speeds as specified in the paper. 14.15% 7.25% Table 3-3. Data cleaning methods tested and results from Experiment A. Stopher, Jiang, and Fitzgerald (2005) Lawson, Chen, and Gong (2010) Schüssler and Axhausen (2008) Figure 3-3. Sample of points from downtown Atlanta identified as noise by tested methods.

57 point every 3 s, and only points with instantaneous speeds above 1 mph were recorded. Success of the method was measured by comparing the detected trips against the trips in the reference data set. The null hypothesis in this test was that the same set of trips was identified in both data sets. Trips were deemed to match if their end locations were approximately the same between the reference and processed data sets, where “approximately the same” means the end times of the two data sets were within 15 min and the start and end locations were within 75 m. Based on this, the following errors were computed: • Type I error: Trip is not found in method but is found in reference data. • Type II error: Trip is found in method but is not found in reference data. The trip identification methods were run against 300 ran- domly selected processed GPS files from the ARC GPS data set, which included 336 individual linked trips. The percent- ages for Type I errors were calculated by taking the number of trips that did not match the reference data and dividing that by the total number of generated trips. The percentages for Type II errors were calculated by taking the number of refer- ence trips that did not match any generated trips and divid- ing that by the total number of trips in the reference data. Table 3-4 presents findings from the process of implementing the tested identification methods along with the failure rates of the two detected errors. Both methods ended up generating fewer trips than what was contained in the reference data set, and this was reflected in the higher rates of Type II errors (trip is found in method but not in reference data). The simple approach proposed by Wolf, Guensler, and Bachman showed a slightly higher Type I (trip is found in reference data but not in tested method) error rate than the one proposed by Schüssler and Axhausen. This result was expected given that the method has no mechanism for detecting short stops (i.e., those which last less than the 120-s threshold). However, this simpler approach ended up with a lower Type II error rate, which indicates that it is less likely to erroneously consider short stops as valid trip ends. The higher Type II error rate found for Schüssler and Axhau- sen was caused by the fact that the method consistently failed to find trip ends arriving at a smaller number of trips than the method proposed by Wolf, Guensler, and Bachman. An examination of the characteristics of the trips identified by the two tested methods also revealed that the trips identified by the Schüssler and Axhausen method tended to be longer and to have lower average speeds, which is consistent with failing to identify stops. It is possible that this was a side effect of the method’s original design, which was customized to a specific data collection device and may not be well suited for processing data that were already filtered for noise points. identified a small number of points as noise in this area; the maps also illustrate how the Lawson, Chen, and Gong method tended to filter out points around intersections. These were likely points with speeds below the prescribed 6.7 mph (3 m/s) but above the original data collection’s min- imum speed setting of 1.9 mph. Finally, the maps show that the Schüssler and Axhausen method did block some valid traces, but also captured some of the same noise identified by Lawson, Chen, and Gong. GPS Data Cleaning Findings Based on the test results, it is suggested that practitioners select devices (or data sources) that can provide instanta- neous speed, as well as HDOP and number of satellites, given the importance that these data elements have in the process of filtering noise out of raw GPS trace data. Another finding from this effort was that further analyst review may be neces- sary after applying automated filtering to raw GPS data to deal with points that should have been identified as noise but were ignored by the methods (Type I error). It is also worth noting that the Schüssler and Axhausen method arrived at sound results without relying on point quality indicators like HDOP and number of satellites. How- ever, this method was much more complicated to implement and took significantly longer to process than the other meth- ods. (It was at least 10 times slower.) In the end, the Lawson, Chen, and Gong method clearly performed the best, which shows the importance of having access to the number of sat- ellites, HDOP, and instantaneous speed in the raw GPS data. Trip Identification Methods Trip identification methods can take clean GPS points as input and generate a list of trips as output. These methods are not able to detect mode transitions within multimodal trips and are hence more appropriate for use with vehicle-based GPS data. They can also be used to generate trips whose point sequences can be further processed into separate mode seg- ments using mode transition detection methods (reviewed in the next section). The test consisted of comparing trips identified in the ref- erence data with those generated by the evaluated methods. These methods are typically rule-based, and the criteria for these rules will require a simple calibration effort, which will heavily depend on how the data were collected (i.e., logging rules) and the performance of the GPS device used. As part of this test, the default values found in the original work were used to configure the methods. Input data for this test con- sisted of noise-filtered points in the original ARC person- based GPS data deliverable. These only included points that were used by delivered trips, which were limited by the origi- nal household’s assigned travel date range. The devices used to collect the original GPS data were configured to record a

58 end times that were within 15 min of each other. The two fol- lowing errors were detected from this check: • Type I error: Mode segment end point is not found in method but is found in reference data. • Type II error: Mode segment end point is found in method but is not found in reference data. Table 3-5 presents observations based on the exercise of programming the tested mode transition identification method along with the two detected errors. The methods were applied to the filtered GPS points from 300 randomly selected person-based GPS files in the reference data (same set used in the trip identification test) and their correspond- ing mode segments. The tests revealed that the first method (Tsui and Shalaby 2006; Schüssler and Axhausen 2008) clearly performed bet- ter, with lower Type I (mode transition is not detected by method) and Type II (mode transition end point is found in method but not in reference) error rates. And while the first method showed lower Type I error rates, the second method featured higher Type II error rates. Examining the distribution of travel times of the iden- tified mode segments in Figure 3-4 reveals that the first method tended to identify shorter mode transitions, likely to be short walk segments. These short segments were attached to longer, and likely motorized, mode segments in the sec- ond method. This is consistent with the fact that the second method found many fewer (a 10% decrease) mode segments than what was present in the reference data. On the other Trip Identification Findings Based on the results of this test, it is suggested that trips found using automated methods be reviewed for potentially missed short stops. The test also made it clear that a very simple approach like the one proposed by Wolf, Guensler, and Bachman can generate a reasonable first estimate of trip ends. The simplicity of the method also makes it very com- putationally efficient, thus making it suitable for real-time processing. Mode Transition Identification Methods In instances where multimodal travel was captured using GPS, it is necessary to further parse identified trips into mode segments (also referred to as unlinked or elemental trips). Each of these segments features a consistent travel mode. For example, a typical single-leg transit trip will consist of a sequence of three mode segments: walk → bus → walk. The methods presented in this section take as inputs the points belonging to trips and find mode transi- tions within them. The null hypothesis in this test was that the mode transi- tion points identified by the method would match the refer- ence data. The test involved passing the filtered GPS points from the reference data (from the person-based GPS sub- sample of the ARC HTS data set) to the implemented algo- rithm and then comparing the resulting mode segments against the unlinked trips in the reference data. Unlinked trips were deemed to match if their end locations were within 75 m between the reference and processed data sets and had Source References Implementation Findings Type I Error Type II Error Wolf, Guensler, and Bachman (2001) This method consists of calculating point dwell time based on subsequent time steps. Trip ends are then identified based on a minimum delay threshold of 120 s. The method was straightforward to implement. 4.25% 10.27% Schüssler and Axhausen (2008) The exact sequence of steps included in the paper was unclear. The test implementation first did a pass over the data to determine the dwell time and activity detection based on point density and time. It also included a short series of points with the activity. Then a second pass was made to determine whether any of the remaining point sequences had a density ratio higher than 2/3. The rest of the data was split into trips. Even though there were several rules applied to the data, the bulk of the detection typically occurred based on the first point density rule. The dwell time activity detection rarely happened, possibly due to the high threshold specified in the paper. There were also only a few instances where the second pass for activity detection found any activities. 3.09% 35.92% Table 3-4. Trip identification methods tested and results from Experiment A.

59 Source References Implementation Findings Type I Error Type II Error Tsui and Shalaby (2006) and Schüssler and Axhausen (2008) The papers were unclear about whether to discard an SOW point if another SOW point is detected before an EOW or EOG point is detected. Also unclear about whether to discard an SOW point if an EOG point fails to be an EOW point but could be another SOW. The tested implementation keeps the first SOW in both cases. If no SOW points are detected, the entire file is considered non-walk. The initial implementation tended to keep data together in segments, even though long dwell times were present in its points. To account for this, the implementation added a step that ended a mode segment if a dwell time of at least 120 s was found. 16.58% 9.59% Oliveira et al. (2011) This method produces elemental trips (separate trips per mode or unlinked trips), so it was treated as a mode transition identification method. The implemented logic did not create places as per the paper but mode segments instead so that the results could be compared with the other tested method. Since places bound mode segments, this was a straightforward change. An error was found in paper about multiplying the segment’s average speed by 1.96 times the point speed standard deviation; it should be adding the result of the second multiplication to the segment’s average speed. Finally, since the input data used was already filtered of noise, the original paper’s noise filtering logic was disabled for this test. 24.76% 31.96% Table 3-5. Mode transition identification method tested and results. Tsui and Shalaby (2006) and Schüssler and Axhausen (2008) Oliveira et al. (2011) Figure 3-4. Travel time distribution of identified mode segments.

60 Note that Type I and Type II are paired for this hypoth- esis (picking a wrong answer implies that you failed to pick the right answer), but it is still interesting to treat them sepa- rately to see what modes are frequently overused (Type I) and underused (Type II). Table 3-6 identifies the tested methods and presents findings that derived from the implementation of the methods. The validation step’s goal was to estimate the reliability of the developed process (or model) and to document its per- formance. Reliability of the calibrated models was assessed by applying them to the validation portion of the data sets and then comparing predicted purposes to actual respondent choices. Classification errors were tabulated as a function of the actual choices selected by respondents, and the distribu- tion of imputed trip purposes was compared to that of the validation data sets. A confusion matrix, also referred to as a prediction-suc- cess table in travel forecasting, was constructed for each method of application. This is a matrix that shows actual choices as rows and modeled outcomes as columns; correct classifications appear on the matrix’s diagonal. Within the context of a confusion matrix, Type I errors are the sum of each column without the diagonal value, and likewise the Type II errors are the sum of each row without the diago- nal value. It should be noted that the mode-specific error percentages in these matrices are not supposed to add up to 100%; that is because they are generated using different totals (i.e., the horizontal or vertical sum of outcomes for a given mode, not the sum of outcomes across all modes). Type I error rates are computed with respect to column totals, while Type II error rates are calculated using row totals. Table 3-7, Table 3-8, Table 3-9, and Table 3-10 present the confusion matrices for the four tested travel mode identifica- tion methods. Table 3-11 and Table 3-12 summarize the Type I and Type II errors for the tested travel mode identification methods. They show the error rates by mode across rows and tested method across columns. The heuristics-based method worked best with walk and bicycle trips, but performed poorly with bus trips. It was also not effective at differentiating between auto and bus modes, and failed to classify most bus mode segments as such. Another limitation of this approach is that it may result in some mode segments not being assigned a travel mode— 11 in the case of this test. When rail lines travel along the same path as roads (for instance, a metro line that travels between the directions of a highway), this method has diffi- culty assigning to rail because to qualify for rail, a trip cannot be on the road network. This is the reason why a small buffer of 50 ft was used. The bicycle travel mode was also disquali- fied if the household reported that they had no bikes, even if the trip matched a biking signature. hand, the first method also identified a higher number (a 3.8% increase) of mode segments than what was reported in the reference data. Mode Transition Identification Findings The method originally proposed by Tsui and Shalaby (2006) and later refined by Schüssler and Axhausen (2008) produced better results than the method proposed by Oliveira et al. (2011), with error rates that were 1.5 to 3 times smaller. Fur- thermore, the first method was also better at capturing short nonmotorized mode segments, which typically occur before and after motorized travel. It should be noted, however, that the first method ended up identifying more mode segments than what was present in the reference data. Experiment A: Classifier Data Fusion Methods This test covered the evaluation of classified methods in the context of travel mode and trip purpose identification. Three types of methods were evaluated: heuristic, probabilistic, and AI. The input data for the tests consisted of processing subsets of the unlinked trips present in the reference data set by using the evaluated methods. Some of the methods required cali- bration (or machine learning), and in these cases the data set was split into calibration and validation groups. The methods presented in this section correspond to the boxes with hori- zontal hashing presented in Figure 3-1. Travel Mode Identification Methods To evaluate the performance of the travel mode identifica- tion methods selected from the literature, a set of unlinked GPS trips from the reference data was obtained. The set was con- structed so as to have the same number of records per evaluated travel mode. This set was then further divided into calibration and validation records. The first contained 36 records for each evaluated travel mode, while the second included 18 records per mode set. The travel modes used in the tests were walk, bicycle, auto, bus, and heavy rail. Initially, these records had no attributes (other than an identifier), but various attributes were computed from the GPS points that the trips covered, according to the needs of each method. The null hypothesis for the tests was that the mode selected from the method matched the reference data. Based on this, the following errors were computed: • Type I error (detected wrongly): Mode classified does not match reference data. • Type II error (failed to detect): Reference data mode does not match mode classified.

61 Method Types Source References Implementation Findings Heuristics Stopher, Clifford, and Zhang (2007) This method needed baseline statistics for each mode, so the 180 training trips were used. The implementation used a 50-ft buffer around road, rail, and bus stops, and to count as being on the road or rail networks, over 50% of the points in the trip had to be within that buffer. For bus mode, both start and end of trip had to be within 50 ft of a bus stop, but the path of the trip was not verified. The 95th percentile speeds in the training data were compared against the 85th percentile speeds in the test data. This was done to account for underestimation of higher speeds in the training samples, which was found in early applications of the method. This method can return N/A results for cases where the mode could not be classified. Probabilistic Oliveira et al. (2006) First, the model in the original paper had to be changed since no accelerometer-based physical activity data were available in the test data set. The first model specification included alternative specific constants and betas for average speed and standard deviation of acceleration. Although this original model’s coefficients could be estimated, the resulting model performed very poorly (adjusted rho- square < 0), so a new specification was created that used dummy variables for low, middle, and high speed and acceleration levels by mode in addition to alternative specific constants. This final specification performed better (adjusted rho-square = 0.537), but the small number of observations made it so that the majority of its coefficients did not pass the t-test for significance. The final model specification is listed in Appendix D. Fuzzy logic Tsui and Shalaby (2006) and Schüssler and Axhausen (2008) The low, 95th percentile acceleration (m/s2) category was (0, 0, 0.5, and 0.6) in the papers, but the used data had negative 95th percentile acceleration values (where the person was mostly decelerating for the whole mode), so -9999 was used to cover these cases. The fuzzy logic gives a score of between 0 and 1 to each mode. If there was a tie for greatest value, a value was randomly picked between the winning values. (The papers did not specify a tiebreaker.) The random seed was set to 1 at the beginning of the process so that the same random values were chosen each time. Neural networks Gonzalez et al. (2008) Given that the stopped time is defined as “a certain threshold” in the paper, an assumption was made to consider time spent at 5 mph or less to be stopped time. Since each point was 3 s apart, two successive points must be at or below 5 mph to count 3 s toward the stopped time. Furthermore, HDOP and percent cell-ID fixes were not available. (All points were captured by GPS.) The paper did not specify how to determine stops within a trip; since stops are used to track bus stops, a minimum of 20 s of dwell time was used to identify stops. The test used a learning rate of 0.1 and 300 iterations as given in the final results of the paper, rather than computing which thresholds were best as the paper did. The researchers also did not make use of critical points as suggested in the original paper. Table 3-6. Travel mode identification methods. Ref\Method Walk Bicycle Auto Bus Heavy Rail N/A Type II Walk 17 0 1 0 0 0 1 (06%) Bicycle 0 10 3 0 0 5 8 (44%) Auto 0 0 17 0 0 1 1 (06%) Bus 0 0 16 2 0 0 16 (89%) Heavy rail 1 0 3 0 9 5 9 (50%) N/A 0 0 0 0 0 0 – Type I 1 (6%) 0 (0%) 23 (58%) 0 (0%) 0 (0%) 11 (100%) - Table 3-7. Heuristic confusion matrix (55/90  61% correct, 12% indeterminate).

62 Ref\Method Walk Bicycle Auto Bus Heavy Rail N/A Type II Walk 14 4 0 0 0 – 4 (22%) Bicycle 2 16 0 0 0 – 2 (11%) Auto 0 0 9 3 6 – 9 (50%) Bus 0 1 7 7 3 – 11 (61%) Heavy rail 1 0 5 2 10 – 8 (44%) N/A – – – – – – – Type I 3 (18%) 5 (24%) 12 (57%) 5 (42%) 9 (47%) – – Table 3-8. Probabilistic confusion matrix (56/90 5 62% correct). Ref\Method Walk Bicycle Auto Bus Heavy Rail N/A Type II Walk 17 1 0 0 0 – 1 (06%) Bicycle 1 16 0 1 0 – 2 (11%) Auto 0 0 18 0 0 – 0 (00%) Bus 0 0 14 4 0 – 14 (78%) Heavy rail 1 0 13 1 3 – 15 (83%) N/A – – – – – – – Type I 2 (11%) 1 (6%) 27 (60%) 2 (33%) 0 (0%) – – Table 3-9. Fuzzy logic confusion matrix (58/90 5 64% correct). Ref\Method Walk Bicycle Auto Bus Heavy Rail N/A Type II Walk 17 0 1 0 0 – 1 (06%) Bicycle 0 17 1 0 0 – 1 (06%) Auto 0 0 10 5 3 – 8 (44%) Bus 0 2 1 14 1 – 4 (22%) Heavy rail 1 0 0 1 16 – 2 (11%) N/A – – – – – – – Type I 1 (6%) 2 (11%) 3 (23%) 6 (30%) 4 (20%) – – Table 3-10. Neural net confusion matrix (74/90 5 82% correct). Ref\Method Heuristics Probabilistic Fuzzy Logic Neural Network Walk 6% 18% 11% 6% Bicycle 0% 24% 6% 11% Auto 58% 57% 60% 23% Bus 0% 42% 33% 30% Heavy rail 0% 47% 0% 20% N/A 100% – – – Table 3-11. Mode identification Type I error rates by travel mode and method. Ref\Method Heuristics Probabilistic Fuzzy Logic Neural Network Walk 6% 22% 6% 6% Bicycle 44% 11% 11% 6% Auto 6% 50% 0% 44% Bus 89% 61% 78% 22% Heavy rail 50% 44% 83% 11% N/A – – – – Table 3-12. Mode identification Type II error rates by travel mode and method.

63 choices. This additional review may be automated through the use of GIS transit infrastructure data, which helped lower the Type I error rate of the heuristics method for bus and heavy rail modes. Trip Purpose Identification Methods The research team tested two modeling techniques for iden- tifying trip purpose: discrete choice modeling, using nested multinomial logit (NMNL) models, and decision trees ( Griffin and Huang 2005). Both decision trees and NMNL methods can be calibrated using revealed trip purpose responses from existing HTS data and can then be applied to identify trip pur- pose for GPS-derived data or GPS-like data (i.e., containing only basic trip attributes). Decision trees have the benefit of graphically organizing the variables that go into trip purpose selection and, in turn, can be used to help direct the develop- ment of a model specification for the discrete choice method. For the purpose of this research effort, the team used the WEKA (http://www.cs.waikato.ac.nz/ml/weka/) data mining tool for estimating decision trees using the C4.5 method. Probabilistic Method for Identifying Trip Purposes A nested logit model structure was used for this test order to logically group choices according to aggregate purposes such as: at home, at work, nonwork, university or school, air- port, and loop trip. At the same time, the participants were classified as belonging to one of eight life-cycle categories listed in Table 3-13. The BIOGEME modeling tool was used to calibrate these nested logit models. Choice simulations were generated using the associated BIOSIM simulation to perform model valida- tion. Although BIOGEME works in Microsoft Windows, it has known memory management problems in this platform. To avoid these issues, all computations were done using a Linux virtual machine. The final probabilistic model worked well for nonmotorized modes but had a difficult time discerning between motorized modes, particularly bus and auto. This was likely due to the limited attributes used as independent variables (i.e., all based on speed and acceleration). As with the probabilistic approach, the fuzzy logic method worked best with nonmotorized modes and in making the distinction between them and motorized alternatives. But it also struggled when selecting between motorized alternatives. The neural network method worked the best across all alter- native modes. As with the other methods, its main challenge was differentiating between auto and bus modes, but even in these cases it fared better than all other methods. A key differ- ence here was that the network calibration data included infor- mation about time spent at or below 5 mph, which may have helped with differentiating between bus and auto modes. Interestingly, the neural net was the only method that had a nearly even overuse and underuse bias toward bus mode (the other methods heavily under-selecting it). As observed in most of the papers referenced here, detecting walk and bicycle modes was relatively easy for all methods. The fuzzy logic thresholds were taken as written in the papers, rather than computing them as the paper specified. It is possible that this method would have performed better if the thresholds were calibrated using the 180 training cases that the other methods used. Mode Identification Findings Based on these test results, the neural network method should be employed whenever calibration data are available given its lower error rates. If no calibration data are avail- able, then the next best approach should be to use the fuzzy logic method, which performed reasonably well with no cali- bration. In cases where a calibration data set is not available to train a neural network and another method is selected, it may be necessary to perform additional review of motorized mode selections to properly classify them between competing ID Category Description 1 FT worker Person is a full-time worker. 2 PT worker Person works only part-time. 3 University student Person is 18 years of age or older and a student. 4 Nonworker Person is 18 years of age or older and does not work or go to university. 5 Retiree Person is retired. 6 Driving-age school child (16–17 yrs) Person is between 16 and 17 years of age and goes to high school. 7 Pre–driving-age school child (6–15 yrs) Person is younger than 16 years of age and older than 5 years of age. 8 Preschool child (<6 yrs) Person is younger than 6 years of age. Table 3-13. Person life-cycle categories.

64 hold members on the trip was reduced by one). For determin- ing if a place matched the previous destination, the variable “originisdestination” was computed; places were considered the same if they were within 75 m of each other and had the same name. The two following subsections provide details on some of the challenges encountered when processing the two test data sets selected for this task. Spatial attributes were attached to the place records using relationships between their destination coordinates and GIS data sets. Table 3-15 identifies these variables, as well as the source spatial data, and also provides a description for how they were set. It is worth noting that land use data for the Atlanta plan- ning region was sparse in its classification, with only 23 land The first step was to categorize each household member into one of the eight categories shown in Table 3-13. If household members qualified for more than one category (e.g., univer- sity student and part-time worker), they were then classified into the lowest number for which they qualified in Table 3-13. Once the data were imported, a series of computed variables were added to the place and person records. These variables and their definitions are identified in Table 3-14. In addition to the variables listed in Table 3-14, additional dummy variables for specific time-of-day periods as well as their interactions were computed. For all trips in which partici- pants counted themselves as another party member, that pas- senger was considered a non-household member (so the count of people on the trip stayed the same, but the count of house- Variable Description nonworker Person is not a worker. mode Trip mode mode_aft Mode of the trip to the next place nonauto Non-auto trip mode tottr Total travelers on the trip to current place tottr_aft Total travelers on the trip to the next place actdur Activity duration (minutes) nonmand Trip to a non-mandatory location other than home, usual school location, or usual workplace transfervariable Variable indicating possibility of a transfer between two non-auto modes adultparty Party of only adult members childparty Party of only child members mixedparty Party of both adult and child members someonedropped A person was dropped at this destination. someonepicked A person was picked up at this destination. dropoffvariable Variable indicating possibility of drop off pickupvariable Variable indicating possibility of pickup worklocationmatch Destination location is usual work location, but excluding work from home cases. schoollocationmatch Destination location is usual school location, but excluding home schooling cases. subtourdummy Set to one if the given trip is a part of a sub-tour (tour starting and ending at the primary destination of the main tour) simplesubtour A sub-tour in which only one destination is visited complexsubtour A sub-tour in which more than one destination is visited hhmem Number of household members, excluding the respondent on this trip groupgroceryduration A trip with a household member to a non-mandatory location and taking between 20 and 40 min., indicating possibility of grocery shopping groupeatoutduration A trip with a household member to a non-mandatory location and taking between 40 and 60 min., indicating possibility of a typical family eat-out trip walkmode Includes walk and wheelchair bikemode Includes bike, skates, skateboard, Segway, and scooter grouprecreationduration A trip with a household member to a non-mandatory location and taking between 110 and 150 min., indicating possibility of a typical family recreational trip groupsocialvisitduration A trip with a household member to a non-mandatory location and with activity duration greater than 150 min., indicating possibility of a typical family social visit notworklocation Destination location is not the usual workplace. Table 3-14. Explanation of computed variables used in the utility equations.

65 During the model’s initial successful estimation runs, a large number of the estimated coefficients either failed the null hypothesis t-test or ended up with coefficient estimates for which BIOGEME could not estimate p-values. Furthermore, two purposes (#12: all other activities at school and #24: attend major sporting event) did not have enough observations to calibrate their coefficients. These two purposes only appeared in 49 and 46 places, respectively. Two actions were taken to deal with these challenges. First the list of trip purposes was further simplified into 12 choices by combining shopping and maintenance activities [under #15 and #7 respectively; drive through (#7) is considered a main- tenance activity for this model, as are vehicle service (#14), household maintenance (#16), health care (#19), and per- sonal business (#20)] and by combining entertainment activi- ties [indoor recreation or outdoor recreation (#23) and attend major sporting event (#24)]. Figure 3-6 shows the final nesting structure. Second, choice, and life-cycle coefficients were simplified so that they could be shared across activities on the second nest level. These changes improved the estimation results, and after three rounds in which coefficients were removed, a final model specification was obtained. The final model specification contained 150 esti- mated parameters (three of which were nesting coefficients) and had an adjusted R2 equal to 0.54. The two strongest positive coefficients in the final model were the ones associated with school and work location matches for school and work purposes (6.99 and 6.35), while the two strongest negative coefficients corresponded to person life-cycle coefficients for retired and nonworker persons and the school purpose (-3.02 and -1.38). Only one of the three nesting coefficient (dis_work) ended up not being significant, with the final estimated value being 1.0, which effectively collapses the nest into two choices at the root level. use categories available. This low level of resolution made it difficult to differentiate trip purposes for trip ends located at or near multipurpose land uses, such as attempting to differ- entiate areas in which health care is dispensed from ordinary commercial developments, government buildings, or schools. Before a model specification was developed, the original list of trip purposes was simplified. This simplification involved collapsing all purposes that took place at home and work to “any other activities at home” and “work doing my job,” respec- tively. The final list of input purposes contained 21 entries. This was done to consolidate purposes that were identified as either being too similar or very difficult to differentiate based on household, traveler, and trip characteristics. Table 3-16 shows the original ARC trip purposes and those that correspond to them in the processed (simplified) data set. This basic nesting structure developed for estimating trip purpose for the Atlanta HTS data set is shown in Figure 3-5. This diagram includes some simplifications such as not listing home or loop purposes and only including one work purpose that can take place at the work location. Utility equations for the final 21 trip purposes were defined using the computed variables added to the data set and the interactions between them. These interactions included choice and life-cycle–specific coefficients to all purposes (total of 8 × 21 = 168). The utilities included purpose-specific co efficients that captured the impact that certain trip attri- butes were expected to exert on specific activities (e.g., short activity duration, change in party size, and mixed part for pickup and drop-off events), disaggregate nest-specific coeffi- cients (applied to all purposes under a specific aggregate nest), which were applied to time of day, and person and GIS vari- ables using interactions (e.g., commercial land use for shop- ping, maintenance, eating out, and discretionary activities). This first model specification contained 295 utility coefficients to be estimated. Variable GIS Data Sets Source Description nearchurch Church and places of worship ESRI (2010) The destination geocode is within 150 m of a feature in the GIS data sets. nearbigbox Walmart, Target, and shopping mall locations ARC (2010) The destination geocode is within 150 m of a feature in the GIS data sets. lu_commercial LandPro ARC (2010) The destination geocode is contained by or within 25 m of a commercial land use area. nearschool School locations from all surveyed participants ARC HTS The destination geocode is within 150 m of a school location. lu_institutional LandPro ARC (2010) The destination geocode is contained by or within 25 m of an institutional land use area. Table 3-15. Spatial variables added to the Atlanta Regional Travel Survey data set.

66 model results (Type II error), with the first (#22) showing a match in only 15 out of 173 reported instances (91% Type II error), while the second (#25) was correctly identified 103 out of 630 occurrences (50% Type II error). Another interest- ing observation is that the largest absolute cells outside the diagonal correspond to the reported-modeled and modeled- reported purpose pairs for activity #7 (maintenance) and #15 (shopping), which indicates that the model cannot easily dif- ferentiate between these two purposes. Figure 3-7 shows the distribution of actual choices accord- ing to the modeled purposes; it illustrates the high degree of uncertainty that the model has for discretionary purposes. Match rates for each choice are shown at the top of each choice’s bar. Using BIOSIM, the NMNL purpose model was applied to the validation data set using Monte Carlo simulation; results were output for 10,316 of the 10,512 destinations without a home purpose. The average success rate of three enumeration runs was 60%. When including places with home purposes, which were not modeled, the overall success rate was 77%. A confusion matrix was generated using the mode from 10 enu- merations of the validation set and is presented in Table 3-17 (see Table 3-16 for definitions of the trip purpose codes). As seen in Table 3-17, the purposes that were incorrectly selected most often (Type I error) were #22 (civic or religious activity) with a 75% Type I error rate and #25 (social visit) with a 68% Type I error rate. These same trip purposes were also the reported purposes that most often showed different ARC Trip Purpose Model Purpose Code Description Code Description 1 Working at home (for pay or volunteer) N/A 2 Shopping (online, catalog, or by phone) N/A 3 Any other activities at home 3 Any other activities at home 4 Change travel mode/transfer 4 Change travel mode/transfer 5 Dropped off passenger 5 Dropped off passenger 6 Picked up passenger 6 Picked up passenger 7 Drive through (ATM, bank, fast-food, etc.) 7 Drive through (ATM, bank, fast-food, etc.) 8 Work/doing my job 8 Work/doing my job 9 Other work-related activities at work 9 Work/doing my job 10 Volunteer work/activities 10 Volunteer work/activities 11 Attending class/studying 11 Attending class/studying 12 All other activities at school (eat lunch, recreational, etc.) 12 All other activities at school (eat lunch, recreational, etc.) 13 Work related (meeting, sales call, delivery) 13 Work related (meeting, sales call, delivery) 14 Service private vehicle (getting gas, oil, lube, repairs) 14 Service private vehicle (getting gas, oil, lube, repairs) 15 Grocery/food shopping 15 Grocery/food shopping 16 Other routine shopping (clothing, convenience store, household maintenance) 16 Other routine shopping (clothing, convenience store, household maintenance) 17 Shopping for major purchases or specialty items 17 Shopping for major purchases or specialty items 18 Household errands (bank, dry cleaning, etc.) 18 Household errands (bank, dry cleaning, etc.) 19 Health care (doctor, dentist, etc.) 19 Health care (doctor, dentist, etc.) 20 Personal business (visit government office, attorney, accountant) 20 Personal business (visit government office, attorney, accountant) 21 Eat meal out at restaurant/diner 21 Eat meal out at restaurant/diner 22 Civic or religious activities 22 Civic or religious activities 23 Indoor recreation or outdoor recreation 23 Indoor recreation or outdoor recreation 24 Attend major sporting event 24 Attend major sporting event 25 Social/visit friends/relatives 25 Social/visit friends/relatives 96 Loop trip 96 Loop trip 97 Other, specify N/A Table 3-16. Atlanta HTS trip purpose simplification.

Note: agg = aggregate, dis = disaggregate. Figure 3-5. Full nested logit trip purpose network structure for ARC model. Note: agg = aggregate, dis = disaggregate. Figure 3-6. Final nested logit trip purpose network structure for ARC purpose model.

68 J48. In addition to the life-cycle Boolean variables listed in Table 3-13, the computed variables identified in Table 3-14, and the spatial variables in Table 3-15, the arrival hour was added to the list of inputs available to the tree-building algo- rithm. The tree was built using a confidence factor of 0.25 and at least 25 instances per leaf. The confidence factor deter- mines how closely the tree conforms to the training set, and 0.25 is the default in both C4.5 and J48. The 25-instances- per-leaf setting keeps the tree from being overly specific by forcing every leaf to have at least 25 samples. A training sample size of 11,854 was used. The produced tree was able The purposes that were correctly identified most often were #8 (work/doing my job) and #11 (attending class/studying), which showed match rates of 97% and 98%, respectively, closely followed by #5 and #6 (dropping off/picking up passengers), with match rates close to 85%. Decision Tree Method for Identifying Trip Purposes Trip purpose decision trees were created using WEKA’s open-source implementation of the C4.5 algorithm, called Reported / Modeled 4 5 6 7 8 11 13 15 21 22 23 25 Type II Error 4 137 49 40 123 – 2 7 34 3 – 12 9 67% 5 1 606 17 30 12 10 3 17 4 1 3 9 15% 6 3 11 521 20 13 3 1 32 3 1 9 2 16% 7 46 22 24 923 3 1 88 503 105 16 62 54 50% 8 2 – 1 11 1,516 – 17 7 1 – 6 7 3% 11 2 – 4 – 3 761 – 1 – 1 2 1 2% 13 31 12 26 186 33 2 161 66 62 4 24 23 74% 15 16 11 5 542 10 1 56 841 119 6 35 26 50% 21 13 5 16 164 2 3 29 289 207 3 38 23 74% 22 5 6 4 40 6 1 22 15 14 15 28 17 91% 23 42 23 20 90 5 6 36 99 70 8 153 45 74% 25 26 34 50 99 3 1 65 50 17 5 65 103 80% Type I Error 58% 22% 28% 59% 6% 4% 67% 57% 66% 75% 65% 68% Table 3-17. Confusion matrix with results of the NMNL purpose model. Figure 3-7. Actual choices in the validation data set by simulated choice.

69 Figure 3-8 shows the actual trip purpose frequencies accord- ing to the selections identified by the decision tree. Match rates are noted as percentages on top of the choice bars. Trip Purpose Identification Findings The models developed as part of this research achieved accu- racy levels comparable with previous efforts documented by Shen and Stopher (2012) and McGowen and McNally (2006), albeit without the post-validation tour logic added by Shen and Stopher. As expected, mandatory activities (i.e., going to work and school) were easier to identify than discretionary ones. This matches findings reported by Stopher et al. (2012). In fact, accuracies for mandatory purposes were above 95% in both modeling approaches. Escorting activities (i.e., pickup and drop off) were also identifiable to a good degree of accu- racy (approximately 85%) using party size and companion information and showed relatively high match rates. Non-mandatory activities proved harder to identify. This can be attributed to the greater variability displayed by the characteristics of places where these activities occur, particu- larly with respect to time and space. Higher-resolution spatial data could help disambiguate the competing choices in these purposes as well as provide more information on participants (e.g., collection of usual places used to perform discretion- ary and maintenance activities such as eating out, buying groceries, and banking). Better spatial data would include extensive databases with the locations of places associated with discretionary activities; examples of places such as these are restaurants, gas stations, grocery stores, and government buildings. Unfortunately, the availability of public sources of these types of data varies greatly from region to region, so it to correctly classify 70% of the training data and 68% of the cross-validation data. The resulting decision tree included 369 decision nodes and is included in Appendix D. The most notable difference between the generated tree and the devel- oped NMNL model was the tree’s use of the same variable in paths within the overall the decision process, combined with the occurrence of multiple final decision nodes for the same purpose. These two factors make a complex tree like this one difficult to review and present. The Atlanta HTS decision tree used the 10,512 trip destina- tions from the households in the GPS subsample and correctly classified 65% of their activity codes. Counting the places with home purposes, which were excluded from the model, as cor- rectly identified resulted in a final 80% match rate. The purposes that were incorrectly selected most often were #16 (other rou- tine shopping – clothing, convenience store, household mainte- nance) and #21 (eat meal out at restaurant/diner). Thus, when a mistake was made by the classifier, it was often to select one of these two. Table 3-18 presents the confusion matrix obtained after applying this decision tree to the reported trip purposes. The reported purposes that were most often incorrectly identified by the classifier were #22 (civic/religious activities) and #23 (entertainment); these were incorrectly identi- fied 66 times out of 104 (63%) and 223 times of 515 (57%), respectively. In other words, these two activities were the hardest to identify correctly and showed wider error ranges by the classifier. Similar to what was revealed in Table 3-17, the largest absolute cells outside the diagonal correspond to the reported-modeled and modeled-reported purpose pairs for activity #7 (maintenance) and #15 (grocery/food shop- ping); this indicates that the decision tree also cannot easily differentiate between these two purposes. Reported / Modeled 4 5 6 7 8 11 13 15 21 22 23 25 Type II Error 4 371 32 15 29 0 1 3 26 1 0 0 2 23% 5 13 609 13 29 4 8 7 4 6 0 10 10 15% 6 21 8 547 16 5 0 5 3 3 0 5 6 12% 7 32 20 31 1,112 15 3 82 417 58 13 48 55 41% 8 6 2 1 8 1,508 0 33 5 2 3 3 7 4% 11 1 0 2 1 3 759 1 1 0 0 3 4 2% 13 16 5 28 153 28 2 264 58 39 5 26 32 60% 15 18 9 12 544 7 4 44 898 88 2 28 20 46% 21 11 3 15 150 1 1 30 239 257 8 64 13 68% 22 1 2 0 30 6 1 7 14 18 38 33 26 78% 23 25 8 10 73 2 10 32 70 59 27 223 73 64% 25 15 23 39 73 1 6 42 47 24 8 72 201 64% Type I Error 30% 16% 23% 50% 5% 5% 52% 50% 54% 63% 57% 55% Table 3-18. Confusion matrix with deterministic results of decision tree purpose model.

70 would be to define the minimum set of purposes that can be easily explained to survey participants in the calibra- tion subsample and yet still provides enough resolution for travel demand modeling. • Plan to collect personal locations (e.g., work, school, and volunteer) for each household member as part of the recruit- ment survey instrument. Build consistency checks into the GPS processing logic to ensure that captured mandatory locations are being matched to GPS destinations; a lack of a match in a typical travel day is unlikely and may be due to a poor geocode. This is especially important for mandatory purposes given that the single most important driving vari- able for these purposes was found to be a location match with the personal locations. It may be possible to derive work and school locations from data sets such as HAZUS (http://www.fema.gov/hazus). • Capture frequently visited locations (along with activities conducted at them) as part of the recruitment process (for example, grocery stores, ATMs, and gas stations). The avail- ability of these data will assist in disambiguating choices between competing non-mandatory purposes. Overall Findings One of the main findings of Experiment A is that the tested methods for automatically filtering noise out of GPS points, identifying GPS trips, and splitting trips into mode segments generated results that would likely require considerable man- ual review and cleaning before being deemed usable. This highlights the importance of software tools that can help by increasing the efficiency with which reviews and edits are per- formed on the processed results. Assuming that these extra may be necessary to investigate the availability of commercial data sources. With regard to the two tested methods, it can be said that it was much faster to generate working models using decision trees than it was using the choice models. This is due to the complexity involved in specifying large models such as these, as well as the long run times needed to estimate model coef- ficients. For example, the final model specifications took, on average, 5 hours to conclude in BIOGEME, while decision trees could be built and evaluated in WEKA in less than 5 min- utes. It should also be noted that the tasks associated with survey data preparation and preprocessing are not trivial, and adequate time should be factored in to project schedules to accommodate the development of automated procedures as well as multiple revision and correction cycles. Based on this research, the following set of suggestions was developed. These are relevant for efforts that plan to use modeling to identify trip purposes using attributes that can be automatically derived from passively collected trace data (such as GPS data) as well as household and person characteristics. • Research the availability of detailed land use and point of interest (POI) data for the study area, and consider looking into commercial options. • Ensure that the recruitment survey captures enough attri- butes to successfully classify all household members into a life-cycle category. This will reduce the number of assump- tions made while preparing the data set for model estimation and deployment. • Consider simplifying trip purpose classifications. This helps in the estimation of models and improves their suc- cess rates when applied to identify purposes. The goal here Figure 3-8. Distribution of actual choices in Atlanta HTS validation data set by tree choice.

71 Some assumptions regarding the input data were neces- sary for model estimation and should hold for any data set to which the model is applied. These include: • The GPS traces can be uniquely linked to one person. • The linked trace data covers at least one full day of travel. • The workplace and school location of the person can be determined from the data or are available from other sources. It is possible that workplaces and school locations may be available from data sets such as HAZUS (http:// www.fema.gov/hazus). • Land use data are available for the region being modeled; this experiment used data from the Census Transportation Planning Package (CTPP). Any data set meeting these limited assumptions should allow for the models in Experiment B to be applied. If fur- ther assumptions could be made regarding the input data (for example, “all household members are tracked and can be linked”), then the resulting person models could have more explanatory power; however, this was not done as part of this exercise so as to keep the models as general as possible. The overall process flow diagram for Experiment B was shown in Figure 3-2. The procedure starts with a data pro- cessing step (Stage 0 in Figure 3-2), which is used to derive a series of person-level travel and tour characteristics used as input to the later model stages. The demographic char- acterization procedure for the GPS traces then proceeds in three stages (Stages 1–3 in Figure 3-2). First, aggregate-level person-type clusters are developed for later use. These clus- ters serve as the dependent variable for the later stages, so that in the demographic characterization process, the major type of the person is selected first, then more detailed demo- graphics are modeled depending on the type of the person. In the second stage, the travel pattern data and local land use data are used to select one of the major person types. Finally, in Stage 3, all of the input data (i.e., the travel pattern data, land use data, and the selected major person type) are used to determine the various socio-demographic details of interest. The outputs of Experiment B include open-source com- puter code, which processes the mode segment data resulting from Experiment A into person-travel records, and a set of model files that can be applied to the processed travel records using various open-source modeling packages, including WEKA and BIOGEME. Finally, a set of findings regarding the use of these procedures, limitations of the procedures, and areas for further development are given. Data Processing The primary output of Experiment A (i.e., the mode segments, which aggregate the GPS traces into trip records) is used as the steps are taken, these methods can be considered ready for implementation, with some of them having already been implemented in large-scale GPS-based HTS projects. Regarding travel mode identification methods, it was found that neural networks should be used if calibration data are avail- able. If no calibration data exist, then the next best approach found was the one using fuzzy logic rules. As with the GPS cleaning and processing methods, mode identification results would benefit from additional consistency and logic checks to avoid using unlikely mode sequences, and also would benefit from analyst review. As for trip purpose identification, the methods evaluated here can be considered the most experimental ones and are likely further from being ready for implementation. However, they did show promising results once the purpose classifica- tion was simplified. Both evaluated methods performed well for mandatory and escorting purposes, but had difficulty dif- ferentiating between discretionary and maintenance activities. The biggest obstacles for the implementation of these methods are the availability of (1) detailed data on the households and persons within the households, (2) location information for locations most frequently visited by the household members, and (3) detailed land use and POI data. Experiment B: Demographic Characterization of GPS Traces Introduction Experiment B was performed to evaluate various methods for attaching person- and household-level information to travel patterns observed in GPS-based household survey data and other sources of anonymized GPS trace data. For this rea- son, the experiment was designed to be as general as possible. Consequently, very few assumptions were made about what would be available in the input data sets, and all of the models were estimated using the limited amount of information that can be derived using the methods in Experiment A. The more detailed activity-travel and socio-demographic information available from the HTS data sets was generally ignored or used for validation purposes only. This allows the models developed to be generally applicable to any source of GPS trace data. The methods developed here, then, are appropriate for situations where there is an interest in demographically characterizing mass anonymous GPS traces for further analysis; for example, identifying facility utilization by user classes and developing segmented origin–destination estimates. The methods devel- oped here are not intended to replace the HTS for purposes of travel demand model estimation. Since the process is in fact a simplified, inverted travel demand model, using the model results to estimate or calibrate further models would likely introduce substantial error.

72 Stage 1 analysis shown in Figure 3-2. During the completion of the first stage of Experiment B, several subtasks were per- formed. First, the tour and daily activity pattern variables extracted from the survey data were transformed into a set of primary factors using PCA. This was done to better understand the high levels of interdependence that existed between many of the variables. These factors were then clustered using the K-means clustering algorithm. Finally, the major person demographic types were developed using a partial decision tree classification algorithm (PART), with the travel pattern cluster membership distribution serving as the univariate dependent variable. The classification algorithm has the effect of choosing the major person types with the maximum differences in travel cluster membership distribution (i.e., the person types with the most differences in travel patterns). Using these person clusters helps with the Stage 2 analysis, since it was determined that the person types vary with the travel characteristics. So, when the travel characteristics are used as the independent variables in Stage 2, stronger differences in person clusters (now the depen- dent variable) are likely to be seen. A description of the most useful demographic clustering model found in the Stage 1 analysis is shown in Table 3-22. These clusters will serve as the primary dependent variables for the Stage 2 analysis. A variety of models are used to classify individuals in terms of the five primary person-level attributes, which are person/ worker type derived from the Stage 1 analysis, educational attainment, gender, possession of driver’s license, and age. The individuals observed in the GPS traces are also classi- fied as belonging to a set of household types, which incorpo- rate the household size, number of vehicles, and presence of children dimensions. A variety of modeling procedures from machine learning, choice modeling, and so forth were evalu- ated, and the final models are discussed in the following. The final models were selected for performance reasons, but also to give a representation of both joint and non-joint modeling methods, as well as regression-type models versus machine primary input to Experiment B. These data are approximated in the model estimation stage by using the Chicago Household Travel Survey trip record file and stripping out all information not found in the Experiment A results. The main variables in the input trip file are shown in Table 3-19. A sample of individuals who had recorded 2 days of travel where neither travel day was a weekend day was extracted from the full Chicago survey data for use in model estimation. This sample also excluded individuals who recorded no travel since these will not appear in GPS trace data. The data file described in Table 3-19 was then recreated from this sample. The demographic information for the sample was retained for model estimation and validation purposes. As can be seen in Table 3-19, only eight variables, three of which are identifiers, are used in the development of the person-type and person-attribute models. The five trip record variables are converted using the data processing routine to the set of person-level aggregate travel characteristics in Table 3-20. These routines first look for tour patterns in the trip records, based on the home and work anchor points, and repeated stops at the same destination. Then, the characteristics of the tours are determined, including the number and type of stops, the modes used on tour legs, and the length of time spent in activi- ties and traveling. The full set of derived travel characteristics is shown in Table 3-20. Finally, these derived travel characteristics are combined with a set of simplified land use variables derived from the CTTP at the census-tract level. These variables describe the built envi- ronment and basic land use characteristics such as employment, housing, and population density. The characteristics of the land use variables for the Chicago data set are listed in Table 3-21. These data sets, taken together, form the basis of the model estimation procedure described in the following section. Model Estimation The first task after data processing in Experiment B was the development of the primary demographic clusters in the Variable Data Type Description HH number Unique identifier Combined with person number to form unique ID Person number Unique identifier within HH Combined with HH number, if no household data, set to 1 Activity ID Unique identifier within person Activity record number for person Location type String Required location types: “home, work, school, other” Location ID Unique identifier Unique identifier for physical location Mode Integer 1–10 Walk, bike, drive, pass, transit, paratransit, taxi, school bus, carpool Duration Integer Trip duration Activity duration Integer Defined as the time spent at trip end Note: HH = household. Table 3-19. Experiment B data processing routine input variables.

73 Variable Name Description Avg Min Max total_tours Number of total tours per day 2.850 1 13 num_subtours Number of subtours per day 0.017 0 2 work_tours Number of work tours 0.743 0 4 school_tours Number of school tours 0.303 0 5 other_tours Number of other tours 1.804 0 13 avg_stops_per_tour Average number of stops per tour 2.351 1 12 avg_stops_per_work_tour Average number of stops per work tour 1.070 0 10 avg_stops_per_school_tour Average number of stops per school tour 0.308 0 8 avg_stops_per_other_tour Average number of stops per other tour 1.689 0 12 avg_tour_ttime Average travel time per tour 57.843 0 841 avg_work_tour_ttime Average travel time per work tour 32.233 0 1,050 avg_school_tour_ttime Average travel time per school tour 6.029 0 423 avg_other_tour_ttime Average travel time per other tour 35.903 0 1,160 at_home_duration Total time spent at home 1970.328 0 8,571 num_acts_work Number of work activities 0.775 0 6 total_dur_work Total duration of all work activities 377.092 0 2,878 avg_dur_work Average duration of work activities 208.874 0 1,439 num_acts_school Number of school activities 0.328 0 6 total_dur_school Total duration of all school activities 123.748 0 2,878 avg_dur_school Average duration of school activities 65.177 0 1,439 num_acts_pickdrop Number of pickup/drop-off activities 0.234 0 10 total_dur_pickdrop Total duration of all pickup/drop-off activities 1.418 0 90 avg_dur_pickdrop Average duration of pickup/drop-off activities 0.587 0 20 num_acts_other Number of other activities 3.920 0 32 total_dur_other Total duration of all other activities 290.168 0 2,878 avg_dur_other Average duration of other activities 79.994 0 1,439 auto_total Percentage of tours by auto mode 0.769 0 1 auto_work Percentage of work tours by auto mode 0.326 0 1 auto_school Percentage of school tours by auto mode 0.095 0 1 Table 3-20. Processed person-level travel characteristics. Variable Description Avg Min Max Transit use Percentage of residents using transit 0.122 0.00 0.61 Road density Miles/sq mile of road 17.248 2.12 43.66 Intersection density Intersections/sq mile 161.203 5.36 650 Block size Average block size (intersection density/road density) 0.108 0.05 0.40 Employment density Jobs/sq mile 4.237 0.01 68.42 Population density Inhabitants/sq mile 8.561 0.03 92.95 Housing density Housing units/sq mile 3.823 0.01 78.95 Table 3-21. Census-tract–level information for the Chicago data set from CTPP.

74 between educational attainment and work status. The educa- tion sub-models similarly conform to expectations. The full results are shown in Appendix E. The ordinal logit model for age categories excludes the retiree, child, and schoolchild person categories from input since these categories also define age categories (65+, >16, and <16, respec- tively). Therefore, the model applies to part-time and full-time workers as well as other persons. The ordinal logit model was selected because there is a natural ordering to the age categories, with the various person-travel characteristics shifting the prob- ability of being older (positive coefficients) or younger. For example, someone who lives in a densely populated, high- employment area and has more work, school, and pickup/ drop-off activities is likely to be younger. This is intuitive as younger individuals tend to work or be in school. Conversely, those living in high housing density areas and high transit use areas are likely to be older. See Appendix E for a complete specification. The gender binary logit model was necessary for the gen- der choice because very little in the daily travel patterns seems to discriminate between males and females, and the decision tree and rule-based classifiers tended to overestimate the pres- ence of females in the sample. Some minor differences exist in that one who has more pickup/drop-off activities, and more numerous—but shorter—work activities is more likely to be female. Meanwhile, males are more likely to have longer work activities, to live in higher population density areas, and to travel further for discretionary tours. However, the differences observed are relatively minor, making this a difficult category to predict. The possession of a driver’s license is another difficult trait to model since the non-possession of a driver’s license in adults is a somewhat rare characteristic, especially as children were excluded from the model. Such individuals represented less than 10% of the total sample. The PART decision list, while not providing the overall highest model fit, performs well at identifying the persons with no license, and since these persons are often of interest in travel demand modeling, this learning approaches. An overview of the modeling compo- nents and procedures is shown in Table 3-23. Actual model specifications are included in Appendix E. The first model estimated and applied when performing the demographic characterization process is the person-type choice model, which combines broad person-type categories (i.e., full-time worker, part-time worker, retiree, children, schoolchildren, and others) with basic educational attain- ment, including no high school, high school graduate, and college graduate. This is referred to as the Stage 2 model in Fig- ure 3-2. The model is estimated as a nested logit model jointly with the educational attainment to capture the substantial correlation between work/student status and education level. The person-type choice model was generally consistent with expectations. Higher numbers of work activities and longer duration of work activities are associated with being a worker, individuals with even longer durations being more likely to be employed full time. Those with more school activities are more likely to be either children or schoolchildren and are least likely to be retirees. The models show that land use characteristics are also related to work status, to some extent. Workers and children are more likely to live in high-density employment areas with lower housing density, while part- time workers, retirees, and others have more likelihood to live in denser housing areas, probably because these person types are less likely to live in family units. Finally, the inclusive value terms of the joint model demonstrate the strong correlation Class Description % of Sample 1 Part-time workers 10.4% 2 Full-time workers 45.0% 3 Retirees 13.0% 4 Young children 6.4% 5 School children 11.8% 6 Others 13.4% Table 3-22. Optimal person-type clusters. # Variable Values Model Procedure 1 Person/worker status Part-time worker, full-time worker, retiree, child, schoolchild, other person type Nested logit 2 Educational attainment No high school, high school, college with above 3 Age 0–16, 16–25, 25–45, 45–65, 65+ Ordered logit 4 Gender Male, female Binary logit 5 License Yes, no PART decision list 6 HH size 1, 2, 3+ J48 tree 7 Num vehicles 0, 1, 2+ with above 8 Has children No, yes with above Note: HH = household. Table 3-23. Model components and procedures.

75 Model Application The various models estimated for the person and house- hold characteristics have been applied to both the Chicago data set, which was used as the training data, and the Portland household survey data set. The latter was processed using the data processing procedure and combined with the census- tract land use variables. The fit results for each model are dis- cussed in the following paragraphs. To evaluate the performance of the person-type model, the prediction matrices for the Chicago (CMAP) training data set and the unused test data set from the Portland household (HH) survey data were used (see Table 3-24). These results are from a probabilistic application of the model, where the person type is assigned to an individual randomly accord- ing to the modeled probability distribution. The table first shows the confusion matrix, which contains the counts of correctly and incorrectly classified examples. The second half of the table then shows the percentage correctly predicted for each person-type category, for both the null model and the probabilistic selection. The null model results are obtained by assigning each observation to a category with a probability criterion was important in the selection of the final model. The decision rules show that the individuals least likely to have driver’s licenses are those who make relatively few tours (≤3) and have short durations for their discretionary activi- ties. This is intuitive since those without driver’s licenses are more likely to be constrained in their ability to engage in multiple tours and likely have to plan tours to work around mobility constraints. Conversely, individuals with long work durations and, naturally, those observed making trips by auto mode are most likely to have a driver’s license. The rules in the decision list are shown in Appendix E. Finally, a joint household characteristics model was esti- mated based on the individual tour patterns. The household- type model was estimated jointly for household size, number of vehicles, and presence of children. A joint model was selected here to maintain consistency between models since the selection of household size, for example, implies that cer- tain values of the presence of children and number of vehicles variables are not available. The household-type joint model was estimated using the J48 decision tree to demonstrate joint modeling with a machine learning approach. The full tree is shown in Appendix E. CMAP – TRAINING RESULTS (67% split) Prediction Results Simulated Person Type Null model % correct Model % correct1.part 2.full 3.retiree 4.child 5.student 6.other Total % O bs er ve d Pe rs on T yp e 1 53 155 44 17 19 45 333 10% 1% 16% 2 152 1,177 87 23 15 73 1,527 45% 20% 77% 3 43 80 156 39 27 113 458 14% 2% 34% 4 19 18 37 37 50 55 216 6% 0% 17% 5 20 14 21 53 205 70 383 11% 1% 53% 6 48 82 113 47 66 109 465 14% 2% 23% Total 335 1,526 458 216 382 465 3,382 27% 51% % 10% 45% 14% 6% 11% 14% TEST RESULTS – PORTLAND HH SURVEY (100%) Prediction Results Simulated Person Type Null model % correct Model % correct1.part 2.full 3.retiree 4.child 5.student 6.other Total % O bs er ve d Pe rs on T yp e 1 152 464 184 91 83 192 1,166 10% 1% 13% 2 430 3,267 331 144 72 294 4,538 41% 17% 72% 3 110 184 413 181 83 312 1,283 12% 1% 32% 4 45 44 94 100 145 134 562 5% 0% 18% 5 88 27 50 206 775 268 1,414 13% 2% 55% 6 214 341 492 285 328 508 2,168 19% 4% 23% Total 1,039 4,327 1,564 1,007 1,486 1,708 11,131 25% 47% % 9% 39% 14% 9% 14% 15% Table 3-24. Probabilistic application of the person-type model.

76 constants in the model are fitted to the Chicago data. How- ever, a calibration process could be undertaken when apply- ing the estimated model to other areas where the alternative specific constants for each person type could be adjusted until a known distribution of the person types is matched. In the case of the Portland model, however, this is not particularly necessary since the distributions for Chicago and Portland are fairly similar. The deterministic results shown in Table 3-25 are obtained by assigning the category with the highest probability for each example. The deterministic model results show similar char- acteristics to the probabilistic results, with the training model using the Chicago survey data somewhat outperforming the Portland model, as expected, although the difference in over- all percent correctly predicted is not substantial. While the fit results appear better under the deterministic model applica- tion, as is often the case with this type of selection process, there is a substantial distortion in the simulated person classifica- tion distribution when compared to the observed distribution, which is why the deterministic assignment is not preferred when performing the demographic characterization process. The educational attainment variable is estimated jointly with the person classification in the nested logit model formulation described previously. Once the results in the person classifi- cation process are obtained, the final educational attainment equal to the observed distribution (i.e., if 45% of people are full-time workers, then each observation has a 45% chance of being a full-time worker). The expected percent correct for the null model for each category is then the square of this value. The probabilistic assignment lowers the performance of the model somewhat as compared to a deterministic applica- tion of the model (i.e., where the highest probability category is always assigned for each observation) but produces more real- istic distributions and provides better fit to infrequent classes. For comparison purposes, the deterministic application results are also shown in Table 3-24 and Table 3-25. The results show that the person-type choice model, which forms the core of the demographic characterization process, is generalizable at least to the Portland data set, where it per- forms approximately as well as on the training data set estima- tion (Chicago model). This is significant since the Portland data were not used in model estimation, and it shows that the model is not likely to have been overfitted and is potentially transferable. In each case, the model can correctly predict between 47% and 51% of person types correctly, which for both applications is substantially higher than the null model expectation of 25% and 27%. One observation from the test results is that the probabilistic application of the model for Portland does not exactly replicate the observed distribution as in the training results for the Chicago model, since the CMAP – TRAINING RESULTS (67% split) Prediction Results Simulated Person Type Null model % correct Model % correct1.part 2.full 3.retiree 4.child 5.student 6.other Total % O bs er ve d Pe rs on T yp e 1 15 181 93 2 17 25 333 10% 0% 5% 2 4 1,333 158 4 1 27 1,527 45% 45% 87% 3 50 401 2 1 4 458 14% 0% 88% 4 3 6 95 13 59 40 216 6% 0% 6% 5 9 49 2 300 23 383 11% 0% 78% 6 2 62 271 10 66 54 465 14% 0% 12% Total 24 1,641 1,067 33 444 173 3,382 45% 63% % 1% 48% 32% 1% 13% 5% TEST RESULTS – PORTLAND HH SURVEY (100%) Prediction Results Simulated Person Type Null model % correct Model % correct1.part 2.full 3.retiree 4.child 5.student 6.other Total % O bs er ve d Pe rs on T yp e 1 24 521 442 14 68 97 1,166 10% 0% 2%2 24 3,634 756 24 14 86 4,538 41% 41% 80% 3 3 152 1,090 6 1 31 1,283 12% 0% 85% 4 1 28 245 34 177 77 562 5% 0% 6% 5 2 16 135 25 1,167 69 1,414 13% 0% 83% 6 17 309 1,250 47 333 212 2,168 19% 0% 10% Total 71 4,660 3,918 150 1760 572 11,131 41% 55% % 1% 42% 35% 1% 16% 5% Table 3-25. Deterministic application of the person-type model.

77 The results of the gender classification model for Chicago and Portland are shown in Table 3-28. The gender classification model was a simple binary logit model. The results show that the model only slightly outperforms the null model for both cases, demonstrating that gender differences are not strongly reflected in differing travel patterns. The model does a better job of predicting classification as female, possibly reflecting the existence to a certain extent of unique travel pattern identifiers for females. The last person-level classification model is for the pos- session of a driver’s license, which is modeled using a PART decision rule set. The model is applied to all individuals in the sample over age 16. The results for Chicago and Portland are shown in Table 3-29. The possession of a driver’s license is dif- ficult to model since it is an unbalanced distribution, with the vast majority of individuals in the sample possessing a license. The results in the table show the difficulty of predicting which samples do not have licenses based only on observed travel patterns. It is relatively easy to identify which individuals have licenses, especially if the mode classification is included from Experiment A. However, individuals not having a license will appear similar to individuals who have a license and choose to use public transport, individuals who happened to not travel much on the survey day, and so forth. The model, how- ever, does improve on the null model for both training and test data sets, and is highly sensitive to identifying travelers with licenses. model, which is conditional based on the person-type classifica- tion, is applied. The fit results are shown in Table 3-26. The model results for the educational attainment classifica- tion show both the Chicago training model and the Portland test model performing very well when compared to the null model results, with a prediction potential of 56% versus 39%. The models both perform very well in identifying individuals without a high school degree and individuals with a college degree, although the model has some trouble identifying individuals with only a high school degree. Next, the results for the age categorization model are shown in Table 3-27 for the Chicago and Portland data sets. The age categorization model classifies the sample data into five broad age categories, which are children (0–16), young adults (17–25), young middle age (26–45), older middle age (46–65), and seniors (66+), using ordinal logit regression. The results show the model performing reasonably well, with an approxi- mately 50% improvement over the null model. Interestingly, the model performs marginally better for the test application using Portland data, although the differences are slight. The results show that the children and middle-age categories are relatively easy to predict, while the young adult category is difficult, which is expected since it is the most infrequently observed category. Additionally, when the classification for age is incorrect, it is generally only off by one category in either direction, with over 75% for both training and test applications within one level of the observed category. CMAP – TRAINING RESULTS (67% split) Prediction Results Simulated Education Level Null model % correct Model % correct No High School High School College Total % O bs er ve d No High School 489.2 120.0 150.8 760.0 22% 5% 64% High School 120.0 250.9 464.1 835.0 25% 6% 30% College 173.1 467.8 1146.1 1,787.0 53% 28% 64% Total 782.3 838.7 1,761.0 3,382.0 39% 56% % 23% 25% 52% PORTLAND – TEST RESULTS Prediction Results Simulated Education Level Null model % correct Model % correct No High School High School College Total % O bs er ve d No High School 1,853 314 357 2,524 23% 5% 73% High School 569 817 1,446 2,832 25% 6% 29% College 767 1,560 3,448 5,775 52% 27% 60% Estimated 3,189 2,691 5,251 11,131 39% 55% % 29% 24% 47% Table 3-26. Educational attainment model results for training and test data.

78 Table 3-27. Ordered logit age category model results for training and test data. CMAP – TRAINING RESULTS (67% split) Prediction Results Simulated Age Category Null model % correct Model % correct 0–16 16–25 25–45 45–65 65+ Total % O bs er ve d 0–16 314 46 88 93 46 587 17% 3% 53% 16–25 80 14 39 45 19 197 6% 0% 7% 25–45 65 48 221 315 136 785 23% 5% 28% 45–65 68 59 315 519 251 1,212 36% 13% 43% 65+ 28 22 128 266 157 601 18% 3% 26% Total 555 189 791 1,238 609 3,382 25% 36% % 16% 6% 23% 37% 18% TEST RESULTS – PORTLAND HH SURVEY (100%) Prediction Results Simulated Age Category Null model % correct Model % correct 0–16 16–25 25–45 45–65 65+ Total % O bs er ve d 0–16 1,338 155 257 249 118 2,117 19% 4% 63% 16–25 253 45 128 166 77 669 6% 0% 7% 25–45 233 152 683 948 407 2,423 22% 5% 28% 45–65 253 211 1,086 1,732 810 4,092 37% 14% 42% 65+ 118 73 408 795 436 1,830 16% 3% 24% Total 2,195 636 2,562 3,890 1,848 11,131 25% 38% % 20% 6% 23% 35% 16% CMAP – TRAINING RESULTS (67% split) Prediction Results Simulated Gender Null model % correct Model % correct Male Female Total % O bs . Male 768 807 1,575 47% 22% 49% Female 807 994 1,801 53% 28% 55% Total 1,575 1,801 3,376 50% 52% % 47% 53% PORTLAND – TEST RESULTS Prediction Results Simulated Gender Null model % correct Model % correct Male Female Total % O bs . Male 2,294 3,001 5,295 48% 23% 43% Female 2,418 3,400 5,818 52% 27% 58% Total 4,712 6,401 11,113 50% 51% % 42% 58% Table 3-28. Gender classification model results for training and test data.

79 substantial increase, showing that the presence of children is related to differences in travel patterns, as expected. The remaining sub-models for person attributes perform similarly. Fit results for the remaining models are shown in Appendix E, along with the model estimates. Findings The results of the Experiment B demographic characteriza- tion process appear promising in that the models generally show substantial improvement over null model expecta- tions, appear to be transferable to some degree, and are able to generate consistent person-characteristic estimates. These results are significant in light of the minimal data used as input to the demographic characterization process, meaning that as more data from detailed land use databases become available and larger and longer-duration GPS trace collec- tion procedures are developed, the models could be improved substantially. Several key findings were observed in the performance of the demographic characterization experiment that can help improve the application of such a procedure and can help to guide the data collection process used to gather input traces. • Multiday data collection is preferable to single-day data collection since it helps to average out intrapersonal day- to-day variation, which can be greater than interpersonal variation. This variability tends to confound the demo- graphic characterization procedure (e.g., if a worker is sur- veyed on a nonworking day, the person will be very hard Finally, the household-type joint model shows a similar pattern in performance as the person-level models, with the model performing nearly as well in Portland as it did in Chi- cago. The fit results can be seen in Table 3-30. Note here that the full misclassification matrix is not shown for the joint model since it is an 18 × 18 matrix of values and is difficult to interpret. Therefore, only the prediction potential versus null model is shown, which is equivalent to the right-hand side of the tables for the personal characteristics. The overall fit, which represents the number of observations with all three variables predicted correctly, is 45% for Chicago against 40% in Port- land, both of which are higher than the null model expectation of 35%. The difference here is less substantial than in the other models, but this is as expected since fitting three categories simultaneously is a difficult task. Individual attribute models perform somewhat better than the null model expectation for Chicago, although there is no difference for the household size and number of vehicles categories when applied to Portland. However, the identification of the presence of children has a CMAP – TRAINING RESULTS (67% split) Prediction Results Simulated Null model % correct Model % correct Yes No Total % O bs . Yes 2,497 44 2,541 91% 83% 98% No 151 91 242 9% 1% 38% Total 2,648 135 2,783 84% 93% % 95% 5% PORTLAND – TEST RESULTS Prediction Results Simulated Null model % correct Model % correct Yes No Total % O bs . Yes 8,066 210 8,276 92% 84% 97% No 651 87 738 8% 1% 12% Total 8,717 297 9,014 85% 90% % 97% 3% Table 3-29. Possession of driver’s license classification model results for training and test data. Training – Chicago Test – Portland Model % Null % Model % Null % % correct overall 45% 35% 40% 34% % correct hh size 56% 51% 55% 56% % correct # vehicle 73% 68% 71% 72% % correct has children 74% 43% 68% 58% Table 3-30. Training versus test model fit results for joint household-type model.

80 the reverse case. In other words, people travel the way they do because of who they are, but people are generally not who they are because of the way they travel (with some exceptions, such as with possession of a license or vehicle ownership). • A significant problem is that some person types are virtu- ally indistinguishable based on travel characteristics alone (e.g., a young child’s travel pattern, naturally, looks much like the caretaker’s pattern). This is especially true for short- term data collection. For example, part-time and full-time workers are rarely distinguishable in short-term collection efforts (most part-time workers work full shifts for fewer days). • Finally, the joint modeling of attributes is difficult but pro- vides benefits over modeling attributes separately, outside of just consistency, which is also important. to identify as a worker). This finding aligns with previous research (Pas and Sundar 1995). • It is also important to have reasonable estimates of work- place and school locations, either from access to more detailed location databases or from longer-term observa- tion that can identify recurrent travel patterns. • If it was possible to ensure that all household members were tracked and linked together, much better estimates of certain person and household characteristics could be made. This would especially help since the joint trip-mak- ing travel characteristics tended to be significant in early versions of the model effort, which did not conform to the assumptions made in Experiment A. • The causality between travel patterns and personal char- acteristics here runs counter to how the modeling is usu- ally done and, as such, appears to be much weaker than in

Next: References »
Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests Get This Book
×
 Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB’s National Cooperative Highway Research Program (NCHRP) Report 775: Applying GPS Data to Understand Travel Behavior, Volume I: Background, Methods, and Tests describes the research process that was used to develop guidelines on the use of multiple sources of Global Positioning System (GPS) data to understand travel behavior and activity. The guidelines, which are included in NCHRP Report 775, Volume II are intended to provide a jump-start for processing GPS data for travel behavior purposes and provide key information elements that practitioners should consider when using GPS data.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!