Cover Image

Not for Sale

View/Hide Left Panel
Click for next page ( 145

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 144
144 A Guidebook for Using American Community Survey Data for Transportation Planning commutes.102 See Section 9.4 for specific examples that illustrate how census data have been used to do travel demand modeling. 9.2 Benefits and Limitations of ACS for Travel Demand Modeling In discussions with transportation planners, the following potential benefits of ACS were identified: The availability of data on a continuous basis provides opportunities for more frequent updates of travel demand models (in particular, the base-year socioeconomic and demo- graphic data). Again, the sparse sample size and resulting standard errors in the data during any particular year could limit the potential opportunities. If PUMS data are made available from ACS on a continuous basis, this has positive implica- tions for tracking regional changes taking place in regions experiencing accelerated growth. In selecting demographic variables for use in developing models, usually these variables are restricted to those found in the census databases, so that the models can be applied to the full pop- ulation using the joint distributions from the census data. Because ACS will enhance trend analy- sis and ACS data are available continuously, more variables could be forecast into the future, and it is likely that more demographic variables could then be included in travel demand models. ACS should not cause problems with model estimation since decennial census data (except for PUMS) are generally not used for parameter estimation of work trip travel models due to the aggregate nature of the data. The following ACS issues were identified as limitations for travel demand modeling: Theoretically, five-year cumulative averages are inconsistent with models that are used to predict at a point in time (such as trip distribution models or mode choice models), and model calibration/validation could be problematic. However, practical ways could be found to over- come this limitation. For example, the validation of work mode choice models occurs at a coarse level of geography, and changes in household characteristics and mode choices over five years would probably not be very significant. As discussed above, the cumulative averaging of ACS model inputs is inconsistent with most travel demand models in use. In addition, the larger standard errors associated with ACS parameters, compared to decennial census Long Form parameters, increase the variability and error of models that rely on these data. In most instances, travel demand modelers treat cen- sus Long Form data as simple point estimates and do not acknowledge in their model systems that these data are subject to sampling and non-sampling error. In migrating to using ACS data, it will be more difficult to make these simplifying assumptions. Origin-destination matrices developed from journey-to-work data can be problematic. Prob- lems include: sparse data from a single year for small geographies, seasonality if data are aver- aged over a year, and changes in development of new housing and business locations over three to five years if rolling averages are used. 9.3 Travel Demand Modeling Case Studies The following case studies illustrate how a data user might use ACS data to support travel demand modeling efforts. The case studies provide a step-by-step description of how one might obtain the data, do the computations, and present the results. 102 North Jersey Transportation Planning Authority, 2003. "Journey-to-Work Data: Census 2000 County-to- County Worker Flow Data for the NJTPA Region."

OCR for page 144
Travel Demand Modeling Analyses Using ACS Data 145 For this purpose, assume that you are a transportation analyst working in an MPO. In the first analysis, your manager has asked you to estimate an auto ownership model. In the second analy- sis, you are asked to validate a trip distribution model. Section 3 of this guidebook has detailed instructions on downloading ACS data. 9.3.1 Analysis 1--Estimation of Auto Ownership Model This case study illustrates how an auto availability model can be estimated using ACS data. Auto availability models can be estimated using disaggregate household data. The ACS PUMS is a great source of disaggregate detailed household data with information on household income, size, workers, type, etc. One limitation of the data is that the household residence is not available at a geographic level fine enough to allow the use of accessibility measures and land-use density in the model. The geographic level that is normally reported is the PUMA, but the ACS PUMS that currently are available on the American FactFinder website have state-level data. The model estimated in this case study builds on the work done previously103 to estimate automobile own- ership models for the Bay Area and the San Diego region. This exercise requires estimating an auto availability model for the state of California using recent ACS PUMS data from years 2000 to 2003. To present the results of the auto availability model exercise, it is important to show the alternatives modeled, explanatory variables used, parameter estimates, statistical significance of the variables, and model fit. This information is displayed in Table 9.1. NOTE: T-statistics appear in parentheses; parameters are set to zero in the model specification. Table 9.1. Magnitudes of variable coefficients. Alternative Variable 0-Vehicle 1-Vehicle 2+ Vehicles -0.75 -2.74 Constant (t = -37.13) - (t = -132.18) 0.34 Persons in household -- -- (t = 54.42) 0.17 0.93 Workers in household -- (t = 10.23) (t = 52.73) 1.03 1.61 Income: $ 35,000 Income < $ 70,000 -- (t = 33.0) (t = 49.66) 0.93 2.26 Income $ 70,000 -- (t = 22.48) (t = 54.82) 0.66 1.89 One-family house -- (t = 27.06) (t = 73.20) Model Statistics Likelihood with Zero Coefficients -151,351.42 Likelihood with Constants only -118,021.73 Final value of Likelihood -87,412.02 "Rho-Squared" with respect to Zero 0.4225 "Rho-Squared" with respect to Constants 0.2594 Note: - Parameters set to zero in the model specification. Rho-squared is a goodness-of-fit measure for discrete choice models that is analogous to R-squared in regression analysis. 103 C. Purvis, "Using 1990 Census Public Use Microdata Sample to Estimate Demographic and Automobile Own- ership Models," Transportation Research Record 1443, TRB, National Research Council, Washington, D.C., 1994.

OCR for page 144
146 A Guidebook for Using American Community Survey Data for Transportation Planning In addition, the following conclusions can be associated with this analysis: The alternative-specific constants of the zero and two-plus alternatives are negative, which indicates that, if all else is the same, households are more likely to have one vehicle available. Moreover, a household is more likely to have zero than two vehicles. As household size increases, a household is more likely to have two or more vehicles available. The presence of workers in the household increases the utility of having one or more vehicles available, and the effect is stronger for the two-plus vehicle category. Relative to low-income households, medium- and high-income households are more likely to have one or more vehicles available, and the effect is stronger for the two-plus vehicle category. Finally, the effect of dwelling type on vehicle availability is that one-family houses are more likely to have one or more vehicles available than zero vehicles, and are more likely to have two-plus vehicles than one vehicle. Available Data The auto availability model that is estimated in this case study is for the state of California. Four years of ACS data, 2000 through 2003, are pooled to increase the sample size. Note that the pooling of the household records from these four years does not cause any correla- tion problems in the estimation since the four samples will not have overlapping housing units.104 The ACS PUMS data are composed of a household file and a person file. Even though the vehicle availability model is estimated at the household level, the person file also provides some characteristics that can be used in the model, such as number of workers in the household. To download the data, the user should follow the steps below: Go to the American FactFinder website at html?_lang=en, Click on the Datasets tab, Select the "2000-2003 American Community Survey" tab, Click on the link leading to "Public Use Microdata Sample (PUMS)", and Click on a certain year (e.g., 2003) and then download the person records ("P" records) and household records ("H" records) for a selected state. Analysis Steps The 2000-2003 household records are pooled into one sample for estima- tion, removing those records that correspond to vacant units or to group quarters. The follow- ing variables from the household file are retained for use in the estimation: number of vehicles in the household (variable is VEH), number of persons in household (variable is NP), household income (variable is HINCP), and type of residence (variable is BLD). In addition, the person files are used to obtain the number of workers in the household (variable is COW). Two types of adjustment factors are applied to the income variable, as follows:105 For a given ACS PUMS year, the first adjustment factor is a value that is applied to all obser- vations obtained from that year. This factor is included in the PUMS datasets and is called ADJUST. The reason this adjustment is needed is because interviews in the ACS were con- ducted throughout the year. Application of the adjustment factor will convert dollar amounts to July (of the given year) dollars.106 The second adjustment factor is needed because ACS PUMS data from years 2000-2003 are used in this case study. When working with dollar amounts from different years, it is necessary to convert the amounts into dollars from a common year (after applying the adjustment factor 104 No housing unit will be sampled more than once in a five-year period. 105 Correspondence with Nicholas Spanos of the Census Bureau, May 27, 2005. 106 Note that the value of ADJUST is the same for all sample cases. This is for disclosure avoidance reasons, that is, so that the month of interview cannot be identified by the adjustment factor. The original dollar amounts were adjusted so that one value of ADJUST could be used for all sample cases.

OCR for page 144
Travel Demand Modeling Analyses Using ACS Data 147 described in the previous paragraph). The CPI-U-RS adjustment factors from the Bureau of Labor Statistics are used.107 The number of household observations used in the estimation is equal to 137,766. The alter- natives are zero, one, and two-plus vehicles. The model estimation exercise consists of iteratively selecting explanatory variables; running the model through model estimation software; exam- ining the magnitudes, signs, and t-statistics of the coefficients and overall model fit; and adjust- ing the selected variables accordingly. 9.3.2 Analysis 2--Validation of a Trip Distribution Model This exercise requires validating the trip distribution gravity model of a county-level travel demand model system by comparing model results to observed data. To present the results of this model validation exercise, it is useful to show two types of com- parison. The first is a comparison of number/percentage of trips from a given origin to all des- tinations (e.g., at the district level). The second is a comparison of county-level mean travel time and travel time distribution. Each of these comparisons can assist in adjusting the coefficients of the gravity model if large discrepancies exist between modeled and observed travel times. For example, Table 9.2 shows the number and percentage of trips from District 1 to all other districts using the 2000 ACS and the gravity model, as well as the difference between the two sources. The table shows that Overall, the number of trips originating from District 1 is under simulated. In terms of distribution of trips, the largest discrepancies occur with the intradistrict flow (to District 1 at -4.5 percent) and the flow to District 2 (at 5.4 percent). Figure 9.3 compares the travel time distribution obtained from the gravity model to the ACS reported travel time distribution. The figure shows that the model under predicts short trips and over predicts long trips. Available Data Two data sources are available to do this analysis. The first data source is the trip distribution model outputs in terms of number of trips and travel time skims by origin-des- tination pair. The second data source is ACS, which provides data on worker flows between every origin-destination pair (assuming a CTPP-like product from ACS is available) and reported travel time data for these origin-destination pairs. Table 9.2. Comparison of worker trips from a given district to all other districts using ACS and the gravity model. Origin: District 1 ACS Gravity Model Gravity Model ACS To District Flow Percentage Flow Percentage Flow Percentage 1 36,545 63.0 30,000 58.5 -6,545 -4.5 2 14,945 25.8 16,000 31.2 1,055 5.4 3 2,705 4.7 2,000 3.9 -705 -0.8 4 1,750 3.0 1,500 2.9 -250 -0.1 5 2,070 3.6 1,800 3.5 -270 -0.1 Total 58,015 100 51,300 100 -6,715 -0.1 107 These factors can be found at the following URL: [For example, to express year 2000 dollars in terms of 2003 dollars, multiply the 2000 dollars by 267.9/250.8 = 1.06818182].

OCR for page 144
148 A Guidebook for Using American Community Survey Data for Transportation Planning Trips (in Percent) 25 ACS Gravity Model 20 15 10 5 0 0-5 5-10 10-15 15-20 20-25 25-30 30-35 35-40 40-45 45-60 60-75 75-90 More Travel Time (in Minutes) Figure 9.3. Travel time distributions from ACS and the gravity model. Note that in addition to these data sources, one also can use the household travel survey, where respondents' trip origins and destinations can be geocoded and the corresponding travel time skims used to derive an observed travel time distribution. An origin-destination survey, if avail- able, also can be a valuable data source for validating trip interchanges. For this case study, the available ACS data (obtained from San Francisco County records) were at the tract-to-tract flow level. They were aggregated to the district level producing the flows in Table 9.3. Note that the gravity model numbers presented in this case study are fictitious. Analysis Steps The following four steps are involved in conducting this analysis: 1. Select the level of geography at which the validation of flows should be conducted. The selec- tion depends on the model type (e.g., statewide model, countywide model, etc.) and the desired level of accuracy. For this analysis, the validation of flows is conducted at the district- to-district level. 2. Aggregate the flows from the available geographic level detail to the desired geographic level. For example, the ACS flow data are available for this case study at the tract-to-tract level. A correspondence table between tracts and districts is used to derive the district-to-district flows presented in Table 9.3. 3. Since ACS flows correspond to the home-to-work direction only, the model home-based work flows, which combine both the home-to-work and work-to-home directions, should be divided in half to be comparable to the ACS data. 4. Use the tract-to-tract reported travel times to derive a travel time distribution from ACS and compare it to the model distribution. The following caveats regarding using ACS data for the validation of trip distribution should be noted: The ACS travel times are reported travel times; hence they inherently suffer from respondent rounding and inaccuracy. Because of confidentiality issues, ACS flow data might be suppressed for origin-destination pairs that do not meet the threshold for data tabulation. This might cause inaccuracies when comparing ACS flows to model flows. ACS measures worker flows rather than trips; ACS does not account for absenteeism or for mul- tiple job locations. These factors can cause additional differences between ACS and model results.