Read "A Guidebook for Using American Community Survey Data for Transportation Planning" at NAP.edu

« Previous: Chapter 7 - Transportation Market Analyses Using ACS Data

Page 131

Suggested Citation:"Chapter 8 - Survey Development and Analysis Using ACS Data." National Academies of Sciences, Engineering, and Medicine. 2007. A Guidebook for Using American Community Survey Data for Transportation Planning. Washington, DC: The National Academies Press. doi: 10.17226/13895.

Page 132

Page 133

Page 134

Page 135

Page 136

Page 137

Page 138

Page 139

Page 140

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

This section describes survey development and analysis as common applications of Census Bureau data and shows the potentially important transportation planning uses of ACS. Section 8.1 defines the different components of survey development and analysis, describes how census data can be used for survey development, and presents some specific examples of census data used for this purpose. Section 8.2 describes some benefits and limitations of shifting from cen- sus to ACS data related to survey development and analysis. Finally, Section 8.3 provides a case study that shows how to do survey expansion using ACS data and how the results compare to survey expansion using Census data. 8.1 Survey Development and Analysis Census data are used in multiple aspects of household travel survey development efforts, including sample design, survey expansion, and survey validation. â¢ Sample design is the process of determining the sample size needed to achieve a certain level of confidence, and/or the categories that should be used in a stratified sample design. â¢ Survey expansion is the process of creating weights for survey responses and applying these weights to expand the survey to the entire population of a given area so as to adjust for sam- pling and non-sampling errors. â¢ Survey validation is the process of checking variables that were not controlled in the sample design (or expansion) for biases (e.g., checking whether income groups are well represented in a sample designed by vehicle ownership and household size.95) 8.1.1 Using Census Data for Survey Development Census data assist in survey development and analysis in several ways, including the following: â¢ Census data are used to estimate the incidence of households with certain characteristics within a geographic area, so that the cost of reaching less common groups and completing cer- tain minimum numbers of households within those groups can be assessed. Census data also help determine the number of categories needed in a stratified sample design. For example, analysts use the data to determine which auto ownership categorization makes the most sense for sample stratification and modeling. â¢ Census data are critical for providing control totals needed in survey expansion efforts. Sur- vey expansion based on variables such as population or households is often performed using 131 C H A P T E R 8 Survey Development and Analysis Using ACS Data 95 U.S. DOT and Bureau of Transportation Statistics, 1996, âImplications of Continuous Measurement for the Uses of Census Data in Transportation Planning.â

the decennial census Short Form data. If variables such as auto ownership or income are used in the expansion (as is usually the case), they are obtained from census products based on the Long Form (such as SF3 or CTPP). â¢ Census data also are used in validating travel surveys by providing information on the distri- butions of various socioeconomic, demographic, and journey-to-work characteristics that can be used for the validation of an expanded survey. 8.1.2 Specific Examples of Use Almost all household travel surveys have been expanded using census data. Among the sam- ple expansion variables that have been used in some recent surveys are the following: â¢ Year 2000 home interview survey conducted by the Metropolitan Council, St. Paul-Minneapo- lis, Minnesota, was expanded by number of households, household size, and vehicle availability. â¢ Year 2000 home interview survey conducted by Memphis MPO was expanded by auto avail- ability, income, and household size. â¢ Year 1990 household travel survey conducted by the Chicago Area Transportation Study was expanded by household size and vehicle availability. â¢ Year 1990 San Francisco Bay Area household travel survey was expanded by superdistrict of residence, household size, vehicle availability, and tenure. â¢ Household travel survey conducted by the Central Transportation Planning Staff, Boston, was expanded by vehicles available, household income, and workers in household. â¢ The household survey conducted by the Denver Regional Council of Governments was expanded using household size and income variables at the county level from the census. Many household travel survey efforts also have relied on census data to confirm and validate the results of the surveys. One example of household travel survey validation is the validation done at the Metropolitan Council (St. Paul-Minneapolis,), where CTPP Part 1 data were used to check the mode split from the home interview survey done concurrently with the census, and CTPP Part 3 data were used to validate the survey home-based-work trip distribution. 8.2 Benefits and Limitations of ACS for Survey Development and Analysis This section summarizes the perceived benefits and limitations of using ACS data for survey design, expansion, and validation. Transportation planners who were asked about the potential use of ACS estimates to support travel survey efforts envisioned that because measurement is done continuously with ACS, it will be easier to conduct surveys and expand them using more recent data at any point in time (e.g., mid-decade) since they can be expanded by the large-area data from ACS. With the decen- nial census, the data used for sample design and expansion are often either extrapolated or out of date. Although ACS can potentially provide more current data for survey expansion, the use of multiyear averaged variables for survey expansion could pose problems. It is expected that the multiyear ACS estimates for household income will be particularly difficult to interpret. If house- hold characteristics change over a five-year period as household income is likely to, then ACS average data might be inconsistent with other demand-side data sources and with household travel surveys conducted during fixed periods of a few weeks. In addition, the higher sampling error associated with ACS estimates will increase the level of uncertainty in the development of household survey expansion targets. 132 A Guidebook for Using American Community Survey Data for Transportation Planning

8.3 Case Study The purpose of this section is to illustrate how ACS data may be used to weight data collected through a typical household travel survey process. The case study is based on an actual recent regional travel survey conducted in 1999 by a national survey firm for the Mid-Ohio Regional Planning Council (MORPC). There are three sections to this case study, as follows: 1. The weighting process used to adjust the 1999 Mid-Ohio Regional Travel Survey data is sum- marized. A significant drawback of using decennial census data for analyses like the weight- ing of surveys is the infrequency of the data releases. When the 1999 survey was conducted, the agency needed to initially rely on census data from 1990 for the weighting. 2. The application of new weights based on Census 2000 data are summarized. Once the 2000 data became available, the survey data could be reweighted to better reflect the population at the time of the survey in 1999. 3. Finally, the weights for Franklin County, Ohio are recalculated, using both the Census 2000 data as well as ACS data.96 Assume that you have been asked to develop new weights for the 1999 household travel sur- vey using Census 2000 data and 2000 ACS data and to compare the two sets of weights. This survey sampled 5,418 households to provide data for the continuing development and refinement of the regionâs travel demand forecasting model, as well as to provide a better under- standing of travel behavior in the Central Ohio region. Resultant data were used to fulfill the modelâs functions of estimating trip generation and distribution, mode choice, and assignments. The 1999 Mid-Ohio Household Travel Survey, like many recent household travel surveys, relied on the willingness of area residents to record their daily travel for a specific 24-hour period. Households were recruited into the study by telephone, then were mailed personalized materi- als to aid in recording travel details, and finally were recontacted by telephone for retrieval of the travel data. The survey was conducted from February through June 1999 in seven Central Ohio counties: Delaware, Fairfield, Franklin, Licking, Madison, Pickaway, and Union. During this time, 7,333 households agreed to participate in the study and 5,418 actually provided travel details for all household members.97 Although the household travel survey sample was considered to be a fairly good representation of households from the seven-county region, weights were developed and applied to adjust the data for unequal response rates for households of different types across the study area and to account for coverage bias resulting from the use of a telephone survey method. Key demographic variables used to weight the household survey data to better reflect the full population of households in the survey area included household size, vehicle ownership, tele- phone ownership, and county populations. Table 8.1 shows weighting factors based on Census 2000 data. The weights shown are factors that analysts would apply to the households with the different characteristics in the table in order to have the household survey sample better reflect the population of interest. For example, the weights in Table 8.1 indicate that the household travel survey sample included slightly more Survey Development and Analysis Using ACS Data 133 96 Of the seven counties in the MORPC study, Franklin County was the only county to be surveyed as part of the ACS pilot program. 97 The study included an additional oversample of households in Licking County, for use in building a model specific to that county. This case study focuses only on the households comprising the main sample, collected for MORPC.

one-person, zero-vehicle households than a fully representative sample of the study area would include. In frequencies and summaries of the household travel survey, these households would be weighted by a factor of 0.95. Conversely, three-person and four-or-more-person households with no vehicles were under represented in the household travel survey. Therefore, in summaries and frequency tabulations, these households would be multiplied by factors of more than 2.5. Table 8.2 shows the same types of weighting factors when ACS estimates are used to represent the target population. This remainder of this section is organized as follows. First, a background of the original weighting process is provided, including a brief description of the different weighting elements. Second, the steps and computations needed to reapply the weighting using Census 2000 data are described. Finally, the reweighting is performed using 2000 ACS data, and the results are com- pared to the Census 2000 weighting process. 8.3.1 Original Weighting Process98 The Mid-Ohio Survey employed a probability sample selection process to select households for inclusion in the study. This means that the relative probability (or chance) that any particular household in the universe would be sampled is known. The actual sampling process employed in the study was a âstratified sampleâ in which households were randomly selected at the county level. The number of households sampled within a particular county was based on the proportion of households within that county compared to the total number of households in the seven-county 134 A Guidebook for Using American Community Survey Data for Transportation Planning HH Size HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons 0 Vehicles 0.95000 1.55263 2.72727 2.57143 1 Vehicle 1.00000 0.94720 1.36842 1.57333 2 Vehicles 1.47143 1.01101 0.90146 0.79736 3 Vehicles 0.94118 0.93750 0.85714 1.11570 4+ Vehicles 0.85714 0.75000 0.70588 0.95652 Table 8.1. Census 2000 weighting factor to adjust for probability of selection, Franklin County. HH Size HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons 0 Vehicles 0.91111 1.36842 2.27273 1.78571 1 Vehicle 1.04805 0.89441 1.32632 1.52000 2 Vehicles 1.40000 1.07390 0.85036 0.75551 3 Vehicles 0.70588 1.06250 0.87143 1.14876 4+ Vehicles 0.57143 0.67857 0.76471 1.00000 Table 8.2. ACS weighting factor to adjust for probability of selection, Franklin County. 98 This section draws largely from the Mid-Ohio Household Travel Survey Final Report written by NuStats. The numbers were adjusted to reflect the final distribution of households only in the original seven-county sample and excluding the Licking County over-sample.

region (based on 1990 Census data). Once the proportion of households within each county was determined, households within each county were randomly sampled from a universe of all tele- phone exchanges in the study area. Upon completion of data collection efforts, the final distribution of households was not pro- portionate to that of the survey universe. This is common in survey research and can be attrib- uted to two main causes: coverage bias and unequal response rates. Coverage bias refers to the fact that as a telephone survey, the households randomly sampled for inclusion in the study were limited to only those with telephones. Secondly, not all types of households that were contacted participated in equal proportions. The survey requirements biased the sample toward smaller households and those in specific counties. The exact reasons why there are variations in response rates are not always clear, but respondent burden and interest in transportation issues does vary based on factors such as household size and household location. Because of these reasons, the final distribution of households in the survey dataset did not match that of the 1990 Census. Thus, a weighting factor was developed to adjust the data, thereby minimizing these potential sources of bias in the data and subsequent analysis. The weighting factors for the 1999 Mid-Ohio Household Travel Survey data were developed through a four- step process. Each step produced an adjustment factor and the final weight represents the product of those four factors. These steps adjust the data for the following: â¢ Probability of selection, â¢ Episodic telephone ownership, â¢ County weight, and â¢ Normalization of weights. Probability of Selection The first step in the weighting process was to account for differen- tial probabilities of selection in the sample generation stage. The natural or proportionate distribution of households by household size and number of vehicles based on 1990 decennial census estimates was determined, and this distribution was compared to the actual distribution of households that completed the survey. The weighting factor that was calculated to bring the final distribution of surveyed house- holds in line with the actual distribution of households as expressed in the 1990 census, thereby adjusting for probability of selection, is shown in Table 8.3. In that table, the value 1.0 would mean that the sampled elements accurately reflected the population at large; a value less than 1.0 meant there was an over representation and a value greater than 1.0 meant there was an under representation of that particular population subgroup in the survey data. As shown in Table 8.3, the survey included fewer zero-vehicle households as compared to the census estimates, but proportionally more large households and households with three or more vehicles. Survey Development and Analysis Using ACS Data 135 HH Size HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons 0 Vehicles 1.2671 1.6872 3.9310 combined 1 Vehicle 0.9357 1.0250 1.4085 2.0036 2 Vehicles 0.9845 0.9263 1.1295 0.9415 3+ Vehicles 0.4383 0.6682 0.8472 0.9429 Table 8.3. Weighting factor to adjust for probability of selection (factor 1).

Episodic Telephone Ownership Except in large urban areas with mature, multimodal transportation infrastructures, most zero-vehicle households tend to be associated with lower- income households. These same households often have difficulties consistently paying their tele- phone bills and will experience episodic telephone service (service is discontinued due to non- payment, the household pays the outstanding bill and has service reconnected, only to have troubles paying the bill again a few months later). Thus, it is very difficult to achieve a represen- tative random sample of lower-income/zero-vehicle households in a telephone survey. To account for the coverage bias introduced by excluding non-telephone households from the telephone sample, the survey team developed an adjustment technique that used a series of ques- tions to separate respondents into two groups: those with steady or continuous telephone serv- ice and those with episodic service. The characteristics of those households reporting episodic telephone ownership were used as proxies to represent other non-telephone-owning households within the region. Fifty of the surveyed households reported being without a telephone for two weeks or longer. These households were used to represent other non-telephone households in the region. To determine the weighting factor required in adjusting for episodic telephone ownership, the data were compared to non-telephone ownership as reported by the Census Bureau. In 1990, four percent of households in the seven-county study area were identified as non-telephone households. Since the Census Bureau defines non-telephone households simply as those not having service on the census date (regardless of reason), this census proportion includes both households with episodic ownership as well as those who never had telephone service. In reality, only about half of the non-telephone households documented as such in the decen- nial census are episodic. This rate is determined based on a general pattern observed in anecdotal evidence collected through in-person interviews and postcard follow-up surveys conducted with non-telephone households on other studies. Based on the survey teamâs experience, the distri- bution of non-telephone households was adjusted so that the proportion of surveyed episodic households could be compared with the census estimates. This allowed for the calculation of the second weight adjustment factor, as shown in Table 8.4. County Weight In addition to ensuring that the survey data are weighted to represent households on the key characteristics of household size and vehicles, as well as minimizing the coverage bias introduced through using a telephone survey method, there also was the issue of geographic coverage to consider. The sample was drawn proportionately from within each county. However, differing response rates and data collection goals resulted in a disproportion- ate distribution of households at the conclusion of the study. Table 8.5 compares the distribu- tion of the survey responses by county with the 1990 Census household counts. Normalization of Weights If only Factors 1, 2, and 3 were used to create the final weights for the dataset, the weighted data would represent 5,512 households rather than the 5,418 households actually contained in the dataset. To account for this and still maintain the relative contribution to the dataset of each household after weighting, all households were given a Factor 4 value of 0.9829. The final weight then was the product of each of the four factors for each household. 136 A Guidebook for Using American Community Survey Data for Transportation Planning Is Phone Service Episodic? Survey Respondents Survey Proportion Census % Census Adjusted for Episodicity Factor 2 No 5,368 0.991 0.9600 0.980 0.989 Yes 50 0.009 0.0400 0.020 2.222 Total 5,418 1.000 1.000 1.000 Table 8.4. Episodic telephone ownership factor (factor 2).

8.3.2 Impact of Census 2000 on Original Weights The Mid-Ohio Regional Household Travel Survey was conducted in 1999, with the develop- ment of the weighting factors (as described above) shortly thereafter. Since Census 2000 data were not available, the 1990 Census counts and estimates were used to develop the sampling goals and data weights. Thus, any analyses done with the initial household travel survey data would reflect the application of 1999 travel patterns on the 1990 population. Once the year 2000 Long Form census data became available in 2002, the survey data could be reweighted and improved by applying the same weighting process with the updated data. Probability of Selection Table 8.6 shows the new weighting factors for the adjustment related to the probability of selection. The proportion of zero-vehicle households from the 2000 decennial census estimates was lower than that of the 1990 decennial census, while Census 2000 estimates showed significantly more vehicles per household in most categories. This suggests that the original weights (created by applying the 1990 Census estimates, which was the only source available at the time of the 1999 survey) overstated the proportions of zero-vehicle households in the region. One advan- tage to the continuous design of ACS is that analysts will have access to updated population parameters on a more regular basis. Episodic Telephone Ownership Census 2000 estimated that two percent of regional house- holds were non-telephone households, as compared to four percent in the 1990 census. As shown in Table 8.7, after accounting for episodic telephone service, there is no need for a weighting factor any longer. County Weight Table 8.8 shows the distribution of the survey respondents by county of res- idence as compared to Census 2000 counts and the resulting weight factor to adjust for Survey Development and Analysis Using ACS Data 137 County Survey % Census % Weight Franklin 63.7% 79.7% 1.25118 Licking 32.2% 10.2% 0.31677 Delaware 1.4% 5.6% 4.00000 Union 0.7% 1.7% 2.42857 Pickaway 0.6% 0.5% 0.83333 Fairfield 1.0% 1.9% 1.90000 Madison 0.4% 0.4% 1.00000 Total 100% 100% Table 8.5. County weights based on 1990 census (factor 3). HH Size HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons 0 Vehicles 1.04813 1.54261 2.91486 Comb 1 Vehicle 1.02104 0.96947 1.41404 1.75862 2 Vehicles 1.29074 0.97264 1.04311 0.90120 3+ Vehicles 0.78530 0.77689 0.78967 0.91137 Table 8.6. Weighting factor to adjust for probability of selection with year 2000 census data (factor 1).

geographic representation. The 1990 Census proportions also are included in this table. When originally weighted using the 1990 Census data, the resulting weights adjusted the survey house- holds to reduce the proportion of households from Licking County and, at the same time, increase the representation of Franklin, Delaware, and Union County households. The new weighting factor, based on Census 2000, still adjusts for an over-representation of Licking County households and under-representation elsewhere. However, the population growth seems less in Franklin and Licking Counties and more in the surrounding counties. Fair- field County in particular grew from 1.9 percent of the population distribution in 1990 to 7.3 percent in 2000. Again, having more frequent updates in terms of population growth from ACS will greatly aid in these types of data adjustments. Normalization of Weights Again, because a weight created only on Factors 1, 2, and 3 would result in the 5,418 households representing 5,364 households when weighted, all house- holds were given a Factor4 value of 0.9829 to normalize the data. Not surprisingly, at each phase of the recalculation, one can see evidence that the household survey data were much more in line with the Census 2000 estimates than the 1990 Census esti- mates. This comparison supports the supposition that using ACS estimates for a time period corresponding to the household travel surveys will benefit the survey process. 8.3.3 Weighting Using ACS Data Many recent household travel surveys were planned to roughly correspond to decennial cen- sus years to allow for the application of relevant weights. As for the mid-Ohio case, this has meant that planners have had to rely on preliminary weights based on older census data until the newer, more relevant, data became available. ACS will provide planners with the ability to develop sur- vey sample weights that better correspond to the survey data collection period. With annual esti- mates available less than a year after the reference period, planners will be able to apply accurate sample weights much more quickly than previously. 138 A Guidebook for Using American Community Survey Data for Transportation Planning Is Phone Service Episodic? Survey Respondents Survey Proportion Census % Census Adjusted for Episodicity Factor2 No 5,368 0.991 0.9800 0.990 1 Yes 50 0.009 0.0200 0.010 1 Total 5,418 1.000 1.000 1.000 Table 8.7. Episodic telephone ownership factor (factor 2). County Survey % 1990 Census % 2000 Census % Old Weight New Weight Franklin 63.7% 79.7% 70.2% 1.25118 1.10204 Licking 32.2% 10.2% 8.9% 0.31677 0.27640 Delaware 1.4% 5.6% 6.4% 4.00000 4.57143 Union 0.7% 1.7% 2.3% 2.42857 3.28571 Pickaway 0.6% 0.5% 2.8% 0.83333 4.66667 Fairfield 1.0% 1.9% 7.3% 1.90000 7.30000 Madison 0.4% 0.4% 2.2% 1.00000 5.50000 Total 100% 100% 100% Table 8.8. County weights based on Census 2000 (factor 3).

As for other analyses, survey analysts need to understand the ACS data availability constraints. For larger census areas of more than 65,000 population, annual estimates will be available. If a survey study area is composed of a group of counties that all have large populations, then annual ACS estimates can be used in geographic-based weighting. If one or more of the geographic areas is smaller than 65,000 people, but all are larger than 20,000, then three-year estimates will be available for use in weighting. If the survey area includes geographic areas that are smaller than that, five-year estimates would be required. To maintain consistency in the estimates, it will usually be best to use common types of esti- mates (e.g., all three-year averages or all annual estimates), but for many larger metropolitan areas, some outlying counties will require the five-year average. Analysts will need to weigh the benefit of fully consistent estimates (from using the five-year estimates throughout the study area) against obtaining more accuracy and timeliness for the core counties (by using one- or three-year estimates where possible and five-year averages where necessary). One of the seven counties in the Columbus region, Franklin County, was included as part of the ACS pilot. We can review the weighting process using ACS estimates and focusing on Franklin County with the objective of understanding how Franklin County weights developed using Census 2000 might differ from those developed using ACS pilot data. Of the 5,418 house- holds surveyed, 3,451 were from Franklin County. As for the weighting based on Census 2000, there was no need to adjust for telephone owner- ship. In addition, since the focus of the ACS analysis was on only one county (Franklin County), the data do not need to be adjusted for geographic representation. Thus, the Franklin County weights focused on the probability of selectionâthe distribution of households by size and vehi- cle ownership. Tables 8.1 and 8.2 show the Franklin County weights using Census 2000 and the ACS as the control totals. The survey data were closer to Census 2000 in terms of the smaller households (or those with fewer vehicles). However, for the larger households, and those with more vehicles, the survey data were more in line with the ACS data. These analyses have ignored two complications of using ACS estimates in analyses. First, the example did not need to consider ACS multiyear averaging because the specific geography stud- ied would have single-year estimates. As discussed above, if survey analysts need to consider geo- graphic areas for which single-year estimates are not available (household travel survey strata could include separate small counties, county subdivisions, or census places), then it will be best to use the multiyear estimates for developing estimates for all the survey strata, regardless of whether single-year estimates are available. Prior to developing a survey stratification scheme and weighting plan, it will make sense for survey analysts to consider which ACS reporting cat- egory the geographic areas within the survey region fall into, and then to define the geographic strata based on this information. Second, the reported analyses did not include any mention of confidence intervals or statisti- cal uncertainty in the ACS estimates. These analyses (and virtually any other analyses that have been previously performed using decennial census Long Form estimates) can be accomplished using the ACS estimates, without consideration of the uncertainty. The ACS estimates, although less precise than decennial census estimates, will still almost always represent the best available estimates of the population characteristics under study, so for analyses that require a single tar- get estimate, such as household survey weighting, the analyst will need to rely on the reported estimate. The analyst could calculate or obtain margins of error for the ACS estimates used as weight- ing targets, but because household survey response biases tend toward certain directions (under- representation of larger households and zero vehicle households), many of the resulting weights would be set at the extreme ends of the confidence intervals. Although not all the midpoint Survey Development and Analysis Using ACS Data 139

estimates will be as close to the actual (but unknown) characteristic count or average as the ends of the 90 percent confidence intervals, on average, the midpoint estimates are a better estimate of the actual characteristic. In cases where ACS estimates will be used without the formal calculation of margins of error, it will be important that the analyst validate, to the extent possible, the reasonableness of the ACS estimates that are being used. This can be accomplished by comparing the ACS estimates â¢ To independent data sources (also referred to as âadministrative recordsâ by the Census Bureau); â¢ For specific geographic areas with those for nearby areas, larger areas for which the areas of interest are a component, and smaller geographic areas that comprise the areas of interest; and â¢ For specific time periods with previous (and perhaps subsequent) time periods and multiyear estimates that include the time period of interest. These validation efforts will help identify potential issues with the specific ACS estimates that would be used to inform the survey stratification and weighting processes. Based on these eval- uations, analysts may choose to use different multiyear average estimates or to define geographic strata differently. 140 A Guidebook for Using American Community Survey Data for Transportation Planning

Next: Chapter 9 - Travel Demand Modeling Analyses Using ACS Data »

A Guidebook for Using American Community Survey Data for Transportation Planning (2007)

Chapter: Chapter 8 - Survey Development and Analysis Using ACS Data

Welcome to OpenBook!

Get Email Updates