Cover Image

Not for Sale



View/Hide Left Panel
Click for next page ( 134


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 133
Survey Development and Analysis Using ACS Data 133 8.3 Case Study The purpose of this section is to illustrate how ACS data may be used to weight data collected through a typical household travel survey process. The case study is based on an actual recent regional travel survey conducted in 1999 by a national survey firm for the Mid-Ohio Regional Planning Council (MORPC). There are three sections to this case study, as follows: 1. The weighting process used to adjust the 1999 Mid-Ohio Regional Travel Survey data is sum- marized. A significant drawback of using decennial census data for analyses like the weight- ing of surveys is the infrequency of the data releases. When the 1999 survey was conducted, the agency needed to initially rely on census data from 1990 for the weighting. 2. The application of new weights based on Census 2000 data are summarized. Once the 2000 data became available, the survey data could be reweighted to better reflect the population at the time of the survey in 1999. 3. Finally, the weights for Franklin County, Ohio are recalculated, using both the Census 2000 data as well as ACS data.96 Assume that you have been asked to develop new weights for the 1999 household travel sur- vey using Census 2000 data and 2000 ACS data and to compare the two sets of weights. This survey sampled 5,418 households to provide data for the continuing development and refinement of the region's travel demand forecasting model, as well as to provide a better under- standing of travel behavior in the Central Ohio region. Resultant data were used to fulfill the model's functions of estimating trip generation and distribution, mode choice, and assignments. The 1999 Mid-Ohio Household Travel Survey, like many recent household travel surveys, relied on the willingness of area residents to record their daily travel for a specific 24-hour period. Households were recruited into the study by telephone, then were mailed personalized materi- als to aid in recording travel details, and finally were recontacted by telephone for retrieval of the travel data. The survey was conducted from February through June 1999 in seven Central Ohio counties: Delaware, Fairfield, Franklin, Licking, Madison, Pickaway, and Union. During this time, 7,333 households agreed to participate in the study and 5,418 actually provided travel details for all household members.97 Although the household travel survey sample was considered to be a fairly good representation of households from the seven-county region, weights were developed and applied to adjust the data for unequal response rates for households of different types across the study area and to account for coverage bias resulting from the use of a telephone survey method. Key demographic variables used to weight the household survey data to better reflect the full population of households in the survey area included household size, vehicle ownership, tele- phone ownership, and county populations. Table 8.1 shows weighting factors based on Census 2000 data. The weights shown are factors that analysts would apply to the households with the different characteristics in the table in order to have the household survey sample better reflect the population of interest. For example, the weights in Table 8.1 indicate that the household travel survey sample included slightly more 96 Of the seven counties in the MORPC study, Franklin County was the only county to be surveyed as part of the ACS pilot program. 97 The study included an additional oversample of households in Licking County, for use in building a model specific to that county. This case study focuses only on the households comprising the main sample, collected for MORPC.

OCR for page 133
134 A Guidebook for Using American Community Survey Data for Transportation Planning Table 8.1. Census 2000 weighting factor to adjust for probability of selection, Franklin County. HH Size HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons 0 Vehicles 0.95000 1.55263 2.72727 2.57143 1 Vehicle 1.00000 0.94720 1.36842 1.57333 2 Vehicles 1.47143 1.01101 0.90146 0.79736 3 Vehicles 0.94118 0.93750 0.85714 1.11570 4+ Vehicles 0.85714 0.75000 0.70588 0.95652 Table 8.2. ACS weighting factor to adjust for probability of selection, Franklin County. HH Size HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons 0 Vehicles 0.91111 1.36842 2.27273 1.78571 1 Vehicle 1.04805 0.89441 1.32632 1.52000 2 Vehicles 1.40000 1.07390 0.85036 0.75551 3 Vehicles 0.70588 1.06250 0.87143 1.14876 4+ Vehicles 0.57143 0.67857 0.76471 1.00000 one-person, zero-vehicle households than a fully representative sample of the study area would include. In frequencies and summaries of the household travel survey, these households would be weighted by a factor of 0.95. Conversely, three-person and four-or-more-person households with no vehicles were under represented in the household travel survey. Therefore, in summaries and frequency tabulations, these households would be multiplied by factors of more than 2.5. Table 8.2 shows the same types of weighting factors when ACS estimates are used to represent the target population. This remainder of this section is organized as follows. First, a background of the original weighting process is provided, including a brief description of the different weighting elements. Second, the steps and computations needed to reapply the weighting using Census 2000 data are described. Finally, the reweighting is performed using 2000 ACS data, and the results are com- pared to the Census 2000 weighting process. 8.3.1 Original Weighting Process98 The Mid-Ohio Survey employed a probability sample selection process to select households for inclusion in the study. This means that the relative probability (or chance) that any particular household in the universe would be sampled is known. The actual sampling process employed in the study was a "stratified sample" in which households were randomly selected at the county level. The number of households sampled within a particular county was based on the proportion of households within that county compared to the total number of households in the seven-county 98 This section draws largely from the Mid-Ohio Household Travel Survey Final Report written by NuStats. The numbers were adjusted to reflect the final distribution of households only in the original seven-county sample and excluding the Licking County over-sample.

OCR for page 133
Survey Development and Analysis Using ACS Data 135 region (based on 1990 Census data). Once the proportion of households within each county was determined, households within each county were randomly sampled from a universe of all tele- phone exchanges in the study area. Upon completion of data collection efforts, the final distribution of households was not pro- portionate to that of the survey universe. This is common in survey research and can be attrib- uted to two main causes: coverage bias and unequal response rates. Coverage bias refers to the fact that as a telephone survey, the households randomly sampled for inclusion in the study were limited to only those with telephones. Secondly, not all types of households that were contacted participated in equal proportions. The survey requirements biased the sample toward smaller households and those in specific counties. The exact reasons why there are variations in response rates are not always clear, but respondent burden and interest in transportation issues does vary based on factors such as household size and household location. Because of these reasons, the final distribution of households in the survey dataset did not match that of the 1990 Census. Thus, a weighting factor was developed to adjust the data, thereby minimizing these potential sources of bias in the data and subsequent analysis. The weighting factors for the 1999 Mid-Ohio Household Travel Survey data were developed through a four- step process. Each step produced an adjustment factor and the final weight represents the product of those four factors. These steps adjust the data for the following: Probability of selection, Episodic telephone ownership, County weight, and Normalization of weights. Probability of Selection The first step in the weighting process was to account for differen- tial probabilities of selection in the sample generation stage. The natural or proportionate distribution of households by household size and number of vehicles based on 1990 decennial census estimates was determined, and this distribution was compared to the actual distribution of households that completed the survey. The weighting factor that was calculated to bring the final distribution of surveyed house- holds in line with the actual distribution of households as expressed in the 1990 census, thereby adjusting for probability of selection, is shown in Table 8.3. In that table, the value 1.0 would mean that the sampled elements accurately reflected the population at large; a value less than 1.0 meant there was an over representation and a value greater than 1.0 meant there was an under representation of that particular population subgroup in the survey data. As shown in Table 8.3, the survey included fewer zero-vehicle households as compared to the census estimates, but proportionally more large households and households with three or more vehicles. Table 8.3. Weighting factor to adjust for probability of selection (factor 1). HH Size HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons 0 Vehicles 1.2671 1.6872 3.9310 combined 1 Vehicle 0.9357 1.0250 1.4085 2.0036 2 Vehicles 0.9845 0.9263 1.1295 0.9415 3+ Vehicles 0.4383 0.6682 0.8472 0.9429

OCR for page 133
136 A Guidebook for Using American Community Survey Data for Transportation Planning Episodic Telephone Ownership Except in large urban areas with mature, multimodal transportation infrastructures, most zero-vehicle households tend to be associated with lower- income households. These same households often have difficulties consistently paying their tele- phone bills and will experience episodic telephone service (service is discontinued due to non- payment, the household pays the outstanding bill and has service reconnected, only to have troubles paying the bill again a few months later). Thus, it is very difficult to achieve a represen- tative random sample of lower-income/zero-vehicle households in a telephone survey. To account for the coverage bias introduced by excluding non-telephone households from the telephone sample, the survey team developed an adjustment technique that used a series of ques- tions to separate respondents into two groups: those with steady or continuous telephone serv- ice and those with episodic service. The characteristics of those households reporting episodic telephone ownership were used as proxies to represent other non-telephone-owning households within the region. Fifty of the surveyed households reported being without a telephone for two weeks or longer. These households were used to represent other non-telephone households in the region. To determine the weighting factor required in adjusting for episodic telephone ownership, the data were compared to non-telephone ownership as reported by the Census Bureau. In 1990, four percent of households in the seven-county study area were identified as non-telephone households. Since the Census Bureau defines non-telephone households simply as those not having service on the census date (regardless of reason), this census proportion includes both households with episodic ownership as well as those who never had telephone service. In reality, only about half of the non-telephone households documented as such in the decen- nial census are episodic. This rate is determined based on a general pattern observed in anecdotal evidence collected through in-person interviews and postcard follow-up surveys conducted with non-telephone households on other studies. Based on the survey team's experience, the distri- bution of non-telephone households was adjusted so that the proportion of surveyed episodic households could be compared with the census estimates. This allowed for the calculation of the second weight adjustment factor, as shown in Table 8.4. County Weight In addition to ensuring that the survey data are weighted to represent households on the key characteristics of household size and vehicles, as well as minimizing the coverage bias introduced through using a telephone survey method, there also was the issue of geographic coverage to consider. The sample was drawn proportionately from within each county. However, differing response rates and data collection goals resulted in a disproportion- ate distribution of households at the conclusion of the study. Table 8.5 compares the distribu- tion of the survey responses by county with the 1990 Census household counts. Normalization of Weights If only Factors 1, 2, and 3 were used to create the final weights for the dataset, the weighted data would represent 5,512 households rather than the 5,418 households actually contained in the dataset. To account for this and still maintain the relative contribution to the dataset of each household after weighting, all households were given a Factor 4 value of 0.9829. The final weight then was the product of each of the four factors for each household. Table 8.4. Episodic telephone ownership factor (factor 2). Is Phone Service Survey Survey Census Adjusted Episodic? Respondents Proportion Census % for Episodicity Factor 2 No 5,368 0.991 0.9600 0.980 0.989 Yes 50 0.009 0.0400 0.020 2.222 Total 5,418 1.000 1.000 1.000

OCR for page 133
Survey Development and Analysis Using ACS Data 137 Table 8.5. County weights based on 1990 census (factor 3). County Survey % Census % Weight Franklin 63.7% 79.7% 1.25118 Licking 32.2% 10.2% 0.31677 Delaware 1.4% 5.6% 4.00000 Union 0.7% 1.7% 2.42857 Pickaway 0.6% 0.5% 0.83333 Fairfield 1.0% 1.9% 1.90000 Madison 0.4% 0.4% 1.00000 Total 100% 100% 8.3.2 Impact of Census 2000 on Original Weights The Mid-Ohio Regional Household Travel Survey was conducted in 1999, with the develop- ment of the weighting factors (as described above) shortly thereafter. Since Census 2000 data were not available, the 1990 Census counts and estimates were used to develop the sampling goals and data weights. Thus, any analyses done with the initial household travel survey data would reflect the application of 1999 travel patterns on the 1990 population. Once the year 2000 Long Form census data became available in 2002, the survey data could be reweighted and improved by applying the same weighting process with the updated data. Probability of Selection Table 8.6 shows the new weighting factors for the adjustment related to the probability of selection. The proportion of zero-vehicle households from the 2000 decennial census estimates was lower than that of the 1990 decennial census, while Census 2000 estimates showed significantly more vehicles per household in most categories. This suggests that the original weights (created by applying the 1990 Census estimates, which was the only source available at the time of the 1999 survey) overstated the proportions of zero-vehicle households in the region. One advan- tage to the continuous design of ACS is that analysts will have access to updated population parameters on a more regular basis. Episodic Telephone Ownership Census 2000 estimated that two percent of regional house- holds were non-telephone households, as compared to four percent in the 1990 census. As shown in Table 8.7, after accounting for episodic telephone service, there is no need for a weighting factor any longer. County Weight Table 8.8 shows the distribution of the survey respondents by county of res- idence as compared to Census 2000 counts and the resulting weight factor to adjust for Table 8.6. Weighting factor to adjust for probability of selection with year 2000 census data (factor 1). HH Size HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons 0 Vehicles 1.04813 1.54261 2.91486 Comb 1 Vehicle 1.02104 0.96947 1.41404 1.75862 2 Vehicles 1.29074 0.97264 1.04311 0.90120 3+ Vehicles 0.78530 0.77689 0.78967 0.91137

OCR for page 133
138 A Guidebook for Using American Community Survey Data for Transportation Planning Table 8.7. Episodic telephone ownership factor (factor 2). Is Phone Survey Survey Census Adjusted Service Episodic? Respondents Proportion Census % for Episodicity Factor2 No 5,368 0.991 0.9800 0.990 1 Yes 50 0.009 0.0200 0.010 1 Total 5,418 1.000 1.000 1.000 Table 8.8. County weights based on Census 2000 (factor 3). 1990 2000 County Survey % Census % Census % Old Weight New Weight Franklin 63.7% 79.7% 70.2% 1.25118 1.10204 Licking 32.2% 10.2% 8.9% 0.31677 0.27640 Delaware 1.4% 5.6% 6.4% 4.00000 4.57143 Union 0.7% 1.7% 2.3% 2.42857 3.28571 Pickaway 0.6% 0.5% 2.8% 0.83333 4.66667 Fairfield 1.0% 1.9% 7.3% 1.90000 7.30000 Madison 0.4% 0.4% 2.2% 1.00000 5.50000 Total 100% 100% 100% geographic representation. The 1990 Census proportions also are included in this table. When originally weighted using the 1990 Census data, the resulting weights adjusted the survey house- holds to reduce the proportion of households from Licking County and, at the same time, increase the representation of Franklin, Delaware, and Union County households. The new weighting factor, based on Census 2000, still adjusts for an over-representation of Licking County households and under-representation elsewhere. However, the population growth seems less in Franklin and Licking Counties and more in the surrounding counties. Fair- field County in particular grew from 1.9 percent of the population distribution in 1990 to 7.3 percent in 2000. Again, having more frequent updates in terms of population growth from ACS will greatly aid in these types of data adjustments. Normalization of Weights Again, because a weight created only on Factors 1, 2, and 3 would result in the 5,418 households representing 5,364 households when weighted, all house- holds were given a Factor4 value of 0.9829 to normalize the data. Not surprisingly, at each phase of the recalculation, one can see evidence that the household survey data were much more in line with the Census 2000 estimates than the 1990 Census esti- mates. This comparison supports the supposition that using ACS estimates for a time period corresponding to the household travel surveys will benefit the survey process. 8.3.3 Weighting Using ACS Data Many recent household travel surveys were planned to roughly correspond to decennial cen- sus years to allow for the application of relevant weights. As for the mid-Ohio case, this has meant that planners have had to rely on preliminary weights based on older census data until the newer, more relevant, data became available. ACS will provide planners with the ability to develop sur- vey sample weights that better correspond to the survey data collection period. With annual esti- mates available less than a year after the reference period, planners will be able to apply accurate sample weights much more quickly than previously.

OCR for page 133
Survey Development and Analysis Using ACS Data 139 As for other analyses, survey analysts need to understand the ACS data availability constraints. For larger census areas of more than 65,000 population, annual estimates will be available. If a survey study area is composed of a group of counties that all have large populations, then annual ACS estimates can be used in geographic-based weighting. If one or more of the geographic areas is smaller than 65,000 people, but all are larger than 20,000, then three-year estimates will be available for use in weighting. If the survey area includes geographic areas that are smaller than that, five-year estimates would be required. To maintain consistency in the estimates, it will usually be best to use common types of esti- mates (e.g., all three-year averages or all annual estimates), but for many larger metropolitan areas, some outlying counties will require the five-year average. Analysts will need to weigh the benefit of fully consistent estimates (from using the five-year estimates throughout the study area) against obtaining more accuracy and timeliness for the core counties (by using one- or three-year estimates where possible and five-year averages where necessary). One of the seven counties in the Columbus region, Franklin County, was included as part of the ACS pilot. We can review the weighting process using ACS estimates and focusing on Franklin County with the objective of understanding how Franklin County weights developed using Census 2000 might differ from those developed using ACS pilot data. Of the 5,418 house- holds surveyed, 3,451 were from Franklin County. As for the weighting based on Census 2000, there was no need to adjust for telephone owner- ship. In addition, since the focus of the ACS analysis was on only one county (Franklin County), the data do not need to be adjusted for geographic representation. Thus, the Franklin County weights focused on the probability of selection--the distribution of households by size and vehi- cle ownership. Tables 8.1 and 8.2 show the Franklin County weights using Census 2000 and the ACS as the control totals. The survey data were closer to Census 2000 in terms of the smaller households (or those with fewer vehicles). However, for the larger households, and those with more vehicles, the survey data were more in line with the ACS data. These analyses have ignored two complications of using ACS estimates in analyses. First, the example did not need to consider ACS multiyear averaging because the specific geography stud- ied would have single-year estimates. As discussed above, if survey analysts need to consider geo- graphic areas for which single-year estimates are not available (household travel survey strata could include separate small counties, county subdivisions, or census places), then it will be best to use the multiyear estimates for developing estimates for all the survey strata, regardless of whether single-year estimates are available. Prior to developing a survey stratification scheme and weighting plan, it will make sense for survey analysts to consider which ACS reporting cat- egory the geographic areas within the survey region fall into, and then to define the geographic strata based on this information. Second, the reported analyses did not include any mention of confidence intervals or statisti- cal uncertainty in the ACS estimates. These analyses (and virtually any other analyses that have been previously performed using decennial census Long Form estimates) can be accomplished using the ACS estimates, without consideration of the uncertainty. The ACS estimates, although less precise than decennial census estimates, will still almost always represent the best available estimates of the population characteristics under study, so for analyses that require a single tar- get estimate, such as household survey weighting, the analyst will need to rely on the reported estimate. The analyst could calculate or obtain margins of error for the ACS estimates used as weight- ing targets, but because household survey response biases tend toward certain directions (under- representation of larger households and zero vehicle households), many of the resulting weights would be set at the extreme ends of the confidence intervals. Although not all the midpoint

OCR for page 133
140 A Guidebook for Using American Community Survey Data for Transportation Planning estimates will be as close to the actual (but unknown) characteristic count or average as the ends of the 90 percent confidence intervals, on average, the midpoint estimates are a better estimate of the actual characteristic. In cases where ACS estimates will be used without the formal calculation of margins of error, it will be important that the analyst validate, to the extent possible, the reasonableness of the ACS estimates that are being used. This can be accomplished by comparing the ACS estimates To independent data sources (also referred to as "administrative records" by the Census Bureau); For specific geographic areas with those for nearby areas, larger areas for which the areas of interest are a component, and smaller geographic areas that comprise the areas of interest; and For specific time periods with previous (and perhaps subsequent) time periods and multiyear estimates that include the time period of interest. These validation efforts will help identify potential issues with the specific ACS estimates that would be used to inform the survey stratification and weighting processes. Based on these eval- uations, analysts may choose to use different multiyear average estimates or to define geographic strata differently.