Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 133
Survey Development and Analysis Using ACS Data 133
8.3 Case Study
The purpose of this section is to illustrate how ACS data may be used to weight data collected
through a typical household travel survey process. The case study is based on an actual recent
regional travel survey conducted in 1999 by a national survey firm for the Mid-Ohio Regional
Planning Council (MORPC).
There are three sections to this case study, as follows:
1. The weighting process used to adjust the 1999 Mid-Ohio Regional Travel Survey data is sum-
marized. A significant drawback of using decennial census data for analyses like the weight-
ing of surveys is the infrequency of the data releases. When the 1999 survey was conducted,
the agency needed to initially rely on census data from 1990 for the weighting.
2. The application of new weights based on Census 2000 data are summarized. Once the 2000
data became available, the survey data could be reweighted to better reflect the population at
the time of the survey in 1999.
3. Finally, the weights for Franklin County, Ohio are recalculated, using both the Census 2000
data as well as ACS data.96
Assume that you have been asked to develop new weights for the 1999 household travel sur-
vey using Census 2000 data and 2000 ACS data and to compare the two sets of weights.
This survey sampled 5,418 households to provide data for the continuing development and
refinement of the region's travel demand forecasting model, as well as to provide a better under-
standing of travel behavior in the Central Ohio region. Resultant data were used to fulfill the
model's functions of estimating trip generation and distribution, mode choice, and assignments.
The 1999 Mid-Ohio Household Travel Survey, like many recent household travel surveys,
relied on the willingness of area residents to record their daily travel for a specific 24-hour period.
Households were recruited into the study by telephone, then were mailed personalized materi-
als to aid in recording travel details, and finally were recontacted by telephone for retrieval of the
travel data.
The survey was conducted from February through June 1999 in seven Central Ohio counties:
Delaware, Fairfield, Franklin, Licking, Madison, Pickaway, and Union. During this time, 7,333
households agreed to participate in the study and 5,418 actually provided travel details for all
household members.97 Although the household travel survey sample was considered to be a fairly
good representation of households from the seven-county region, weights were developed and
applied to adjust the data for unequal response rates for households of different types across the
study area and to account for coverage bias resulting from the use of a telephone survey method.
Key demographic variables used to weight the household survey data to better reflect the full
population of households in the survey area included household size, vehicle ownership, tele-
phone ownership, and county populations.
Table 8.1 shows weighting factors based on Census 2000 data. The weights shown are factors
that analysts would apply to the households with the different characteristics in the table in order
to have the household survey sample better reflect the population of interest. For example, the
weights in Table 8.1 indicate that the household travel survey sample included slightly more
96
Of the seven counties in the MORPC study, Franklin County was the only county to be surveyed as part of the
ACS pilot program.
97
The study included an additional oversample of households in Licking County, for use in building a model
specific to that county. This case study focuses only on the households comprising the main sample, collected
for MORPC.
OCR for page 134
134 A Guidebook for Using American Community Survey Data for Transportation Planning
Table 8.1. Census 2000 weighting factor to adjust for probability of
selection, Franklin County.
HH Size
HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons
0 Vehicles 0.95000 1.55263 2.72727 2.57143
1 Vehicle 1.00000 0.94720 1.36842 1.57333
2 Vehicles 1.47143 1.01101 0.90146 0.79736
3 Vehicles 0.94118 0.93750 0.85714 1.11570
4+ Vehicles 0.85714 0.75000 0.70588 0.95652
Table 8.2. ACS weighting factor to adjust for probability of selection,
Franklin County.
HH Size
HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons
0 Vehicles 0.91111 1.36842 2.27273 1.78571
1 Vehicle 1.04805 0.89441 1.32632 1.52000
2 Vehicles 1.40000 1.07390 0.85036 0.75551
3 Vehicles 0.70588 1.06250 0.87143 1.14876
4+ Vehicles 0.57143 0.67857 0.76471 1.00000
one-person, zero-vehicle households than a fully representative sample of the study area would
include. In frequencies and summaries of the household travel survey, these households would
be weighted by a factor of 0.95. Conversely, three-person and four-or-more-person households
with no vehicles were under represented in the household travel survey. Therefore, in summaries
and frequency tabulations, these households would be multiplied by factors of more than 2.5.
Table 8.2 shows the same types of weighting factors when ACS estimates are used to represent
the target population.
This remainder of this section is organized as follows. First, a background of the original
weighting process is provided, including a brief description of the different weighting elements.
Second, the steps and computations needed to reapply the weighting using Census 2000 data are
described. Finally, the reweighting is performed using 2000 ACS data, and the results are com-
pared to the Census 2000 weighting process.
8.3.1 Original Weighting Process98
The Mid-Ohio Survey employed a probability sample selection process to select households for
inclusion in the study. This means that the relative probability (or chance) that any particular
household in the universe would be sampled is known. The actual sampling process employed in
the study was a "stratified sample" in which households were randomly selected at the county level.
The number of households sampled within a particular county was based on the proportion of
households within that county compared to the total number of households in the seven-county
98
This section draws largely from the Mid-Ohio Household Travel Survey Final Report written by NuStats. The
numbers were adjusted to reflect the final distribution of households only in the original seven-county sample
and excluding the Licking County over-sample.
OCR for page 135
Survey Development and Analysis Using ACS Data 135
region (based on 1990 Census data). Once the proportion of households within each county was
determined, households within each county were randomly sampled from a universe of all tele-
phone exchanges in the study area.
Upon completion of data collection efforts, the final distribution of households was not pro-
portionate to that of the survey universe. This is common in survey research and can be attrib-
uted to two main causes: coverage bias and unequal response rates. Coverage bias refers to the
fact that as a telephone survey, the households randomly sampled for inclusion in the study were
limited to only those with telephones.
Secondly, not all types of households that were contacted participated in equal proportions.
The survey requirements biased the sample toward smaller households and those in specific
counties. The exact reasons why there are variations in response rates are not always clear, but
respondent burden and interest in transportation issues does vary based on factors such as
household size and household location.
Because of these reasons, the final distribution of households in the survey dataset did not
match that of the 1990 Census. Thus, a weighting factor was developed to adjust the data, thereby
minimizing these potential sources of bias in the data and subsequent analysis. The weighting
factors for the 1999 Mid-Ohio Household Travel Survey data were developed through a four-
step process. Each step produced an adjustment factor and the final weight represents the
product of those four factors. These steps adjust the data for the following:
· Probability of selection,
· Episodic telephone ownership,
· County weight, and
· Normalization of weights.
Probability of Selection The first step in the weighting process was to account for differen-
tial probabilities of selection in the sample generation stage. The natural or proportionate
distribution of households by household size and number of vehicles based on 1990 decennial
census estimates was determined, and this distribution was compared to the actual distribution
of households that completed the survey.
The weighting factor that was calculated to bring the final distribution of surveyed house-
holds in line with the actual distribution of households as expressed in the 1990 census,
thereby adjusting for probability of selection, is shown in Table 8.3. In that table, the value
1.0 would mean that the sampled elements accurately reflected the population at large; a value
less than 1.0 meant there was an over representation and a value greater than 1.0 meant there
was an under representation of that particular population subgroup in the survey data. As
shown in Table 8.3, the survey included fewer zero-vehicle households as compared to the
census estimates, but proportionally more large households and households with three or
more vehicles.
Table 8.3. Weighting factor to adjust for probability of selection
(factor 1).
HH Size
HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons
0 Vehicles 1.2671 1.6872 3.9310 combined
1 Vehicle 0.9357 1.0250 1.4085 2.0036
2 Vehicles 0.9845 0.9263 1.1295 0.9415
3+ Vehicles 0.4383 0.6682 0.8472 0.9429
OCR for page 136
136 A Guidebook for Using American Community Survey Data for Transportation Planning
Episodic Telephone Ownership Except in large urban areas with mature, multimodal
transportation infrastructures, most zero-vehicle households tend to be associated with lower-
income households. These same households often have difficulties consistently paying their tele-
phone bills and will experience episodic telephone service (service is discontinued due to non-
payment, the household pays the outstanding bill and has service reconnected, only to have
troubles paying the bill again a few months later). Thus, it is very difficult to achieve a represen-
tative random sample of lower-income/zero-vehicle households in a telephone survey.
To account for the coverage bias introduced by excluding non-telephone households from the
telephone sample, the survey team developed an adjustment technique that used a series of ques-
tions to separate respondents into two groups: those with steady or continuous telephone serv-
ice and those with episodic service. The characteristics of those households reporting episodic
telephone ownership were used as proxies to represent other non-telephone-owning households
within the region. Fifty of the surveyed households reported being without a telephone for two
weeks or longer. These households were used to represent other non-telephone households in
the region.
To determine the weighting factor required in adjusting for episodic telephone ownership, the
data were compared to non-telephone ownership as reported by the Census Bureau. In 1990,
four percent of households in the seven-county study area were identified as non-telephone
households. Since the Census Bureau defines non-telephone households simply as those not
having service on the census date (regardless of reason), this census proportion includes both
households with episodic ownership as well as those who never had telephone service.
In reality, only about half of the non-telephone households documented as such in the decen-
nial census are episodic. This rate is determined based on a general pattern observed in anecdotal
evidence collected through in-person interviews and postcard follow-up surveys conducted with
non-telephone households on other studies. Based on the survey team's experience, the distri-
bution of non-telephone households was adjusted so that the proportion of surveyed episodic
households could be compared with the census estimates. This allowed for the calculation of the
second weight adjustment factor, as shown in Table 8.4.
County Weight In addition to ensuring that the survey data are weighted to represent
households on the key characteristics of household size and vehicles, as well as minimizing the
coverage bias introduced through using a telephone survey method, there also was the issue of
geographic coverage to consider. The sample was drawn proportionately from within each
county. However, differing response rates and data collection goals resulted in a disproportion-
ate distribution of households at the conclusion of the study. Table 8.5 compares the distribu-
tion of the survey responses by county with the 1990 Census household counts.
Normalization of Weights If only Factors 1, 2, and 3 were used to create the final weights for
the dataset, the weighted data would represent 5,512 households rather than the 5,418 households
actually contained in the dataset. To account for this and still maintain the relative contribution
to the dataset of each household after weighting, all households were given a Factor 4 value of
0.9829. The final weight then was the product of each of the four factors for each household.
Table 8.4. Episodic telephone ownership factor (factor 2).
Is Phone Service Survey Survey Census Adjusted
Episodic? Respondents Proportion Census % for Episodicity Factor 2
No 5,368 0.991 0.9600 0.980 0.989
Yes 50 0.009 0.0400 0.020 2.222
Total 5,418 1.000 1.000 1.000
OCR for page 137
Survey Development and Analysis Using ACS Data 137
Table 8.5. County weights based on 1990 census (factor 3).
County Survey % Census % Weight
Franklin 63.7% 79.7% 1.25118
Licking 32.2% 10.2% 0.31677
Delaware 1.4% 5.6% 4.00000
Union 0.7% 1.7% 2.42857
Pickaway 0.6% 0.5% 0.83333
Fairfield 1.0% 1.9% 1.90000
Madison 0.4% 0.4% 1.00000
Total 100% 100%
8.3.2 Impact of Census 2000 on Original Weights
The Mid-Ohio Regional Household Travel Survey was conducted in 1999, with the develop-
ment of the weighting factors (as described above) shortly thereafter. Since Census 2000 data
were not available, the 1990 Census counts and estimates were used to develop the sampling
goals and data weights. Thus, any analyses done with the initial household travel survey data
would reflect the application of 1999 travel patterns on the 1990 population. Once the year 2000
Long Form census data became available in 2002, the survey data could be reweighted and
improved by applying the same weighting process with the updated data.
Probability of Selection Table 8.6 shows the new weighting factors for the adjustment
related to the probability of selection.
The proportion of zero-vehicle households from the 2000 decennial census estimates was
lower than that of the 1990 decennial census, while Census 2000 estimates showed significantly
more vehicles per household in most categories. This suggests that the original weights (created
by applying the 1990 Census estimates, which was the only source available at the time of the
1999 survey) overstated the proportions of zero-vehicle households in the region. One advan-
tage to the continuous design of ACS is that analysts will have access to updated population
parameters on a more regular basis.
Episodic Telephone Ownership Census 2000 estimated that two percent of regional house-
holds were non-telephone households, as compared to four percent in the 1990 census. As shown
in Table 8.7, after accounting for episodic telephone service, there is no need for a weighting
factor any longer.
County Weight Table 8.8 shows the distribution of the survey respondents by county of res-
idence as compared to Census 2000 counts and the resulting weight factor to adjust for
Table 8.6. Weighting factor to adjust for probability of selection with
year 2000 census data (factor 1).
HH Size
HH Vehicles 1 Person 2 Persons 3 Persons 4+ Persons
0 Vehicles 1.04813 1.54261 2.91486 Comb
1 Vehicle 1.02104 0.96947 1.41404 1.75862
2 Vehicles 1.29074 0.97264 1.04311 0.90120
3+ Vehicles 0.78530 0.77689 0.78967 0.91137
OCR for page 138
138 A Guidebook for Using American Community Survey Data for Transportation Planning
Table 8.7. Episodic telephone ownership factor (factor 2).
Is Phone Survey Survey Census Adjusted
Service Episodic? Respondents Proportion Census % for Episodicity Factor2
No 5,368 0.991 0.9800 0.990 1
Yes 50 0.009 0.0200 0.010 1
Total 5,418 1.000 1.000 1.000
Table 8.8. County weights based on Census 2000 (factor 3).
1990 2000
County Survey % Census % Census % Old Weight New Weight
Franklin 63.7% 79.7% 70.2% 1.25118 1.10204
Licking 32.2% 10.2% 8.9% 0.31677 0.27640
Delaware 1.4% 5.6% 6.4% 4.00000 4.57143
Union 0.7% 1.7% 2.3% 2.42857 3.28571
Pickaway 0.6% 0.5% 2.8% 0.83333 4.66667
Fairfield 1.0% 1.9% 7.3% 1.90000 7.30000
Madison 0.4% 0.4% 2.2% 1.00000 5.50000
Total 100% 100% 100%
geographic representation. The 1990 Census proportions also are included in this table. When
originally weighted using the 1990 Census data, the resulting weights adjusted the survey house-
holds to reduce the proportion of households from Licking County and, at the same time,
increase the representation of Franklin, Delaware, and Union County households.
The new weighting factor, based on Census 2000, still adjusts for an over-representation of
Licking County households and under-representation elsewhere. However, the population
growth seems less in Franklin and Licking Counties and more in the surrounding counties. Fair-
field County in particular grew from 1.9 percent of the population distribution in 1990 to 7.3
percent in 2000. Again, having more frequent updates in terms of population growth from ACS
will greatly aid in these types of data adjustments.
Normalization of Weights Again, because a weight created only on Factors 1, 2, and 3
would result in the 5,418 households representing 5,364 households when weighted, all house-
holds were given a Factor4 value of 0.9829 to normalize the data.
Not surprisingly, at each phase of the recalculation, one can see evidence that the household
survey data were much more in line with the Census 2000 estimates than the 1990 Census esti-
mates. This comparison supports the supposition that using ACS estimates for a time period
corresponding to the household travel surveys will benefit the survey process.
8.3.3 Weighting Using ACS Data
Many recent household travel surveys were planned to roughly correspond to decennial cen-
sus years to allow for the application of relevant weights. As for the mid-Ohio case, this has meant
that planners have had to rely on preliminary weights based on older census data until the newer,
more relevant, data became available. ACS will provide planners with the ability to develop sur-
vey sample weights that better correspond to the survey data collection period. With annual esti-
mates available less than a year after the reference period, planners will be able to apply accurate
sample weights much more quickly than previously.
OCR for page 139
Survey Development and Analysis Using ACS Data 139
As for other analyses, survey analysts need to understand the ACS data availability constraints.
For larger census areas of more than 65,000 population, annual estimates will be available. If a
survey study area is composed of a group of counties that all have large populations, then annual
ACS estimates can be used in geographic-based weighting. If one or more of the geographic areas
is smaller than 65,000 people, but all are larger than 20,000, then three-year estimates will be
available for use in weighting. If the survey area includes geographic areas that are smaller than
that, five-year estimates would be required.
To maintain consistency in the estimates, it will usually be best to use common types of esti-
mates (e.g., all three-year averages or all annual estimates), but for many larger metropolitan
areas, some outlying counties will require the five-year average. Analysts will need to weigh the
benefit of fully consistent estimates (from using the five-year estimates throughout the study
area) against obtaining more accuracy and timeliness for the core counties (by using one- or
three-year estimates where possible and five-year averages where necessary).
One of the seven counties in the Columbus region, Franklin County, was included as part of
the ACS pilot. We can review the weighting process using ACS estimates and focusing on
Franklin County with the objective of understanding how Franklin County weights developed
using Census 2000 might differ from those developed using ACS pilot data. Of the 5,418 house-
holds surveyed, 3,451 were from Franklin County.
As for the weighting based on Census 2000, there was no need to adjust for telephone owner-
ship. In addition, since the focus of the ACS analysis was on only one county (Franklin County),
the data do not need to be adjusted for geographic representation. Thus, the Franklin County
weights focused on the probability of selection--the distribution of households by size and vehi-
cle ownership. Tables 8.1 and 8.2 show the Franklin County weights using Census 2000 and the
ACS as the control totals. The survey data were closer to Census 2000 in terms of the smaller
households (or those with fewer vehicles). However, for the larger households, and those with
more vehicles, the survey data were more in line with the ACS data.
These analyses have ignored two complications of using ACS estimates in analyses. First, the
example did not need to consider ACS multiyear averaging because the specific geography stud-
ied would have single-year estimates. As discussed above, if survey analysts need to consider geo-
graphic areas for which single-year estimates are not available (household travel survey strata
could include separate small counties, county subdivisions, or census places), then it will be best
to use the multiyear estimates for developing estimates for all the survey strata, regardless of
whether single-year estimates are available. Prior to developing a survey stratification scheme
and weighting plan, it will make sense for survey analysts to consider which ACS reporting cat-
egory the geographic areas within the survey region fall into, and then to define the geographic
strata based on this information.
Second, the reported analyses did not include any mention of confidence intervals or statisti-
cal uncertainty in the ACS estimates. These analyses (and virtually any other analyses that have
been previously performed using decennial census Long Form estimates) can be accomplished
using the ACS estimates, without consideration of the uncertainty. The ACS estimates, although
less precise than decennial census estimates, will still almost always represent the best available
estimates of the population characteristics under study, so for analyses that require a single tar-
get estimate, such as household survey weighting, the analyst will need to rely on the reported
estimate.
The analyst could calculate or obtain margins of error for the ACS estimates used as weight-
ing targets, but because household survey response biases tend toward certain directions (under-
representation of larger households and zero vehicle households), many of the resulting weights
would be set at the extreme ends of the confidence intervals. Although not all the midpoint
OCR for page 140
140 A Guidebook for Using American Community Survey Data for Transportation Planning
estimates will be as close to the actual (but unknown) characteristic count or average as the ends
of the 90 percent confidence intervals, on average, the midpoint estimates are a better estimate
of the actual characteristic.
In cases where ACS estimates will be used without the formal calculation of margins of error,
it will be important that the analyst validate, to the extent possible, the reasonableness of the ACS
estimates that are being used. This can be accomplished by comparing the ACS estimates
· To independent data sources (also referred to as "administrative records" by the Census
Bureau);
· For specific geographic areas with those for nearby areas, larger areas for which the areas of
interest are a component, and smaller geographic areas that comprise the areas of interest; and
· For specific time periods with previous (and perhaps subsequent) time periods and multiyear
estimates that include the time period of interest.
These validation efforts will help identify potential issues with the specific ACS estimates that
would be used to inform the survey stratification and weighting processes. Based on these eval-
uations, analysts may choose to use different multiyear average estimates or to define geographic
strata differently.