Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 48

48 A Guidebook for Using American Community Survey Data for Transportation Planning
greater than 61 percent, then these modes would be collapsed to fewer categories according to
predefined collapsed table definitions. If the median of covariances for the collapsed table still
exceeds 61 percent, the table will be suppressed for County X.
4.4 Understanding, Working with,
and Reporting Sample Data
4.4.1 ACS Sample Size
The ACS questionnaire is sent to 250,000 housing units every month, or equivalently to 3 million
housing units annually, drawn from all counties in the U.S. To allow data users to better analyze
smaller areas, the Census Bureau applies differential sampling rates based on the area type. The 2005
sampling rates are shown in Table 4.9.
In contrast, the decennial census Long Form was sent to about one of every six addresses.
Since both the Long Form and ACS data represent samples of the overall population, they
include some imprecision, or margin of error, in their estimates.
4.4.2 Sampling Error
Sampling error is the term given to the error associated with deriving an estimate from a sam-
ple rather than an entire population. ACS data are estimates of actual numbers or percentages
in the population but because the data are not collected from the whole population, random
sampling error will be present. The larger the sample size is, the smaller the sampling error will
be but, of course, the specific amount of error in an estimate can only be known if information
from the true population were available.
Sampling error is most commonly estimated through the calculation of the standard error
associated with the estimate. Standard error is a measure of the deviation of a sample estimate
from the average of all possible similar samples. It is an indication of the precision with which a
Table 4.9. ACS Sampling rates, 2005.
Area Type Sampling Rate Category 2005 Final Sampling Rate
Blocks in smallest sampling entities (estimated occupied housing units in block < 200) 10.0%
Blocks in smaller sampling entities (estimated occupied housing units in block 200 6.9%
and < 800)
Blocks in small sampling entities (estimated occupied housing units in block 800 and 3.6%
1200)
Blocks in large tracts (estimated occupied housing units in block > 1200 and estimated
occupied housing units in tract > 2000)
Mailable addresses 75% and predicted levels of completed interviews prior to 1.6%
subsampling > 60%
Mailable addresses < 75% and/or predicted levels of completed interviews prior 1.7%
to subsampling 60%
All other blocks (estimated occupied housing units in block > 1200 and estimated
occupied housing units in tract 2000)
Mailable addresses 75% and predicted levels of completed interviews prior to 2.1%
subsampling > 60%
Mailable addresses < 75% and/or predicted levels of completed interviews prior 2.3%
to subsampling 60%
Source: United States Census Bureau, Design and Methodology: American Community Survey, Technical Paper 67
(May 2006) U.S. Government Printing Office, Washington, D.C.

OCR for page 48

Using ACS Data 49
sample estimate approximates the population value. Formulas for calculating standard errors
associated with sample estimates are straightforward, but since the Census Bureau will calculate
and report the standard errors, the reader is referred to any standard statistics textbook for more
details on these calculations.
The sampling error of an estimate is usually summarized as a combination of a confidence
level and a confidence interval. The confidence level is the percentage of times that drawing a
sample of a particular size from a certain population will result in having the actual (but
unknown) parameter of interest being within a certain confidence interval.
For instance, a surveyor might report that based on survey results, sample size, and variance
levels, the percent of households with zero vehicles for a certain population of households is 10
percent plus or minus 3 percent at the 95 percent confidence level. This means that 95 out of 100
times that we performed a survey with the same sample size, the estimate we determine in the
survey--plus or minus 3 percent--will include the true percentage of zero vehicle households.
For this example, the confidence level is 95 percent. The confidence interval is 6 percent and the
margin of error is ±3 percent.
It is common for analysts to establish a confidence level for reporting and then to calculate the
margin of error for the survey-derived estimates associated with that confidence level. The confi-
dence levels selected are generally related to how much uncertainty researchers are able to accept in
particular estimates. Medical and scientific researchers sometimes will specify 99 percent confidence
levels or higher. Political polls seem to usually report margins of error assuming confidence levels
of 95 percent or 90 percent. For a particular sample population and sample size, as confidence levels
are increased, the corresponding margins of error around the sample estimates widen.
Suppose a sample parameter is measured from a large sample to have a mean value of X and,
based on the variation in the sample, the standard error is computed to be Y. The confidence
intervals for different confidence levels are shown in Table 4.10.
Both the decennial census Long Form and ACS are sample datasets, so sampling error will be
present in estimates from either source. Despite this fact, one almost never sees precision levels
reported for census Long Form estimates. Analysts generally report census Long Form estimates
as single numbers. The Census Bureau does make the precision levels available to users, but most
data users choose not to work with them. Not incorporating the uncertainty levels into analyses
simplifies analyses, some of which are already fairly complicated. However, in practical applica-
tion, this also has the effect that many users of the analyses do not understand the nature of these
data. A common misconception of many consumers and users of these data is that they are cen-
sus data and therefore are actually based on a 100 percent sample of the population (like the
decennial census Short Form data).
Because the ACS sample sizes are smaller than those of the Long Form, the sampling errors will
be more significant for ACS, and the misconception that the estimates are completely precise is
Table 4.10. Confidence intervals for a large sample parameter with a mean
value X and a standard error Y.
Confidence Interval
Confidence Level Low High
80 percent X 1.28 * Y X + 1.28 * Y
90 percent X 1.65 * Y X + 1.65 * Y
95 percent X 1.96 * Y X + 1.96 * Y
99 percent X 2.58 * Y X + 2.58 * Y

OCR for page 48

50 A Guidebook for Using American Community Survey Data for Transportation Planning
more likely to lead to erroneous conclusions. For this reason, the Census Bureau is making a con-
certed effort to stress that ACS estimates are just that, statistical estimates, and not counts.
The Census Bureau calculates the standard errors for all estimates reported in ACS data prod-
ucts using procedures that account for the sample design and estimation methods. These
procedures are described in the Census Bureau's Accuracy of the Data reports, which are updated
annually (available at www.census.gov/acs/www/UseData/Accuracy/Accuracy1.htm).
All ACS estimates are reported with margins of error or confidence intervals corresponding to
the 90 percent confidence level. Using the reported estimates and upper and lower bounds, data
users are able to incorporate ACS's sampling error into their analyses and data presentations.
Example Calculations for Incorporating Sampling Error into ACS Analyses To help ana-
lysts use and interpret the margin of error provided with the ACS estimates, the Census Bureau
provides formulas and some example calculations to guide data users in the Accuracy of the Data
reports. There are four example calculations from this source presented and annotated below.
1. Calculation of the standard error of an ACS estimate,
2. Calculation of the standard error of the sum (or difference) of ACS estimates,
3. Calculation of the standard error of the ratio of two ACS estimates, and
4. Calculation of the standard error of the proportion of an ACS total estimate in an ACS subto-
tal estimate.
Although these examples are for a generic analysis for a wider audience, the same procedures
will be used by transportation planners in their most common analyses, as is demonstrated by
the case study sections that follow in this guidebook.
Example Calculation 1 Determine the standard error of a reported ACS estimate.
Problem The ACS estimates the number of males in the United States that have never married
to be 33,290,195. The reported lower bound of the estimate is 33,166,192, and the reported upper
bound is 33,414,198. What is the standard error of the estimate of the number of males who have
never married?
Relevant Equations.
Standard error = 90 percent confidence margin of error/1.65
Margin of error = max(upper bound estimate, estimate lower bound)
Note: Many, but not all, ACS intervals are symmetrical around the reported estimate, so
choosing the maximum interval is the conservative approach to establishing the margin of error.
Calculations
Margin of error = max(33,414,198 33,290,195), (33,290,195 33,166,192))= 124,003
Standard error = 124,003/1.65 = 75,153
Discussion The standard error calculation, in and of itself, may not be particularly edifying,
but it is a first step that allows users to perform other calculations, like those shown below. Also,
by knowing the standard error, analysts can establish upper and lower bound estimates for other
confidence levels. For instance, the 95 percent margin of error is 1.96 * 75,153 = 147,300.
Example Calculation 2 Determine the Standard Error of a Sum of Reported ACS Estimates.
Problem As noted in the previous example calculation, the number of males that have never
been married is estimated to be 33,290,195, with upper and lower bounds of 33,414,198 and
33,166,192. The estimate of the number of females that have never married is 29,204,857 with a

OCR for page 48

Using ACS Data 51
reported lower bound of 29,090,048, and a reported upper bound of 29,319,666. What is the
estimated number of all people who have never married?
Relevant Equations.
Standard error (SE) of a sum
^ +Y
SE( X ^ ) = [SE( X
^ )]2 + [SE(Y
^ )]2
Notes: The Census Bureau states that this method will underestimate the standard error if the
items in a sum are highly positively correlated, and will overestimate the standard error if the
items in the sum are highly negatively correlated. This equation also is valid for the standard
^ -Y
error of the difference of ACS reported estimates: SE(X ^ ) = SE(X
^ +Y^ ).
Calculations The point estimate of the number of people who have never married is
33,290,195 + 29,204,857 = 62,495,052.
From the previous example, the standard error of the estimates for males is 75,153. Applica-
tion of the same equation for females yields a standard error of 69,581. Therefore, the standard
error of the sum is
SE(62, 495, 052) = (75,153)2 + (69, 581)2 = 102, 418
Once the standard error of the sum has been calculated, analysts can calculate and report asso-
ciated confidence intervals. The 90 percent confidence interval for the total number of people
who have never married (based on equation in the first example) is
(62,495,052-1.65(102,418)) to (62,495,052+1.65(102,418)), or
62,326,062 to 62,664,042 people.
Discussion The summation of estimates propagates the sampling error inherent in the indi-
vidual addend estimates, so the importance of evaluating and reporting the uncertainty in esti-
mates derived in this manner is increased.
Many census data users, including transportation planners, will frequently need to combine
individual census estimates in this way to address their specific analysis needs. The detailed
delineations in several of the transportation-related ACS tabulations will frequently require ana-
lysts to sum individual estimates. For instance, ACS tabulations of commuting time of day break
the day into very detailed day parts. To analyze longer periods, such as peak periods as opposed
to peak hours, analysts will need to sum the time period components.
Example Calculation 3 Determine the standard error of a ratio of reported ACS estimates.
Problem Suppose the statistic of interest is the ratio of the number of women who have never
married to the number of men who have never married. What is the ratio and the standard error
of the ratio of females who have never married to males who have never married?
Relevant Equations.
Standard error of a ratio
^ 1
X ^2
SE = ^ )]2 + X [SE(Y
[SE( X ^ )]2
^ Y
Y ^ ^2
Y
Note: This approximation is valid for ratios of two estimates where the numerator is not a
subset of the denominator.

OCR for page 48

52 A Guidebook for Using American Community Survey Data for Transportation Planning
Calculations The equation inputs are calculated as shown above.
29, 204, 857 1 (29, 204, 857)2
SE = (69, 581)2 + (75, 513)2 = 0.29 percent.
33, 290,195 33, 290,195 (33, 290,195)2
The ratio of the two estimates is (29,204,857/33,290,195) = 87.73 percent, and the upper and
lower bounds for the 90 percent confidence level are
87.73% ± 1.65*0.29% = 87.25% - 88.21%
Discussion This example demonstrates a technique for evaluating how the sampling errors
affect the calculation of ratios between two parallel estimates. A transportation-based example
of this type of comparison would be if an analyst wanted to make a statement such as, "there are
X times more two-vehicle households than zero-vehicle households in geographic area Y."
These comparisons are not usually that useful for single-variable tables, but are very common
and useful when analyzing cross-tabulations, where an analyst might want to say something like,
"workers in zero-vehicle households are X times more likely to commute by transit than work-
ers in two-vehicle households."
The more common comparison between a subtotal estimate and its corresponding total estimate
(e.g., "X percent of the households have zero vehicles") is covered in the next example calculation.
Example Calculation 4 Determine the standard error of a percentage.
Problem: Now, suppose the statistic of interest is the percentage of females who have never
married in relation to the total number of people who have never been married. What is the per-
centage of people who have never married that are women, and what is the standard error of the
percentage?
Relevant Equations.
Standard error of a proportion:
1 ^2
^) =
SE( p ^ ))2 - X (SE(Y
(SE( X ^ ))2
^
Y ^2
Y
Note: This approximation is valid for proportions of two estimates where the numerator (X)
is a subset of the denominator (Y).
Calculations The point estimate for the proportion of the total that are female is
(29,204,857/62,495,052)100% = 46.73%.
From the previous calculations, we know the standard error of the number of females who have
never married is 69,581. The standard error for all people who have never married is 102,418.
The standard error of the proportion is
29, 204, 857 1 (29, 204, 857)2
SE = (69, 581)2 - (102, 418)2 = 0.08 percent.
62, 495, 052 62, 495, 052 (62, 495, 052)2
The proportion is 46.73 percent, and the upper and lower bounds for the 90 percent confi-
dence level are as follows:
46.73% ± 1.65 * 0.08% = 46.60% - 46.86%
Discussion Determining the percentage that an ACS estimate makes up of an ACS estimated
total will be a very common procedure for transportation planners and other census data users.
For example, to calculate mode shares for different commuting modes, analysts will apply this
procedure.