Cover Image

Not for Sale



View/Hide Left Panel
Click for next page ( 54


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 53
Using ACS Data 53 4.4.3 Confidence Intervals As discussed earlier, the effect of the smaller ACS sample size, compared to the Long Form sam- ple size, is to increase the sampling error and consequently to increase the standard errors of the estimates. The fact that the ACS estimates are less reliable than the corresponding Long Form esti- mates is well recognized by the Census Bureau, and has led to the release of 90 percent confidence intervals along with the ACS estimates. Previously, confidence intervals were not released with Long Form estimates. Instead, the dataset documentation included a description of the method- ology and tables of parameters that users could employ in calculating these intervals. Data users should learn how to use and interpret these confidence intervals. The various case studies presented in this guidebook illustrate how the confidence intervals affect the conclusions drawn from the analysis. For example, by examining the standard errors (computed from the estimate and the confidence interval) of estimates for two time periods or two populations, one can determine whether there is any real (or statistically significant) change in the value of the corresponding characteristic or whether the change is attributed to random error. 4.5 Comparison of ACS Estimates to Census It is likely that many, or most, new ACS data users will begin their analyses of ACS by com- paring ACS estimates to Census 2000 Long Form estimates. There are many methodological differences between the census Long Form and ACS, including differences in Sample sizes, Data collection procedures, Field staff training and capabilities, Wording of some questions, Reference periods, Editing procedures, Weighting, and Rounding. Therefore, it would not be that surprising to see differences in the estimates that are not actual differences that can be supported by conventional wisdom or other data sources. Certainly, some of these differences are due to actual improvements in methods, as discussed in Section 4.1. In addition, over time, differences between ACS estimates and Census 2000 estimates will become historical footnotes, as only ACS will be carried into the future. Nevertheless, these facts may not help an analyst very much as she or he tries to understand how important population variables are actually changing within a region. When unexpected differences between ACS and previous Census 2000 estimates are found, analysts may benefit by going through the following checklist: Examine the ACS margins of error and standard errors, as discussed in the previous section. An ACS point estimate may look odd compared to the Census 2000 estimate, but the differences may be statistically indistinguishable due to the limited sample sizes and the variability in the data. Remember the Census 2000 Long Form data represent sample data as well. The Census 2000 documentation provides the ability to estimate standard errors for those estimates. Investigate ACS and Census 2000 data quality measures, such as item imputation rates, related to the specific questionable conflicting results. Imputation rate tables are available with the other base tables on the American FactFinder website. Compare the Census 2000 and ACS questionnaires for the item(s) in question, and judge whether differences in the surveys would naturally lead to differences in the estimates. The ACS residency definition and reference period definition are likely to be the cause of many measured differences between the datasets.

OCR for page 53
54 A Guidebook for Using American Community Survey Data for Transportation Planning Determine whether the curious finding is consistent with what other comparisons between the datasets have noted. Many comparison studies are discussed below. Decide whether applying benchmarking analysis could address the identified issues. A tech- nique for performing this analysis is summarized below. To the extent possible, identify and utilize validation datasets and administrative records to determine whether the new ACS estimates are reasonable. Develop caveat language for reports and presentations to explain the dataset differences and the effects of these differences on your analyses. The following subsections provide guidance on implementing these strategies. We begin by exploring the research that has been conducted on differences between Census 2000 and ACS to help analysts better understand where structural differences between the datasets can be expected. 4.5.1 Census Bureau Comparison Reports Four of the reports in the Census Bureau ACS evaluation series compare the results of the C2SS and the decennial census for General demographic and housing characteristics (Report 4); Economic characteristics (Report 5); Social characteristics (Report 9); and Physical and financial housing characteristics (Report 10).30 Each of the reports concludes that at the national level, the C2SS estimates were similar to those produced from the Census 2000 sample. In addition, the researchers compared county- level estimates for counties corresponding to 18 of the ACS test sites. Few county-level estimate differences were found to be substantive. Even when differences were deemed to be statistically significant (which was common due to the large sample sizes), the report authors note, "data users would in general come to similar conclusions, implement similar programs, and allocate funds in a similar way regardless of which dataset they used." Where differences were found, the researchers considered potential methodological reasons, and recommended actions for the ACS design. Among the reasons identified for differences were the following: Sample coverage differences between the C2SS and decennial census Long Form; Differences in the reference periods (Census 2000 focused on a single point of time in April 2000; C2SS referred to "last week" and covered all of year 2000); Questionnaire presentation differences, including question wording and response categories; Different proxy rules and survey follow-up mechanisms; Different weighting and estimation procedures; Better internal checks and verification procedures in C2SS than in Census 2000; and Interviewers who were more experienced and better trained for C2SS than the enumerators for Census 2000. A fifth report in the Census series, Report 8: Comparison of the American Community Survey Three-Year Averages and the Census Sample for a Sample of Counties and Tracts compares esti- mates from the census 2000 Long Form to the same ACS estimates(1999-2001) at the county and census tract level for the 36 ACS test sites.31 30 See www.census.gov/acs/www/AdvMeth/Reports.htm. 31 U.S. Census Bureau, Meeting 21st Century Demographic Data Needs Implementing the American Community Survey: Report 8: Comparison of the American Community Survey Three-Year Averages and the Census Sample for a Sample of Counties and Tracts (June 2004).

OCR for page 53
Using ACS Data 55 For this analysis, the Census Bureau selected a manageable number of variables for analysis. Four types of estimates were evaluated. Demographic estimates included Age, Race, Gender, Hispanic origin, Relationship, Tenure, and Housing occupancy. Social estimates included School enrollment, Educational attainment, Marital status, Disability status, Grandparents as caregivers, Veteran status, Nativity and place of birth, Region of birth of foreign born, Language spoken at home, and Ancestry. Economic estimates included Employment status, Commuting to work, Occupation, Industry, Class of worker, Income, and Poverty status. Housing estimates included Units in structure, Year structure built, Rooms, Year householder moved into unit, Vehicles available, House heating fuel, Occupants per room, Value, Mortgage status and selected monthly owner costs, Selected monthly owner costs as a percentage of household income, Gross rent, and Gross rent as a percentage of household income. For the county-level comparison, the majority of ACS estimates were in agreement with the Census 2000 estimates. Some of the statistically significant county-level differences were found to be small enough that they would not impact the use of the data. In addition, many of the differences could be attributed to differences in the questionnaires, procedures or both. Unfortunately, because of the small sample size in the ACS, meaningful comparisons at the cen- sus tract level were difficult to perform. Although the general patterns for the tract data tended to mirror the county patterns, the high levels of variance for the tracts tended to reduce the number of detectable differences.

OCR for page 53
56 A Guidebook for Using American Community Survey Data for Transportation Planning The report uses the Z-score to determine whether differences are due to the sampling vari- ability or are probably due to issues other than sampling variability. The report finds that most of the variables show small differences between the ACS and Cen- sus 2000. At the county level, a large number of counties showed statistically significant differ- ences in disability status, Hispanic origin, and employment status. Some other authors32-33 (Stern, 2003, Salvo et al., 2004) have noted that the Census 2000 disability rates may have been inflated, partly because of misinterpretation of the Census 2000 survey question. Stern notes that "differences in disability are traced to computer interviewing in the ACS (a clear improvement over the Census 2000 and ACS mail questionnaire). Differences in race responses are partly traced to the use of permanent field staff where the response "some other race" is not a response category in most other surveys and a much smaller number of these responses are observed in ACS than in Census 2000."34 Differences also were seen in labor force participation, mean travel time (Census 2000 estimates are consistently higher), vehicles available in households (Census 2000 estimates were significantly higher in six counties for households with no vehicles, and ACS estimates were significantly higher in five counties for households with three or more vehicles), and means of transportation to work. The carpool to work category (in mode to work) recorded the highest difference with ACS num- bers consistently lower in 9 of 36 counties. Tables 4.11 through 4.14 summarize the county-level differences reported for the sample census variables. Note, of the 36 counties analyzed, small is defined as fewer than 4 counties with significant differences; moderate is defined as between 4 and 8 counties; large is defined as 9 or more counties. Table 4.11. Number of counties with statistically significant differences between ACS and Census 2000 demographic estimates. Estimate Category ACS Census 2000 Difference35 Sex Small Age Moderate Race Large Hispanic Origin Large Relationship Large Tenure Moderate Household by Type Large Housing Occupancy Large Source: United States Census Bureau, 2004. 32 S.M. Stern, (2003). Counting People with Disabilities: How Survey Methodology Influences Estimates in Census 2000 and the Census 2000 Supplementary Survey. Report submitted to the U.S. Census Bureau. Washington, D.C. 33 Joseph Salvo, Peter Lobo, and Timothy Calabrese. Small Area Data Quality: A Comparison of Estimates 2000 Census and the 1999-2001 ACS, Bronx, New York Test Site, 2004. 34 U.S. Census Bureau, Meeting 21st Century Demographic Data Needs--Implementing the American Community Survey: Report 8: Comparison of the American Community Survey Three-Year Averages and the Census Sample for a Sample of Counties and Tracts (June 2004), p. xvii. 35 Of the 36 counties analyzed, small is defined as fewer than four counties with significant differences; moderate is defined as between four and eight counties; large is defined as nine or more counties.

OCR for page 53
Table 4.12. Number of counties with statistically significant differences between ACS and Census 2000 social estimates. Estimate Category ACS Census 2000 Difference School Enrollment Moderate Educational Attainment Moderate Marital Status Moderate Grandparents as Caregivers and Veteran Status Small Disability Large Nativity and Place of Birth Moderate Region of Birth of Foreign Born Small Language Spoken at Home Large Ancestry Large Source: United States Census Bureau, 2004. Table 4.13. Number of counties with statistically significant differences between ACS and Census 2000 economic estimates. Estimate Category ACS Census 2000 Difference Employment Status Large Commuting to Work Moderate Occupation Small Industry Small Class of Worker Moderate Household Income Moderate Income by Type Large Family Income Small Poverty Status Small Source: United States Census Bureau, 2004. Table 4.14 Number of counties with statistically significant differences between ACS and Census 2000 housing estimates. Estimate Category ACS Census 2000 Difference Units in Structure Large Year Structure Built Large Number of Rooms Large Year Householder Moved into Unit Small Number of Vehicles Moderate House Heating fuel Moderate Selected Housing Characteristics Large Occupants per Room Large Housing Value Moderate Mortgage Status and Selected Owner Costs Small Selected Monthly Costs as a Percentage of Household Income Moderate Gross Rent Moderate Gross Rent as a Percentage of Household Income Large Source: United States Census Bureau, 2004.

OCR for page 53
58 A Guidebook for Using American Community Survey Data for Transportation Planning 4.5.2 Local Area Experts Comparison Reports In addition to the reports prepared by staff, the Census Bureau contracted with four local experts to provide site-specific analysis of these data. With their local knowledge of the counties, they provided a comprehensive interpretation of the data from a user perspective. Bronx County, New York Bronx County data were assessed in the report summarized at www.census.gov/acs/www/AdvMeth/acs_census/lreports/SalvoLoboCalabrese.pdf and written by Joseph Salvo, Peter Lobo, and Timothy Calabrese in March 2004. Because the three-year aggregate ACS sample size for the Bronx (10.2 percent of total housing units) was very small at the census tract level, this report examined data at a neighborhood level.36 The 355 tracts in the Bronx were aggregated to 88 neighborhoods. The report finds that the mail return rates between the census and ACS are only modestly correlated (0.42). The average Cen- sus 2000 return rate was 53 percent. During the period 1999-2001, the ACS had an average return rate of 36 percent, decreasing from 38 percent in 1999 to 34 percent in 2001. The ACS also has a response rate that varies by geographic area. However, the allocation levels were lower in the ACS than in Census 2000, both for housing and population items. The ACS produced higher percentages for people in the labor force than Census 2000. Car- pool rates in ACS were about 2 percent smaller than Census 2000. Table 4.15 shows some of the variables for which statistically significant and meaningful37 differences were found between the ACS and Census 2000. The authors expressed concern regarding the adequacy of five-year accumulated data at the census tract level, as follows: Another concern, again related to the heavy dependence in the ACS on non-response follow- up, is that five years of data may not be enough to generate reliable estimates at the census tract level if mail return rates do not improve. This study provides a good illustration of what limits a 9 versus 15 percent sample placed on our ability to derive reliable estimates, namely the use of 88 neighborhood tract aggregates in lieu of estimates for the actual 355 census tracts. Table 4.15. Variables with statistically significant differences in Bronx County: ACS versus Census 2000. Variable ACS Census 2000 Population Aged 21-64 with Disability 19.0% 31.8% Commute Via Carpool 7.0% 9.3% Commute via Public Transportation 57.0% 53.9% Mean Travel Time to Work 40.4 minutes 43.1 minutes Civilian Employment 50.3% 45.7% Median Household Income $26,185 $27,611 Mean Earnings $41,552 $44,116 Poverty Status of Individuals 56.8% 58.8% Vehicles Available in Household = 1 30.1% 28.8% Source: Salvo, Lobo, and Calabrese, 2004. 36 ACS sample rates were 15 percent for Puma, Hampden, Douglas, and Multnomah Counties. For Broward, Bronx, San Francisco, Lake, and Franklin Counties, the ACS three-year aggregate sampling rate was closer to 10 percent. 37 "Meaningful" differences are defined by the authors as statistically significant differences of 2 percent or more between ACS results and Census 2000 results.

OCR for page 53
Using ACS Data 59 Multnomah County, Oregon Multnomah County ACS test site data were assessed in the report found at www.census.gov/acs/www/AdvMeth/acs_census/lreports/hough_swanson.pdf and written by George Hough and David Swanson in March 2004. Examining the self-response rates for Multnomah County, the authors state that if the only data for the survey were to come from self-response, ACS would have significant problems in areas where there is a concentration of minority populations. The most important issue under- lying all of their concerns is funding the ACS effort continuously. "Sufficient funding for imple- menting the 2010 ACS plan must be ensured for a longer time horizon than the annual federal budget process now allocates." However, the ACS allocation rates were lower than those of Census 2000 for population and housing items. ACS provided better data than Census 2000 for sample unit non-response rates, occupied sample unit non-response rates, and housing unit sample completeness ratios, with no significant difference observed for the household population sample completeness ratios. Census 2000 results were better than ACS when examining vacant housing unit non-response rates. The Census 2000 sample uses population, housing unit, and household controls, while the ACS weights housing units and population solely. Using the specific housing unit and popula- tion weights to estimate households results in a difference between the number of householders and corresponding households of about 5,000. Another contribution of this report is an alternate analysis of differences by using a method called "Loss Function." The Loss Function summarizes the information in the absolute numeric and absolute percent differences by combining them in a weighted fashion. Using the Loss Func- tion, the authors identified some concerns with the measurement of race variables in ACS. The authors suggest that the Census Bureau release estimates for aggregated racial groupings as opposed to the detailed race groups that currently are provided. Significant differences also were observed for Hispanic population. San Francisco and Tulare Counties, California Data from San Francisco and Tulare Coun- ties were assessed in the report provided at www.census.gov/acs/www/AdvMeth/acs_census/ lreports/gage.pdf and written by Linda Gage. This report compared ACS and Census 2000 data for San Francisco and Tulare Counties. The report notes striking differences in data collection on race, disability status, vacancy status, number of rooms in structure, and grandparents as caregivers. However, 80 percent of the total variables were comparable. There were significant differences in the percentage of foreign-born, educational attainment, and language spoken at home--the author states that the rates of allocation in Census 2000 are the reason for the differences. Response rates were significantly improved under the ACS for most difficult items such as income. The report's findings on non-response are consistent with the Census Bureau's quality measures report. Census 2000 data shows higher percentages of work- ers commuting by carpool, longer commute times, and higher percent of households without vehi- cles than ACS. ACS shows higher percent of households with one vehicle. The author provides strategies for analyzing and using census data. ACS prospects and predicaments also are delineated in the report as follows: The amount of data available to make your own assessment of the comparability, quality, use- fulness, and potential benefits of ACS is initially overwhelming. The data, quality measures, and geography make analysis a challenge. Statistical measures like the differences, standard errors, Z-scores and P-values can help quickly identify significant differences but some statis- tically significant differences may not be meaningful differences in the world of the data user. In general, the ACS appears to be measuring the same things in much the same ways as the census and getting similar results. There is still much to learn about data comparability, reasons for differences and whether "different" is better, worse or just different. There are dif-

OCR for page 53
60 A Guidebook for Using American Community Survey Data for Transportation Planning ferences between the census and ACS, some statistically significant differences. These may ultimately be welcome differences if ACS data are consistent, more current, and of higher quality than data from the Census 2000 Long Form sample. A few suggestions as you proceed to use the ACS data: Do not try to analyze all the data all at once even if you use all the items or must supply them to others. Concentrate on the data items that you already use in your work frequently. Compare those items with the census data. Do not assume the census picture is more accurate. Check the quality measures. Compare ACS and census data to administrative records that you may have available. Consider whether the data make sense. Learn to use and provide standard errors supplied with ACS data. Communicate your findings with the Census Bureau and others evaluating the ACS data. This will improve the survey as it matures. ACS has been designed to collect and provide more complete and current demographic, social, economic, and housing information between censuses and to replace the Census 2010 Long Form. The success of this endeavor depends upon continuous and adequate funding, sufficient sample sizes, and a current and accurate MAF. Shortfalls in any of these areas could reduce data quality. The decennial census is subject to the same perils. As the ACS continues to evolve and improve, a few of the identified challenges include: Resident populations in facilities such as prisons and dormitories (group quarters), Improving the Census Bureau's population estimates used as the population controls for the ACS, and Assisting data users to use a series of averaged data and data for small jurisdictions and seasonal areas. Vilas and Oneida Counties, Wisconsin and Flathead and Lake Counties, Montana Data from these sample Wisconsin and Montana counties were assessed in the report provided at www.census.gov/acs/www/AdvMeth/acs_census/lreports/vossetal.pdf and written by Paul Van Auken, Roger Hammer, Paul Voss, and Daniel Veroff in March 2004. This report assesses ACS attributes and quality measures at county and tract levels for coun- ties with seasonal population. Based on seasonality in these counties, the authors anticipate ACS values to be higher for older population, median age, occupied housing units, median income, and housing values, and lower for unemployment and average household size. Because rural census tracts are so large in geographic extent and encompass governmental units, the authors would like to have data at the minor civil division level, in addition to census tracts. Because the Census Bureau expects ACS to achieve a (five-year) sample that is 75 percent of the census Long Form, and because the housing unit response is roughly around 75 percent of those originally in the sample, the ACS "interviewed" sample size would be 56 percent (0.75 0.75 =.56) of the "100 percent response" census Long Form. The authors expect this to be exac- erbated in rural areas. All four counties studied exhibited a sizeable difference in economic and housing attributes for over 20 percent of items. ACS was successful in capturing some of the sea- sonal variations. Plotting the annual estimates of ACS at the county level, the authors found that ACS would be unable to provide reliable annual estimates for smaller areas like Vilas and Oneida Counties, particularly if they are not over sampled. The authors also plotted the ratio of ACS and Census 2000 standard errors at the geography of census tract to find substantial cases where the ratio is more than 1.3, the level predicted by the Census Bureau.

OCR for page 53
Using ACS Data 61 The authors did not make any conclusions on the comparison of the ACS and Census 2000 data, citing the following four reasons: 1. Lack of data at the minor civil division level for comparison: The authors believe that data at the minor civil division level is critical to providing meaningful data for governmental units in rural areas. 2. Access to uncontrolled estimates from ACS: The authors want to review the ACS numbers, properly weighted, but without the final control to the population and housing estimates to examine what the ACS implies in terms of numbers of people/housing units in addition to their characteristics. 3. Because of a sampling error, ACS samples for some of the counties are substantially smaller than Census 2000 samples, thus yielding estimates with higher standard errors and more uncertainty. 4. One of the goals of the ACS is for standard errors in ACS not to exceed Census 2000 standard errors by more than 33 percent at all levels of census geography. At the tract level, attribute standard errors for the ACS appear to exceed those obtained in the Long Form by more than 33 percent. 4.5.3 ACS Transportation-Related Research One objective was to compare the ACS data to the decennial census data so as to be able, to the extent possible, to make conclusions about the differences between the data sources, the rel- ative accuracy of the data sources, and the adequacy of the ACS data. The differences between the ACS and CTPP estimates can be attributed to several factors, including differences in sam- pling rates, survey methodology, wording of the questions, timeframe of data collection, con- trol totals, and rounding. It is important to understand both the magnitude and the signifi- cance of these differences, and how they would impact transportation planning applications. We evaluated the general quality and validity of three-year accumulations (1999-2001) of ACS transportation-related data based on residence, workplace, and flow for nine test counties by comparing them to Census 2000 data that corresponds to CTPP Part 1, Part 2, and Part 3 data. The ACS and census data tables were provided to the project team by FHWA, which had received them for evaluation from the Census Bureau. Appendix I summarizes the analyses that were conducted. Some conclusions of these analyses were as follows: In general, the CTPP and ACS datasets appear to show the same patterns for the transporta- tion-related tables. Only a small number of tracts and TAZs in the test counties for which data were available had significant variances between the two datasets. When we correlated the differences that were found with other tract and TAZ variables, we detected some systematic biases in the residence-based estimates, most notably for the fol- lowing variables: Disability status; Disability status by mode to work; Tenure (specifically, the owned-with-mortgage category); Number of workers in the household by vehicles available by household income; Poverty status (specifically the category for incomes between 100 percent and less than 150 percent of poverty); and Telephone availability. Although the analyses of workplace-based estimates were more limited by the available com- parison data, we did not identify any systematic biases. Effective comparisons of worker flow data were not possible.

OCR for page 53
62 A Guidebook for Using American Community Survey Data for Transportation Planning 4.5.4 Questionnaire Considerations There are many questionnaire and data collection differences between ACS and the census Long Form data that affect the comparability of individual estimates (see Section 2 and Appen- dix A), but two differences are likely to affect many of the estimates, including transportation- related characteristics. The ACS residency definition and reference period definition will have an important effect on many comparisons of ACS and Census 2000. Residency Definition The ACS uses different residence rules than have been employed in past decennial censuses. Although the decennial census uses the usual residence concept, the ACS uses the current residence concept along with the Two-Month Rule. The current residence concept suits the ACS, because the ACS continuously collects infor- mation from monthly samples throughout the year. The current residence concept recognizes that people can live more than one place over the course of a year, and that population estimates for some areas may be noticeably affected by these people. Seasonal areas can experience impor- tant increases in their population over the year, increases that are not measured when only usual residents are recognized. While the use of the current residence concept gives a more accurate picture of an area's pop- ulation, it does present some challenges (for example, in integrating ACS data with intercensal population estimates, which employ the decennial census usual residence definition).38 Reference Period Since ACS data are collected continuously, the annual ACS estimates rep- resent cumulative data over the 12-month interview cycle, and thus average annual conditions. In contrast, decennial census data represent point-in-time conditions. The implications of the different reference dates are that ACS data will more accurately capture average conditions in seasonal areas. Decennial census data will only reveal characteristics of those areas on a single day, which might be quite different from conditions at other times of the year. Using average annual data in models or analyses that are developed based on point-in-time data might be inconsistent, and this presents challenges to the analyst. For example, using aver- age annual data or multiyear moving average data to calibrate/validate a travel demand model (e.g., trip distribution models, mode choice models) that predicts at a single point in time is the- oretically inconsistent. However, this might not be a major issue if changes in household char- acteristics or mode choices are not significant over the period when the data are collected. It is important to note that even though the rolling reference period procedures are used in ACS, the ultimate population control for any given year is the July 1 estimate. The implications of this control for seasonal analysis are discussed in Section 4.6. 4.5.5 Bridging between Year 2000 Census Data and ACS Much of the discussion in previous sections has focused on identifying why seemingly sur- prising differences might occur between Census 2000 Long Form estimates and ACS estimates from roughly the same time. This section describes how the analyst might apply corrective fac- tors to allow for better comparisons. Suppose the following data shown in Table 4.16 on the mean travel time in a given area are available. The analyst wants to determine the change in the mean travel time from 2000 to 2005. Decennial census data are available in year 2000 but not in any of the following years; ACS data are not available for this area in year 2000 but are available afterwards. 38 Amy Symens Smith, "The American Community Survey and Intercensal Population Estimates: Where Are the Crossroads?" 1998. See www.census.gov/population/www/documentation/twps0031/twps0031.html.

OCR for page 53
Using ACS Data 63 Table 4.16. Example of ACS and census data comparison. Year ACS Mean Travel Time Census Mean Travel Time 2000 NA 30 2001 28.8 NA 2002 29.0 NA 2003 29.2 NA 2004 29.4 NA 2005 29.6 NA The 2001 to 2005 ACS travel time series shows an increasing trend in travel time over the years. However, the Census 2000 mean travel time estimate is larger than the ACS estimates in each of the years from 2001 to 2005. If analysts compare the raw Census 2000 estimate to the 2005 ACS estimate, they might erroneously conclude that congestion has decreased, and low- ered journey-to-work travel times by 0.4 minutes. However, this conclusion is probably inaccurate because the two estimates are drawn from two different surveys. When one accounts for the inherent differences between the surveys cor- rected through an analytical comparison of Census 2000 and C2SS data, a more reasonable con- clusion can be drawn. As discussed below, the Census 2000 estimate can be converted to a 2000 ACS-like estimate by multiplying it by a factor of 0.9552, resulting in a 2000 estimate of 28.7 minutes. Given this estimate, one could conclude that the travel time increased between years 2000 and 2005 by 0.9 minutes. The process of reconciling the estimates from one survey to the estimates from another sur- vey is called benchmarking. It is typically done when two surveys with different precision levels and collection frequencies are available for providing estimates of a given population's charac- teristics. The survey that is normally used as the benchmark is the one whose estimates are more reliable. Different methods exist for benchmarking such as constrained estimation,39 prediction models,40 and imputation of adjusted responses.41 Since future data releases will only be from the ACS, Census 2000 data can be reconciled to produce year 2000 ACS-like data that could then be more consistently compared to future releases of ACS data. One method that has been used to bridge this gap is regression analysis. Year 2000 ACS data are available from C2SS for 216 counties with population above 250,000. The C2SS data for these counties can be used together with decennial census data for the same counties to analyze the differences between the two data sources. For this guidebook, the following variables were analyzed: Mode to work, Travel time to work, 39 This method can consist, for example, of adjusting the weights used to obtain the ACS estimates so that the ACS weighted annual average for selected characteristics would be equal to that of the census. 40 In this method, a model (e.g., a regression) is developed using census estimates as the dependent variables and ACS estimates as predictors (or independent variables). The fitted equation can then be used to calibrate the ACS estimates to the census estimates by doing empirical Bayes' smoothing. This only applies, however, to the ACS variables included in the model. 41 This method consists of estimating "what proportion of ACS respondents must have given the wrong answers to produce the observed differences, and then imputing the necessary proportion of different answers to bring agreement [with the census]". It requires the estimation of a measurement error model based on the differences between ACS and the census.

OCR for page 53
64 A Guidebook for Using American Community Survey Data for Transportation Planning Vehicle availability, and Income. For each variable of interest, we regressed the 2000 ACS estimate as the dependent variable against the 2000 decennial census estimate as the independent variable. The slope of this regression (with no intercept) provides a factor that can be interpreted as a factor that could be multiplied by the cen- sus estimate to obtain an ACS-like estimate. We did this analysis using all 216 county observations, as well as separately by metropolitan statistical area/consolidated metropolitan statistical area (MSA/CMSA) size to account for any biases that might be a function of area size. Table 4.17 shows the factors obtained for means of transportation to work. The following cat- egories are used: percent that drove alone, percent that carpooled, percent that used public transportation, and percent that walked. Therefore, based on this regression analysis, we could conclude that the Census 2000 drove- alone mode share would be more consistent with ACS estimates if the census estimate were mul- tiplied by 1.0099. Note that some of the factors for the walk and transit modes are fairly large, indicating that there were significant differences between the raw Census 2000 and ACS-like C2SS estimates. Table 4.18 shows the factors obtained for vehicle availability, and used the categories of per- cent of zero-vehicle households and average auto ownership (average vehicles per household). Tables 4.19 and 4.20 show the factors obtained for travel time to work. Categories used are mean travel time (minutes), percent with short commutes (less than 20 minutes), and percent with long commutes (greater than 20 minutes). Table 4.21 shows the factors obtained for median household income. Of course, these benchmarking factors are crude measures of the differences between Census 2000 and ACS, but analyses like these could help analysts understand and report trend data that rely on the different datasets. Table 4.17 Means of transportation to work. MSA/CMSA MSA/CMSA: MSA/CMSA Mode Pooled Sample 5 Million Drove-Alone 1.0099 Carpool 0.9249 Public Transportation 0.9701 1.0043 1.0861 Walked 0.7579 0.8496 0.9265 Table 4.18 Vehicle availability. MSA/CMSA MSA/CMSA: MSA/CMSA Vehicles Available Pooled Sample 5 Million Zero 0.9164 0.8887 0.9617 Average Number 1.0182 Table 4.19 Mean travel time to work. Travel Time Pooled Sample Mean travel time 0.9552