ponent of the ACS—for example, the possible deletion of institutions from the ACS universe—that would be cost-beneficial for users and stakeholders.

4-D
DATA PREPARATION

This section briefly describes key procedures to prepare the ACS data products, including confidentiality protection measures (4-D.1), the collapsing of tables because of large sampling errors (4-D.2), inflation adjustments of income and housing value and costs (4-D.3), tabulation specifications with respect to the population universe and geographic areas for which various estimates are provided (4-D.4), and data quality review (4-D.5). Recommendations for research and development on these topics are contained within the applicable subsection.

4-D.1
Confidentiality Protection
4-D.1.a
Confidentiality Protection Procedures

The Census Bureau uses three primary methods of disclosure avoidance to minimize the risk that someone could identify an individual respondent in the ACS data products: data swapping, categorizing variables, and top-coding. The first two methods are used for tabulations; all three methods are used for the ACS public use microdata sample (PUMS) files. The PUMS files also protect confidentiality by deleting names and addresses from the individual records, limiting geographic identification to large areas containing about 100,000 people called public use microdata areas (PUMAs), and perturbing the ages of people in households with 10 or more members. In addition, the subsampling for generating the PUMS files affords protection even if one knows a person who was in the full ACS sample because one does not know whether the person is in the PUMS subsample.

Data swapping occurs when a household has rare characteristics (such as being the only minority household in a block group). In such instances, the entire household may be swapped with a demographically similar household in a different geographic region. Only a small percentage of households are ever swapped, and they are never identified. The purpose of swapping is to ensure that users will not be able to identify a household with certainty. All data products are created from the ACS records after swapping.

Categorizing variables refers to collapsing categories of a variable within a table, or on the PUMS records, to avoid small cell sizes. For example, a table may combine some race categories, such as races other than white and black, into a single category, or a table may combine income



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement