rion when making the case for government-supported data collection—particularly in flagship surveys where there is great competition for space. The strength of correlative or causative connections, as well as the perceived importance of the hypothesized outcome, are key criteria for setting data collection priorities. For example, if trust in others in a neighborhood is strongly associated with crime rates and weakly associated with, say, mental health, it suggests that trust may be more useful to measure in a crime and victimization survey than it would be in a health survey. However, if mental health is considered a larger social issue than crime, the weaker linkage for the latter would be offset in determining the focus of data collection resources.

The Potential of Alternative Data Sources

The U.S. Office of Management and Budget and the agencies responsible for the federal statistical system determine standards and guidelines and appropriate content for surveys on an ongoing basis. In addition to such benefits as larger sample sizes, higher standards for methodological transparency and documentation, better archiving and access, and increased likelihood of being repeated over time, government surveys also typically enjoy higher response rates than do those in the private sector. And for some elements this is critical. Information about people’s volunteering activities is an example. Abraham et al. (2009) showed that high response rates are important for measuring volunteerism because people who engage in these activities are also most likely to participate in surveys such as the American Time Use Survey (ATUS); thus selection bias (associated with nonresponse, in this case) would be exacerbated in a low response rate survey. This finding suggests an area of comparative advantage for the CPS Volunteer Supplement.

Administrative data sources—both government and nongovernment—are becoming prominent in the alternative data landscape. Sometimes these data, produced as a by-product from program or other (nonstatistical) needs, can be linked with survey and other data to allow richer analyses than would be possible with survey data alone.2 The optimal data strategy for one data set or survey therefore cannot be sensibly designed without consideration of other elements of the data infrastructure. The ability to link government data sources means that covariate information may not be limited to the fields on the primary survey vehicle. Tax data, Social Security records, and information on program participation are all


2For example, Chetty et al. (2013) combined administrative tax data from the Internal Revenue Service and local area variables to analyze patterns of intergenerational occupational and earnings mobility.

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement