Skip to main content

Currently Skimming:

1 The Promise of Integrated Data
Pages 9-24

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 9...
... , satellite and sensor data, private-sector data such as electronic health records and credit card transaction data, and massive amounts of data available on the internet. There is increasing interest in using non-survey data sources together with probability surveys to improve official statistics and create new data resources for social and economic research.
From page 10...
... Each panel was charged with convening a 1.5-day workshop on particular aspects of a vision for a new data infrastructure and writing a consensus panel report on those aspects. The first panel's workshop, The Scope, Components, and Characteristics of a 21st Century Data Infrastructure, was held on Decem ber 9 and 16, 2021.2 This workshop explored recent data infrastructure initiatives in the federal government; presented examples of using privatesector data for statistical purposes; and discussed legal, privacy, and access issues in using alternative data sources for official statistics.
From page 11...
... These reports will help inform a vision for a new data infrastructure and will not include recommendations. The three reports will follow institutional guidelines and be subject to the National Academies review procedures prior to release.
From page 12...
... . This report examines current practice and potential for using data originating from administrative records, private-sector organizations, sensors and satellites, and other sources to enhance the timeliness, detail, and accuracy of information currently collected through surveys.
From page 13...
... Section 1.1 describes the potential of combined data sources to improve evidence-based policy­ making and gives an example in which using multiple data sources to investigate childhood lead exposure resulted in new information that was used to change policy. Section 1.2 discusses frameworks for evaluating the quality of statistics calculated from single and multiple data sources.
From page 14...
... But childhood lead poisoning can be prevented. Large declines "in blood lead levels occurred from the 1970s to the 1990s following the elimination of lead in motor-vehicle gasoline, the ban on lead paint for residential use, removal of lead from solder in food cans, bans on the use of lead pipes and plumbing fixtures and other limitations on the uses of lead" (President's Task Force on Environmental Health Risks and Safety Risks to Children, 2016, p.
From page 15...
... .5 While there were indications that HUD-assisted housing units had lower levels of lead hazards, no single dataset included both designations of HUD-assisted housing and information on children's blood lead levels, which would enable evaluating associations between children's health and living in HUD-assisted housing. The NHANES data contained blood lead levels and other health information about respondents, but no information on whether respondents lived in assisted housing.
From page 16...
... Statistical Policy Directive No. 1 states: "It is the responsibility of Federal statistical agencies and recognized statistical units to produce and disseminate relevant and timely information; conduct credible, accurate, and objective statistical activities; and protect the trust of information providers by ensuring confidentiality and exclusive statistical use of their responses…" (OMB, 2014, p.
From page 17...
... . More recent statements on data quality have kept the same seven basic dimensions of quality but have added guidelines for assessing the quality of integrated data sources (Federal Committee on Statistical Methodology, 2018, 2020; Statistics Canada, 2019, 2022; Eurostat, 2021; see also the review of international quality standards in Czajka & Stange, 2018)
From page 18...
... 18 ENHANCING SURVEY PROGRAMS BY USING MULTIPLE DATA SOURCES FIGURE 1-1  Dimensions of data quality. SOURCE: Federal Committee on Statistical Methodology (2020, p.
From page 19...
... When multiple data sources are combined, quality assessments must consider the quality of each source as well as the quality of the combined data. Paths for using multiple data sources, and possible implications for data quality, include: • Using administrative records directly to give a picture of the popu lation found in the administrative records system (see Chapter 4)
From page 20...
... Switching to administrative records or combined data sources may affect the time series for these indicators, and these potential effects need to be thoroughly investigated. The use of multiple data sources can help improve the quality of data collected in surveys, even if the data are not combined.
From page 21...
... -- Statistical agencies and researchers in the areas of income and health statistics have done extensive work on methods for linking survey and administrative records datasets. The panel decided to devote a workshop session to recent data-linkage projects involving income and health data that illustrate the cur rent "state of the art" and show the potential for data linkage in other subject areas.
From page 22...
... The public virtual workshop on Implications of Using Multiple Data Sources for Major Survey Programs was held on May 16 and 18, 2022. The five sessions of the workshop were organized according to decisions outlined above, with an overview session followed by the use cases and a final session on data equity: 1.
From page 23...
... It also outlines some of the methods that can be used to combine data from multiple sources, such as linking data records, combining statistics from multiple sources, and using statistical models to predict values for missing data and to merge information from separate data sources. Chapter 3 introduces the key theme of data equity.
From page 24...
... Chapter 5 emphasizes the use of administrative data to study properties of income measurement, while Chapter 6 focuses on the ability to add data about health outcomes and expenditures to the records of survey participants. Chapter 7 discusses challenges in measuring crime as the Uniform Crime Reporting Program, which collects data on criminal offenses from law enforcement agencies, has migrated from a system that measured only counts of offenses to a system that records detailed information about the victims, offenders, and characteristics of incidents -- but with fewer law enforcement agencies providing data to the federal government.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.