Skip to main content

Currently Skimming:

3 Meeting the Challenges
Pages 42-58

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 42...
... These strategies are very important for protecting respondent confidentiality in survey data under all circumstances, and especially when there is a high risk of identification due to the existence of precise geospatial attributes. At their heart, many of these strategies protect confidentiality by restricting access to the data, either by limiting access to those data users who explicitly guarantee not to reveal respondent identities or attributes or by requiring that data users work in a restricted environment so they cannot remove information that might reveal identities or attributes.
From page 43...
... For instance, the disclosure limitation program project, "Human Subject Protection and Disclosure Risk Analysis," at the Inter-university Consortium for Political and Social Research (ICPSR) has resources available for teaching about its findings and the best practices it has developed (see http:// www.icpsr.umich.edu/HSP [April 2006]
From page 44...
... This tier of access is typically reserved for data files with little risk of disclosure and harm, such as those that include very small subsamples of a larger sample, that represent a tiny fraction of the population in a geographic area, that contain little or no geographic information, or that do not include any sensitive information. We are unaware of any cases for which this form of public access is allowed to files that combine social data with highly specific locational data, such as addresses or exact latitude and longitude.
From page 45...
... , the current policy is to introduce error into the data, destroy the original data, and release only the data that have been transformed. Data users consider it a burden to obtain limited licensing agreements, but both data stewards and users generally perceive them as successful because they combine an obligation for education and certification with relatively flexible access for datasets that present little risk of disclosure or harm.
From page 46...
... Although some researchers and universities are wary of these agreements, in recent years they have been seen as successful by most data users. Data distributors, however, continue to be fearful that their rules about data access are not being followed sufficiently closely or that sensitive data are under inadequate control.
From page 47...
... In addition to the cost passed on to users, the data stewards who maintain data enclaves bear considerable cost and space requirements. In sum, data enclaves are effective but inefficient and inequitable.
From page 48...
... For tabular data, as well as some microdata, one data limitation approach is cell suppression. The data steward essentially blanks out cells with small counts in tabular data or blanks out the values of identifiers or sensitive attributes in microdata.
From page 49...
... For spatial data, stewards can aggregate spatial identifiers or attribute values or both, but the aggregation of spatial identifiers is especially important. Aggregating spatial attributes puts more than one respondent into a single spatial location, which may be a point (latitude-longitude)
From page 50...
... Data Alteration Spatial attributes are useful in linked social-spatial data because they precisely record where an aspect of a respondent's life takes place. Sometimes these spatial data are collected at the moment that the original social survey data are collected.
From page 51...
... Swapping spatial identifiers thus is better suited for limiting disclosures of respondents' attributes than their identities. Swapping may not reduce -- and probably increases -- the risk of mistaken attribute disclosures from incorrect identifications.
From page 52...
... Interpolated Geocoding Interpolated geocoding estimates the precise location of an address along a street segment, typically defined between street intersec tions, on a proportional basis. This approach relies on the use of a geographic base file (GBF)
From page 53...
... This approach may reduce disclosure risks. Parcel Geocoding Parcel geocoding makes use of new cadastral information systems that have been implemented in many communities.
From page 54...
... , which allows different data stewards to compute the exact values of sums without sharing their values. One variant, used at the National Center for Educational Statistics, provides public data on a diskette or CD-ROM that is encoded to allow users to construct special tabulations while preventing them from seeing the individual-level data or for calculating totals when there are fewer than 30 respondents in a cell.
From page 55...
... However, the concept underpinning these techniques -- to allow users to perform computations with the data without actually seeing the data -- may point to solutions for sharing social and spatial data. Data Simulation Data providers may also release synthetic (i.e., simulated)
From page 56...
... To reduce dependency on data generation models, Little (1993) suggests a variant of the fully synthetic data approach called partially synthetic data.
From page 57...
... Partially synthetic datasets present greater disclosure risks than fully synthetic ones: the originally sampled units remain in the released files, albeit with some values changed, leaving values that analysts can use for record linkages. Currently, for either fully or partially synthetic data, there are no semiautomatic data synthesizers.
From page 58...
... This approach introduces error into matches obtained by linking the partially synthetic records to records in other datasets. Alternatively, simulating selected attributes reduces attribute risks without disturbing the identifiers: this enables linking, but it does not prevent identity disclosures.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.