the lead author of the revised version of Working Paper 22—provided an assessment of the current status of research and development on methods to provide access and limit disclosure risk in the federal government. These emerging models include research data centers (RDCs), which permit onsite use of confidential files in a closely delimited area with specialized equipment and extreme security; systems of remote access over secure electronic lines to dedicated computers; fellowships and postdoctoral programs, in which researchers can be treated as agency employees, permitting a less restrictive form of access; and use of confidential data offsite but under highly restricted conditions, as spelled out in a legally binding agreement, such as a license. The emerging role of public query systems for accessing tabular and microdata was also discussed.

Bournazian began by stating his overall assessment that the data aggregation approach selected by NSF is compatible with both user needs and future growth in accessing data. However, he cautioned that any disclosure limitation approach adopted today must be designed with public database query systems in mind, and that NSF may need to develop a restricted access model to complement the application of the data aggregation approach for tabular data.

The issues surrounding the application of different disclosure limitation methods are appropriately considered at the design stage, rather than at the back end of the system, and the approach that ultimately gets chosen may have ramifications on future data release strategies, he observed. To the extent that some users are not satisfied with the access afforded in the scheme selected to protect the data, NSF may need offer to restricted access to the microdata.

Risk Assessment

The selection of the appropriate disclosure limitation methods, Bournazian suggested, should be based on the results of a formal risk assessment.1 The information that could be disclosed by the table design using prior releases is the appropriate basis for the assessment. It is important for NSF to look at the risk of reidentification of individuals in small cell counts by matching files to the tabular data. If there are only a few identified disclo-

1

In this report, risk is defined as the likelihood that a disclosure will occur, and a risk assessment is the calculation of the probability that an identity could be associated with a data item.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement