FIGURE 3-1 Balancing data utility and disclosure risk.

FIGURE 3-1 Balancing data utility and disclosure risk.

SOURCE: Presentation of Stephen Cohen, SED Workshop, May 27, 2009. Adapted from Duncan, Keller-McNulty, and Stokes (2001).

a pledge of confidentiality, but it is even more complicated in the case of SED because of the nature of the data that are collected.

The survey collects a mixture of information on the education, socioeconomic characteristics, and plans of doctorate recipients. Some data are more sensitive than others. For example, some of the information collected would generally be considered public knowledge, such as the type and field of the doctoral recipient’s degree. This information is widely disclosed in university graduation programs, announcements, and publications. The information can also be readily gleaned from various databases of dissertations—such as ProQuest and—which identify the authors of dissertations, their institutions, and other possibly identifying information, such as the date of the dissertation.

Some of the information collected would be considered by many people to be very private and personal and thus sensitive. Such information includes birth year, citizenship status at graduation, country of birth and citizenship, disability status, graduate and undergraduate educational debt, postgraduation plans (e.g., work, postdoctorate or other study, training), sources of financial support during graduate school, and salary of next position. Because the individual responses are protected by NSF and the contracted collection organization, the National Opinion Research Center (NORC), the data are published only in tabular form at high levels of aggregation.

There would be little concern over the protection of sensitive informa-

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement