procedures; they are fielded at the same time and they use the same reference period.1 They have been designed to provide coverage of the same target population: noninstitutionalized individuals residing in the United States, under 75 years of age, with a bachelor’s or higher degree, and educated or working in science and engineering (S&E) and related fields and occupations. Scientists and engineers are those who hold a bachelor’s or higher degree in an S&E or S&E-related field or who have a bachelor’s or higher degree in a non-S&E field but have an S&E or S&E-related occupation. Special emphasis in the surveys is given to relatively rare populations, such as doctorates, recent graduates, minorities, and people with disabilities.

All cases that qualify as scientists and engineers according to the SESTAT target population definition are integrated into a comprehensive database, the SESTAT integrated file, of all college-educated scientists and engineers in the United States. Because a person may be eligible for inclusion in more than one of the surveys, the National Science Foundation (NSF) uses a sophisticated method to ensure that each person is counted only once.2 The integrated file is used to produce national estimates of the number and characteristics of scientists and engineers in the United States.

The SESTAT surveys are unique in the federal system in that they compile detailed occupational, educational, and demographic data in one database. The complete educational histories that are collected for each person allow for a detailed examination of the relationship between education and career outcomes.

The SESTAT surveys are conducted every 2-3 years and are designed, primarily, to provide cross-sectional time-series data. However, an important new analytical dimension to the surveys was added when SESTAT individual data were assembled into longitudinal files that were prepared for the period from 1993 to 1999. The history of the SESTAT Program and the interrelationship between the component surveys is shown in Figure 2-1.


For further information on SESTAT, see; for NSCG, see; for NSRCG, see; and for SDR, see [accessed April 2008].


The statistical integration process uses a unique linkage rule. Each survey is weighted according to the frame developed for that survey and a series of overlap variables are calculated that allow for the identification of cases that are eligible for more than one survey. To remove these multiple selection opportunities, each case in the SESTAT target population is uniquely linked to one and only one component survey, and that individual is included in the SESTAT integrated file only when he or she is selected for that linked survey.

