APPENDIX C
1991 SURVEY METHODOLOGY
Sample Design
The sampling frame for the Survey of Doctorate Recipients (SDR), comprising the Survey of Humanities Doctorates and the Survey of Doctoral Scientists and Engineers, is compiled from the Doctorate Records File (DRF), an ongoing census of all research doctorates earned in the United States since 1920. For the 1991 Survey of Humanities Doctorates, the sampling frame was selected from the DRF to include individuals who--
-
had earned a doctoral degree from a U.S. college or university in a humanities field;
-
were U.S. citizens or, if non-U.S. citizens, indicated they had plans to remain in the United States after degree award; and
-
were under 76 years of age.
To develop the frame, graduates who had earned their degrees since the 1989 survey and met the conditions listed above were added to the frame, and those who were carried over from 1989 but had attained the age of 76 (or died) were deleted. A sample of the incoming graduates was selected and added to the panel sample to form the total sample.
However, after the 1991 sample had been selected, it became necessary to reduce its size by about 50 percent because of budget constraints (the cost savings were redirected toward obtaining a higher response rate); the humanities sample was reduced from an initial size of 17,716 to 8,894.8
The basic sample design for the 1991 SDR was a stratified random sample with the goal of 70 as the minimum number of cases selected in each sampling cell. This minimum worked to ensure that there were sufficient cases to publish estimates of small subgroups. The variables used for stratification were 11 selected fields of degree, 2 genders, and 2 cohort groupings (year of degree), resulting in 44 sampling cells.9 The sampling rates in each cell were the product of the initial sampling rate (prior to reduction) and the
8 |
Because a higher response rate was achieved in 1991, the effective sample size was reduced by only 23 percent. |
9 |
The initial 1991 sampling frame was stratified into 879 cells according to a different set of variables. The sample reduction goals included restratifying the sample into fewer sampling cells that reflected current analytic needs. |
subsampling rate (applied to achieve the reduction). The population of 105,715 was sampled at an overall rate of 8.2 percent.
Data Collection
The goal of the 1991 data collection plan was to maximize the response rate using the most cost-effective measures. These measures related to the two primary causes of nonresponse in the SDR: (1) failure to locate sample members, and (2) failure to gain cooperation from those who were located. Because the SDR as longitudinal--and people change residences and jobs--contact is lost with a certain proportion of sample cases between survey years. At the start of the 1991 survey, this proportion was estimated to be about 5 percent of the sample. However, with assistance from alumni offices and private address vendors, this percentage was reduced to about 2.5 percent prior to the first mailing.
Data collection consisted of two phases: a self-administered mail survey, followed by computer-assisted telephone interviewing (CATI) among a sample of the nonrespondents to the mail survey. The mail survey consisted of three mailings of the survey questionnaire, with a reminder postcard between mailings 1 and 2. The first mailing was sent in October 1991, and the other two in December 1991 and January 1992. In order to encourage participation, all survey materials were personalized with the respondent's name and address. In addition, the survey questionnaires were reformatted in a more “respondent friendly” design than that of earlier years. The mail survey achieved a response rate of about 63 percent.
Phase 2--telephone interviewing--was conducted with about 60 percent of the nonrespondents to the mail survey. This activity was subcontracted by the National Research Council to Mathematica Policy Research (MPR) in Princeton, New Jersey. Of the nonrespondents, MPR located telephone numbers for about 90 percent and completed interviews with 71 percent. CATI was conducted between March and July 1992.
Data Preparation
As completed mail questionnaires were received, they were logged into a receipt control system that kept track of the status of all cases. Coders then carried out a variety of checks and prepared the documents for data entry. Specifically, they resolved incomplete or contradictory answers, imputed missing answers if logically appropriate, reviewed “other, specify” responses for possible backcoding to a listed response, and assigned numeric codes to open-ended questions (about employer name, for example). A coding supervisor validated the coders' work.
Once cases were coded, they were sent to data entry. The data entry program ensured that only values within allowable ranges were entered and that built-in consistency checks were not violated. For example, a case in which a respondent reported unemployment but later listed an employer's name was flagged for review.
The same consistency and range checks, together with the editing and coding rules, were applied to the CATI data. (Because CATI data are keyed directly to disk during the interview, the data entry step is eliminated.) CATI data were then recoded to match the structure and format of the mail data, and the two files were combined. Further computer checks were performed to test for inconsistent values, corrections were made, and the process was repeated until no inconsistencies remained.
Weighting and Estimation
The general purpose of weighting survey data is to compensate for unequal probabilities of selection to the sample and to adjust for the effects of nonresponse (see the section in this appendix on Reliability of the 1991 Survey Estimates for a discussion of nonresponse). Weights are often calculated in two stages. In the first stage, unadjusted weights are calculated as the inverse of the probability of selection, taking into account all stages of the sampling selection process. In the second stage, these weights are adjusted to compensate for nonresponse; such nonresponse adjustments are typically carried out separately within multiple weighting cells.
The first step in constructing a basic weight for the 1991 SDR sample cases involved developing a design weight that reflected the selection probabilities for each case. Because the 1991 initial sample was reduced through subsampling, cases selected for the 1991 initial sample were each assigned a 1991 initial design weight (DWGT) based on their probability of selection to the sample. The 1991 initial design weight does not adjust for nonresponse. This weight was then multiplied by the inverse of the case's probability of selection to the 1991 reduced sample; the latter probability took into account the subsampling done to reduce the 1991 initial sample. More formally, the basic weight (BSCWGT) for the ith case is defined as
BSCWGTi = DWGTi * (1/Pi),
in which Pi represents the probability of selection for the 1991 reduced sample. BSCWGT is the basic weight for the mail respondents.
For the mail “nonrespondent” cases, a further subsampling step was done to determine the cases to be followed up by CATI. The subsampling was done in 11 groups of cases. The selection of the nonrespondent subsample was done independently of the 1991 SDR design. Therefore, the basic weight (BSCWGTC) for the ith CATI case can be defined as
BSCWGTCi = BSCWGTi * (1/P'i),
where P'i represents the probability of selection for the CATI subsample.
The next stage was to adjust the 1991 basic weight for nonresponse. Nonresponse adjustment cells were created using poststratification. Within each nonresponse adjustment
cell, a weighted nonresponse rate, which took into account both mail and CATI nonresponse, was calculated. The nonresponse adjustment factor for each cell is the inverse of this weighted response rate. The initial set of nonresponse adjustment factors was examined and, under certain conditions, some of the cells were collapsed. Let ƒ be the final adjustment factor for a given cell. Then the final weights for the mail and CATI respondents are given by
FINWGTM = BSCWGT * (ƒ)
and
FINWGTC = BSCWGTC * (ƒ),
respectively.
Because the weights that resulted from this computation process were not always integer weights, respondents in each cell were assigned a weight that was equal to either the integral part of the cell' s final weight or the integral part plus one. Allocation of integer weights within a cell was made at random so as to represent the cell population. Estimates in this report were developed by summing the final integer weights of the respondents selected for each analysis.
Reliability of the 1991 Survey Estimates
Because the estimates shown in this report are based on a sample, they may vary from those that would have been obtained if all members of the target population had been surveyed (using the same questionnaire and data collection methods). Two types of error are possible when population estimates are derived from measures of a sample: nonsampling error and sampling error. By looking at these errors, it is possible to estimate the accuracy and precision of the survey results. Potential sources of nonsampling error in the 1991 SDR are discussed below, followed by a discussion of sampling error--how it is estimated and how it can be used in interpreting the survey results.
Nonsampling Error
Nonsampling errors in surveys can arise at many points in the survey process; they take different forms:
-
Coverage errors can occur when some members of the target population are not identified and therefore do not have a chance to be selected to the sample.
-
Nonresponse errors can occur when some or all of the survey data are not collected in a survey year.
-
Response errors can occur either when the wrong individual completes the survey or when the correct individual cannot accurately recall the events being questioned. Response errors can also arise from deliberate misreporting or poor question wording that leaves room for inconsistent interpretation by respondents.
-
Processing errors can occur at the point of data editing, coding, or key entry.
Little information exists on the magnitude of nonsampling error in the SDR. Coverage errors are likely to be minimal, because the Doctorate Records File (the sampling frame for the SDR) is considered a complete census.10 However, response errors may have occurred during the CATI phase, when respondents were asked in March to recall their work activities the previous September, a full 6 months earlier--although this type of error has never been studied. Likewise, no information exists on the consistency of coding and editing over time or within a survey year.
However, the largest potential source of nonsampling error--nonresponse--can be examined by looking at the overall response rate as well as at response rates by subgroups. Nonresponse bias is defined as “the bias or systematic distortion in survey estimates occurring because of the inability to obtain a usable response from some members of the sample.”11 Nonresponse bias is concerned with the “representativeness” of the respondents, that is, with how the respondents' characteristics compare with those of the population from which they were chosen. If the respondents do not accurately represent the population, this would result in inaccurate population estimates.
Table C-1 shows the overall response rate and response rates by subgroups (both weighted and unweighted).12 The overall weighted response rate was 87.6 percent, a rate sufficiently high for confidence that the effects of nonresponse bias are minimal, at least on estimates of the total population. By field of degree, weighted response rates ranged from 85.1 (doctorates in “other” modern, languages and literature) to 91.9 percent (doctorates in music). These differences are not extreme, and they suggest that estimates by field are not likely to be biased by nonresponse. Likewise, subgroups defined by cohort and sex are probably not affected by nonresponse bias, as evidenced by the high observed response rates (ranging from 86.7 to 92.1 percent) and the small range in response rates among these subgroups.
10 |
See P. Ries and D. H. Thurgood, Summary Report 1992: Doctorate Recipients from United States Universities (Washington, D.C.: National Academy Press, 1993), p. v. |
11 |
Judith T. Lessler and William D. Kalsbeek, Nonsampling Error in Surveys (New York: Wiley, 1992), p. 118. |
12 |
Response rates were calculated by dividing the number of usable responses by the number of in-scope sample cases. Weighted response rates take into account the unequal probabilities of selection to the sample and show what the response rate might have been if everyone in the population had been surveyed. Weighted response rates indicate the potential for nonresponse bias in the survey estimates, and unweighted response rates indicate how successful the data collection protocol was in getting responses. |
Sampling Error
Sampling error is the variation that occurs by chance because a sample, rather than the entire population, is surveyed. The particular sample that was used to estimate the 1991 population of humanities doctorates in the United States was one of a large number of samples that could have been selected using the same sample design and size. Estimates based on each of these samples would have differed.
Standard errors indicate the magnitude of the sampling error that occurs by chance because a sample rather than the entire population was surveyed. Standard errors are used in conjunction with a survey estimate to construct confidence intervals--bounds set around the survey estimate in which, with some prescribed probability, the average estimate from all possible samples would lie. For example, approximately 95 percent of the intervals from 1.96 standard errors below the estimate to 1.96 standard errors above the estimate would include the average result of all possible samples.13 With a single survey estimate, the 95 percent confidence limit implies that if the same sample design was used over and over again, with confidence intervals determined each time from each sample, 95 percent of the time the confidence interval would enclose the true population value.
The number of survey estimates in the SDR for which standard errors might have been estimated was extremely large because of the number of variables measured, the number of subpopulations, and the values--totals, percentages, and medians--that were estimated. The direct calculation of standard error estimates from the raw data for each estimate was prohibited by time and cost limitations. Instead, a method was used for generalizing standard error values from a subset of survey estimates that characterize the population, allowing application to a wide variety of survey estimates.
This method computes the variances associated with selected variables and uses these estimates to develop values of a and b parameters (regression coefficients) for use in generalized variance functions that estimate the standard errors associated with a broader range of totals and percentages.14 Base a and b parameters are shown in Table C-2. These parameters were used to generate tables of approximate standard errors shown on pp. 63-65. The use of these tables is described below, together with an alternative method for approximating the standard errors more directly.
Standard Errors of Estimated Totals
Table C-3 and Table C-4 show approximate standard errors for the humanities doctoral population overall, for field groupings used in the report (e.g., American history,
13 |
Approximately 90 percent of the intervals from 1.64 standard errors above and below the estimate would include the average result of all possible samples; or, if more precision is required, approximately 99 percent of the intervals from 2.58 standard errors above and below the estimate would include the average result of all possible samples. |
14 |
Consideration of the complex sample design and estimation procedure of the 1991 SDR suggested that a balanced replication procedure (with 16 replicates) be used for calculating the a and b parameters. |
philosophy), and for gender by field. The standard errors shown in the tables were calculated using the appropriate values of a and b, along with the formula for standard errors of totals:
where x is the total. Resulting values were rounded to the nearest multiple of 10. The illustration below shows how to use the tables to determine the standard errors of estimates shown in the report.
Illustration. If the number of speech/theater Ph.D.s employed in academic institutions was reported at 3,200 and one wishes to determine the approximate standard error, one can use the values shown in Table C-3 for estimated numbers of 2,500 and 5,000 in the total (“All Fields”) column (230 and 320, respectively) and, through linear interpolation, calculate 255 as the approximate standard error of the estimate of 3,200 as follows:
On the other hand, using the values of a and bfor speech/theater Ph.D.s from Table C-2 and formula (1), one can also calculate the approximate standard error more directly:
To develop a 95 percent confidence interval around this estimate of 3,200, one would add and subtract from the estimate the standard error multiplied by 1.96. This means that the average estimate from all possible samples would be expected 95 times out of 100 to fall within the range of
3,200 ± (1.96 × 259) = 2,692 to 3,708
This range of 2,692 to 3,708 represents the 95 percent confidence interval for the estimated number of 3,200.
Standard Errors of Estimated Percentages
Percentages are another type of estimate that are given throughout the report. The standard error of a percentage may be approximated using the formula:
where x is the numerator of the percentage, y is the denominator of the percentage, p is the percentage (0 < p < 100), and b is the b parameter from Table C-2. Tables of standard errors of estimated percentages were derived using this formula and are shown in Table C-5 and C-6. These tables display each of the broad fields reported in the report and for the female subpopulation within each field to illustrate the differences for subpopulations. Formula (2) may be used to calculate the standard errors of percentages not shown in the tables.
Illustration. Suppose the total number of women doctorates in the U.S. labor force was reported as 29,100 and the number of women employed part-time was reported at 4,800. The proportion of women employed part-time would be approximately 16.5 percent. Table C-6 shows the approximate standard error of a 15 percent characteristic on a base of 25,000 to be 1.0. Alternatively, using the appropriate value of b from Table C-2 and formula 2, the standard error of p is determined as follows:
To develop a 95 percent confidence interval around this estimate of 16.5 percent, one would add and subtract from the estimate the standard error multiplied by 1.96. That is, the average estimate from all possible samples would be expected 95 times out of 100 to fall within the range
16.5 ± (1.96 × .949) = 14.6 to 18.4
The range of 14.6 to 18.4 represents the 95 percent confidence interval for the estimated percent of 16.5.
Limitations of the Standard Error Estimates
As mentioned, the standard error estimates provided in this report were derived from generalized functions based upon a limited set of characteristics (or survey estimates). While this method provides good approximation of standard errors associated with most survey results, it may overstate the error associated with estimates drawn from strata with high sampling fractions. However, the only way to avoid this overstatement is to calculate the standard errors directly from the raw data, forgoing the practical, more widely applicable generalized method.
TABLE C-1 Response Rates by Summary Strata (Field, Cohort, and Gender), 1991
TABLE C-2 Listing of a and b Parameters (Select Groups in Humanities Fields), 1991
Field of Doctorate |
Parameter |
Total |
Women |
Whites |
Asians |
Blacks |
Native Americans |
Minority Combined |
Hispanic |
Foreign |
Total, All Fields |
a |
-0.000199 |
-0.000519 |
-0.000199 |
0.009131 |
-0.003005 |
0.020942 |
-0.000833 |
-0.001162 |
-0.004549 |
b |
21.6904 |
18.9936 |
21.8340 |
17.7057 |
22.3356 |
12.1452 |
18.8179 |
12.8045 |
17.7517 |
|
American History |
a |
-0.003591 |
-0.021731 |
-0.002012 |
0.009131* |
0.140529 |
0.020942* |
0.119762 |
0.042981 |
-0.004549* |
b |
26.1841 |
29.6630 |
20.7618 |
17.7057* |
1.2347 |
12.1452* |
2.0103 |
4.2848 |
17.7517* |
|
“Other History” |
a |
-0.000199* |
-0.000044 |
-0.000199* |
0.012206 |
-0.022305 |
0.020942* |
-0.015350 |
-0.056858 |
-0.004549* |
b |
21.6904* |
9.9658 |
21.8340* |
4.0585 |
10.2099 |
12.1452* |
13.3264 |
32.9098 |
17.7517* |
|
Art History |
a |
-0.004109 |
-0.005417 |
-0.004086 |
0.009131* |
-0.003005* |
0.020942* |
-0.000833* |
-0.001162* |
-0.004549* |
b |
12.9442 |
10.0504 |
12.5698 |
17.7057* |
22.3356* |
12.1452* |
18.8179* |
12.8045* |
17.7517* |
|
Music |
a |
-0.001105 |
-0.007003 |
-0.001055 |
0.009131* |
0.022766 |
0.020942* |
-0.000942 |
-0.001162* |
-0.004549* |
b |
10.7681 |
16.5488 |
10.6411 |
17.7057* |
22.2600 |
12.1452* |
19.6350 |
12.8045* |
17.7517* |
|
Speech/Theater |
a |
-0.000199* |
-0.004946 |
-0.000199* |
0.009131* |
-0.039070 |
0.020942* |
0.110617 |
-0.001162* |
-0.004549* |
b |
21.6904* |
7.0627 |
21.8340* |
17.7057* |
14.0045 |
12.1452* |
3.3756 |
12.8045* |
17.7517* |
|
Philosophy |
a |
-0.000986 |
-0.004594 |
-0.000947 |
0.572047 |
-0.003005* |
0.730917 |
-0.005948 |
0.169559 |
-0.016053 |
b |
13.6434 |
6.1502 |
13.8210 |
0.6616 |
22.3356* |
0.7462 |
15.3065 |
1.1329 |
9.9934 |
|
Engl/Amer Lang/Lit |
a |
-0.001281 |
-0.002500 |
-0.001473 |
-0.005534 |
-0.016353 |
0.020942* |
-0.016952 |
0.054909 |
-0.022902 |
b |
36.4284 |
31.2660 |
40.0157 |
6.2152 |
19.4521 |
12.1452* |
17.9575 |
7.0489 |
11.8290 |
|
Classical Lang/Lit |
a |
-0.003111 |
-0.008243 |
-0.003423 |
0.009131* |
-0.003005* |
0.020942* |
-0.000833 |
-0.001162* |
-0.004549* |
b |
6.8651 |
5.1673 |
6.9775 |
17.7057* |
22.3356* |
12.1452* |
18.8179 |
12.8045* |
17.7517* |
|
Modern Lang/Lit |
a |
-0.000809 |
-0.001167 |
-0.000791 |
0.102755 |
-0.014019 |
0.020942* |
0.032888 |
-0.002798 |
-0.008958 |
b |
15.0078 |
13.1262 |
13.1886 |
28.0732 |
24.0491 |
12.1452* |
16.7345 |
10.9683 |
19.5433 |
|
“Other Humanities” |
a |
-0.002498 |
-0.000519* |
-0.002667 |
0.009131* |
-0.003005* |
0.020942* |
-0.026912 |
-0.001162* |
-0.004549* |
b |
16.1731 |
18.9936* |
16.1484 |
17.7057* |
22.3356* |
12.1452* |
20.4191 |
12.8045* |
17.7517* |
|
*Direct estimates are not available; data shown are considered useful approximations. |
TABLE C-3 Approximate Standard Errors of Estimated Numbers of Humanities Doctorates, by Field, 1991
TABLE C-4 Approximate Standard Errors of Estimated Numbers of Female Humanities Doctorates, by Field, 1991
TABLE C-5 Approximate Standard Errors of Estimated Percents of Humanities Doctorates, 1991
Estimated Percents |
|||||||
Base Number of Percent |
1 OR 99 |
2 OR 98 |
5 OR 95 |
10 OR 90 |
15 OR 85 |
25 OR 75 |
50 |
50 |
6.6 |
9.2 |
14.4 |
19.8 |
23.5 |
28.5 |
32.9 |
100 |
4.6 |
6.5 |
10.2 |
14.0 |
16.6 |
20.2 |
23.3 |
200 |
3.3 |
4.6 |
7.2 |
9.9 |
11.8 |
14.3 |
16.5 |
500 |
2.1 |
2.9 |
4.5 |
6.2 |
7.4 |
9.0 |
10.4 |
700 |
1.8 |
2.5 |
3.8 |
5.3 |
6.3 |
7.6 |
8.8 |
1,000 |
1.5 |
2.1 |
3.2 |
4.4 |
5.3 |
6.4 |
7.4 |
2,500 |
0.9 |
1.3 |
2.0 |
2.8 |
3.3 |
4.0 |
4.7 |
5,000 |
0.7 |
0.9 |
1.4 |
2.0 |
2.4 |
2.9 |
3.3 |
10,000 |
0.5 |
0.7 |
1.0 |
1.4 |
1.7 |
2.0 |
2.3 |
25,000 |
0.3 |
0.4 |
0.6 |
0.9 |
1.1 |
1.3 |
1.5 |
50,000 |
0.2 |
0.3 |
0.5 |
0.6 |
0.7 |
0.9 |
1.0 |
75,000 |
0.2 |
0.2 |
0.4 |
0.5 |
0.6 |
0.7 |
0.9 |
100,000 |
0.1 |
0.2 |
0.3 |
0.4 |
0.5 |
0.6 |
0.7 |
Table C-6 Approximate Standard Errors of Estimated Percents of Female Humanities Doctorates, 1991
Estimated Percents |
|||||||
Base Number of Percent |
1 OR 99 |
2 OR 98 |
5 OR 95 |
10 OR 90 |
15 OR 85 |
25 OR 75 |
50 |
50 |
6.1 |
8.6 |
13.4 |
18.5 |
22.0 |
26.7 |
30.8 |
100 |
4.3 |
6.1 |
9.5 |
13.1 |
15.6 |
18.9 |
21.8 |
200 |
3.1 |
4.3 |
6.7 |
9.2 |
11.0 |
13.3 |
15.4 |
500 |
1.9 |
2.7 |
4.2 |
5.8 |
7.0 |
8.4 |
9.7 |
700 |
1.6 |
2.3 |
3.6 |
4.9 |
5.9 |
7.1 |
8.2 |
1,000 |
1.4 |
1.9 |
3.0 |
4.1 |
4.9 |
6.0 |
6.9 |
2,500 |
0.9 |
1.2 |
1.9 |
2.6 |
3.1 |
3.8 |
4.4 |
5,000 |
0.6 |
0.9 |
1.3 |
1.8 |
2.2 |
2.7 |
3.1 |
10,000 |
0.4 |
0.6 |
0.9 |
1.3 |
1.6 |
1.9 |
2.2 |
25,000 |
0.3 |
0.4 |
0.6 |
0.8 |
1.0 |
1.2 |
1.4 |