Weighting and Estimation

The next phase of the survey process involved weighting the survey data to compensate for unequal probabilities of selection to the sample and to adjust for the effects of unit nonresponse. The first step was the construction of sample weights, which were calculated as the inverse of the probability of selection, taking into account all stages of the sample selection process over time. Sample weights varied within cells because different sampling rates were used depending on the year of selection and the stratification in effect at that time.

The second step was to construct a combined weight, which took into account the subsampling of nonrespondents at the CATI phase. All respondents received a combined weight, which for mail respondents was equal to the sample weight and for CATI respondents was a combination of their sample weight and their CATI subsample weight.

The third step was to adjust the combined weights for unit nonresponse. (Unit nonresponse occurs when the sample member refuses to participate or cannot be located.) Nonresponse adjustment cells were created using poststratification. Within each nonresponse adjustment cell, a weighted nonresponse rate was calculated. This weighted nonresponse rate took into account both mail and CATI nonresponse. The nonresponse adjustment factor was the inverse of this weighted response rate. The initial set of nonresponse adjustment factors was examined and, under certain conditions, some of the cells were collapsed if use of the adjustment factor would create excessive variance.

The final weights for respondents were calculated by multiplying their respective combined weights by the nonresponse adjustment factor. Estimates in this report were developed by summing the final weights of the respondents selected for each analysis.

Response Rates

The unweighted response rate, which is calculated as total returns divided by total sample, was 76 percent. The weighted response rate takes into account the different probabilities for selection to the sample and the CATI subsample and is calculated as the total returns multiplied by their combined weight divided by the total sample cases multiplied by their sampling weights. The weighted response rate was 85 percent. The unweighted response rate is a measure of how well the data collection methodology worked in obtaining responses, while the weighted response rate is an indicator of the potential for nonresponse bias and as such is a somewhat better indicator of data quality.

Reliability

The statistics in this report are subject to both sampling and nonsampling error. For a detailed discussion of both sources of error in the SDR, see the methodological report referenced in footnote 1 of this appendix. In this methodological report, tables are provided that allow the reader to approximate the standard error associated with various estimates from the survey.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 87
Doctoral Scientists and Engineers in the United States Weighting and Estimation The next phase of the survey process involved weighting the survey data to compensate for unequal probabilities of selection to the sample and to adjust for the effects of unit nonresponse. The first step was the construction of sample weights, which were calculated as the inverse of the probability of selection, taking into account all stages of the sample selection process over time. Sample weights varied within cells because different sampling rates were used depending on the year of selection and the stratification in effect at that time. The second step was to construct a combined weight, which took into account the subsampling of nonrespondents at the CATI phase. All respondents received a combined weight, which for mail respondents was equal to the sample weight and for CATI respondents was a combination of their sample weight and their CATI subsample weight. The third step was to adjust the combined weights for unit nonresponse. (Unit nonresponse occurs when the sample member refuses to participate or cannot be located.) Nonresponse adjustment cells were created using poststratification. Within each nonresponse adjustment cell, a weighted nonresponse rate was calculated. This weighted nonresponse rate took into account both mail and CATI nonresponse. The nonresponse adjustment factor was the inverse of this weighted response rate. The initial set of nonresponse adjustment factors was examined and, under certain conditions, some of the cells were collapsed if use of the adjustment factor would create excessive variance. The final weights for respondents were calculated by multiplying their respective combined weights by the nonresponse adjustment factor. Estimates in this report were developed by summing the final weights of the respondents selected for each analysis. Response Rates The unweighted response rate, which is calculated as total returns divided by total sample, was 76 percent. The weighted response rate takes into account the different probabilities for selection to the sample and the CATI subsample and is calculated as the total returns multiplied by their combined weight divided by the total sample cases multiplied by their sampling weights. The weighted response rate was 85 percent. The unweighted response rate is a measure of how well the data collection methodology worked in obtaining responses, while the weighted response rate is an indicator of the potential for nonresponse bias and as such is a somewhat better indicator of data quality. Reliability The statistics in this report are subject to both sampling and nonsampling error. For a detailed discussion of both sources of error in the SDR, see the methodological report referenced in footnote 1 of this appendix. In this methodological report, tables are provided that allow the reader to approximate the standard error associated with various estimates from the survey.

OCR for page 87
Doctoral Scientists and Engineers in the United States This page in the original is blank.