residents. Otherwise, the shortcomings described in earlier sections related to the GQ frame could result in scenarios in which data are imputed into facilities that no longer exist. The panel thinks that improvements to the GQ sampling frame are essential to ensure the success of the imputation approach.
The success of the item imputation plans also depends on the quality of the donors. Some of the data associated with the donor cases are also imputed due to item nonresponse, which, in essence, translates into “double imputation.” The item imputation rates in the GQ data are higher than in the household data and are particularly high for the income questions (see Table 6-11). Item imputation rates also vary by state (see Table 6-12). To the extent of the panel’s knowledge, the effects of the double imputation on the data have not yet been evaluated.
Panel Observations on the Imputation Plans
The Census Bureau’s plans to impute nonsample GQ person records are in line with the panel’s view that GQ estimates can be produced based on alternatives to a design-based weighting approach. The proposed method allows for the creation of a microdata file with all characteristics included that could also serve as the basis for a Public Use Microdata Sample (PUMS) file and would be valuable to data users. By contrast, small-area estimation would involve constructing separate estimates for group quarters, which would then be combined with the household estimates to obtain total population estimates. Moreover, person-level imputation would not need to be performed for the GQ types that are moved to the housing unit sample (see Recommendation 4-7), which also has the advantage of reducing the volume of records imputed.
We discuss below some refinements to the Census Bureau plans presented to the panel. We also make recommendations for additional research that could inform the direction of this work in the future.
There are several alternatives that could be explored to evaluate methods for identifying donors. One concern is that donors are pulled from multiple group quarters in order to impute for a recipient GQ. This does not reflect the natural intraclass correlation that occurs within a GQ facility, but it could nevertheless produce unbiased estimates of descriptive statistics. The variance of the imputation procedure could, in fact, be lower this way. If more complex statistics—having to do with the relationships of variables among persons in the same group quarters—were of interest, then the imputation method could be biased. Another issue is that the imputation model assumes that all GQ cases, in each cell, have the same mean or are, in some sense, exchangeable. This may not account for other important covariates.
In the case of the donor selection procedure that prioritizes donor pools based on geographic proximity, it is not clear that the sequence of combinations