Page 51
Appendix
An Example of Combining Information
The Panel on Estimates of Poverty for Small Geographic Areas is providing assistance to the Census Bureau in its development of model-based small-area estimates of the number of children living in poverty, which is needed for input to formulas allocating substantial funds to counties and school districts to address the needs of disadvantaged children under Title I of the Elementary and Secondary Education Act. Prior to this recent work, Title I had used the most recent census long-form (sample-based) counts to allocate funds, which produced estimates that were as much as 12 years out of date. Model-based estimates at the county level (for 1993 and every two years into the future) and at the school district level (for 1995 and every two years into the future) are now being used in place of the census long-form estimates. (Contemporaneous direct estimates cannot be supported with current survey or administrative data.)
The county-level model (used for both 1993 and 1995 estimates) is an excellent example of how current best practice permits one to combine data from various sources. These model-based estimates make use of a county-level regression model, which used as the dependent variable a logarithmic transformation of the current number of children in poverty, measured by a 3-year average (to reduce variance) from the Current Population Survey (CPS).1 This regression model makes use of (logarithmic transformations of)
1 Since the CPS does not have samples in all counties, the regression model was fit using only about 1,300 counties.
Page 52
the following covariates for a given county: the number of child exemptions reported by families in poverty on tax returns, the number of people receiving food stamps, the estimated population under age 18, the number of child exemptions on tax returns, and the number of poor school-age children in the county from the previous census. For counties with CPS sample households and with poor children in the sample, a linear combination (formally, empirical Bayes' shrinkage) of the direct estimate from the CPS and the model prediction from the regression model is computed; otherwise, the model prediction alone is used. After being transformed back to the original scale (assisted by an adjustment for transformation bias), the final county-level estimates of the number of poor school-age children are then ratio adjusted so that within each state the county-level estimates sum to a separately modeled state-level estimate.
The state-level model was developed in a similar manner to the county-level model. The state-level regression model uses as the dependent variable the estimated proportion of poor school-age children as measured by the CPS (using only a single year, given the larger sample size at the state level). The covariates used in this regression model are the proportion of child exemptions reported by families in poverty on tax returns, essentially the proportion of people receiving food stamps; the proportion of persons under 65 years of age who did not file a tax return; and the residual from the analogous census regression of the proportion of poor school-age children from the most recent census on the other three covariates contemporaneous with that time period. As in the county-level model, a linear combination (again based on empirical Bayes' methods) of the direct CPS estimate and the model prediction is used (though in practice, the estimated model error variance has been so low that the regression prediction has usually received the full weight).
For income year 1995, the requirement was to provide poverty estimates at the level of school districts. At this low level of geographic aggregation, the above approach based on regression modeling cannot be used, since corresponding data, especially for the covariates, does not now exist on a uniform basis. Therefore, the Census Bureau adopted a simple shares approach, distributing 1995 county-level estimates of the number of poor school-age children to school districts according to the school district to county poverty shares, measured using the 1990 census long form (ignoring some minor complexities).
In the future, the ACS is expected to play an important role in the estimation of the number of school-age children in poverty at the school district level, either by direct estimation based on aggregation of data over several years, or by combination in one of several ways, with other data series that are and might become available at the school district level (e.g., data on food stamp participation, data on school lunch participants, and poverty rates estimated from tax filers.) It is quite likely that even with the large sample size
Page 53
of the ACS, small-area estimation techniques will be required to combine information over time and geography to develop high-quality estimates. Issues of comparability of the decennial census, the CPS, and the ACS will need to be addressed, as will any changes in tax or welfare programs that affect data comparability over either time or geography.