3
Model-Based Estimates of Poor School-Age Children

Past reliance on the most recent decennial census to allocate federal funds to counties and other small areas primarily reflects the absence of alternative data sources with comparable or superior reliability. As discussed in Section 2, the CPS can provide reasonably reliable annual estimates at the national level of such population characteristics as the number and percent of poor children, but it cannot produce estimates for counties. Nonetheless, the CPS data may serve as the basis for creating usable estimates through the application of statistical estimation techniques to develop ''model-based" or "indirect" estimates. Indirect estimators use data from other areas, time periods, or data sources to "borrow strength" and improve precision. In contrast, direct estimators use only the data from one source for the area and time period in question. A model-based approach is useful when there is no single source of information that can provide direct estimates, but relationships among several variables across various data sources can be used to provide estimates with acceptable precision.

The Census Bureau has been mindful of the need for updated small-area estimates. Even as Congress charged the Census Bureau to develop postcensal estimates of poor school-age children for counties and school districts, the Census Bureau was organizing a program to study methods for producing postcensal income and poverty estimates for states and counties during the 1990s. The Census Bureau launched this program in late 1992 with financial support from a consortium of five federal agencies. The program faces a challenging task. In particular, there is no single administrative or survey data source that provides all of the information required to develop reliable estimates of the number of school-age children in poverty by county, including income information that is detailed and precise at the county level.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 17
Small-Area Estimates of School-Age Children in Poverty: Interim Report I: Evaluation of 1993 County Estimates for Title I Allocations 3 Model-Based Estimates of Poor School-Age Children Past reliance on the most recent decennial census to allocate federal funds to counties and other small areas primarily reflects the absence of alternative data sources with comparable or superior reliability. As discussed in Section 2, the CPS can provide reasonably reliable annual estimates at the national level of such population characteristics as the number and percent of poor children, but it cannot produce estimates for counties. Nonetheless, the CPS data may serve as the basis for creating usable estimates through the application of statistical estimation techniques to develop ''model-based" or "indirect" estimates. Indirect estimators use data from other areas, time periods, or data sources to "borrow strength" and improve precision. In contrast, direct estimators use only the data from one source for the area and time period in question. A model-based approach is useful when there is no single source of information that can provide direct estimates, but relationships among several variables across various data sources can be used to provide estimates with acceptable precision. The Census Bureau has been mindful of the need for updated small-area estimates. Even as Congress charged the Census Bureau to develop postcensal estimates of poor school-age children for counties and school districts, the Census Bureau was organizing a program to study methods for producing postcensal income and poverty estimates for states and counties during the 1990s. The Census Bureau launched this program in late 1992 with financial support from a consortium of five federal agencies. The program faces a challenging task. In particular, there is no single administrative or survey data source that provides all of the information required to develop reliable estimates of the number of school-age children in poverty by county, including income information that is detailed and precise at the county level.

OCR for page 17
Small-Area Estimates of School-Age Children in Poverty: Interim Report I: Evaluation of 1993 County Estimates for Title I Allocations The Census Bureau's research suggests that, for state and county estimates of poor school-age children, the best procedure to follow is a model-based approach. Previously, the Census Bureau used this strategy to develop estimates of median family income for states (Fay et al., 1993) and, in part, to develop population estimates for states and counties (see Spencer and Lee, 1980).1 This section briefly describes the model-based approach as applied by the Census Bureau to estimate the number of school-age children in poverty by county. It also describes the Census Bureau's model for states, which estimates the state-level proportions of school-age children in poverty. (Appendix C provides a more detailed, technical explanation of both models; see also Coder et al., 1996; Fay, 1996.) Finally, the section describes how the estimated numbers of poor school-age children for the counties in each state are adjusted to agree with the corresponding estimates from the state-level model. This procedure requires that the estimated proportions from the state-level model are converted to estimated numbers of poor school-age children in each state. Both the county-level and state-level models were estimated for school-age children in 1994 who were poor in 1993 for purposes of the Title I allocations and for school-age children in 1990 who were poor in 1989 for purposes of model validation by comparison with estimates from the 1990 census. This validation exercise is discussed in Section 4, which provides the panel's assessment of the Census Bureau's methods. COUNTY-LEVEL MODEL Development of the Census Bureau's county-level model for estimates of the number of poor school-age children involved several steps: determining what administrative and other data sources are available for all counties that can be used in a prediction equation; specifying and estimating an equation that relates the predictor variables to a dependent variable from 3 years of the March CPS; using the estimates from the equation, together with direct estimates for counties for which they are available, to develop estimates for all counties; and, finally, adjusting the county estimates for consistency with estimates from a separate state-level model. The state-level model and the final adjustment of the county estimates are discussed following the description of the county-level model. 1   The model-based population estimates (total noninstitutionalized population for counties and noninstitutionalized population under age 65 for states) were produced as components of one of the methods that the Census Bureau used to estimate state and county population totals for use in allocating funds under the general revenue sharing program. The final estimates were developed by averaging the results of three separate methods.

OCR for page 17
Small-Area Estimates of School-Age Children in Poverty: Interim Report I: Evaluation of 1993 County Estimates for Title I Allocations Administrative Data Used The first step in developing model-based estimates of school-age children in poverty by county is to bring together administrative data that are related to poverty and that are available for all counties on a consistent basis (i.e., that are obtained using the same definitions and procedures). The Census Bureau examined a variety of administrative records and selected two sources as most nearly meeting these criteria: counts of the number of people receiving food stamps in each county from the U.S. Department of Agriculture (USDA) and state food stamp agencies;2 and county estimates from Internal Revenue Service (IRS) data of the number of child exemptions (assumed to be under age 21) in families who reported income below the poverty threshold on their federal income tax return. Neither of these two data sources gives the number of school-age children in poverty as measured by the March CPS or by the census, but this is not a problem for model-based estimation: it is necessary only that the variables chosen to be used in the model can provide good predictions of that number. For the Food Stamp Program, the total number of recipients is available annually for states and counties, and eligibility requirements are generally uniform across all states (with some exceptions for Alaska and Hawaii). Two key eligibility requirements are that households must have gross income before deductions that is below 130 percent of the applicable poverty guideline and net countable income that is below 100 percent of the applicable guideline.3 (The gross and net income limits for eligibility and the ceilings on allowable deductions are higher in Alaska and Hawaii than in the other states due to their higher cost of living.) Although the program is generally administered uniformly across all states, participation rates—the proportion of eligible households that apply for and receive benefits—are not the same. Also, the information obtained for each county is not always the same: in most counties, the counts of food stamp recipients pertain to July; for some counties, they are an average of the monthly counts for the year. Information from federal income tax returns can be used to construct family units and to compare the income of such units with the applicable poverty threshold. Individual tax returns are assigned to counties on the basis of their address information. There are three major advantages of data from tax returns: (1) coverage of a very large proportion of the population, (2) coverage of a very large proportion of the income received by families, and (3) data availability on an annual basis. The number of child exemptions reported on tax returns for families with incomes below the poverty threshold, like the number of food stamp recipients, is 2   USDA counts of food stamp recipients were not complete for all counties; the Census Bureau contacted individual state agencies to obtain missing information.a 3   The poverty guidelines used for determining program eligibility are derived by smoothing the official poverty thresholds for families of different sizes (see Fisher, 1992).

OCR for page 17
Small-Area Estimates of School-Age Children in Poverty: Interim Report I: Evaluation of 1993 County Estimates for Title I Allocations an imperfect measure of poverty for school-age children. Not all people file tax returns, especially those with very low incomes or income mostly from nontaxable sources. In addition, "income" as defined on tax returns does not include all the sources of income that are used in the official measure of poverty, and tax filing units are not totally consistent with the Census Bureau's definition of families. Moreover, the address on a tax return does not always correspond to a filer's residential address. Nonetheless, tax information, like counts of food stamp recipients, is a useful variable to develop predictions of poverty for school-age children. Model Specification The second step in developing a model-based estimate of the number of school-age children in poverty by county is to specify and estimate a formula, or prediction equation, that relates the administrative data and other "predictor" variables to the dependent or "outcome" variable, which is an estimate of the number of school-age children in poverty from the March CPS. The CPS estimate is chosen as the object of prediction because the CPS provides the largest and most up-to-date data set that is available with which to estimate poverty among school-age children. A key decision in the specification of the county-level model was to use the CPS estimate of the number of school-age children in poverty as the dependent variable.4 Another choice would have been to model the proportion of school-age children who were poor and then convert the estimated proportions to estimated numbers of poor school-age children. (This approach was in fact adopted for the state-level model—see below.) The Census Bureau decided against the approach of estimating proportions and converting them to numbers because of a concern that the county population estimates of school-age children that would form the basis for converting the estimated proportions to numbers were of uncertain quality. Hence, it would be difficult to construct estimates of the precision of the estimated numbers of poor school-age children, which play the most important role in the Title I allocation formula. Another key decision was to estimate the number of school-age children in each county who were poor at a particular time (e.g., the number in 1994 who were poor in 1993) and not to estimate the change in the number since the 1990 census. This decision reflects the Census Bureau's conclusion that the available administrative data were likely measured more consistently across areas at a given time than they would be over time, given changes in tax and transfer program rules. Both of these decisions are discussed further in Section 4, which presents the panel's assessment of the Census Bureau's methods. 4   As noted in Section 1, for this application school-age children are defined as related children aged 5–17.

OCR for page 17
Small-Area Estimates of School-Age Children in Poverty: Interim Report I: Evaluation of 1993 County Estimates for Title I Allocations The county-level model includes five predictor variables. In addition to the two variables described above, the number of food stamp recipients and the estimated number of child exemptions reported by families in poverty on tax returns, the variables are: the total number of child exemptions on tax returns, the total population under age 21 from the Census Bureau's postcensal population estimates program,5 and the number of school-age children in poverty from the most recent census. In the county-level model, the CPS estimate is a 3-year centered average, and all the variables are measured on a logarithmic scale. A reason to use logarithms was the wide variation in the CPS estimate and the values of the predictor variables among counties: transforming the variables to logarithms made their distributions more symmetric and the relationships between some of them and the dependent variable more linear. A reason for the decision to combine 3 years of CPS data for county estimation and thereby improve precision was the small CPS sample sizes in individual counties. (For the 1993 county-level model, the CPS estimate is an average of data from the March 1993, 1994, and 1995 CPS, representing measured poverty in 1992, 1993 and 1994.) Given that only a subset of counties is represented in the March CPS sample, the relationships between the predictor variables and the dependent variable in the model are estimated solely on this subset of counties. This subset includes proportionately more large counties and proportionately fewer small counties than the distribution of all counties.6 By calculating the relationships among the predictor variables and the CPS estimates of school-age children in poverty for the subset of counties that have households in the March CPS sample with poor school-age children, it is possible to obtain a good estimate of an equation for predicting the number of poor school-age children in a county, even though the CPS estimate for any specific county has a measurable level of uncertainty that is large for many small counties.7 The prediction equation can then be used to predict the number of school-age children 5   The population estimates for people under age 21 are the estimated resident population under age 21 derived from demographic analysis minus the estimated population in institutions and military barracks for that age group; see Appendix D. The estimates pertain to July 1 following the income year (e.g., July 1994 for the 1993 model). Including the estimated population under age 21 and the estimated number of total child exemptions on income tax returns as variables in the model is intended to provide a measure of the number of people not covered on tax returns, most of whom are at the low end of the income distribution. 6   Because values of 0 cannot be transformed into logarithms, a number of counties whose sampled households contain no poor school-age children are excluded from the estimation; see Section 4 for discussion. 7   The regression coefficients on the predictor variables that express the relationships with the dependent variable in the county-level model are estimated using weighted least squares. The weights used are the reciprocal of the sum of the estimated sampling variance of the logarithm of the number of poor school-age children in a given county plus the estimated variance of model error, assumed to be constant across counties; see Appendix C.

OCR for page 17
Small-Area Estimates of School-Age Children in Poverty: Interim Report I: Evaluation of 1993 County Estimates for Title I Allocations in poverty from the food stamp, IRS, population estimates, and census predictor variables for each county, whether or not the county is in the March CPS sample. For counties that have households with poor school-age children in the March CPS sample, a weighted average of the model prediction and the estimate based on data from the sample households (the direct estimate) is used to produce an estimate for that county. The weights that are given to the model prediction and the direct estimate depend on their relative precision. (See Appendix C for how these weights are derived.) For a county with very few sample households in the CPS and hence a high level of sampling variability in the direct estimate, most of the weight will be given to the model prediction and little to the direct estimate. For a county with a large number of sample households in the CPS, more weight will be given to the direct estimate and less to the model prediction.8 In either case, assuming that the weights have been well estimated, the combined estimate of the number of school-age children in poverty will be at least as accurate as the better of the separate predictions (from the model or the CPS). For counties that lack households with poor school-age children in the CPS sample, the prediction from the model is the estimate. In both cases, after first transforming the logarithmic values back to numbers, the county estimates are adjusted for consistency with the estimates from the state-level model, as described below. STATE-LEVEL MODEL The Census Bureau's state-level model for estimates of poverty among school-age children is similar in general approach to the county-level model. However, it differs in a number of respects: The state-level model uses as the dependent variable the proportion of school-age children in poverty: that is, the dependent variable is a poverty ratio rather than the number of poor school-age children, as in the county-level model.9 The numerator for the ratio is the CPS estimate of poor school-age children in a state (i.e., the estimate of the number of poor related children aged 5–17); the denominator is the CPS estimate of the total number of noninstitutionalized children aged 5–17 in the state.10 8   The variation in the difference between the model prediction and the actual number of school-age children in poverty is assumed to be the same, on a proportional basis, for all counties with households in the March CPS sample. This difference is termed model error: as used in statistics, "error" is the inevitable discrepancy between the truth and an estimate due to variability in measurements and the fact that modeled relationships are not precise. 9   The predicted variable is termed a ratio because the denominator is not exactly the same as that for the official published poverty rates. 10   A different denominator—noninstitutionalized school-age children rather than the slightly smaller universe of related school-age children—is used for consistency with the denominator that is used to convert the estimated poverty ratios to estimated numbers of poor school-age children.

OCR for page 17
Small-Area Estimates of School-Age Children in Poverty: Interim Report I: Evaluation of 1993 County Estimates for Title I Allocations The state-level model uses four predictor variables for each state: the estimated percentage of child exemptions in families who reported incomes below the poverty threshold on their federal income tax return; the estimated percentage of people under age 65 who did not file an income tax return;11 the percentage of the population that received food stamps; and the residuals from a regression of poverty rates for school-age children from the prior decennial census on the other three independent variables (see Appendix C). The dependent variable in the state-level model is derived from 1 year of CPS data (the March 1994 CPS for the 1993 model), rather than a 3-year centered average as in the county-level model. This decision assumes that the sample size for states permits estimating the model with reasonable accuracy and, implicitly, that it is preferable when possible to have estimates that pertain directly to the income year. Since all the variables in the state-level model are proportions rather than numbers, they need not be transformed to a logarithmic scale as is done with the numbers in the county-level model. Such a transformation is not needed because the distributions of the estimated proportions for the predictor variables are more symmetric and have a more linear relationship with the dependent variable than is the case for the distributions of the estimated numbers. All states have sample households in the CPS; however, the variability associated with estimates from the CPS is large for some states. As is done in the county-level model, the state-level model weights the direct estimate for a state and the estimate from the model according to their relative precision to produce estimates of the proportion of poor school-age children in each state. To produce estimates of the number of poor school-age children in each state, the estimates of the proportion poor from the model are multiplied by estimates of the total number of noninstitutionalized school-age children. For the 1993 model, these estimates are derived from the Census Bureau's program of demographic population estimates (see Section 4 and Appendix D). Finally, the state estimates of the number of poor school-age children are adjusted to total the CPS national estimate of school-age children in poverty. The national estimate pertains to related children aged 5–17 so that, at this final stage, the state estimates are consistent with the county estimates in that both sets represent estimates of the numbers of related children aged 5–17 in poverty. ADJUSTMENT OF COUNTY ESTIMATES TO STATE CONTROLS The county-level model described above produces an initial set of estimates of the number of poor school-age children in each county in the United States. 11   This percentage is obtained by subtracting the estimated number of exemptions for people under age 65 on income tax returns from the estimated total population under age 65 derived from demographic analysis; see Appendix D.

OCR for page 17
Small-Area Estimates of School-Age Children in Poverty: Interim Report I: Evaluation of 1993 County Estimates for Title I Allocations The final estimates for counties are produced by "benchmarking" the initial county estimates to the final adjusted state estimates: for each state, the estimate for every county in that state is multiplied by a constant factor to make the sum of the resulting county estimates equal the state estimate. For example, if the estimated state total is 5 percent higher than the sum of the county estimates for that state, the estimate for each county in that state is multiplied by 1.05. If the estimated state total is 5 percent lower than the sum of the county estimates for that state, the estimate for each county in that state is multiplied by 0.95. The rationale for this last step is that the state estimates are more reliable because they are based on more data (larger samples) than are available for most counties. For example, if the county-level model tends to underpredict for counties in a particular state, the state as a whole is not affected by that error because its total is determined by the state-level model. The county-level model predicts the number of school-age children in poverty. Estimates of county poverty rates for school-age children, which play an important but secondary role in the Title I allocation formula, are obtained by dividing the estimated number of school-age children in poverty from the county-level model by an updated estimate of the county noninstitutionalized population aged 5–17, adjusted to represent related school-age children. These estimates are produced from the Census Bureau's population estimates program (see Appendix D). The county and state estimation procedures described in this section are based on the CPS. Therefore, the county estimates represent estimates of poverty for school-age children as measured by the March CPS, not as measured by the decennial census. The issues raised by this shift in the underlying source of data for the estimates are considered in Section 4.