Click for next page ( 84


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 83
s Future Research and Development There are several reasons that make it important for the Census Bureau to continue an active program of research and development for methods of estimat- ing poverty for school-age children at the county and school district levels. For counties, although there is clear evidence that the county model is performing well, the county (and state) model evaluations have identified a number of issues that warrant investigation as a priority in the short term to determine how to further improve the estimation procedures. Also, with a model-based approach, it is important to examine carefully the continued applicability of a model each time it is used and to modify it appropriately when necessary. In addition, research is needed to take account of likely future developments in the availabil- ity and characteristics of data sources that have implications for the modeling effort and to work on longer term modeling issues. Continued work to improve the county model is important not only for county estimates, but also to improve school district estimates that are developed by using the basic synthetic shares estimation procedure. For school districts, the important short-term priority is to investigate ways to improve the synthetic shares method for developing updated estimates of total and poor school-age children. Also, it is not too soon to begin research on ways to take advantage of likely future developments in available data that could make it possible to develop an estimation method that (unlike the shares method) captures changes in school-age poverty among districts within counties as well as changes between counties. The chapter begins by reviewing the schedule for the Census Bureau to provide updated small-area estimates of poor school-age children. It then consid 83

OCR for page 83
84 SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY ers short-term and longer term research priorities for county and school district estimates. It concludes by noting the requirements for an ongoing program of small-area income and poverty estimates, particularly for thorough evaluation and full documentation of models and results. SCHEDULE CONSIDERATIONS Over the next 5 years, there are three legislatively mandated deadlines for the Census Bureau to deliver updated school district estimates of poor school-age children to the Department of Education for use in Title I allocations: October 2000: estimates for 1997 (or later) for use for allocations for the 2001-2002 and 2002-2003 school years October 2002: estimates for 1999 (or later) for use for allocations for the 2003-2004 and 2004-2005 school years October 2004: estimates for 2001 (or later) for use for allocations for the 2005-2006 and 2006-2007 school years . In each case, three estimates are needed: number of total and poor school- age children and the total population. Although the legislation does not require county estimates, they will be needed as long as the method for producing school district estimates includes an adjustment or control to county estimates. There is also interest in state and county estimates of poor children for other important public policy uses, such as evaluating the effects of changes in welfare programs. Priorities for short-term and longer term research should consider the impor- tant changes that are likely to occur in the availability of data for modeling over the next 5 years and beyond, which include: current and future changes to welfare programs and tax systems that may affect the comparability or applicability of Food Stamp Program and Internal Revenue Service (IRS) data for use in small-area estimation models; the income and poverty estimates for small areas that will be available from the 2000 decennial census long-form sample of about 17 million households (likely to be available in 2002 for counties but not until later for school districts); and the planned introduction of the American Community Survey (ACS) as a large-scale, continuing sample survey of U.S. households, conducted primarily by mail, that will provide estimates similar to those provided by the decennial census long-form sample, including income and poverty estimates for small areas. The ACS is currently being tested in 4 sites; under current plans, it will be implemented in 31 sites in 1999-2001 for comparison with the 2000 census. For each year from 2000 to 2002, the ACS will sample about 70,000 households nationwide. Beginning in 2003, the ACS will sample 250,000 households each

OCR for page 83
FUTURE RESEARCH AND DEVELOPMENT 85 month throughout the decade, for an annual sample size of about 3 million house- holds. The current plan is that the ACS (as well as the 2000 census long-form sample) will oversample small jurisdictions. Unlike the 1990 census, the over- sampling in the 2000 census and the ACS will include small school districts (see Alexander, 1998~. SHORT-TERM PRIORITIES County Estimates The panel identified seven types of research that should be pursued as a priority to determine if the current estimation procedure for counties can be improved: modeling of CPS county sampling variances; estimation of model error and sampling error variance in the state model; methods to incorporate state effects in the county model; discrete variable models that include counties in the CPS sample that have no sampled households with poor school-age children; ways to reduce the time lag of the estimates; evaluation of food stamp and other input data; and large category differences and residual patterns for the state and county models. This research, much of which the Census Bureau has planned, should be conducted and the results fully evaluated well before the next delivery of updated county estimates of poor school-age children, scheduled for October 2000. Modeling of CPS County Sampling Variances The residual variance for the county model comprises two components: the model error variance and the sampling variance of the dependent variable. These two components need to be reasonably well estimated for the application of the model (e.g., to determine the relative weights of the regression estimate and the direct estimate in the shrinkage procedure). The current approach for estimating these components is to assume that the model error variance from the 1989 regression equation with the depen- dent variable formed from 1990 census data is the same as the model error variance when the dependent variable is formed from the 3 years of CPS data that are used for the county model equation for the target year. The total sampling variance is then obtained simultaneously with the regression parameter estimates through use of maximum likelihood estimation. As part of this procedure, the sampling variance for a particular county is assumed to be inversely proportional to the CPS sample size in that county. There is ample evidence that the function that is now used to distribute the total sampling variance to counties is incorrect (see Chapter 2~. Experimentation with other functions, which has already begun at the Census Bureau (specifically, investigating a function in which the sampling variance is inversely proportional to the square root of the CPS sample size in a county), should be pursued to eliminate or reduce the problem of variance heterogeneity with respect to both

OCR for page 83
86 SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY the CPS sample size and poverty rate that is evident in the county model regres- sion output. Research on this topic should include an assessment of the effects of alternative variance functions on the county estimates. In addition, the Census Bureau should pursue an alternative approach, which is to estimate the CPS sampling variances for counties with adequate sample size on the basis of direct calculations of these variances that take account of the clustered sample design within these counties, and then use a generalized vari- ance function for modeling the sampling variances for all counties with CPS sampled households. With this approach, the model error variance is calculated by subtracting the total sampling variance from the total squared error. This approach thus avoids the questionable assumption that the model error variances for the 1989 census equation and the CPS equation for the target year are equal. Census Bureau staff have begun work on fitting a generalized variance function to the CPS sampling variances. This work should continue and should include an early assessment of the effects on the county estimates to determine if the ben- efits justify continued refinement of the variance modeling. Model Error and Sampling Error Variance in the State Model In the state model the model error variance is obtained from a maximum likelihood proce- dure that estimates the coefficients of the predictor variables and the model error variance, given estimates of the sampling error variances of the direct state esti- mates. For most years for which the state model has been estimated, this proce- dure estimates the model error variance as zero, which results in zero weight being given to the direct CPS estimates. In effect, the model is assumed to be without error, which is not credible. A likely explanation is that the Census Bureau's estimates of sampling error variance for the direct state estimates are overestimates, which results in a value of zero for the model error variance when the state sampling variances are used in a maximum likelihood procedure that estimates the coefficients of the predictor variables and the model error variance. The Census Bureau should investigate its procedures for estimating sampling error variance. And without waiting for the results of that work, it should also examine the effects of a simple correction, such as putting a small weight on the direct estimates in weighting the estimates from the CPS equation for a target year. State Effects The magnitude of the state raking factors that are used to adjust the county estimates warrants further investigation. The Census Bureau should estimate the variance in the state raking factors for 1993 and 1995 to determine if their variability is consistent with sampling variation. If it is not, then research should be conducted to find an explanation for the variation. One part of this research could be to examine the effect of using 3 years rather than 1 year of CPS data in the state model, as is done in the county model.

OCR for page 83
FUTURE RESEARCH AND DEVELOPMENT 87 More generally, work should be conducted to determine if there are idiosyn- cratic state effects that should be captured in the county model. The Census Bureau did some preliminary research on adding fixed state effects to alternative formulations of the county model (see National Research Council, 1998:App. A). While the addition of fixed state effects reduced some nonrandom residual pat- terns in the regression output, a fixed state effects model did not perform better than other models in comparison with the 1990 census estimates (see National Research Council, 1998:App. C and D). Some preliminary work with a random state effects model with two components of variance, one for state and one for county within state (see Fuller and Goyeneche, 1998), suggested that state effects may be present and that further research on a random state effects model should be conducted. Discrete Variable Models that Use Counties with No Sampled Poor School- Age Children When using a logarithmic transformation of the number of poor school-age children as the dependent variable in the county regression model, all counties in the CPS sample for which none of the sampled households have poor school-age children (262 of 1,247 counties for the 1995 model) have to be re- moved from the regression analysis. The dropped counties are generally smaller counties with small CPS sample sizes. While the dropped counties would have little influence in any regression equation due to their small size, the exclusion of 21 percent of the counties in the CPS sample is a cause for concern. Moreover, the internal and external evalua- tions of the county model suggest that although the current approach provides reasonably good estimates for small counties for 1989, 1993, and 1995, they could be improved. For example, there is a slight tendency in the county model equation to overpredict poverty in small counties (see Chapter 2~. It is important to investigate the development of discrete variable regression models, such as Poisson regression or other forms of generalized linear models, that permit the inclusion of data for those counties that have no sampled families with children in poverty. There are two factors that complicate the development of discrete variable models in this context: the lack of fully developed hierarchical models and related shrinkage procedures and the lack of methods for optimal incorporation of CPS sampling variances. However, Markov Chain Monte Carlo implementation of hierarchical models can be used to address the first issue, and, with additional research and development, can also probably address the second issue. Ways to Reduce the Time Lag of the Estimates The Title I fund alloca- tions for the 1999-2000 school year will be based on estimates of school-age children in 1996 who were in poor families in 1995, and these estimates will also be used for the 2000-2001 school year allocations. It is important to explore the

OCR for page 83
88 SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY extent to which this time lag can be reduced for the county estimates, which will correspondingly reduce the time lag for the school district estimates.] The Cen- sus Bureau began some exploratory work on this topic in June 1997 but had to put it aside. Now that the county estimation procedure has been developed and put on a production basis, it is important to resume this work. One of the causes of the lag is the availability of food stamp data, which must be obtained from individual states in some instances and which are not available until almost 2 years after the year to which they refer. It might be possible to overcome this problem, without seriously harming the performance of the county model, by using food stamp data for the year prior to the estimation year. An- other possibility is to control the estimates from the county model to the state model estimates for the latest of the 3 years of CPS data used in the county model, instead of to the middle year. These ideas and others need to be evaluated to determine if the lag between the time period of the estimates and the year of allocation of funds can be reduced. Evaluation of Food Stamp and Other Input Data Regular evaluation of the continued suitability of food stamp and other data for input to the state and county models is important for the Census Bureau's small-area estimation pro- gram. Changes in welfare programs and the accompanying data systems (espe- cially those resulting from the 1996 Personal Responsibility and Work Opportu- nity Reconciliation Act) will almost certainly affect the comparability of food stamp data over geographic areas. For example, legal immigrants, many of whom are no longer eligible for benefits, are very unevenly distributed geo- graphically. Comparability is an important assumption in both the county and state regression models, and, therefore, the way in which food stamp data are used as a predictor variable in the models may need to be modified. Changes in the tax system could also affect the usefulness of IRS data for small-area poverty estimation. More generally, it is important to continually evaluate the input data to the state and county models to assess errors or inconsistencies in them and to develop methods to account for those errors in the modeling process. Large Category Differences and Residual Patterns for the State and County Models The internal and external evaluations (see Chapter 2) demonstrated that the state and county models are generally well behaved with respect to the esti- mates for various categories of states and counties. However, it is important to investigate further the residual patterns and category differences to determine if lit would also be desirable to reduce the time lag in the school district boundary survey so that the allocations are made to current school districts. However, that survey is conducted every 2 years, and it may not be possible to carry it out more frequently or to complete it more quickly.

OCR for page 83
FUTURE RESEARCH AND DEVELOPMENT 89 the regression models could be improved either through a modification of the model form or through the addition of predictor variables.2 As an example of a pattern that is worth further investigation, when com- pared with CPS aggregate estimates, the county model exhibited a tendency in 1989, 1993, and 1995 to underpredict the number of poor school-age children in counties with large percentages of Hispanics. Also, from examination of the standardized residuals, the state model exhibited a tendency to underpredict the proportion of poor school-age children in some states in the West Region. More generally, as a model is estimated for additional years, it is important to look for consistent patterns of residuals and category differences to understand their causes and to take corrective action when necessary. While it may be necessary to tolerate overprediction or underprediction for a particular type of area in any one year, a consistent pattern of overprediction or underprediction needs to be addressed. In the evaluation of residuals and category differences, particular attention should be paid to states and counties that have experienced large demographic or socioeconomic changes that may correlate with changes in numbers of poor school-age children. For example, the federal tax return data that are used to estimate internal migration for the demographic population estimates might be used to classify states and counties into categories by migration rates and the performance of the models compared for these categories. Also, the performance of the models might be compared for categories of counties classified by overall population change since the 1990 census. In turn, adding predictor variables to the models from the decennial census and the demographic estimates program, possibly including interaction terms, may prove a fruitful way to address persis- tent patterns of overprediction or underprediction for these and other categories of states and counties. 2The evaluations conducted to date of the county estimates include examination of the residual patterns from the regression model, comparisons of the model estimates for 1989 with 1990 census estimates, and comparisons of the model estimates for 1989, 1993, and 1995 with aggregate CPS estimates. Another evaluation that could help determine what portion of the errors in the county estimates is due to problems with the model-rather than measurement differences and sampling variability-is to fit the model to 1990 census data (prior to shrinkage and raking to the state model) and to compare the estimates to 1990 census values for aggregates of counties. This evaluation is similar to the county model-CPS aggregate comparisons, but it has the advantage that the sampling error in the census is much less than in the CPS. The county model estimates are not shrunk for this evaluation because the resulting estimates would have considerable weight on the census direct estimates and so be less informative about possible problems with the regression model.

OCR for page 83
9o SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY School District Estimates There cannot be marked improvements in the school district estimates with- out a substantial effort to improve the data sources for districts and to develop models to use them. Nonetheless, work should go forward to further evaluate the current estimation method and to seek to effect modest improvements in it. Three important areas for research are: investigation of methods to reduce the variance of the 1990 census estimates of poor school-age children; use of school enroll- ment data to improve estimates of the total number of school-age children; and investigation of the possible use of National School Lunch Program data to improve estimates of poor school-age children. Reducing the Variance of the 1990 Census Estimates of Poor School-Age Children Because so many school districts are so small in size, the 1990 census estimates of poor school-age children, which derive from the long-form sample, are subject to high sampling variability. In addition to affecting the quality of the 1995 estimates that were developed by the Census Bureau's synthetic method, the sampling variability in the 1990 census estimates affects thel980-1990 evalu- ations. The evaluation measures reported in Chapter 3 overstate the degree of error in the synthetic estimates because of this sampling variability. The Bureau should conduct research to determine the extent of this overstatement for school districts of different sizes and compute adjusted evaluation measures in which the effect of this sampling variability is removed. A simple approach would be to use the mean square error as an evaluation measure. This measure may then be readily adjusted by subtracting out the sampling variance of the census estimates, thereby producing a more valid measure of the quality of the synthetic estimates. The 1990 census school district estimates of poor school-age children that were used in the 1995 estimates and as the standard of comparison in the 1980- 1990 evaluations were developed by ratio adjustment. This procedure, which applies the long-form-sample-based estimates of the school-age poverty rate to the complete-count estimates of total school-age children, reduces the variance of the 1990 census estimates to a modest extent. Other ways to further reduce the variance should be investigated. One approach is to incorporate other characteristics from the census short form that are known to be related to poverty in estimating school district numbers of poor school-age children from the 1990 census. For example, such character- istics as race and ethnicity, home tenure (owner, renter), family type, and resi- dence (e.g., central city) could be used for this purpose. A very simple form of this type of estimation procedure would be a stratified ratio adjustment with strata defined using short-form information. Another approach is to smooth the 1990 census school district estimates with the 1990 census county estimates. By carefully constructing smoothed school- district estimates as combinations of school-district and county-level estimates, it

OCR for page 83
FUTURE RESEARCH AND DEVELOPMENT 91 might be possible to produce school-district estimates with lower mean square errors than the direct 1990 census estimates. It would be desirable to make use of knowledge about model error and sampling variances at the school-district level- if available to tailor the degree of smoothing for each school district. If suc- cessful, smoothing procedures might substantially improve the estimation of cen- sus school-age poverty rates in small school districts. They would add some bias because county poverty rates differ from poverty rates for school districts con- tained within them, but they could potentially substantially reduce variance, thereby improving mean square error. The development of a smoothing approach should include a thorough evalu- ation. As part of that evaluation, it would be useful to compare 1990 census estimates of poor school-age children for school districts with three sets of esti- mates that differ in the calculation of 1980 census within-county shares that are applied to the 1989 county model estimates: unsmoothed 1980 census within- county shares (as in synthetic method (1), see Chapter 3~; smoothed 1980 census within-county shares; and 1980 census within-county shares that use the 1980 census county school-age poverty rates for all school districts within each county. The third method represents a complete smoothing of the school district poverty rates within counties. If one or both methods for reducing the variance of the 1990 census school district estimates of poor school-age children (smoothing and using other charac- teristics in the estimation) are successful, then the revised 1990 census estimates should be employed with the synthetic shares approach if it is used again in the future. The revised estimates should also be used as the standard of evaluation for assessing the synthetic shares estimates of poor school-age children in 1989. Use of School Enrollment Data to Improve Estimates of the Total Number of School-Age Children The method for estimating total school-age children is similar to that for estimating poor school-age children, namely, to apply the 1990 census school district shares within each county to updated county esti- mates. The method is more robust for total school-age children (and total popu- lation) than for poor school-age children because the numbers being estimated are larger and because the 1990 census shares for total school-age children (and total population) are based on complete-count data that are not subject to sam- pling error. But the synthetic shares method still does not capture within-county changes in school district populations that have occurred since the 1990 census. Public school enrollment data are collected annually by the National Center for Education Statistics (NCES) for school districts. Research should be con- ducted to determine if these data could be used to update the within-county school district shares of total school-age children. Research could begin by examining reported school enrollment in the 1980 and 1990 censuses for school districts to determine if the within-county enrollment shares in 1990, or, alterna- tively, the changes in enrollment from 1980 to 1990, produce estimates of total

OCR for page 83
92 SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY school-age children that are more accurate for 1990 than the 1980 census-based shares. (Work is under way along these lines at the Census Bureau.) Research would also be needed to evaluate the quality of the NCES enrollment data and to determine if such factors as changes in public versus private school enrollment present a problem for estimation. If it is determined that the use of enrollment data would improve school district estimates of total school-age children, it will be necessary to modify the estimation procedure for poor school-age children so that the estimates of both groups (total and poor) are consistent. One way to achieve consistency would be to apply 1990 census school-age poverty rates for districts to the updated esti- mates of within-county shares of total school-age children that are developed from enrollment data. Possible Use of School Lunch Data to Improve Estimates of Poor School- Age Children There are many reasons that school lunch data are not necessarily a good proxy for school-age poverty (see Chapter 3~. Moreover, at present, there is no complete, accurate source of school lunch data by school district that is readily available to the Census Bureau. Nonetheless, participation in the Na- tional School Lunch Program is an indicator of low income, and it seems worth- while to pursue for other states the research that the panel undertook for New York. The Census Bureau may be able to work through its state data centers for selected states to obtain school lunch data by district for 1989-1990 to evaluate whether within-county school lunch participation shares in 1989-1990 produce estimates of poor school-age children in 1989 that are more accurate than those produced from the 1980 census-based shares. Another approach to evaluate is whether a combination of school lunch data and census data would be preferable to using either data source alone. The research should also look at the effects of using school lunch data, solely or in combination with census data, to estimate school-age poverty rates because of the role that rates play in concentration grants. If the results of this research are promising, it would be necessary for the NCES to improve the reporting of participation in the National School Lunch Program that it collects in the Common Core of Data. LONGER TERM PRIORITIES State and County Models In the longer term, research should proceed on multivariate approaches to state and county estimation that take advantage of the multiple data sources that are likely to become available in the next decade. These sources are the March CPS, the 2000 decennial census, and the monthly ACS.

OCR for page 83
FUTURE RESEARCH AND DEVELOPMENT Multivariate State and County Models 93 Use of multiple data sources (from separate surveys or multiple years of the same survey) in a system of equations can be advantageous for small-area mod- eling. For the state model, the Census Bureau has initiated work on a multivariate approach to incorporating the data from several years of the CPS, instead of just one year, into the regression equation (see Otto and Bell, 1997~. For the county model, the Census Bureau developed, as an alternative to the separate use of CPS and census county regression equations (with the census equation being used only to estimate the model error variance for the CPS model), a bivariate county regression model, in which the two dependent variables are the CPS and the previous census estimates of poor school-age children. This formu- lation has some very real advantages (see National Research Council, 1998:App. C). First, the internal evaluation of the regression output for the bivariate models for 1993 indicated that they are as good as or possibly better than their single- equation analogues. In addition, tests of the constancy of the parameter that distinguished between the single-equation and bivariate formulation clearly showed the benefit of the bivariate approach. Unfortunately, lack of administra- tive records data for 1979 prevented the Census Bureau from conducting an external evaluation of the bivariate models in comparison with the 1990 census. Therefore, given the novelty and relative lack of evaluation of these models, the panel did not recommend using them for the production of 1993 or 1995 county estimates of poor school-age children. However, research into this approach should continue, including an external evaluation as soon as that is feasible using the 2000 census data. Similarly, integrating multiple years of the March Income Supplement of the CPS into the county estimation procedure by means of a multivariate model, as opposed to the current procedure of averaging the data for 3 years, may be advantageous. A multivariate model, with estimates from more than one CPS year and the census as dependent variables in a linear system of equations, might provide an effective way of using more of the available information. In the future this model could also incorporate data from the ACS, possibly by adding equa- tions for the estimates from that survey. More broadly, a wide variety of approaches that combine information over time and over geographic areas should be considered as such a combination might prove effective at modeling poverty for small areas. Because poverty very likely has commonalities over time and across areas that are similar in economic conditions, efforts to exploit this structure could prove advantageous and should be examined. American Community Survey The American Community Survey, when it is fully operational, will be an important component of any approach to providing updated estimates of poor

OCR for page 83
94 SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY school-age children for small areas.3 For states and counties, it is possible that several months (or years) of data from the ACS might be used to provide direct estimates of poor school-age children. Alternatively, ACS data could be used indirectly as a dependent variable in a model-based approach for state and coun- ties, parallel to the manner in which CPS data are currently used. However, given that each year of the CPS and the 2000 census will also provide information on poverty,4 it will be important to find ways to use all three sources of information together, for multiple time periods (for the CPS and ACS), to produce the best state and county estimates. Furthermore, given that all three data sources will have their own measurement biases5 and that they are available for different time periods the decennial census year, multiple years of the CPS March Income Supplement, and many months of the ACS it is unlikely that simply pooling estimates from the three data sources can be justified. Some adjustment or modeling procedure will be needed. Such a procedure will have to take account of available information about the variances and biases of the esti- mates from each data source. Continued research and development on measurement error and time-series models will be needed to develop effective multivariate models for small-area poverty estimates that use multiple data sources for multiple time periods.6 A specific research issue is to determine how best to use the 2000 census informa- tion, which will have lower sampling variance but possibly substantial measure- ment bias and which may be biased if the economic conditions during the census reference period differ markedly from the period for which estimates are needed. In order to learn as much as possible about the measurement differences between the census, the March CPS, and the ACS, the Census Bureau should plan now for an exact match of the 2000 census with both the March 2000 CPS and the national ACS sample of about 70,000 households that will be in the field in that year. These two matches would provide a wealth of information about the three different income measurement systems. They would also provide key inputs to the development of a CPS-census measurement error model, which could help resolve some remaining issues about the state and county models. For example, 3The ACS, together with the 2000 census Master Address File, may also provide the means to improve small-area estimates of total population and population by age. 4If the ACS is implemented as planned, it is likely that the 2010 census and subsequent censuses will not include a long form and, hence, will not provide income and poverty information. 5The data collection methods for the census, Ups, and ACS differ in many respects, including the length of the questionnaire, the primary data collection technique (face-to-face interviews or mail questionnaires), the definitions of variables, the reference period for income measurement, and edit- ing and imputation methods. Any of these differences can lead to different measurement biases. Measurement error models, by attempting to model effects over time and across states resulting from changes in program administration, could also be used to adjust administrative data that are used as predictor variables in estimation models for differences due to time or state effects.

OCR for page 83
FUTURE RESEARCH AND DEVELOPMENT 95 some of the category differences observed in the 1990 census comparisons for the county model could be due to differences between the CPS and census measure- ments of poverty. A CPS-census measurement error model could also provide information from which to determine how to use data from the 2000 census with the current CPS-based estimation procedure to minimize discontinuities in the Title I fund allocations that may occur when the data from the 2000 census are incorporated into the models. School Districts The planned implementation of the ACS and the availability of 2000 census data hold out the prospect for markedly improved estimates of poor school-age children for school districts, as well as for states and counties. However, the availability of 2000 census and ACS data alone will not likely be sufficient to provide estimates of acceptable quality for school districts that reflect within- county as well as between-county changes in school-age poverty for districts.7 It is likely that modeling will be necessary, and modeling, in turn, will require sources of data to serve as predictor variables. With the Master Address File that will be completed for the 2000 census, it should be possible to geocode most federal tax return data to the school district level. In fact, if a high proportion of tax return addresses can be geocoded in the near future, even before the census itself is completed, that information could be used to improve the current syn- thetic shares estimation method.8 It may also be possible to undertake a federal- state cooperative effort to provide food stamp data that are geocoded to school districts. A substantial research and development effort will be needed for improved school district estimates of poor school-age children for which work should begin now. The panel will comment further on the long-term prospects for improve- ment, in its final report, due at the end of 1999. DOCUMENTATION AND EVALUATION The development of small-area estimates of income and poverty is a major effort that includes data acquisition and review, database development, geo- graphic mapping and geocoding of data, methodological research, model devel- opment and testing, and documentation and evaluation of procedures and out 7For many school districts, data from the ACS will have to be pooled across several years to produce direct estimates of adequate precision. Because the ACS will not be fully phased in until 2003, the first 5-year pooled estimates, for example, will not be available until 2008. Moreover, such estimates will still be subject to high sampling variability for many districts, similar to the census. 8However, to obtain complete geocoding of these data would likely require that all tax returns be filed by the address of the residence of the tax filer rather than the address of the tax preparer.

OCR for page 83
96 SMALL-AREA ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY puts. Since the production of small-area poverty estimates supports a range of important public policies for federal, state, and local governments including the allocation of funds it is essential that the Census Bureau have adequate staff and other resources for all components of the estimation program, including evaluation and documentation. It is the responsibility of any agency that pro- duces model-based estimates to conduct a thorough assessment of them, includ- ing internal and external evaluations of alternative model formulations. An integral part of the evaluation effort is the preparation of detailed docu- mentation of the modeling procedures and evaluation results. No small-area estimates should be published without full documentation. Such documentation is needed for analysts both inside and outside the Census Bureau to judge the quality of the estimates and to identify areas for research and development to improve the estimates in future years.