| ||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 83
s
Future Research and Development
There are several reasons that make it important for the Census Bureau to
continue an active program of research and development for methods of estimat-
ing poverty for school-age children at the county and school district levels. For
counties, although there is clear evidence that the county model is performing
well, the county (and state) model evaluations have identified a number of issues
that warrant investigation as a priority in the short term to determine how to
further improve the estimation procedures. Also, with a model-based approach,
it is important to examine carefully the continued applicability of a model each
time it is used and to modify it appropriately when necessary. In addition,
research is needed to take account of likely future developments in the availabil-
ity and characteristics of data sources that have implications for the modeling
effort and to work on longer term modeling issues. Continued work to improve
the county model is important not only for county estimates, but also to improve
school district estimates that are developed by using the basic synthetic shares
estimation procedure.
For school districts, the important short-term priority is to investigate ways
to improve the synthetic shares method for developing updated estimates of total
and poor school-age children. Also, it is not too soon to begin research on ways
to take advantage of likely future developments in available data that could make
it possible to develop an estimation method that (unlike the shares method)
captures changes in school-age poverty among districts within counties as well as
changes between counties.
The chapter begins by reviewing the schedule for the Census Bureau to
provide updated small-area estimates of poor school-age children. It then consid
83
OCR for page 84
84
SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY
ers short-term and longer term research priorities for county and school district
estimates. It concludes by noting the requirements for an ongoing program of
small-area income and poverty estimates, particularly for thorough evaluation
and full documentation of models and results.
SCHEDULE CONSIDERATIONS
Over the next 5 years, there are three legislatively mandated deadlines for
the Census Bureau to deliver updated school district estimates of poor school-age
children to the Department of Education for use in Title I allocations:
· October 2000: estimates for 1997 (or later) for use for allocations for the
2001-2002 and 2002-2003 school years
· October 2002: estimates for 1999 (or later) for use for allocations for the
2003-2004 and 2004-2005 school years
October 2004: estimates for 2001 (or later) for use for allocations for the
2005-2006 and 2006-2007 school years
.
In each case, three estimates are needed: number of total and poor school-
age children and the total population. Although the legislation does not require
county estimates, they will be needed as long as the method for producing school
district estimates includes an adjustment or control to county estimates. There is
also interest in state and county estimates of poor children for other important
public policy uses, such as evaluating the effects of changes in welfare programs.
Priorities for short-term and longer term research should consider the impor-
tant changes that are likely to occur in the availability of data for modeling over
the next 5 years and beyond, which include:
· current and future changes to welfare programs and tax systems that may
affect the comparability or applicability of Food Stamp Program and Internal
Revenue Service (IRS) data for use in small-area estimation models;
· the income and poverty estimates for small areas that will be available
from the 2000 decennial census long-form sample of about 17 million households
(likely to be available in 2002 for counties but not until later for school districts);
and
· the planned introduction of the American Community Survey (ACS) as a
large-scale, continuing sample survey of U.S. households, conducted primarily
by mail, that will provide estimates similar to those provided by the decennial
census long-form sample, including income and poverty estimates for small
areas. The ACS is currently being tested in 4 sites; under current plans, it will be
implemented in 31 sites in 1999-2001 for comparison with the 2000 census. For
each year from 2000 to 2002, the ACS will sample about 70,000 households
nationwide. Beginning in 2003, the ACS will sample 250,000 households each
OCR for page 85
FUTURE RESEARCH AND DEVELOPMENT
85
month throughout the decade, for an annual sample size of about 3 million house-
holds. The current plan is that the ACS (as well as the 2000 census long-form
sample) will oversample small jurisdictions. Unlike the 1990 census, the over-
sampling in the 2000 census and the ACS will include small school districts (see
Alexander, 1998~.
SHORT-TERM PRIORITIES
County Estimates
The panel identified seven types of research that should be pursued as a
priority to determine if the current estimation procedure for counties can be
improved: modeling of CPS county sampling variances; estimation of model
error and sampling error variance in the state model; methods to incorporate state
effects in the county model; discrete variable models that include counties in the
CPS sample that have no sampled households with poor school-age children;
ways to reduce the time lag of the estimates; evaluation of food stamp and other
input data; and large category differences and residual patterns for the state and
county models. This research, much of which the Census Bureau has planned,
should be conducted and the results fully evaluated well before the next delivery
of updated county estimates of poor school-age children, scheduled for October
2000.
Modeling of CPS County Sampling Variances The residual variance for
the county model comprises two components: the model error variance and the
sampling variance of the dependent variable. These two components need to be
reasonably well estimated for the application of the model (e.g., to determine the
relative weights of the regression estimate and the direct estimate in the shrinkage
procedure). The current approach for estimating these components is to assume
that the model error variance from the 1989 regression equation with the depen-
dent variable formed from 1990 census data is the same as the model error
variance when the dependent variable is formed from the 3 years of CPS data that
are used for the county model equation for the target year. The total sampling
variance is then obtained simultaneously with the regression parameter estimates
through use of maximum likelihood estimation. As part of this procedure, the
sampling variance for a particular county is assumed to be inversely proportional
to the CPS sample size in that county.
There is ample evidence that the function that is now used to distribute the
total sampling variance to counties is incorrect (see Chapter 2~. Experimentation
with other functions, which has already begun at the Census Bureau (specifically,
investigating a function in which the sampling variance is inversely proportional
to the square root of the CPS sample size in a county), should be pursued to
eliminate or reduce the problem of variance heterogeneity with respect to both
OCR for page 86
86
SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY
the CPS sample size and poverty rate that is evident in the county model regres-
sion output. Research on this topic should include an assessment of the effects of
alternative variance functions on the county estimates.
In addition, the Census Bureau should pursue an alternative approach, which
is to estimate the CPS sampling variances for counties with adequate sample size
on the basis of direct calculations of these variances that take account of the
clustered sample design within these counties, and then use a generalized vari-
ance function for modeling the sampling variances for all counties with CPS
sampled households. With this approach, the model error variance is calculated
by subtracting the total sampling variance from the total squared error. This
approach thus avoids the questionable assumption that the model error variances
for the 1989 census equation and the CPS equation for the target year are equal.
Census Bureau staff have begun work on fitting a generalized variance function
to the CPS sampling variances. This work should continue and should include an
early assessment of the effects on the county estimates to determine if the ben-
efits justify continued refinement of the variance modeling.
Model Error and Sampling Error Variance in the State Model In the state
model the model error variance is obtained from a maximum likelihood proce-
dure that estimates the coefficients of the predictor variables and the model error
variance, given estimates of the sampling error variances of the direct state esti-
mates. For most years for which the state model has been estimated, this proce-
dure estimates the model error variance as zero, which results in zero weight
being given to the direct CPS estimates. In effect, the model is assumed to be
without error, which is not credible. A likely explanation is that the Census
Bureau's estimates of sampling error variance for the direct state estimates are
overestimates, which results in a value of zero for the model error variance when
the state sampling variances are used in a maximum likelihood procedure that
estimates the coefficients of the predictor variables and the model error variance.
The Census Bureau should investigate its procedures for estimating sampling
error variance. And without waiting for the results of that work, it should also
examine the effects of a simple correction, such as putting a small weight on the
direct estimates in weighting the estimates from the CPS equation for a target
year.
State Effects The magnitude of the state raking factors that are used to
adjust the county estimates warrants further investigation. The Census Bureau
should estimate the variance in the state raking factors for 1993 and 1995 to
determine if their variability is consistent with sampling variation. If it is not,
then research should be conducted to find an explanation for the variation. One
part of this research could be to examine the effect of using 3 years rather than 1
year of CPS data in the state model, as is done in the county model.
OCR for page 87
FUTURE RESEARCH AND DEVELOPMENT
87
More generally, work should be conducted to determine if there are idiosyn-
cratic state effects that should be captured in the county model. The Census
Bureau did some preliminary research on adding fixed state effects to alternative
formulations of the county model (see National Research Council, 1998:App. A).
While the addition of fixed state effects reduced some nonrandom residual pat-
terns in the regression output, a fixed state effects model did not perform better
than other models in comparison with the 1990 census estimates (see National
Research Council, 1998:App. C and D). Some preliminary work with a random
state effects model with two components of variance, one for state and one for
county within state (see Fuller and Goyeneche, 1998), suggested that state effects
may be present and that further research on a random state effects model should
be conducted.
Discrete Variable Models that Use Counties with No Sampled Poor School-
Age Children When using a logarithmic transformation of the number of poor
school-age children as the dependent variable in the county regression model, all
counties in the CPS sample for which none of the sampled households have poor
school-age children (262 of 1,247 counties for the 1995 model) have to be re-
moved from the regression analysis. The dropped counties are generally smaller
counties with small CPS sample sizes.
While the dropped counties would have little influence in any regression
equation due to their small size, the exclusion of 21 percent of the counties in the
CPS sample is a cause for concern. Moreover, the internal and external evalua-
tions of the county model suggest that although the current approach provides
reasonably good estimates for small counties for 1989, 1993, and 1995, they
could be improved. For example, there is a slight tendency in the county model
equation to overpredict poverty in small counties (see Chapter 2~. It is important
to investigate the development of discrete variable regression models, such as
Poisson regression or other forms of generalized linear models, that permit the
inclusion of data for those counties that have no sampled families with children in
poverty.
There are two factors that complicate the development of discrete variable
models in this context: the lack of fully developed hierarchical models and
related shrinkage procedures and the lack of methods for optimal incorporation of
CPS sampling variances. However, Markov Chain Monte Carlo implementation
of hierarchical models can be used to address the first issue, and, with additional
research and development, can also probably address the second issue.
Ways to Reduce the Time Lag of the Estimates The Title I fund alloca-
tions for the 1999-2000 school year will be based on estimates of school-age
children in 1996 who were in poor families in 1995, and these estimates will also
be used for the 2000-2001 school year allocations. It is important to explore the
OCR for page 88
88
SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY
extent to which this time lag can be reduced for the county estimates, which will
correspondingly reduce the time lag for the school district estimates.] The Cen-
sus Bureau began some exploratory work on this topic in June 1997 but had to put
it aside. Now that the county estimation procedure has been developed and put
on a production basis, it is important to resume this work.
One of the causes of the lag is the availability of food stamp data, which must
be obtained from individual states in some instances and which are not available
until almost 2 years after the year to which they refer. It might be possible to
overcome this problem, without seriously harming the performance of the county
model, by using food stamp data for the year prior to the estimation year. An-
other possibility is to control the estimates from the county model to the state
model estimates for the latest of the 3 years of CPS data used in the county model,
instead of to the middle year. These ideas and others need to be evaluated to
determine if the lag between the time period of the estimates and the year of
allocation of funds can be reduced.
Evaluation of Food Stamp and Other Input Data Regular evaluation of
the continued suitability of food stamp and other data for input to the state and
county models is important for the Census Bureau's small-area estimation pro-
gram. Changes in welfare programs and the accompanying data systems (espe-
cially those resulting from the 1996 Personal Responsibility and Work Opportu-
nity Reconciliation Act) will almost certainly affect the comparability of food
stamp data over geographic areas. For example, legal immigrants, many of
whom are no longer eligible for benefits, are very unevenly distributed geo-
graphically. Comparability is an important assumption in both the county and
state regression models, and, therefore, the way in which food stamp data are
used as a predictor variable in the models may need to be modified. Changes in
the tax system could also affect the usefulness of IRS data for small-area poverty
estimation. More generally, it is important to continually evaluate the input data
to the state and county models to assess errors or inconsistencies in them and to
develop methods to account for those errors in the modeling process.
Large Category Differences and Residual Patterns for the State and County
Models The internal and external evaluations (see Chapter 2) demonstrated that
the state and county models are generally well behaved with respect to the esti-
mates for various categories of states and counties. However, it is important to
investigate further the residual patterns and category differences to determine if
lit would also be desirable to reduce the time lag in the school district boundary survey so that the
allocations are made to current school districts. However, that survey is conducted every 2 years,
and it may not be possible to carry it out more frequently or to complete it more quickly.
OCR for page 89
FUTURE RESEARCH AND DEVELOPMENT
89
the regression models could be improved either through a modification of the
model form or through the addition of predictor variables.2
As an example of a pattern that is worth further investigation, when com-
pared with CPS aggregate estimates, the county model exhibited a tendency in
1989, 1993, and 1995 to underpredict the number of poor school-age children in
counties with large percentages of Hispanics. Also, from examination of the
standardized residuals, the state model exhibited a tendency to underpredict the
proportion of poor school-age children in some states in the West Region.
More generally, as a model is estimated for additional years, it is important
to look for consistent patterns of residuals and category differences to understand
their causes and to take corrective action when necessary. While it may be
necessary to tolerate overprediction or underprediction for a particular type of
area in any one year, a consistent pattern of overprediction or underprediction
needs to be addressed.
In the evaluation of residuals and category differences, particular attention
should be paid to states and counties that have experienced large demographic or
socioeconomic changes that may correlate with changes in numbers of poor
school-age children. For example, the federal tax return data that are used to
estimate internal migration for the demographic population estimates might be
used to classify states and counties into categories by migration rates and the
performance of the models compared for these categories. Also, the performance
of the models might be compared for categories of counties classified by overall
population change since the 1990 census. In turn, adding predictor variables to
the models from the decennial census and the demographic estimates program,
possibly including interaction terms, may prove a fruitful way to address persis-
tent patterns of overprediction or underprediction for these and other categories
of states and counties.
2The evaluations conducted to date of the county estimates include examination of the residual
patterns from the regression model, comparisons of the model estimates for 1989 with 1990 census
estimates, and comparisons of the model estimates for 1989, 1993, and 1995 with aggregate CPS
estimates. Another evaluation that could help determine what portion of the errors in the county
estimates is due to problems with the model-rather than measurement differences and sampling
variability-is to fit the model to 1990 census data (prior to shrinkage and raking to the state model)
and to compare the estimates to 1990 census values for aggregates of counties. This evaluation is
similar to the county model-CPS aggregate comparisons, but it has the advantage that the sampling
error in the census is much less than in the CPS. The county model estimates are not shrunk for this
evaluation because the resulting estimates would have considerable weight on the census direct
estimates and so be less informative about possible problems with the regression model.
OCR for page 90
9o
SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY
School District Estimates
There cannot be marked improvements in the school district estimates with-
out a substantial effort to improve the data sources for districts and to develop
models to use them. Nonetheless, work should go forward to further evaluate the
current estimation method and to seek to effect modest improvements in it. Three
important areas for research are: investigation of methods to reduce the variance
of the 1990 census estimates of poor school-age children; use of school enroll-
ment data to improve estimates of the total number of school-age children; and
investigation of the possible use of National School Lunch Program data to
improve estimates of poor school-age children.
Reducing the Variance of the 1990 Census Estimates of Poor School-Age
Children Because so many school districts are so small in size, the 1990 census
estimates of poor school-age children, which derive from the long-form sample,
are subject to high sampling variability. In addition to affecting the quality of the
1995 estimates that were developed by the Census Bureau's synthetic method,
the sampling variability in the 1990 census estimates affects thel980-1990 evalu-
ations. The evaluation measures reported in Chapter 3 overstate the degree of
error in the synthetic estimates because of this sampling variability. The Bureau
should conduct research to determine the extent of this overstatement for school
districts of different sizes and compute adjusted evaluation measures in which the
effect of this sampling variability is removed. A simple approach would be to use
the mean square error as an evaluation measure. This measure may then be
readily adjusted by subtracting out the sampling variance of the census estimates,
thereby producing a more valid measure of the quality of the synthetic estimates.
The 1990 census school district estimates of poor school-age children that
were used in the 1995 estimates and as the standard of comparison in the 1980-
1990 evaluations were developed by ratio adjustment. This procedure, which
applies the long-form-sample-based estimates of the school-age poverty rate to
the complete-count estimates of total school-age children, reduces the variance of
the 1990 census estimates to a modest extent. Other ways to further reduce the
variance should be investigated.
One approach is to incorporate other characteristics from the census short
form that are known to be related to poverty in estimating school district numbers
of poor school-age children from the 1990 census. For example, such character-
istics as race and ethnicity, home tenure (owner, renter), family type, and resi-
dence (e.g., central city) could be used for this purpose. A very simple form of
this type of estimation procedure would be a stratified ratio adjustment with strata
defined using short-form information.
Another approach is to smooth the 1990 census school district estimates with
the 1990 census county estimates. By carefully constructing smoothed school-
district estimates as combinations of school-district and county-level estimates, it
OCR for page 91
FUTURE RESEARCH AND DEVELOPMENT
91
might be possible to produce school-district estimates with lower mean square
errors than the direct 1990 census estimates. It would be desirable to make use of
knowledge about model error and sampling variances at the school-district level-
if available to tailor the degree of smoothing for each school district. If suc-
cessful, smoothing procedures might substantially improve the estimation of cen-
sus school-age poverty rates in small school districts. They would add some bias
because county poverty rates differ from poverty rates for school districts con-
tained within them, but they could potentially substantially reduce variance,
thereby improving mean square error.
The development of a smoothing approach should include a thorough evalu-
ation. As part of that evaluation, it would be useful to compare 1990 census
estimates of poor school-age children for school districts with three sets of esti-
mates that differ in the calculation of 1980 census within-county shares that are
applied to the 1989 county model estimates: unsmoothed 1980 census within-
county shares (as in synthetic method (1), see Chapter 3~; smoothed 1980 census
within-county shares; and 1980 census within-county shares that use the 1980
census county school-age poverty rates for all school districts within each county.
The third method represents a complete smoothing of the school district poverty
rates within counties.
If one or both methods for reducing the variance of the 1990 census school
district estimates of poor school-age children (smoothing and using other charac-
teristics in the estimation) are successful, then the revised 1990 census estimates
should be employed with the synthetic shares approach if it is used again in the
future. The revised estimates should also be used as the standard of evaluation
for assessing the synthetic shares estimates of poor school-age children in 1989.
Use of School Enrollment Data to Improve Estimates of the Total Number
of School-Age Children The method for estimating total school-age children
is similar to that for estimating poor school-age children, namely, to apply the
1990 census school district shares within each county to updated county esti-
mates. The method is more robust for total school-age children (and total popu-
lation) than for poor school-age children because the numbers being estimated
are larger and because the 1990 census shares for total school-age children (and
total population) are based on complete-count data that are not subject to sam-
pling error. But the synthetic shares method still does not capture within-county
changes in school district populations that have occurred since the 1990 census.
Public school enrollment data are collected annually by the National Center
for Education Statistics (NCES) for school districts. Research should be con-
ducted to determine if these data could be used to update the within-county
school district shares of total school-age children. Research could begin by
examining reported school enrollment in the 1980 and 1990 censuses for school
districts to determine if the within-county enrollment shares in 1990, or, alterna-
tively, the changes in enrollment from 1980 to 1990, produce estimates of total
OCR for page 92
92
SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY
school-age children that are more accurate for 1990 than the 1980 census-based
shares. (Work is under way along these lines at the Census Bureau.) Research
would also be needed to evaluate the quality of the NCES enrollment data and to
determine if such factors as changes in public versus private school enrollment
present a problem for estimation.
If it is determined that the use of enrollment data would improve school
district estimates of total school-age children, it will be necessary to modify the
estimation procedure for poor school-age children so that the estimates of both
groups (total and poor) are consistent. One way to achieve consistency would be
to apply 1990 census school-age poverty rates for districts to the updated esti-
mates of within-county shares of total school-age children that are developed
from enrollment data.
Possible Use of School Lunch Data to Improve Estimates of Poor School-
Age Children There are many reasons that school lunch data are not necessarily
a good proxy for school-age poverty (see Chapter 3~. Moreover, at present, there
is no complete, accurate source of school lunch data by school district that is
readily available to the Census Bureau. Nonetheless, participation in the Na-
tional School Lunch Program is an indicator of low income, and it seems worth-
while to pursue for other states the research that the panel undertook for New
York.
The Census Bureau may be able to work through its state data centers for
selected states to obtain school lunch data by district for 1989-1990 to evaluate
whether within-county school lunch participation shares in 1989-1990 produce
estimates of poor school-age children in 1989 that are more accurate than those
produced from the 1980 census-based shares. Another approach to evaluate is
whether a combination of school lunch data and census data would be preferable
to using either data source alone. The research should also look at the effects of
using school lunch data, solely or in combination with census data, to estimate
school-age poverty rates because of the role that rates play in concentration
grants. If the results of this research are promising, it would be necessary for the
NCES to improve the reporting of participation in the National School Lunch
Program that it collects in the Common Core of Data.
LONGER TERM PRIORITIES
State and County Models
In the longer term, research should proceed on multivariate approaches to
state and county estimation that take advantage of the multiple data sources that
are likely to become available in the next decade. These sources are the March
CPS, the 2000 decennial census, and the monthly ACS.
OCR for page 93
FUTURE RESEARCH AND DEVELOPMENT
Multivariate State and County Models
93
Use of multiple data sources (from separate surveys or multiple years of the
same survey) in a system of equations can be advantageous for small-area mod-
eling. For the state model, the Census Bureau has initiated work on a multivariate
approach to incorporating the data from several years of the CPS, instead of just
one year, into the regression equation (see Otto and Bell, 1997~.
For the county model, the Census Bureau developed, as an alternative to the
separate use of CPS and census county regression equations (with the census
equation being used only to estimate the model error variance for the CPS model),
a bivariate county regression model, in which the two dependent variables are the
CPS and the previous census estimates of poor school-age children. This formu-
lation has some very real advantages (see National Research Council, 1998:App.
C). First, the internal evaluation of the regression output for the bivariate models
for 1993 indicated that they are as good as or possibly better than their single-
equation analogues. In addition, tests of the constancy of the parameter that
distinguished between the single-equation and bivariate formulation clearly
showed the benefit of the bivariate approach. Unfortunately, lack of administra-
tive records data for 1979 prevented the Census Bureau from conducting an
external evaluation of the bivariate models in comparison with the 1990 census.
Therefore, given the novelty and relative lack of evaluation of these models, the
panel did not recommend using them for the production of 1993 or 1995 county
estimates of poor school-age children. However, research into this approach
should continue, including an external evaluation as soon as that is feasible using
the 2000 census data.
Similarly, integrating multiple years of the March Income Supplement of the
CPS into the county estimation procedure by means of a multivariate model, as
opposed to the current procedure of averaging the data for 3 years, may be
advantageous. A multivariate model, with estimates from more than one CPS
year and the census as dependent variables in a linear system of equations, might
provide an effective way of using more of the available information. In the future
this model could also incorporate data from the ACS, possibly by adding equa-
tions for the estimates from that survey.
More broadly, a wide variety of approaches that combine information over
time and over geographic areas should be considered as such a combination
might prove effective at modeling poverty for small areas. Because poverty very
likely has commonalities over time and across areas that are similar in economic
conditions, efforts to exploit this structure could prove advantageous and should
be examined.
American Community Survey
The American Community Survey, when it is fully operational, will be an
important component of any approach to providing updated estimates of poor
OCR for page 94
94
SMAL L-ARE4 ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY
school-age children for small areas.3 For states and counties, it is possible that
several months (or years) of data from the ACS might be used to provide direct
estimates of poor school-age children. Alternatively, ACS data could be used
indirectly as a dependent variable in a model-based approach for state and coun-
ties, parallel to the manner in which CPS data are currently used.
However, given that each year of the CPS and the 2000 census will also
provide information on poverty,4 it will be important to find ways to use all three
sources of information together, for multiple time periods (for the CPS and ACS),
to produce the best state and county estimates. Furthermore, given that all three
data sources will have their own measurement biases5 and that they are available
for different time periods the decennial census year, multiple years of the CPS
March Income Supplement, and many months of the ACS it is unlikely that
simply pooling estimates from the three data sources can be justified. Some
adjustment or modeling procedure will be needed. Such a procedure will have to
take account of available information about the variances and biases of the esti-
mates from each data source.
Continued research and development on measurement error and time-series
models will be needed to develop effective multivariate models for small-area
poverty estimates that use multiple data sources for multiple time periods.6 A
specific research issue is to determine how best to use the 2000 census informa-
tion, which will have lower sampling variance but possibly substantial measure-
ment bias and which may be biased if the economic conditions during the census
reference period differ markedly from the period for which estimates are needed.
In order to learn as much as possible about the measurement differences
between the census, the March CPS, and the ACS, the Census Bureau should plan
now for an exact match of the 2000 census with both the March 2000 CPS and the
national ACS sample of about 70,000 households that will be in the field in that
year. These two matches would provide a wealth of information about the three
different income measurement systems. They would also provide key inputs to
the development of a CPS-census measurement error model, which could help
resolve some remaining issues about the state and county models. For example,
3The ACS, together with the 2000 census Master Address File, may also provide the means to
improve small-area estimates of total population and population by age.
4If the ACS is implemented as planned, it is likely that the 2010 census and subsequent censuses
will not include a long form and, hence, will not provide income and poverty information.
5The data collection methods for the census, Ups, and ACS differ in many respects, including the
length of the questionnaire, the primary data collection technique (face-to-face interviews or mail
questionnaires), the definitions of variables, the reference period for income measurement, and edit-
ing and imputation methods. Any of these differences can lead to different measurement biases.
Measurement error models, by attempting to model effects over time and across states resulting
from changes in program administration, could also be used to adjust administrative data that are
used as predictor variables in estimation models for differences due to time or state effects.
OCR for page 95
FUTURE RESEARCH AND DEVELOPMENT
95
some of the category differences observed in the 1990 census comparisons for the
county model could be due to differences between the CPS and census measure-
ments of poverty. A CPS-census measurement error model could also provide
information from which to determine how to use data from the 2000 census with
the current CPS-based estimation procedure to minimize discontinuities in the
Title I fund allocations that may occur when the data from the 2000 census are
incorporated into the models.
School Districts
The planned implementation of the ACS and the availability of 2000 census
data hold out the prospect for markedly improved estimates of poor school-age
children for school districts, as well as for states and counties. However, the
availability of 2000 census and ACS data alone will not likely be sufficient to
provide estimates of acceptable quality for school districts that reflect within-
county as well as between-county changes in school-age poverty for districts.7 It
is likely that modeling will be necessary, and modeling, in turn, will require
sources of data to serve as predictor variables. With the Master Address File that
will be completed for the 2000 census, it should be possible to geocode most
federal tax return data to the school district level. In fact, if a high proportion of
tax return addresses can be geocoded in the near future, even before the census
itself is completed, that information could be used to improve the current syn-
thetic shares estimation method.8 It may also be possible to undertake a federal-
state cooperative effort to provide food stamp data that are geocoded to school
districts.
A substantial research and development effort will be needed for improved
school district estimates of poor school-age children for which work should begin
now. The panel will comment further on the long-term prospects for improve-
ment, in its final report, due at the end of 1999.
DOCUMENTATION AND EVALUATION
The development of small-area estimates of income and poverty is a major
effort that includes data acquisition and review, database development, geo-
graphic mapping and geocoding of data, methodological research, model devel-
opment and testing, and documentation and evaluation of procedures and out
7For many school districts, data from the ACS will have to be pooled across several years to
produce direct estimates of adequate precision. Because the ACS will not be fully phased in until
2003, the first 5-year pooled estimates, for example, will not be available until 2008. Moreover, such
estimates will still be subject to high sampling variability for many districts, similar to the census.
8However, to obtain complete geocoding of these data would likely require that all tax returns be
filed by the address of the residence of the tax filer rather than the address of the tax preparer.
OCR for page 96
96
SMALL-AREA ESTIMATES OF SCHOOL-AGE CHILDREN IN POVERTY
puts. Since the production of small-area poverty estimates supports a range of
important public policies for federal, state, and local governments including the
allocation of funds it is essential that the Census Bureau have adequate staff
and other resources for all components of the estimation program, including
evaluation and documentation. It is the responsibility of any agency that pro-
duces model-based estimates to conduct a thorough assessment of them, includ-
ing internal and external evaluations of alternative model formulations.
An integral part of the evaluation effort is the preparation of detailed docu-
mentation of the modeling procedures and evaluation results. No small-area
estimates should be published without full documentation. Such documentation
is needed for analysts both inside and outside the Census Bureau to judge the
quality of the estimates and to identify areas for research and development to
improve the estimates in future years.
Representative terms from entire chapter:
county model