Recommendations for Producers and Users
Small-area income and poverty estimates are increasingly in demand for important public policy purposes, such as allocation of funds to states and localities. No estimates can satisfy all requirements perfectly or be without error, but for fund allocation and related program uses, it is critical that they meet the highest possible standards with regard to their development and use.
In this concluding chapter we recommend practices that we believe should be followed in the production of small-area estimates, documentation that users of estimates should expect from producers, studies that users should undertake of the effects of estimates on programs, and the need for policy makers to consider carefully the strengths and weaknesses of alternative sources of estimates in selecting which ones to use for fund allocation and other program purposes. Policy makers also need to consider the design of formula provisions as they interact with the properties of estimates.
Our recommendations apply specifically to the model-dependent estimates produced from the Census Bureau's Small Area Income and Poverty Estimates (SAIPE) Program. It seems likely to us that the regularly updated SAIPE estimates will become more widely used for fund allocation –not only as new programs for allocating funds to subnational areas are introduced, but also as existing programs are modified to use the SAIPE estimates in place of outdated census estimates. However, many of our recommendations apply equally to other sources of small-area
estimates, including direct estimates produced from surveys or administrative records.
At present, for counties and smaller areas, model-based estimates of income and poverty are generally the only possible source of estimates that are more up-to-date than those from the decennial census. Even when it is possible to produce direct survey estimates by averaging over several years or months, as is sometimes done for states from the Current Population Survey and as is planned for states and smaller areas from the American Community Survey, model-based estimates should be considered by users and may be preferred.
PRODUCTION OF ESTIMATES
The production of model-based estimates (such as those provided by SAIPE) that use multiple data sources and sophisticated statistical techniques is a major effort that includes many operations. These operations include data acquisition and review, database development, geographic mapping and coding of data, methodological research, model development and testing, production of estimates (together with estimates of their error properties), and thorough evaluation and documentation of procedures and outputs. For the estimates to be of the highest quality possible for such important uses as fund allocation, it is essential that the producing agency have adequate staff and other resources for all components of the estimation program.
Below we identify practices that we believe are critically important to follow for each of the major components of a small-area estimation program. In addition, the producing agency should maintain regular contact with key users, so that the estimation program is producing those estimates that are most needed and appropriate within the constraints of available resources.
As a matter of routine practice, each time a new round of estimates is prepared, the producing agency should check the input data for errors (e.g., check to see that state food stamp reports look reasonable compared with the previous year's reports and do not have transcription or other errors). Such checking should also include the procedures used to geocode or otherwise assign data to areas for which estimates are to be produced.
A producing agency should regularly review each data source to determine its continued suitability for use in estimation model(s). Such reviews should address the comparability of the data over time and across
areas. If changes, such as program changes for administrative records, are determined to affect either temporal or spatial comparability, it will be necessary to carry out research and development to determine if alternative model formulations can still make use of the data or if the particular data source needs to be dropped.
A producing agency should regularly search for possible new data sources and consider pilot efforts as appropriate to establish the value of a new source. The search for new data sources is particularly important because some sources currently being used may change in ways that adversely affect their usefulness for estimation. A producing agency should also identify changes that might be made to existing survey and administrative records sources to enhance the usefulness of the data for modeling, while not adding undue burden for the source agency. Because it may not be easy to gain agreement to make changes to ongoing administrative records systems and surveys, the producer agency should seek the cooperation of users to understand and support the need for change.
Another dimension of data to pursue is timeliness. A producing agency should make efforts to reduce the lag in availability of key data sources so that the lag in releasing estimates can be reduced. Strategies for more timely estimates could include changes to modeling procedures, as well as working with data originators to reduce the time between collection and delivery of data to the producing agency.
Finally, every producing agency should regularly document its use of data sources in estimation models and, to the extent possible, make available assessments of the effects of each source on the production of estimates. It is particularly important to document the effects on estimates when there is a change in data sources–for example, if the existing Current Population Survey-based SAIPE models are turned into American Community Survey-based models.
Methodological Research: Model Development and Testing
It is important for a producing agency to have resources to carry out research on methods that may improve the estimates in terms of their variability, bias, and timeliness (see Chapter 3). Such research should include provision for early testing of promising ideas in models for which the estimates can be evaluated in comparison with estimates from existing production models. A new model can be crude for this purpose; the intent is to learn early on if improvements from a new model appear substantial enough to warrant work toward full-scale development. Methodological research and model testing should always be accompanied by documentation and archiving to maintain a record of ideas that
were tried but did not work out, ideas that appear promising but need considerably more work, and ideas that appear to be prime candidates for development in the short term.
It is the responsibility of an agency that produces model-dependent estimates to conduct a thorough assessment of them. Every time that a set of production estimates is produced, evaluations should be carried out before the estimates are released. Such evaluations should include checking of input data and software program code to make sure that all specifications were correctly implemented. Such checking is especially important whenever there are changes in the data (which will likely happen each year for which estimates are produced) or the software (which may happen less frequently).
Regular evaluations should include internal evaluations of the model outputs each time that estimates are produced–for example, examining patterns of residuals and other features of regression models (see Chapter 3). Over time, the internal evaluations should focus on identifying consistent biases that may appear for multiple estimation years, and research and development should be directed to understanding and reducing those biases to the extent possible. It is expected that random variation will produce anomalies in estimates in any given year; however, persistent patterns need to be investigated and addressed through such means as trying out alternative model specifications. One-time anomalies, which might be due to a problem with the input data rather than random variation, should also be investigated.
Regular evaluations should include external evaluations to the extent possible, by comparing the production estimates with estimates from other sources (see Chapter 3). Whenever a production model is being revised in its specifications or sources of data, there should be the fullest external evaluation possible, including comparisons with alternative model formulations. In this instance there should also be an internal evaluation of alternative models.
Documentation of Procedures and Evaluations
An integral part of the evaluation effort outlined above is the preparation of detailed documentation, which should cover both the evaluation results and the modeling procedures in sufficient detail to permit replication of the estimates. No small-area estimates should be published without full documentation. Such documentation is needed for analysts both inside and outside the producing agency to judge the quality of the esti-
mates and to identify areas for research and development to improve the estimates in future years.
The producing agency should make arrangements for researchers outside the agency to have access to the input data and models, taking care to address confidentiality concerns. Such access is important to permit independent replication and evaluation.
USE OF ESTIMATES
Users of small-area income and poverty estimates need to ensure that the estimates provided by the producing agency are used effectively and appropriately. Thus, an agency such as the U.S. Department of Education should have an active program to understand estimates and assess their effect on such uses as Title I allocations to school districts.
A user agency should convey its expectations that the producing agency will provide complete, understandable, and timely documentation of the methods for developing estimates and evaluation results to accompany each new release of estimates. The user agency should carefully review the documentation so that it fully understands the properties of the estimates.
A user agency should also regularly undertake studies of the effects of the estimates on fund allocations (or other program uses) that are made of them. Studies of fund allocation effects will require maintaining a database of each year's allocations and having the capability to analyze allocation patterns in relation to program provisions and the type and quality of estimates, including the capability to simulate alternative provisions and estimates. Such studies should help inform policy makers about the operation of formulas and how changes in formulas or the estimates used could achieve the program 's goals more effectively. Such studies could also help identify priority areas for improvements in estimates to provide to producer agencies.
For federal funding programs in which states suballocate federal amounts to localities, the responsible federal agency should not only study the effects of estimates on the initial funding amounts determined by the agency, but also review the methods and data used by states for suballocation. At a minimum, the responsible federal agency should regularly collect data on state suballocation amounts, procedures, and sources of estimates. In addition, to the extent possible, the agency should conduct evaluation studies of the effects of state procedures and data on the resulting allocations. (Studies could perhaps subsample states for this purpose.) Such studies may be helpful to the responsible federal agency in developing guidance for use of estimates by states.
Finally, a user agency may find it useful periodically to commission
in-depth reviews of the estimates that are used for its programs and possible alternatives to them by individuals or groups not affiliated with the producer or user agency. Such reviews should be carried out not only when the program estimates are dependent on a model, but also when they are obtained directly from a survey or administrative records. A full-scale review should include the strengths and weaknesses of alternative sources of estimates in terms of program requirements for the income or poverty definition, level of geographic and population detail, timeliness, and accuracy (including both bias and variability across areas and over time).
DECIDING TO USE ESTIMATES FOR PROGRAMS
If producing agencies follow good practice in developing, evaluating, and documenting estimates, and user agencies are vigilant in seeking to understand estimates and assess their effects on fund allocations and other program uses, then policy makers will have information with which to periodically reassess the laws and regulations that cover use of estimates for program purposes. As we discuss in Chapter 6 for fund allocation formulas, it is critical that policy makers be aware of the unintended consequences that errors in estimates can have on allocations. It is also important that information about the effects of alternative formula provisions and the kinds and quality of estimates be considered in decisions about how to construct or modify formulas and which estimates to use in them. Because it may be difficult to take account of such information in the heat of debate on particular legislation, it is important for policy makers to commission periodic assessments or take other steps to identify key issues and develop detailed alternatives for consideration in the early stages of crafting new or modified program legislation. Such assessments can also contribute to regular reviews by policy makers of the provisions of existing allocation formulas that use small-area estimates.