Many funding formulas currently use as input census long-form data. However, this information is often several years out of date when utilized, which is a problem since the funding formulas are used for programs that are intended to address current problems. The ACS data therefore will have an advantage over long-form data for this application in that they will be produced on a more timely basis. As mentioned above, the ACS initially plans to use moving averages to provide estimates for very small areas. As input into funding formulas for these small areas, one could use the ACS equally weighted moving average estimate, an asymmetric moving average that gives more weight to the current time period (such weighted averages may not greatly increase the variance and will provide information that has less time bias), or one could use the (direct) estimate based only on the current year's information.1 This is a tradeoff between variance and bias and their effect on the funds that areas would and should receive. How should this choice be evaluated?
It is easy to see that the use of moving averages could have an important effect on allocations. Consider a fund allocation formula that has an eligibility threshold, for example, an area receives benefits only when estimated per capita income falls below $17,000 per year. Also consider an area that typically
1 Use of a more elaborate time-series model approach, as discussed in Chapter 2, could reduce the problem discussed here. However, that would be relatively complicated to apply for all ACS responses. Furthermore, the general issues discussed in this chapter would still be relevant.
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 19
Page 19 4 Funding Formulas Many funding formulas currently use as input census long-form data. However, this information is often several years out of date when utilized, which is a problem since the funding formulas are used for programs that are intended to address current problems. The ACS data therefore will have an advantage over long-form data for this application in that they will be produced on a more timely basis. As mentioned above, the ACS initially plans to use moving averages to provide estimates for very small areas. As input into funding formulas for these small areas, one could use the ACS equally weighted moving average estimate, an asymmetric moving average that gives more weight to the current time period (such weighted averages may not greatly increase the variance and will provide information that has less time bias), or one could use the (direct) estimate based only on the current year's information. 1 This is a tradeoff between variance and bias and their effect on the funds that areas would and should receive. How should this choice be evaluated? It is easy to see that the use of moving averages could have an important effect on allocations. Consider a fund allocation formula that has an eligibility threshold, for example, an area receives benefits only when estimated per capita income falls below $17,000 per year. Also consider an area that typically 1 Use of a more elaborate time-series model approach, as discussed in Chapter 2, could reduce the problem discussed here. However, that would be relatively complicated to apply for all ACS responses. Furthermore, the general issues discussed in this chapter would still be relevant.
OCR for page 19
Page 20 has an (estimated) per capita income above the threshold (and hence is often not eligible) but has wide variation in this estimate (due to either real change or the variance in the estimate) from year to year. The use of moving averages in producing this area's estimated per capita income would reduce the number of highs and lows, which will, in turn, reduce the years in which estimated per capita income will fall below the threshold. Alternately, smoothing could also result in more money going to an area. For example, smoothed estimates for an area with an average value below the threshold having substantial variability from year to year will serve to keep that area eligible for this hypothesized program for more years. Fund allocation formulas often have hold-harmless provisions, which is another common feature that can cause an area to have substantially different allocations as a result of that area's estimate having a larger or smaller variance. Hold-harmless provisions guarantee an area a high percentage (often 80% or 90%) of the funds they received in the previous time period regardless of their inputs for the current time period. These provisions have the goal of protecting areas against large decreases in funding from one year to the next, so they can undertake multiyear fiscal obligations. Assume now that an estimate for an area, to be input into a formula with a hold-harmless provision, has a large variance. Assume also that larger estimates are associated with larger allotments. Because the estimate for the area has a large variance, at some point the estimate will be relatively high due to random variation, and the area will receive a much larger allocation than it would have without the random variation. Then, as a result of the hold-harmless provision, that area could continue to receive “undeserved” higher benefits for several years. The overall question to address is how to evaluate the performance of direct versus moving average estimates as inputs into fund allocation formulas. A first step would be to understand how variance affects allocations. RESEARCH DIRECTIONS Charles Alexander described the Census Bureau's current plans. For areas of more than 65,000 population, the input into fund allocation formulas from the ACS will be the direct estimate. For areas with population between 30,000 and 65,000, the input will be the average of the most recent and previous years' data. There are further thresholds for the use of moving averages of 3 and 4 years. For the smallest areas of less than 15,000 population, the input will be the average of the direct estimates for the most recent 5 years. These cutoffs are designed for typical uses of census data (point-in-time observations) and are based on a roughly equal coefficient of variation criterion for sampling error. However, it is important to distinguish between uses of estimates that require an assessment of the situation at a given time and uses of
OCR for page 19
Page 21 the estimates for forecasting, since forecasting would favor shorter moving averages. Alan Zaslavsky's presentation, based on joint work with Allen Schirm, focused on the relationship among data, estimation methods, and funding formulas. His goal was to show that funding formulas with such features as hold-harmless provisions, eligibility thresholds, using estimates (especially nonlinear estimates) with substantial bias and variance, along with dynamics in the quantity of interest, often have unintended consequences. Funding formulas often need three inputs: the number of people that are categorically eligible, the rate of incidence, and the total population of interest. These quantities are estimated in a variety of ways, sometimes directly and sometimes indirectly. The indirect estimates can include averaging over time, but more involved methods include small-area estimation models, using regression and empirical or hierarchical Bayes' methods to combine data over time and space. Much of the following is exemplified by two main examples: the allocations to states under the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) and Title I allocations to counties and school districts (discussed in the Appendix). Typical data sources as inputs into fund allocation formulas are the most recent decennial census—generally the long form, which has nonnegligible sampling error for small areas; household surveys, which have smaller sample sizes but good measurement properties; and administrative records, for which the content is often not what one desires, the definitions are often inconsistent, and access and use can be complicated. The ACS as a data source has many advantages (and some disadvantages) over these existing data sources. It is more current than, say, the last census (use of the census for input to an allocation formula implies a great deal of stability in what is being measured). The ACS has a larger sample size than all current household surveys, and its content is better tailored to meet current data needs than administrative records. Of course, it is expected that the output from household surveys will have considerably less measurement error than the ACS. The use of data from the ACS will likely have the following implications: more frequent recalculation of formula inputs due to the timely availability of current data, the availability of useful direct estimates for more variables and smaller areas, and improvement of the quality of indirect estimates due to currency and uniformity. 2 The direct estimates from the ACS will have reduced variance relative to current surveys, but an increase in variance relative to the census long form. Most important, there will be a reduction in bias due to the increased timeliness of the information. Also, changing from a census- 2 It is possible that the production of ACS estimates could generate interest in reviewing the allocation formulas themselves.
OCR for page 19
Page 22 based or CPS-based measurement process to an ACS-based process will very likely introduce considerable measurement error. It is important to understand that the estimation procedures and the formula are not separable. For example, if one computes this year's inputs through averaging the data for the previous 3 years, that is equivalent to simply requiring as the input the average of the previous 3 years. In one case, it is a choice for an estimate, and in the other it is what the formula requests. 3 (Of course, the best estimate of the moving average of the previous 3 years' true means, which a fund allocation program might ask for as an input, might not be equal to the simple 3-year moving average of the individual annual estimates.) The presentation focused on the impact of the variance of estimates used as input for fund allocation formulas, with the understanding that bias also has an extremely important and possibly more widely appreciated impact. If an estimation procedure is linear and if the allocation formula is linear, variance in the inputs will not affect the long-range allocations in expectation and therefore are unlikely to matter very much. However, in the nonlinear case, the variance of an estimate for an area can have a large effect on how much money an area receives. This poses a problem, since an area has no effect on how much variance its estimates have. A simulation study was described in which four factors were varied: (1) the sampling standard error; (2) the estimation method (single year, 3-year moving average with equal weights ending with the current year, or 3-year exponentially weighted moving average also ending with the current year); (3) the formula (use of hold-harmless or threshold provision, or both); and (4) population trends (constant, trending upward, trending downward). The assumption was also made that the general allocation formula, possibly aside from these two provisions, is proportional. The simulations were over a 4-year period and were repeated (except for a few exceptions) 10,000 times. The simulations showed that with an eligibility threshold and no trend, sampling variability serves to smooth out the effects of the threshold, which may not be a bad thing from a public policy point of view. However, the degree of smoothing depends on the sampling variance, which is related to the size of the area for many household surveys, with the result that an area's allocation depends on the sampling variance of its inputs, which is not rea- 3 There is a distinction between providing different estimates that measure conceptually distinct items and providing different estimates for the same conceptual item as a result of using different loss functions. The former is easily explained, while it is difficult to explain the latter. It would be useful to better communicate the costs of using the wrong loss function to users. More generally, point estimates are not the best way to communicate information about the estimates of a quantity. It might be more useful to have some idea of the distribution of estimates for that quantity, which address more questions relevant to fund allocation.
OCR for page 19
Page 23 sonable. Examining an 80 percent hold-harmless provision with no trend, the larger the sampling variance, the larger the allocation bias, 4 given the asymmetrical nature of the effect of the hold-harmless provision. If one uses moving averages, the allocation bias is reduced. Specifically in this study, a bias of 20 percent is reduced to one of 3 percent using a 3-year moving average. If rapid responsiveness to fluctuations is not that important, this smoothing helps to reduce the allocation bias that occurs from a having a hold-harmless provision in the allocation formula. Combining a hold-harmless provision with a threshold is even more worrisome, since the threshold results in larger jumps in allocation from year to year. With a downward trend, 3-year accumulations cause a bias, since one is using data for years that do not reflect the latest changes. With a hold-harmless provision, coupled with a downward trend, the effect of the provision is more extreme, since the smoothing of the trend and the hold-harmless provision are both (occasionally) keeping the allocations too high. With an upward trend, the effects from the provision are less worrisome. The remaining results from the simulation were relatively intuitive, once the patterns were examined, with three main points: (1) the data sources, the estimation procedures, and the allocation formulas must be considered as an integrated whole; (2) when a change is made to one piece, the others must be kept in mind, going back to consider the original intentions of the program, if necessary; and (3) linear procedures have more predictable consequences, and methods that produce smoother estimates also have more predictable consequences. New data sources such as the ACS require reevaluation of funding formulas in light of the original intentions of the program, not simply replicating previously used procedures. The floor discussion raised a number of additional issues. The importance of the level of geography at which estimates were needed in these programs was stressed. In response to the comment that use of the ACS might result in an increase in the variance of estimates in comparison with use of the census long form, the value of using the ACS in combination with, say, the CPS, to reduce sampling variance (and also measurement bias) was stressed. This approach, which would provide high-quality, timely small-area estimates, might provide an incentive to consider distributing funds using federally controlled allocations to smaller areas. Also, it was noted that once the hold-harmless allocations are computed, the total amount allocated would be affected, necessitating a recalculation of shares. 4 With respect to these simulations, allocation bias is defined as the average difference, over replications, between the allocation received using estimates with some assumed variance and the allocation received using estimates with zero variance.
OCR for page 19
Page 24 The problem of thresholds and hold-harmless clauses in fund allocation formulas was also addressed 25 years ago by the Federal Committee on Statistical Methodology (Office of Federal Statistical Policy and Standards, 1978) in its first report, although the number of fund allocation programs was much smaller then. Considering the expansion in programs with allocation formulas, it is a good time for reconsideration of these issues. The formulas could be reviewed, along with the estimation methods, the data sources, and what statistical problems might be involved. The three different elements in the allocation process—the formula, the data source, and the estimation procedure—are in the domains of many different individuals and groups, and they rarely interact. TerriAnn Lowenthal commented on fund allocation formulas and the ACS from a congressional perspective. The concern is whether the designers of these formulas realize the implications from their use, especially the unexpected consequences of hold-harmless provisions and eligibility thresholds. She described how formulas come about in the legislative process. She suggested that this discussion needs to be communicated to the people that are developing these formulas. There needs to be more interchange so that the developers understand and avoid these unintended consequences. She added that the project on small-area estimates of poverty is an illustration of how to bring together the interested parties on the intent of legislation. One needs to keep in mind that it is important to find the right language to communicate with members of Congress. FINAL POINTS The effects of the variance of estimates used as inputs to fund allocation formulas with features, such as hold harmless provisions and eligibility thresholds, is complicated and can have unintended consequences. This needs to be more widely understood. The impact of the bias of estimates used as inputs to fund allocation formulas also needs to be examined, especially how one might trade off of bias and variance in comparing competing estimators for purposes of equitable fund allocation.