ModelBased Estimates for School Districts and School Attendance Areas
For all school districts in the Census Bureau’s Topologically Integrated Geographic Encoding and Referencing (TIGER) database, the Small Area Income and Poverty Estimates Program (SAIPE), operated by the Census Bureau, has been releasing annual estimates for the number of related^{1} children aged 517 living in families with income below the poverty level since 1999.^{2} Title I of the No Child Left Behind Act of 2001 directed the U.S. Department of Education to allocate $14 billion to school districts based on SAIPE results.^{3}
The SAIPE model estimates are produced for a given year with about a 1year time lag. For example, 2008 estimates were released in December 2009; they incorporated administrative records information for 2007. This schedule is only a few months later than the release of direct American Community Survey (ACS) estimates. The SAIPE modelbased estimates have the advantage of reducing mean squared error compared with direct estimates for small geographic areas; however, their accuracy depends on the validity of the underlying model and may vary for different kinds of areas. SAIPE estimates are not available for census tracts or block groups, and they pertain to the official statistical poverty level and not the 130 percent and 185 percent ratios of income to the poverty level that determine
____________
^{1} Related children are people who are aged 517 and related by birth, marriage, or adoption to the householder of the housing unit in which they reside; foster children, other unrelated individuals, and residents of group quarters are not considered related children.
^{2} Estimates were also released in 1995 and 1997.
^{3} The development of SAIPE is described in National Research Council (2000a,b).
Below are the first 10 and last 10 pages of uncorrected machineread text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapterrepresentative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 264
Appendix C
ModelBased Estimates
for School Districts and
School Attendance Areas
F
or all school districts in the Census Bureau's Topologically Inte
grated Geographic Encoding and Referencing (TIGER) database,
the Small Area Income and Poverty Estimates Program (SAIPE),
operated by the Census Bureau, has been releasing annual estimates for
the number of related1 children aged 517 living in families with income
below the poverty level since 1999.2 Title I of the No Child Left Behind Act
of 2001 directed the U.S. Department of Education to allocate $14 billion
to school districts based on SAIPE results.3
The SAIPE model estimates are produced for a given year with about
a 1year time lag. For example, 2008 estimates were released in December
2009; they incorporated administrative records information for 2007. This
schedule is only a few months later than the release of direct American
Community Survey (ACS) estimates. The SAIPE modelbased estimates
have the advantage of reducing mean squared error compared with direct
estimates for small geographic areas; however, their accuracy depends on
the validity of the underlying model and may vary for different kinds of
areas. SAIPE estimates are not available for census tracts or block groups,
and they pertain to the official statistical poverty level and not the 130 per
cent and 185 percent ratios of income to the poverty level that determine
1Related children are people who are aged 517 and related by birth, marriage, or adoption
to the householder of the housing unit in which they reside; foster children, other unrelated
individuals, and residents of group quarters are not considered related children.
2Estimates were also released in 1995 and 1997.
3The development of SAIPE is described in National Research Council (2000a,b).
264
OCR for page 264
APPENDIX C 265
eligibility for free and reducedprice school meals, respectively. The panel
partnered with the Census Bureau to develop modelbased estimates for
the percentages of public school children who are eligible for free and
reducedprice meals. The model was developed in a short period of time
and with limited resources, and should be viewed as a proof of concept.
The work on developing and evaluating the model led to the identifica
tion of research topics that could be used to improve the model in the
future should resources become available.
HIGHLIGHTS OF THE SAIPE ESTIMATION PROCESS4
The SAIPE estimation process entails several steps. First, statelevel
poverty estimates are developed for ages 04, 517, 1864, and 65 and
older. There are two equations for ages 517one for all children and one
for related children. These estimates are based on a weighted average
of a direct ACS estimate and a prediction from a regression model. The
dependent variable in the model is the ACS 1year direct estimate.5 Inde
pendent variables include the poverty rate from the 2000 census, the tax
return poverty rate, the tax return nonfiler rate, a Supplemental Nutri
tion Assistance Program (SNAP, formerly the Food Stamp Program) par
ticipation ratio, and the Supplemental Security Income (SSI) receipt rate.
The regressionbased and ACSbased estimates are combined, each being
weighted according to the associated uncertainty, with the more uncertain
estimate having less weight. The poverty ratios obtained are multiplied
by population estimates to provide counts of the number of people in
poverty, which are controlled to sum to the official national total from
the ACS.
Second, countylevel estimates are developed. Like the state esti
mates, the county estimates are based on a weighted average of direct
ACS estimates and regression predictions. The dependent variable in each
regression model is the log of the number of people in a particular age cat
egory in that county as measured by the ACS. Predictor variables (appro
priately transformed) include the number of child exemptions claimed
on tax returns of people in poverty, the number of child exemptions on
tax returns, the number of SNAP benefit recipients, the resident popula
tion, and the estimated number of people in the age category in poverty
according to the 2000 census. Weighting of the ACS and model estimates
4This section comes from documentation on the Census Bureau's website, with some minor
editing. See http://www.saipe.gov/did/www/saipe/methods/schools/data/20062008.
html.
5ACS direct estimates are estimates produced for a population group, time frame, and
geographic area based only on ACS data and the ACS methods documented by the Census
Bureau.
OCR for page 264
266 USING ACS DATA TO EXPAND ACCESS TO THE SCHOOL MEALS PROGRAMS
is based on the uncertainty associated with each estimate. For counties
for which there are no ACS sample observations in the age category, the
weight on the model's prediction is 1. County estimates are adjusted so
they sum to the state total from the previous step.
State and countylevel estimates are provided along with estimates
of their uncertainty, measured as a margin of error. The margin of error
is the halfwidth of a 90 percent confidence interval for an estimate and
is equal to 1.645 times the standard error. The standard errors represent
"uncertainty" arising from two major sources: ACS sampling variation
and lack of fit of the regression model to what the ACS measures. In gen
eral, the former error is larger than the latter.
Finally, school districtlevel estimates are developed using a "shares
methodology," a way of creating estimates for subjurisdictions from
estimates for the jurisdiction. Counties are divided into school districts,
parts of school districts (for districts that cross county lines), and possi
bly residual pieces not in any school district. The division may be done
separately by grade and type of school. For the 2008 SAIPE estimates,
the child poverty shares for each subcounty portion of a school district
were constructed by combining the shares from two data sources2010
decennial Census direct estimate poverty shares and child tax poverty
shares. Not all tax returns can be exactly located at the subcounty level,
so in areas with less reliable subcounty tax data, the SAIPE estimate relies
more heavily on the decennial census share. The precise method used
for combining these two shares is termed the minimum change method,
Maples and Bell (2007). For each school district and school district piece,
estimates are derived for the total population, children aged 517, and
related children aged 517 in families in poverty. Margins of error are not
currently provided for school districtlevel estimates, although the Census
Bureau continues to conduct research on the estimation of standard errors
for school districtlevel estimates.
The 2008 school district estimates are based on the 2008 county esti
mates and tabulations of poverty from the 2000 census and income tax
data for tax year 2007 from the Internal Revenue Service (IRS), using
school district boundaries corresponding to school year 20072008. By
construction, the SAIPE school district estimates are arithmetically con
sistent with the SAIPE county and state estimates.
OCR for page 264
APPENDIX C 267
MODEL SPECIFICATION FOR SCHOOL MEALS PROGRAMS6
Census Bureau staff noted the following challenges in adapting the
SAIPElike model to produce estimates of eligibility for the school meals
programs.
· To follow the loglevel structure of the SAIPE model would
require an estimate of the universe. In the case of the school meals
programs, the universe contains children aged 019 attending pub
lic school. The only source for public school attendance consistent
with Census Bureau population and survey estimates is the ACS.
This source would inject additional sampling error into the esti
mates and suggests the possible desirability of modeling public
school enrollment.
· Countylevel modeling of the log of surveyweighted counts
causes removal of counties with zero estimates. In the 2009 ACS,
about 4 percent of 3,143 counties had zero estimates of eligibility
for free meals, and 21 percent had zero estimates for reduced
price meals. This demonstrates two points: (1) deleting these
observations to take logs appeared to be more severe than includ
ing them in a continuous distribution rate model, and (2) work
done by Elizabeth Huang and Jerry Maples of the Census Bureau
indicates potential serious bias for successive difference variance
estimates of log quantities with small sample sizes.
· SAIPE is designed for Title I allocations, which is a "fixedpie"
funding program; that is, the total funding for Title I is fixed so
that an increase in the amount allocated to one jurisdiction entails
a decrease in the amounts allocated to one or more other jurisdic
tions. Therefore, national consistency among the level estimates
and topdown controls are important. The school meals programs
are fully funded, and the target estimates are eligibility rates.
· To produce accurate (unbiased) estimates of the parameters, the
Census Bureau decided to estimate the parameters at the county
level, where zero eligible in a sample is less prevalent. How
ever, since a lagged ACS survey variable was also included, the
assumption of constant parameters across all sizes of districts
may be untenable.
· To allow for variable parameters, separate parameter estimates
were produced for each of three partitions (020,000 residential
population, 20,00065,000, and 65,000+). All parameters (regres
6This is an edited version of documentation provided to the panel on May 12, 2011, by
the U.S. Census Bureau.
OCR for page 264
268 USING ACS DATA TO EXPAND ACCESS TO THE SCHOOL MEALS PROGRAMS
sion coefficients and model error variance) may differ from one
partition to another.7
For the school districtlevel model, the Census Bureau chose a Fay
Herriot structure similar to SAIPE production, but on an unlogged rate
scale8 rather than loglevels. Parameters were estimated independently
for both the free and reducedprice eligibility rates at the county level and
then applied to school districtlevel auxiliary data. No raking to higher
levels was performed.
CountyLevel Model
The empirical Bayes model of eligibility rates reflects the general
shrinkage form suggested by Fay and Herriot (1979). The model is
yi = Yi + ei, where ei ~ ind N (0, vi)(1)
and
Yi = xi k + ui where ui = i.i.d.N 0, 2
(
k (2) )
where for a given year and county i,
· yi = ACS direct survey estimate of free (or reducedprice) eligibility
rate;
7The SAIPE county model estimates one set of parameters across all counties. For the
school meals programs, the Census Bureau addressed the issues of size variation by using
the size partitions associated with 1year, 3year, and 5year ACS estimates. These models
may or may not adequately represent school districts within a county that may be very small
and have very different urban/rural or other important properties. Census Bureau analysts
stated that they do not have solid evidence as to whether the quality of the estimates can
be extrapolated to very small areas. They did perform residual analysis, whereby it does
not appear visually that excessive outliers are present at smaller sizes, but do not have any
statistical testing to report. Appropriate partitioning and evaluation for very small areas is
an ongoing field of research at the Bureau. The models for the school meals programs could
similarly benefit from additional research.
8The analysis conducted made it clear that a log transformation was not a good approach.
However, no extensive specification search was performed for other transformations, and no
testing for linearity of the chosen specification was conducted because of time and resource
constraints. This could be a topic for further research. However, the range of estimates did
not appear to be that extreme. There were outliers at 0 and 100, but excluding these, the 10th
and 90th percentiles for the 2009 ACS dependent variables at the county level were 1457
percent for free eligibility rates and 121 percent for reducedprice eligibility rates. Census
Bureau analysts believed that one of the data characteristics driving poor fit for the reduced
price eligibility model was the limited range of the dependent variable.
OCR for page 264
APPENDIX C 269
· Yi = true population value of free (or reducedprice) eligibility rate;
· ei = yi  Yi = sampling error in yi as an estimate of Yi;
· xi = vector of regression variables (see below);
· k = vector of regression parameters for partition k (population
size), k = {k1, k2, k3};
k1 = counties with population less than 20,000;
k2 = counties with population greater than or equal to 20,000
but less than 65,000; and
k3 = counties with population greater than or equal to 65,000;
· ui = random model error (county random effect);
· vi = a generalized variance function (GVF) representation of the
ACS sampling variance (the GVF is described below); and
· 2
k = the model variance associated with partition k.
The independent variables that constitute the vector xi in the free
eligibility model and reducedprice eligibility model are as follows:
· Free eligibility model
 Tax income/poverty ratiothe ratio of the number of child
exemptions in households with income less than or equal to
130 percent of the poverty level to the total number of child
exemptions in the county
 Child tax coverage ratiothe ratio of the number of child
exemptions on tax returns in the county divided by the total
household population with age less than or equal to 19
Fouryear average ACS ratethe average of the free eligibility
rates for the other 4 years of the ACS9
· Reducedprice eligibility model
 Tax income/poverty ratiothe ratio of the number of child
exemptions in households with income greater than 130 per
cent of the poverty level but less than or equal to 185 percent
of the poverty level to the total number of child exemptions in
the county
Fouryear average ACS ratethe average of the reducedprice
eligibility rates for the other 4 years of the ACS
Estimation of the parameters proceeds on the assumption that ACS sam
pling variances are known, using the GVF estimate, ^ i , described below and
iterating the weighted least squares regression equations to the maximum
likelihood estimate of the model variance 2
k for each partition k.
9For example, in the model for 2008, this predictor is the average of the estimates for 2005,
2006, 2007, and 2009.
OCR for page 264
270 USING ACS DATA TO EXPAND ACCESS TO THE SCHOOL MEALS PROGRAMS
The GVF model used is as follows:
vi = h(mi) pi (1 pi)
where
· mi = number of household respondents in ACS sample;
· pi = free (or reducedprice) eligibility rate; and
· h(mi) = k midei.
Or transforming for estimation:
1
log
( ( ^ i (1  p
^i / p ^i ) )) 2
= 1 + 2 log ( mi ) + i
where
· ^ i = direct successive difference estimate, and
· p^ i = countylevel fittedvalue estimate of the free (or reduced
price) eligibility rate.
The parameters a1 and a2 were estimated with simple linear regres
sion. The estimated value a2 varies from 0.44 to 0.45 for all years, imply
ing an exponent on mi of nearly negative 1.
School DistrictLevel Estimates
For school district j in county i, there are two estimates for Yj: the ACS
direct estimate and a predicted value derived by plugging school district
level independent variables into a model with estimated parameters from
the countylevel model. Values for the school district tax variablestax
income/poverty ratio free, tax income/poverty ratio reducedprice, and
child tax coverage levelare calculated using minimumchange synthetic
estimates.10 Then, shrinkage estimates (empirical best predictions) for
school districts (i.e., predictions of Yj for school district j) and the cor
responding prediction error variances are computed by plugging the
parameter estimates into the following standard formulas (Bell, 1999):
10The tax variables are prepared by tallying all tax returns that have been coded to a spe
cific district within a county and adding in a "synthetic" estimate for those tax returns that
have been coded to the county but not to a specific district. The method used is described
in Maples and Bell (2007).
OCR for page 264
APPENDIX C 271
Yj (
^ = 1  w y + w x
j j j )
^
j k (3) ( )
where
vj
wj =
(v + )
j
2
k
and
( j j k )
^ = w 2 + w 2 x Var
Var Yj  Y j j
^ x (4)
k j ( ( ) )
The parameters ^ and variance 2 are estimates from the county
k k
model. The parameter vj is the GVF estimate11 for the variance of the
direct ACS estimate for the district.
The standard error estimator in equation (4) does not account for
estimation error in 2
k ; an asymptotic correction for this error was found
to be small in the past. Similarly, the estimator does not account for the
varying quality of the synthetic estimates of the independent variables
across school districts. Hence, 2
k may be underestimated, leading to
reported standard errors that are too low. Future research may be needed
to address this issue.
Results and Evaluation
Regression results for 2009, including estimated coefficients and sum
mary statistics, are shown in Table C1. Figure C1 displays the median
free and reducedprice eligibility rates estimated by the model over time.
The median free eligibility rate showed a slight upturn in 2009, while
the reducedprice eligibility rate was relatively flat. Figure C2 shows
the average across districts of 5year ACS eligibility rates for free and
reducedprice meals by size of school district. Figure C3 shows the medi
ans (across districts) of the relative standard errors for percentages eligible
for free meals estimated by the model and from the 5year and 1year ACS
by size of school district. Figure C4 shows the same thing for percentage
eligible for reducedprice meals. Figures C5 and C6, respectively, show
the medians of the root mean squared difference (RMSD)12 (a measure of
variation over time) for freeeligible and reducedpriceeligible percent
ages estimated by the model and from the 1year ACS by size of district.
11GVF is used to estimate the direct variance of the ACS estimates to reduce the volatility
in this districtlevel shrinkage estimate.
12For a given singleyear estimate, Y ^ for year t and area i, the RMSD is defined as
ti
{ )}
1
(
2
^ ^
t = 1 ( 1/ T ) Yti  Yi
2
RMSDi = T where Yi = T
t = 1Yti / T .
OCR for page 264
TABLE C1 Regression Results for 2009
Free Eligibility ReducedPrice Eligibility
Resident Population Partitions < 20k 2065k 65k+ < 20k 2065k 65k+
Coefficient Estimates, Z < 1.645, Z < 1.00
Tax ratio 0.74 0.75 0.39 0.51 0.50 0.33
Child filing ratio 0.18 0.12 0.07
Lagged ACS 4year 0.28 0.31 0.65 0.01 0.07 0.33
No. of Counties 1,321 1,024 792 1,321 1,024 792
Average Dependent Variable 33.3 34.2 29.5 12.0 10.6 8.7
Model Error Variance 0.9 0.3 0.0 0.7 0.2 0.0
R2 0.356 0.576 0.873 0.014 0.048 0.363
Standardized Residual
Mean 0.00 0.01 0.02 0.00 0.00 0.00
Median 0.08 0.01 0.05 0.26 0.15 0.08
Raw Residual
Mean 0.08 0.16 0.23 0.00 0.00 0.00
Median 1.16 0.12 0.19 2.80 0.96 0.20
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
272
OCR for page 264
APPENDIX C 273
30
25
20
Eligibility Rate
Free
Reduced price
15
10
5
0
2005 2006 2007 2008 2009
FIGURE C1 Median free and reducedprice eligibility rates estimated by the
models over time.
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
FIGC1.eps
30 Free
Reduced price
25
20
Eligibility Rate
15
10
5
0
65k (942)
FIGURE C2 Average 5year ACS eligibility rates for free and reducedprice
meals by size of school district. FIGC2.eps
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
OCR for page 264
274 USING ACS DATA TO EXPAND ACCESS TO THE SCHOOL MEALS PROGRAMS
0.80 Model results
ACS 5year
0.70
ACS 1year
0.60
Percentage Eligible
0.50
0.40
0.30
0.20
0.10
0.00
65k (942)
FIGURE C3 Median of relative standard errors for percentages eligible for free
meals estimated by the model and from the 5year and 1year ACS by size of
FIGC3.eps
school district.
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
1.60 Model results
ACS 5year
1.40 ACS 1year
1.20
Percentage Eligible
1.00
0.80
0.60
0.40
0.20
0.00
65k (942)
FIGURE C4 Median of relative standard errors for percentages eligible for re
ducedprice meals estimated by the model and from the 5year and 1year ACS
by size of school district. FIGC4.eps
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
OCR for page 264
APPENDIX C 275
18 Model results
ACS 1year
16
14
Percentage Eligible
12
10
8
6
4
2
0
65k (942)
FIGURE C5 Median of root mean squared differences (RMSDs) for freeeligible
percentages estimated by the model and from the 1year ACS by size of school
district.
FIGC5.eps
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
12 Model results
ACS 1year
10
Percentage Eligible
8
6
4
2
0
65k (942)
FIGURE C6 Median of root mean squared differences (RMSDs) for reduced
priceeligible percentages estimated by the model and from the 1year ACS by
FIGC6.eps
size of school district.
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
OCR for page 264
276 USING ACS DATA TO EXPAND ACCESS TO THE SCHOOL MEALS PROGRAMS
Table C2 shows the distribution of estimates, relative standard errors
for 2009, and the RMSDs for free and for reducedprice eligibility rates
estimated by the model and from the 1year ACS.
Additional Analysis and Diagnostics
Figures C7 and C8 display data for free eligibility percentages, while
Figures C9 and C10 display data for reducedprice eligibility percentag
es.13 Figures C7 and C9 display relative standard errors for modelbased
estimates and 1year and 5year ACS estimates. Figures C8 and C10
display the medians of the RMSDs for the modelbased and 1year ACS
estimates.
School Attendance Area Estimates
The methodology for school attendance areas is the same as that for
school districts:
· The parameters ^ and variance 2 are estimates from the county
k k
model.
· The prediction for a school attendance area is the empirical Bayes
shrinkage estimate using:
the fitted value xs ^ , where x is the vector of independent vari
k s
ables for school attendance area s computed using the synthetic
estimation method described for school districts;
 ys, the ACS direct estimate for school attendance areas;
 vs, the variance of ys, calculated using the same GVF as described
for the county and district methodology; and
 the shrinkage estimation methodology described for school
districts.
· The school attendance areas are overlapping with respect to both
geography and grade ranges,14 so it was impractical to construct
a primitive and rake to school district estimates.
The Census Bureau provided the following observations about the
choice of prediction methods for school districts and school attendance
areas for this study, relative to the shares methodology used for current
SAIPE school district production:
13Figures in this section cover only those districts with combined free and reducedprice
eligibility rates over 70 percent, as measured by 5year average empirical Bayes rate modeled
estimates.
14For example, in many places there are elementary, middle, and secondary schools that
serve the same geographic area.
OCR for page 264
TABLE C2 Distribution of Estimates, Relative SEs (2009) and 5Year RMSDs for Free and ReducedPrice
Eligibility Rates Estimated by the Model and from the 1Year ACS
Variable N Min. 1st Pctl. 5th Pctl. 25th Pctl. 50th Pctl. 75th Pctl. 95th Pctl. 99th Pctl. Max.
Free
Model Est., 2009 13,753 0.2 2.3 5.8 16.2 26.5 37.7 54.4 65.9 95.6
Model Rel. SE, 2009 13,753 0.02 0.1 0.1 0.2 0.3 0.5 1.2 3.2 33.5
Model RMSD, 0509 13,753 0.03 0.5 0.9 1.8 2.8 4.1 6.6 9.5 24.6
ACS Est., 2009 13,347 0 0 0 9.5 25.3 44.0 75.7 100 100
ACS Rel. SE, 2009 13,753 0.02 0.1 0.2 0.4 0.6 1.0 2.5 6.2 123.4
ACS RMSD, 0509 13,687 0 0 2.3 6.7 11.5 18.6 33.3 43.3 50
Reduced Price
Model Est., 2009 13,753 0.4 1.9 3.4 6.7 9.5 12.9 19.3 25.1 55.1
Model Rel. SE, 2009 13,753 0 0.2 0.3 0.4 0.6 0.9 1.6 2.3 5.1
Model RMSD, 0509 13,753 0.1 0.4 0.7 1.8 2.7 3.9 6.2 8.5 26.7
ACS Est., 2009 13,347 0 0 0 0 6.0 13.8 36.5 68.3 100
ACS Rel. SE, 2009 13,347 0.1 0.2 0.3 0.6 1.1 1.6 2.7 3.8 14.4
ACS RMSD, 0509 13,687 0 0 0 3.7 6.8 11.9 24.9 40.9 50
Free + Reduced Price
Model Est., 2009 13,753 1.4 5.4 10.9 25.2 37.6 49.9 67.0 79.1 99.5
NOTE: ACS = American Community Survey; RMSD = root mean squared difference; SE = standard error.
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
277
OCR for page 264
278 USING ACS DATA TO EXPAND ACCESS TO THE SCHOOL MEALS PROGRAMS
0.45 Model results
ACS 5year
0.40 ACS 1year
0.35
Percentage Eligible
0.30
0.25
0.20
0.15
0.10
0.05
0.00
65k (26)
FIGURE C7 Median of relative standard errors for modelbased and 1year and
5year ACSbased free eligibility percentages by size of school district.
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
FIGC7.eps
20 Model results
ACS 1year
18
16
14
Percentage Eligible
12
10
8
6
4
2
0
65k (26)
FIGURE C8 Median of root mean squared differences (RMSDs) for modelbased
and 1year ACSbased free eligibility percentages by size of school district.
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
FIGC8.eps
OCR for page 264
APPENDIX C 279
1.40 Model results
ACS 5year
1.20 ACS 1year

1.00
Percentage Eligible
0.80
0.60
0.40
0.20
0.00
65k (26)
FIGURE C9 Median of relative standard errors for modelbased and 1year and
5year ACSbased reducedprice FIGC9.eps
eligibility percentages by size of school district.
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
9 Model results
ACS 1year
8
7
Percentage Eligible
6
5
4
3
2
1
0
65k (26)
FIGURE C10 Median of root mean squared differences (RMSDs) for model
based and 1year ACSbased FIGC10.eps
reducedprice eligibility percentages by size of
school district.
SOURCE: Provided to the panel May 12, 2011, by the U.S. Census Bureau.
OCR for page 264
280 USING ACS DATA TO EXPAND ACCESS TO THE SCHOOL MEALS PROGRAMS
· The Census Bureau could not use the SAIPE relative error meth
odology to evaluate the estimation error of the eligibility rates for
the school meals programs because it requires an independent
source of poverty estimates.
· The SAIPE model uses shares from the 2000 decennial census long
form as an independent variable. These shares are now 10 years
old. The Census Bureau has not evaluated the use of shares from
the 5year ACS but suspects that they are less reliable. The models
for the school meals programs do not use the decennial census
data as an independent variable.
· The SAIPE shares methodology for the 2008 estimates did not use
the direct ACS currentyear estimate, so there would be a poten
tial loss of information over the school meals model.
· The shares methodology is a twostep process, adding estimation
error at each step.
PANEL'S SUGGESTIONS FOR MODELING ELIGIBILITY
PERCENTAGES FOR THE SCHOOL MEALS PROGRAMS
As noted previously, the models for the school meals programs were
developed quickly as a proof of the concept that using SAIPElike small
area models for the school meals programs might provide accurate and
timely estimates of eligibility. The panel considers that the work done to
date demonstrates the feasibility of such an approach. While the model
based eligibility estimates for the school meals programs are timely, they
did not prove to be as accurate as the 5year ACS direct estimates. Accord
ingly, the panel believes that this promising approach would benefit from
further research, particularly if the ACS Eligibility Option (AEO) is adopted.
Among general topics that might warrant research are (1) variations
in the synthetic method used to determine school district or school atten
dance area estimates, (2) consideration of transformations of the vari
ables entering model equations to improve modeling of county data,
and (3) variations on the use of partitioning of county data to improve
performance at the school district and school attendance area levels. The
following are the panel's specific suggestions concerning approaches for
improving the models:
· While the school meals programs are not "fixedpie" fund alloca
tion programs, controlling estimates to higher levels of geography
should give the estimates greater precision and lower bias, while
also improving face validity.
OCR for page 264
APPENDIX C 281
· Joint modeling of free and reducedprice percentages might
improve the estimates. Because the two percentages are corre
lated (in both cases), joint modeling should improve efficiency.
· More generally, crosssectional and timeseries models using
several years of ACS data could be specified and estimated to
improve efficiency. See, for example, Datta, Lahiri, and Lu (1999).
This approach would be preferable to using the average of four
1year estimates as a predictor variable.
· While assuming that estimated eligibility percentages follow nor
mal distributions may be reasonable in some instances, it is not a
good assumption for small samples (as for the school attendance
areas in a small or mediumsized district) or for small percentages
(such as reducedprice percentages) with skewed distributions or
many estimates of 0. Better approaches include transformation of
the percentage, assuming a discrete distribution, using a mixed
distribution, or using a linking distribution defined in [0,1], such
as the logistic or beta.
· Variance estimation might be improved. For variances of direct
estimates, the approach to GVF modeling should be compared
to approaches in the literature. For estimating model variances,
generalized maximum likelihood estimation methods have been
developed that are consistent and strictly positive (in contrast
to variance components methods). Another possibility is to use
hierarchical Bayes or some simple approximations, such as the
adjustment for density maximization method described in Morris
and Tang (2011).
· Exchangeability assumptions on regression coefficients and
model variances could be relaxed by introducing heterogeneity
using different regression coefficients and model variances for
different groups based, for example, on administrative estimates
of the percentage of students eligible for free or reducedprice
meals, as well as the size of the resident population.