Skip to main content

Currently Skimming:

5 Small-Area Estimation
Pages 63-79

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 63...
... In addition, the subject of the distinction between measurement error and definitional vagueness, which is related to data quality, was raised during the discussion period; a summary of that discussion is included as a short section at the end of this chapter. OVERVIEW Julie Gershunskaya of the Bureau of Labor Statistics presented a survey of current methods for small-area estimation that have been found useful in various federal statistical applications.
From page 64...
... Gershunskaya said at the outset that the best strategy to avoid reliance on small-area estimation is to provide for sufficiently reliable direct estimates for the domains of interest at the sample design stage. However, it is typical for surveys carried out by federal statistical agencies to have insufficient sample sizes to support estimates for small domains ­ requested by the user communities.
From page 65...
... To address this, there are various alternative direct estimators that may out-perform the Horvitz-Thompson estimator, especially when auxiliary data are available. Ratio Estimators To discuss these estimators, additional notation is needed.
From page 66...
... If there is a substantial correlation between R&D expenditure and payroll size, the ratio estimator may provide a marked improvement over the Horvitz-Thompson estimator. A particular case of the ratio estimator is given for the situation where xj equals 1 if j is in the dth domain and equals zero otherwise, which is referred to as the post-stratified estimator.
From page 67...
... = Xd B + ∑ w j y j − x j B , ˆ ) which shows that the survey regression estimator is a sum of the fitted v ­ alues from a regression model based on predictors from the domain of interest, and it has a bias correction from weighting the residuals again from a regression using data only from that domain.
From page 68...
... Synthetic Estimator To describe synthetic estimation, Gerhsusnkaya began with the usual direct estimate of the sample domain mean from a simple random sample, namely: 1 yd = nd ∑y j . j ∈sd Unfortunately, this estimator can be unreliable if the sample size in the domain is small, so one would want to use the data from the other domains to improve its reliability.
From page 69...
... ˆ ˆ j ∈sd ˆ ( ) This estimator can be depicted as survey regression equals model plus bias correction.
From page 70...
... SPREE assumes that initial estimates of individual cell totals, Cim, are available from a previous census or from administrative data, though as such they are possibly substantially biased. This approach also assumes that the sample from a current survey is large enough so that one can obtain direct sample-based estimates for the ˆ ˆ marginal totals, denoted Yi and Ym .
From page 71...
... Methods Based on Explicit Models In contrast to these approaches that are based on implicit models, the final general category of estimators described by Gershunskaya covers methods based on explicit models. Explicitly stated modeling assumptions allow for the application of standard statistical methods for model selection, model evaluation, the estimation of model parameters, and the
From page 72...
... The authors used the following set of auxiliary variables: county level per capita income, the value of owner-occupied housing, and the average adjusted gross income per exemption. Fay-Herriot models are often represented using two-level model assumptions, the sampling model and the linking model.
From page 73...
... The above composite form shows that the direct estimates are shrunk toward the synthetic part, where the smaller A is (i.e., the better the linking model explains the underlying relationship) , the more weight goes to the synthetic (i.e., model-based)
From page 74...
... . The assumption is that ignoring an error term, the state-level R&D funds in industry type i are proportional to the state's total payroll, which can be expressed as θ im = Xim Bi + ν im (linking model)
From page 75...
... Both the Fay-Herriot and the Battese, Fuller, and Harter models are examples of linear mixed models, Gershunkaya noted. Smoothing Over Time None of the models presented so far examined the potential from use of the dependent variable collected from previous time periods.
From page 76...
... , a modified form of the previous alternative direct sample estimator can be defined as
From page 77...
... • Using a statistical model supports a systematic approach for a given problem: (a) the need for explicitly stated assumptions, (b)
From page 78...
... She said that her presentation was already detailed and therefore some complicated issues could not be included. Eric Slud added that the sample survey context made some of this particular application more difficult than in the literature that Horowitz was referring to.
From page 79...
... MEASUREMENT ERROR OR DEFINITIONAL VAGUENESS As noted above, an issue that arose during the workshop concerned the need to better understand sources of error underlying NCSES survey and census responses. Survey data are subject to sampling error and non­ sampling error, with nonsampling error often decomposed into nonresponse error and measurement error.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.