Weighting and Estimation
As is the case with the sample design, the current weighting and estimation procedures are not optimized for small-area estimates. Coverage adjustment for the group quarters (GQ) population is applied at the state level by GQ type categories. On the basis of the current estimation procedures, only the total population (households and group quarters) is controlled at the county level. While some small geographic areas with GQ populations do not have group quarters represented in the sample, the state-level adjustments disproportionately increase the weights of group quarters in other areas. For some small-area data users, the 5-year estimates may not reflect local reality.
While the 1-year and 3-year data from the American Community Survey (ACS) releases have been subject to dissemination restrictions based on the reliability of the estimates, the Census Bureau has decided not to restrict in this fashion the release of the 5-year estimates, primarily because such restrictions would preclude the use of census tract and block group data for the purpose of aggregation to larger user-defined geographic areas. Block group data will not be available through the American FactFinder interactive application on the Census Bureau’s website, instead being accessible only through the download center targeted at more advanced users. As with all ACS data, the estimates will be accompanied by margins of error.
To produce the GQ estimates, the data are weighted in three steps. The first step applies a trimmed base weight that reflects the initial sampling probability and the within-GQ subsampling probability. The second step is a noninterview adjustment across group quarters, defined within state, by county and by major GQ type group. If the sample is small or if the adjustment is large, the cells are collapsed to state by major GQ type group. The third step applies a coverage adjustment, controlling the GQ data at the state level by major GQ type group, using the GQ population estimates from the Population Estimates Program (PEP).
PEP CONTROLS AND ALTERNATIVES
To estimate subcounty GQ populations, the Census Bureau starts with GQ population counts by facility type for each subcounty area from the previous decennial census and updates them with a time series of individual GQ records from the group quarters report (GQR). The GQR is an annual count of group quarters populations prepared by Federal-State Cooperative for Population Estimates program units (Census Bureau, 2008b). A time series of the GQ population is derived in two steps. First, facility-level GQ populations from the GQR are summed to the
subcounty level by facility type for each estimate date in the time series. Second, a year-to-year change is calculated by the aggregated GQR time series of population. For some GQ types, the population estimates may be out of date, since they are basically the decennial census counts kept constant.
As the decade progresses, the census counts become increasingly outdated, and the updates, such as the GQRs collected from the states, cannot always be counted on, which affects the quality of the population estimates. Following the release of counts from the decennial census, the Census Bureau typically conducts a formal evaluation of errors (bias and precision) in its population estimates for various levels of geography. These tests generally treat the census counts as the gold standard against which the population estimates are evaluated. The Census Bureau recently proposed a project to evaluate the 2010 round of population estimates against the 2010 census counts and awarded eight contracts to external researchers to evaluate alternative population estimation methodologies.
The purpose of this program will be to evaluate the current method by comparing the population estimates of the total resident population and the household population at the national, state, and county levels with the census counts. The plan is to examine the national, state, and county population estimates by selected characteristics (e.g., age, sex, race, Hispanic origin). At the subcounty level the plan will be to evaluate both the subcounty population totals and the housing unit estimates. Population estimates developed using the housing unit method at the national, state, county, and subcounty levels will also be evaluated. The population estimates produced using a housing unit method will be for the total resident population and the household population; they will not include any demographic characteristics data, nor will they provide information about the GQ population.
Despite uncertainty surrounding the quality of the GQ estimates prepared by the PEP, the proposed evaluation research, as planned, will focus only on estimated population (household and GQ populations combined), and household population compared with total 2010 census counts. The Census Bureau plans to consider the GQ estimates separately at a later time, but this could be a missed opportunity to better understand the challenges surrounding the GQ population estimates in relation to the total population estimates and to inform the deliberations about the role of the GQ populations in the ACS. The panel thinks that an evaluation of the GQ estimates should be conducted along with the evaluation of other aspects of the population estimates program.
Recommendation 4-1: The Census Bureau should consider amending its current plan for evaluating the 2010 population estimates against the 2010 census counts to include an evaluation of the 2010 estimates of the GQ population at all levels of geography for which such estimates are prepared. This research should identify estimated bias and imprecision by GQ type. The evaluation of the 2010 population estimates should also be viewed as an opportunity to foster a closer collaboration between the Population Estimates Program and the ACS office to ensure that the estimates meet the needs of all users.
As discussed, currently the population controls for GQ estimates in the ACS products are applied at the state level, and this topic needs to be considered in the context of their effect on the mean square errors as well, given that inaccurate population controls will be more likely to introduce error than to reduce it. While there are arguments for considering county, or even
subcounty controls, this may be unrealistic, because GQ types are often collapsed due to small sample size or large adjustments. Alternatives would be to control for demographics and drop controls for GQ type or to limit the use of controls to those GQ types for which the population controls are most reliable.
Recommendation 4-2: Depending on the outcome of the evaluation discussed in Recommendation 4-1, the Census Bureau should also evaluate the relative advantages and disadvantages of developing control totals by demographic characteristics, possibly in addition to the control totals by GQ type.
It is likely that the population controls for some GQ types are inadequate, but alternatives exist and should be considered. If the updates received from outside sources about some GQ types are of adequate quality, it may be possible to use these population estimates instead. For example, the Defense Manpower Data Center or Bureau of Prisons records may supply better data than the current approach of updating the census counts for military and correctional facilities. Group quarters also maintain basic administrative records about their residents. If these facility level records include enough information to produce population counts by demographic cross-classification, they could also be used as controls.
Recommendation 4-3: The Census Bureau should evaluate the possibility of using as controls population estimates from some of the outside sources that are currently used to provide updates for the sampling frame (also see Recommendations 2-1 and 2-2).
As discussed in Chapter 2, state and other local resources are underutilized as sources of data. Considering the limitations and costs of the current procedures, it may be worth exploring the possibility of obtaining state-generated estimates of GQ populations, as recommended in Chapter 2.
Recommendation 4-4: Whenever possible, the Census Bureau should work with existing state-level partners to explore the use of state-generated estimates of GQ populations.
ESTIMATES OF THE GQ POPULATION IN SMALL AREAS
The decennial census, because of its role of providing complete counts of the population down to the census block level, mostly succeeds in completely enumerating the GQ population everywhere and is able to support counts by GQ type for all entities in the census geographic hierarchy. In contrast, the state-based sample design of the ACS is not an efficient vehicle for providing small-area estimates of the GQ population.
The ACS substate samples are highly variable, particularly by GQ type, and there are large fluctuations over time in the characteristics associated with residence in group quarters. In some cases, this variation results in counties with known GQ facilities within their administrative boundaries having no group quarters represented in the sample. At lower geographic levels this is a common occurrence, with approximately half of the census tracts that have group quarters according to the sampling frame ending up with none selected in the sample after four years (Asiala, 2010).
For example, the ACS data for Elmore County, Alabama, seems to suggest that the poverty rate in the county dropped from 14 to 10.4 percent between 2006 and 2007. However, a closer examination of the role of the group quarters in the sample reveals that the apparent change is largely explained by the fact that in 2006 the ACS estimate of the GQ population for the county was 1,976, and 90 percent of the GQ residents were in poverty. In 2007 no group quarters were included in the sample, so the 10.4 percent poverty rate for that year is essentially the household poverty rate, which is not very different from the 11.8 percent household poverty rate in 2006 (Asiala, 2010).
Among the difficulties facing the Census Bureau in this regard is the goal of identifying and reporting, on the basis of a sample survey, a sparse and irregularly distributed GQ population for small geographic units. This is a fundamental tension arising from conflicting goals, and it leads to sample-based estimates that have, for the statistician, very large standard errors and, for the unsophisticated data user, numbers that simply make no sense. By definition, a survey sample will not include all households or all GQ facilities. When the only GQ facility in a small area is not selected for interviews, the sample-based estimate of the GQ population for that small area will be zero.
Acknowledging that the Census Bureau does not want to consider release restrictions for the 5-year estimates, the panel thinks that it is important to ensure that the data published resonates with reality from the perspective of small geographic areas. One statistical solution to consider is the use of some type of small-area estimate. There are a variety of estimators in this class, ranging from simple to complex. Which type would be both feasible and an improvement over the current method is a subject for study. The Census Bureau for many years has employed a variation of this approach as part of its Small Area Income and Poverty Estimates (SAIPE) program. It produces annual small-area income and poverty estimates for school districts, counties, and states using a model-based approach that relies on combining survey data with population estimates and administrative records (National Research Council, 2000). If a small area estimate were used for the total GQ population, for example for a county, an additional dilemma arises. A decision would have to be made about whether acceptable small area estimates could be made for the GQ totals in demographic groups in the small area. If this is not possible, it may be reasonable to simply report a small area estimate for the total GQ population without breakdown by characteristics. Breakdowns by characteristics for that area would refer only to the household population.
An option would be to use a composite of a small-area model estimate and direct estimate. If the geographic entity has group quarters but the sample has none, then the direct estimate would receive a weight of zero. Otherwise, a combination estimate could be used that accounts for the variance of each estimate.
Sources of GQ data that could be used in the model include, but are likely not to be limited to, counts of residents and group quarters for small areas as shown on the frame, the previous census counts of GQ population by small area, data provided by state or local agencies regarding GQ populations, or possibly the PEP subcounty estimates of the GQ population. Another option would be to investigate the use of administrative records maintained by GQ facilities for this purpose, even if these records are found not to be comprehensive enough to replace interviews with residents.
The best estimate to use may depend on how old the latest census counts are at any particular point. The census counts could be used in the years immediately following the
decennial census, but a few years later the PEP numbers or information obtained from administrative records may be more reliable.
An additional issue to consider is how the unreliability of the GQ sampling frame may affect synthetic small-area estimates. A similar effort, the Local Area Unemployment Statistics (LAUS) program of the Bureau of Labor Statistics, uses state-level estimates from the Current Population Survey (CPS) as input to create model-based state-level estimates. It has found that the direct CPS estimates of unemployment for lower levels of geography are not reliable enough to publish (Pfeffermann and Tiller, 2006).
Recommendation 4-5: The Census Bureau should evaluate methods for producing estimates for counties in which group quarters are known to exist based on the frame but are not included in the sample. The simplest method may be to use the county GQ count from the decennial census. A slightly more complex method would be to use a synthetic estimator, or another straightforward small-area estimator. The evaluation would ideally be completed and changes would be implemented before the 2011 ACS data products are released.
PUBLICATION OF DATA PRODUCTS
Given that small-area estimates based on the 5-year data are expected to be unreliable in some areas in which GQ residents represent a large proportion of the population, it will be important to flag data products that are affected by the presence of group quarters for a particular geography. This should be considered in addition to publishing the margins of error. One approach would be to flag tables applicable to areas in which there are group quarters in the administrative area but not in the sample.
Recommendation 4-6: In addition to continuing to publish margins of error to accompany the estimates, the Census Bureau should also develop a system for flagging estimates that are adversely affected by the presence of group quarters in the area. This procedure should continue until alternative methodologies are developed to reduce the variance in GQ estimates.