To kick off the workshop, the steering committee chose to profile two high-visibility policy areas—ones involving decisions and decision makers at all levels of government and actors in the public, private, and nonprofit sectors. These chosen topic areas were the rapidly changing health care environment and development and planning of critical transportation infrastructure. As a single session in a short-duration workshop, the presentations in this session only scratch the surface of the applications of the American Community Survey (ACS) data in these topic areas; in particular, the workshop treatment necessarily understates the great attention that has been given to the utility of ACS products in the transportation arena, in which the survey’s information on journeys between work and home is vital to infrastructure planning.1 But the presentations in the session combine some specific applications of the data with some more general “framework” discussions outlining the analytic process in which the data may be brought to bear to solve important problems.
The workshop session included five presentations, three in the health care
1See, for instance, a National Cooperative Highway Research Program(2011) report examining the technical issues in producing ACS-based products for detailed transportation analysis that still conform to data disclosure rules. The National Cooperative Highway Research Program (2007) previously issued a technical manual for transportation planners—prior to the release of multiyear data products but anticipating their use and appropriateness to replace the census long-form sample. The most recent revision of Commuting in America (Pisarski, 2006) relied principally on the 2000 census long-form data but also included some tabulations from the first waves of large-scale ACS deployment.
planning arena and two involving transportation. Section 2–A describes the work of a policy resource center established to craft analyses and data products and provide technical assistance—using the ACS as a primary source—for health care decision makers, while Section 2–B examines the uses of the ACS in the public health department of the nation’s largest city, including linkages to the city’s own survey and data resources. Section 2–C steps back and describes the framework through which data-driven analysis can influence the siting of specific health care facilities or modification of services. With Section 2–D, the chapter switches to the transportation area, beginning with an overview of the ways inw hich the ACS is used by metropolitan planning organizations to model future transportation trends and infrastructure needs. Section 2–E closes by describing specific legal and regulatory requirements under which the ACS is used to document transportation agencies’ compliance with social equity laws. (This specific example foreshadows some applications described more fully in Chapter 7.) The session included brief time for questions, the answers to some of which (clarifying an individual speaker’s point) are folded into the earlier questions; discussion of broader questions asked of multiple speakers is summarized in Section 2–F.
Kathleen Thiede Call (School of Public Health, University of Minnesota) described the functions of the State Health Access Data Assistance Center (SHADAC), a health policy resource center for which she serves as an investigator. Funded primarily by the Robert Wood Johnson Foundation and housed at the University of Minnesota, SHADAC began operations in 2000 with the goal of making health-related data more accessible to state policy makers. (Additional detail on SHADAC’s early history is summarized by State Health Access Data Assistance Center, 2007.) To this end, SHADAC supplies technical assistance to state government agencies to either analyze existing data resources or, in some cases, to collect their own data.
SHADAC projects typically involve assessing health care coverage—both access to health care services and health insurance coverage. Call suggested that states need good data on health care coverage because policy decisions concerning health care have become major (if not dominant) in state-level budgeting. Consequently, the requirements for health insurance coverage data and estimates are considerable:
- Estimates need to be valid and consistent, and need to facilitate comparisons across states;
- Estimates need to support analysis of trends and patterns over time, in part to be able to judge the effectiveness of new policies;
- Estimates need to support disaggregation into fine subpopulations—demographic splits by race, ethnicity, age, and poverty status, along with geographic splits by county (or at least some substate areas); and
- Access to microdata, in order to achieve this fine-grained analysis, is critical.
Call suggested that state policy makers are most interested in data on the characteristics of the uninsured—what they look like demographically and where they may be concentrated geographically. Uninsured children are of particular interest: how many children in each county are eligible for Children’s Medicaid (CHIP) or State Children’s Health Insurance Program (SCHIP) assistance? And, though questions of eligibility for Medicaid have been of interest for years, interest has certainly been heightened among states looking at the effects of the federal Patient Protection and Affordable Care Act (PPACA).2 Call noted that these kinds of analyses have been done for years and that the states have relied heavily (or exclusively) on the federal government as a source of information. However, budget pressures and constraints are particularly acute at the state level, increasing the demand for reliable information about uninsurance and public program eligibility (and the effects of policy changes on that eligibility).
For their analyses, Call noted that SHADAC can draw from a variety of federal survey data sources, including three specialized health surveys: The National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS; both conducted by the National Center for Health Statistics3 [NCHS]) and the Household Component of the Medical Expenditure Panel Survey (sponsored by NCHS and the Agency for Healthcare Research and Quality). However, the requirements listed above are such that the principal sources for analysis are the ACS and the Current Population Survey (CPS), both surveys conducted by the Census Bureau (with the CPS sponsored jointly by the Census Bureau and the Bureau of Labor Statistics).
Prior to 2008, SHADAC relied extensively on the CPS—and the CPS retains some solid advantages. Chief among these are the consistency of the CPS data: the CPS’s longer-term inclusion of relevant questions permits trends to be analyzed back to 1986, and its data releases are generally very timely. CPS control variables are such that the data are also amenable to limited disaggregation to substate levels. But there are also major drawbacks, chief among these the relatively small sample size (and corresponding sample sizes for substate pop-
2The Patient Protection and Affordable Care Act, P.L. 111-148, was signed into law on March 23, 2010. At the time of the workshop, key provisions of the law were under review by the U.S. Supreme Court. On June 28, 2012, the Court ruled in National Federation of Independent Business v. Sebelius that the core mandates under the act were constitutional as a valid exercise of the power of Congress to impose taxes.
3The Census Bureau is the data collection agent for the National Health Interview Survey (as well as the ACS), though the survey is sponsored and organized by the National Center for Health Statistics.
ulations). The CPS questions of key pertinence to SHADAC are asked in the Annual Social and Economic Supplement (ASEC) portion of the CPS, which can suffer from nonresponse; Call suggested that ASEC responses have to be imputed in their entirety for roughly 10 percent of respondents each year. A thornier problem with the CPS is that the form of the key questions is intended to yield calendar-year estimates of health insurance coverage, but not necessarily contemporaneous estimates. A facsimile of the questions on the 2011 ASEC4 indicates that “these next questions are about health insurance coverage during the calendar year 2010. The questions apply to ALL persons of ALL ages.” The lead question is then: “At any time in 2010, (was/were) (you/anyone in this household) covered by a health insurance plan provided through (their/your) current or former employer or union?” Hence, the question is not quite as precise as a measure of current coverage and does not capture lapses in coverage.
What changed in 2008 was the addition of a health insurance question to the ACS, and that has had tremendous benefit for analyzing health coverage. A major benefit of the ACS question is asking about coverage at the time of the survey, in contrast to a calendar year reference period and long look-back requirement for coverage in the CPS questions. The ACS version of the question—“Is this person CURRENTLY covered by any of the following types of health insurance or health coverage plans?”—emphasizes current coverage; it permits yes/no answers to seven types of insurance coverage, plus a write-in category.5 However, there is a new interpretation challenge presented by ACS estimates—explaining, for instance, what an estimate of current health insurance coverage means in an average computed over 1, 3, or 5 years of data. Call suggested another challenge inherent in the ACS data, stemming from its development as a general survey and not as a dedicated barometer of health and health insurance trends. Specifically, the unit embodied in each ACS questionnaire—a household, or a “census family” unit—is not necessarily the same thing as a health insurance unit. By its nature, the ACS does not probe to identify relationships within the household/family that would allow access to an individual’s health plan (a policy holder and their dependents), and so that relationship cannot be directly recovered.
But, Call argued, the drawbacks of the ACS for examining health coverage are outweighed by the most profound benefit of the ACS relative to the CPS: its larger sample size, roughly 15 times that of the CPS in a given year, and its representativeness for smaller geographic and demographic units within states. Combined with the full range of covariate information available in the ACS, the larger sample size of the ACS has enabled analysis at finer, substate levels that
4See http://www.census.gov/apsd/techdoc/cps/cpsmar11.pdf [July 2012], pp. D-77–D-78. The questions are asked through computer-assisted interviewing, hence the syntax choices in the phrasing of the question; the question shown on the CPS interviewer’s screen reflects previously collected information.
5This question is numbered Person Question 16 in the 2012 version of the questionnaire.
Figure 2-1 Estimated percent of uninsured persons, age 0–64 and at or below 200 percent of the poverty level, in West Virginia, derived from the American Community Survey and the Current Population Survey
SOURCE: Workshop presentation by Kathleen Thiede Call, based on data from 2010 American Community Survey and 2011 Current Population Survey (which covers calendar year 2010).
was previously out of bounds. Call presented several examples of data-based maps using the ACS to inform health policy questions, but one that particularly drew a point of comparison with the CPS is shown in Figure 2-1. Asking the general question “where should we allocate funds for community clinics, serving the uninsured?,” Call showed the type of analysis that SHADAC conducted for officials in the state of West Virginia, showing the percent of persons in the vulnerable population of interest: low income (defined as being at or under 200 percent of the federal poverty level), nonelderly, and uninsured. Trying to answer the question using 2011 CPS data (which covers calendar year 2010), Call said that the best estimates they could generate were for three regions of the state, based on core-based statistical areas—one of which (the Huntington-Ashland metropolitan area) includes areas of Kentucky and Ohio and so is not specific to West Virginia. Call said that SHADAC would not feel confident even giving the CPS-based analysis over to the state officials. By comparison, the ACS data for 2010 permit good estimates for 12 regions—the Public Use Microdata Areas (PUMAs) within the state, a considerably fuller picture of need within the state.
Call presented and discussed two further examples of ACS-based analysis, illustrating even finer detail and one intended to capture more macro-level trends. In one, 3-year estimates from the ACS were used to profile children not included in Colorado’s CHIP program at the county level. The same map was generated at the county level for children not covered under Medicaid, and Call said that the work helped propel policy debates about expanding insurance coverage for children. Call also presented a basic state-level map, shaded to indicate the percentage of persons who would be eligible for Medicaid under the PPACA.
With respect to access to the data—both by SHADAC and its clients—Call said that she had encountered a range of responses among state-level users concerning the Census Bureau’s American FactFinder interface. Some have loved the interface and appreciate that one does not have to be a computer programmer to use it; others find it overly cumbersome and not user friendly. The default tabulations available in American FactFinder reflect federal poverty thresholds but not (directly) the federal poverty guidelines promulgated by the U.S. Department of Health and Human Services.6 The FactFinder tabulations also do not provide direct results at the poverty cuts of particular policy interest—e.g., at or below 138 percent or 200 percent of the federal poverty threshold.7
In part to compensate for deficiencies in American FactFinder, SHADAC established its own online data center 2 years ago. The data center site—accessible at http://www.shadac.org/datacenter—acts as a table and chart generator using both ACS and CPS data, enhanced to describe health insurance units (as
7As of September 2012, the American FactFinder interface was modified to directly add a 138-percent-of-poverty-threshold tabulation.
distinct from family/household units) and to automate data cuts based on federal poverty guidelines. Like the Census Bureau’s American FactFinder interface, SHADAC believes it important to present both estimates and standard errors overtly, so that users can assess differences across population and demographic groups of interest.
Call closed by noting her concerns about the possibility of reductions in effective ACS sample size and generalizability that she said would occur if the survey were made voluntary; she said that such an outcome could greatly impair the representativeness of the data and the states’ ability to benchmark and to look directly at some subpopulations of interest within state boundaries. And—in terms of a “wish list”—she argued that the great value of ACS data is its timeliness, yet availability of coverage estimates for half-years or even quarters would be ideal for time-sensitive policy debates. Given SHADAC’s health coverage and access focus, Call’s “wish list” included addition of two questions: a self reported indication of general health status and a question on access to health care services. During the discussion period at the end of the session, Call was asked what specific form of question on access might be most useful; she replied that some typical ones from other standalone surveys include questions of the rough forms “Do you have a usual source that you go to for care?” or “At any point in the last year, have you gone without health care because you couldn’t afford it?” Questions of this type are part of the NHIS and BRFSS, and could be the model for a more general question on the ACS. On the health status question, even something as basic as “Would you say your health is generally excellent, very good, good, fair, or poor?”—combined with other covariates available in the ACS—could spur important and interesting research.
Established in its present form in 2002 through the merger of the existing Departments of Health (itself dating back to 1805) and Mental Hygiene, the New York City Department of Health and Mental Hygiene (DOHMH) is the chief public health agency for the nation’s largest city. James Stark, an epidemiologist from the DOHMH Bureau of Epidemiology Services, described DOHMH’s use of ACS data in a presentation developed in collaboration with methodology unit director Kevin Konty.
Stark commented that all seven of the “content divisions” within DOHMH8 use ACS-based analysis in some form, directly or indirectly. The Epidemiology Services bureau is the principal support arm for this analysis, generating population descriptions or deriving population estimates and profiles that are used
8The Epidemiology Division, headed by a deputy commissioner, is one of these divisions; Stark’s Bureau of Epidemiology Services is housed within that division.
throughout the department. Besides the Epidemiology Division itself, Stark noted that his comments were based on the ACS uses by the Disease Control and Emergency Response divisions of DOHMH, as well as the office of the Commissioner of Health.
Similar to Call’s remarks, Stark said that the most common demand for ACS data is to construct basic demographic profiles. Much more than basic age-and-sex characteristics, Stark said thatOHMH bureaus had needed (and requested) data on very precise groups drawing from many ACS variables, including:
- Household composition for public employees (distinct from private-sector employees);
- Recent immigrants, including language spoken; and
- Enrollment in private school (for comparison with public school data).
One perennial demand—discussed in more detail in the context of implementing the Voting Rights Act in Section 7–A—is to understand the primary languages spoken by New York City residents, broken down by neighborhood. Stark displayed a map derived from 2000 census long-form-sample data showing the range of non-English languages spoken in the city, plotted so that individual dots represent about 150 households that use a particular language. This analysis demonstrates, for example, the wide range of language diversity in the borough of Brooklyn, and the dominance of Spanish as the primary non-English language spoken in Manhattan and the Bronx. At the request of the commissioner’s office, and the communications officer in particular, the Epidemiology bureau has replicated this analysis using the most recent 5-year small-area estimates, with the intent of continuing to update the map over time. Stark said that this will help DOHMH produce and provide health-related material for city residents that is both neighborhood- and language-specific.
Given New York City’s large population, counts and analysis by neighborhoods or other small areas within the city are of particular interest to DOHMH. However, their analyses also require work with the ACS at higher levels of geographic region as well: counties (the boroughs of the city), the surrounding counties around New York City, and the nation as a whole. Again, given New York City’s size, Stark noted that DOHMH is frequently called upon to put analyses within the city in context, through contrast with the rest of the nation; the ACS has proved particularly useful in this regard.
Though ACS-based estimates are interesting in their own right, one primary use of the ACS by DOHMH is to generate denominators, to compute rates based on the department’s own health surveys. Chief among these are the New York City Community Health Survey (CHS), a telephone-only cross-sectional survey of approximately 10,000 adults within the city that is administered by the DOHMH Bureau of Epidemiology Services each year. In its administration and content, the CHS is patterned after the Behavioral Risk Factor Surveillance System (BRFSS) conducted by the U.S. Centers for Disease Control and Pre-
vention (the annual sample size for which is roughly 350,000 adults).9 The epidemiology bureau uses the ACS to generate control totals for post-stratification weighting to produce CHS estimates; this includes splitting the city by borough and each combination of education status, marital status, and household size. Since 2009, the CHS has included a sample from a list of households having only cell phones, not landline phones; Stark indicated that his bureau has used ACS data—and resulting glimpses in change over time—to try to ensure that this cell-phone-only sample is working properly.
Besides the CHS, Stark said that his bureau also helps other parts of DOHMH link to, and use ACS data as supplement to, a variety of other data collections run by the city. These other data collections include both interview based surveys as well as registry/records data, such as special surveys of physical activity and use of public transit options; a specialized analogue of the BRFSS aimed at youth rather than adults (age 18 and older); and a periodic New York version of the National Health and Nutrition Examination Survey (NHANES; the federal version is maintained by the National Center for Health Statistics).
In working with public health surveillance data, Stark noted that DOHMH fashions its approach after the Public Health Disparities Geocoding Project headed by Nancy Krieger of the Harvard School of Public Health. Krieger’s project was discussed and summarized at an earlier National Research Council workshop; see National Research Council (2009: § 2–A.1). Recognizing that socioeconomic status can be an important predictor of disease—linked to neighborhood contextual effects that could be associated with disease—the work estimates area-based poverty measures for geographic pockets throughout the city, using the percentage of residents who live below the federal poverty line. For some of their work, DOHMH uses United Hospital Fund (UHF) areas to approximate neighborhoods; these groupings combine multiple ZIP Code tabulation areas to create (for New York City) about 40 districts that are finer than whole boroughs but larger than individual ZIP Codes. DOHMH anticipates updating these analyses using ACS data, with interest in comparison with similarly defined areas/districts throughout the nation; technically, one question with which they are grappling is the appropriate variety of ACS estimates to use (3- or 5-year estimates).
Stark also outlined an epidemiological study of Legionnaire’s Disease, a form of severe pneumonia that is believed to be transmitted through contaminated water, in which ACS data on occupation proved very useful. DOHMH’s Bureau of Communicable Diseases sought to use its surveillance data on reported cases to examine the hypothesis that occupations potentially associated with contaminated water (e.g., plumbing or air cooling system repair) may re-
9Additional information about the CHS is available from the DOHMH website at http://www.nyc.gov/html/doh/html/survey/survey.shtml [July 2012], while the federal BRFSS is described fully in links from http://www.cdc.gov/brfss/index.htm [July 2012].
sult in an elevated risk of Legionnaire’s Disease. Public health officials followed up on every reported case in the city within a 10-year period, classifying respondents by the occupational categories defined in the ACS. These results were then used to calculate relative risks of incidence of Legionnaire’s Disease, suggesting, for instance, that categories like cleaning and janitorial and machinery have higher rates of incidence than others (e.g., office or health care).10
DOHMH couples its direct estimation of disease prevalence—based on surveillance of reported cases—with model-based estimates of disease risk. The CDC’s NHANES includes numerous laboratory samples for specific pathogens, and the array of population covariates available in the ACS permits models to be constructed using the ACS variables as predictors. Stark briefly discussed work done by a DOHMH colleague, constructing a prediction model for hepatitis C in New York State (and City), down to the county level using the ACS Public Use Microdata Sample (PUMS) data.
Consistent with the themes discussed in Chapter 3, Stark said that DOHMH has also found opportunities for use of the ACS in preparing for and responding to emergencies. They use ACS data to study the spatial distribution of particularly vulnerable populations, for purposes of updating preparedness plans. In particular, DOHMH staff from the Bureau of Emergency Management used the functional disability questions from the ACS (as well as other variables such as poor English literacy) to construct an index of risk. These scores were then weighted based on ACS PUMS data to estimate the population within different degree-of-vulnerability groups within the city. The result of this work is identification of areas of potential need for additional services during an emergency, for not just New York City but the 30-county metropolitan area.
Speaking from decades of experience as an independent consultant, instructor, and author—as he put it, mostly in the area of health care and always with a heavy infusion of census data—Rick Thomas (Center for Population Studies, University of Mississippi and Health and Performance Resources, Memphis, Tennessee) framed his presentation as an outline of general approaches to using the ACS in the health care area.
Thomas suggested that the major opportunities, and needs, for use of ACS data in health care planning may fall into seven general (and mainly sequential) categories:
10Per Stark’s request in his presentation materials not to directly cite the draft results of DOHMH work, this summary is deliberately vague on the details of this example and some of Stark’s other specific examples.
- Community profiles: basic documentation of the potential service population in question;
- Health status assessment: a more detailed examination (than the community profile) of the population’s health characteristics;
- Health services demand estimates: an estimate of the need for health services in the population, based on the levels of risk and demand found in the health status assessment;
- Determination of need: the practical, technical estimate of potential demand for new services, such as would be needed in a certificate of need application for a new facility;
- Site selection: the actual physical siting of a new facility, service location, or deployment of personnel;
- Business development: work to grow the client base for services, given the business-like nature of modern health care; and
- Other health planning uses: a catch-all category for various other applications.
Actors of various sorts are involved in these types of analyses: individual institutions or businesses (e.g., hospital or medical service groups), state agencies, local governments, and (though not the focus of this workshop) federal agencies.
As a consultant, Thomas said that he looks to three categories when he accesses and uses ACS data. First are the basic demographic and socioeconomic characteristic data that are crucial to building the early profiles and generating the snapshot of the population of interest; in this regard, ACS data are a natural starting point for many analyses. Second are project-related data—more refined queries depending on the application, such as whether the topic is providing services for children, or seniors, or a specific health problem (e.g., chronic disease or reproductive health), and pulling the ACS information that is most relevant to the specific subpopulation. Finally, as noted earlier by Call and Stark, there are a few questions on the ACS that speak directly to health issues—for instance, disability and health insurance coverage, as well as insights that can be drawn related to fertility. These health-specific data in the ACS are few in number but can be very helpful, in Thomas’s experience.
This type of analysis is demanding in that it requires data on past, present, and future timeframes. Census and ACS data can provide the past and present views, and serve as the platform for projecting forward into the future. In health care planning, there is usually a desire to profile the population and project needs 5–10 years into the future. Specific to the ACS, Thomas said that he typically uses the 5-year average data products—particularly if it is a smaller target population that is of interest (e.g., a single county) or if it is desirable to drill down to the census tract level. He took care to note that he uses the 5-year averages “with verification”—because the geographic area or demographic
group may have been undergoing dramatic changes that might be masked in the wide-window 5-year data, he will look at alternative data sources to triangulate in those cases and to ensure that the 5-year-average glimpses are appropriate.
With that as prelude, Thomas walked through the analysis steps that he would take if presented with a solid query from a client—the (fictional) ABC Health Network, which wants to add a new clinic in an area that it perceives is currently underserved. Though he cast this example as one of physically siting a new facility, he noted that the same process could be used to study the feasibility of adding some new service to existing facilities, to modifying services, or even to eliminating redundant sites or services.
Reiterating and walking through the basic steps, Thomas suggested that he would proceed in the following way:
- Create a demographic profile: This would include basic characteristics on the size of the population and how they are distributed (whether the population is concentrated or dispersed). The basic demographic information from the ACS would be most useful here: distributions by age, sex, race, and Hispanic origin.
- Create a socioeconomic profile: Specific ACS variables of interest here would be household structure, marital status, educational attainment, employment status, type of occupation, and income level. Consistent with other speakers, Thomas noted that the proportion of the population of interest living in poverty is of critical importance for much of the health care consulting that he does. In some situations, Thomas said that he might incorporate other variables, to get a sense of transportation access, geographic mobility, or primary language spoken at home.
- Screen population for eligibility: Federally qualified health centers have to demonstrate certain criteria within their target population in order to qualify for funding from the U.S. Department of Health and Human Services, and similar criteria may apply within states. These kinds of criteria may include proportions of minority population, low-income population, unemployment, and educational attainment—all of which may be derived from the ACS. For many health programs, the number and proportion of single-person households would also be a factor of interest, as would fertility measures.
- Develop a health profile for the population: Assuming that the population meets the eligibility criteria, the next step would be to develop a health profile for the area. This, in turn, focuses first on estimating morbidity—the prevalence of disease in the population—and projecting the possible demand for health services. The problem with both of these concepts is that there are no direct data on them for the whole, general population; hospital admissions and other data give some insight to reported cases, but do not speak directly to the prevalence of underlying disease or risk
factors that have not (yet) resulted in treatment. So, in both cases, the quantities of interest must be modeled. Thomas suggested the following basic approach:
- Obtain age/sex distributions (i.e., age in 5-year intervals) from the ACS;
- Determine which of four (geographic) regions covers the population of interest;
- Apply calculated rates of either morbidity or health service utilization, by age, sex, and region, from data resources (derived from actual incidents) maintained by the National Center for Health Statistics (NCHS) to the target area’s population;11 and
- Compute the number of disease cases or utilization estimates, for relevant health conditions (in the case of morbidity) or for a variety of services (in the case of estimating potential service demand; for instance, estimating number of general physician visits, number of potential well-child exams, or number of surgical follow-up episodes).
Although this is a modeling approach—and one predicated on a very crude geographic measure, the census regions used by NCHS—Thomas reported that he is constantly surprised at (and finds it “scary”) how accurate the model can be in estimating the number of cases. For this type of consulting, he said that the model can consider hundreds of different conditions, but that he focuses on the most frequent (say, the top 50) and those that are most relevant for the program he is working with.
From his time using the data, Thomas concludes that the ACS data are very useful to health care planning analyses. Indeed, in many cases, they are essential; there are not very many alternatives for some of the data items, and the ACS is able to provide information at a good and adequate level of detail. In the health care arena, Thomas said, people are used to working with health outcome data that are at least 2–5 years old, so the relative timeliness of the ACS data are a considerable benefit. It is also helpful that, for a lot of purposes, the ACS has moved easily into the position of being considered the “standard”—and so is acceptable to and trusted by the federal grant-making agencies such as the Health Resources and Services Administration or other Health and Human Services agencies.
In terms of accessibility and usability of the data, Thomas said that the data are easily accessible “now that I understand what I am doing, but it didn’t start
11The National Health Interview Survey, administered by NCHS, is designed to be representative at the national level and for four broad census regions: West, Midwest, South, and Northeast. Other NCHS data such as the National Ambulatory Medical Care Survey, the new National Hospital Care Survey, and its predecessor National Hospital Discharge Survey are coded to the same breakdown of four census regions; these surveys would be used to estimate utilization levels, as discussed below.
out that way.” But once one gets used to the routines of deriving data from the Census Bureau’s American FactFinder interface or other files, it is fairly straightforward. Particularly at the tract level, there are sometimes uncomfortably large margins of error to deal with, but Thomas sees those as “something that we have to live with for now.”
In terms of suggested improvements to the ACS, Thomas echoed the main ones suggested by Call—the “obvious,” ideal suggestions to increase the sample size and speed up the turnaround time (e.g., making the estimates even more timely). He also suggested that this point in the ACS’s history may be an opportune one to reassess the ACS questions, based on user input—adding some and deleting others. On the much broader horizon—looking 20–30 years ahead and envisioning a continuous and ongoing need for more and better health data—Thomas suggested that there needs to be continued assessment of the need for data for a variety of uses, and health care planning is a particularly important one. As much money and resources as the health care system consumes and creates, the data needed for effective planning are not always there. The legal reporting requirements under the PPACA of not-for-profit hospitals are one example: Thomas suggested that on the order of 3,000 hospitals are now going to have to do community health need assessments every 3 years, under the terms of the new law, and most of those facilities had not been in the routine habit of doing those assessments in the past. Similarly, there is money in the PPACA for expanding the federally qualified health clinics, each of which is going to need data and analysis of the type outlined in this presentation—a very big pool of applications to keep tabs on. Thomas further suggested that health care providers are increasingly going to have to focus on marketing their services—putting still further demands on the data stream.
As a final note, Thomas suggested that an important path for the future—using the ACS and other sources—will be meeting the need to understand the health care consumer. There are already many health insurance companies and—under the PPACA’s health care exchanges—there will now be tens of millions of individuals potentially choosing a health care insurance carrier. Accordingly, the insurance providers are going to be increasingly interested in profiling their potential customers, and developing strategies to identify and engage them.
Switching from applications in health care to uses of ACS data in the similarly high-stakes area of planning for transportation services, Beth Jarosz (senior demographer) described uses of the ACS in transportation modeling by the San Diego Association of Governments (SANDAG). Comprised of the 18 incorporated cities and towns in San Diego County as well as the San Diego
County government, SANDAG is the metropolitan planning organization and council of governments for the San Diego, California, region. In addition to the regular members of SANDAG’s board, advisory members from other entities in and around San Diego also participate in SANDAG activities; these include neighboring governments (Imperial County, California, and the Mexican government), transportation authorities (Caltrans, the San Diego Metropolitan Transit System, and the North County Transit District), and the U.S. Department of Defense (given the large presence of Naval Base San Diego and other military facilities in the region). In addition to serving as the San Diego region’s designated census data center, SANDAG focuses on issues of major regional impact such as air quality/environmental planning, housing development—and transportation planning.
Jarosz described region-level transportation planning as a multistep process, beginning with identifying stakeholders and assessing their needs and goals. Her particular focus, transportation modeling, is the basis of the next few steps: developing a set of alternatives from available data and information, using them to predict outcomes, and testing and evaluating them in order to select a final alternative. The analytic work in transportation modeling helps with the next immediate step—budgeting for the work—and then implementing. Planning agencies then have the responsibility to evaluate whether the chosen alternative has done what it was intended and expected to, and amend the plans as necessary. In theory, she noted, the final step is achievement of a finished, “perfect” transportation system; in practice, as she noted, anyone who has ever been stuck in traffic knows that the perfect transportation system is a constantly moving target. Consequently, planning agencies like SANDA Grepeat this process every several years (often a 4-year cycle), trying to map out transportation infrastructure needs 20–40 years into the future.
Focusing on the transportation modeling steps in this general process, Jarosz presented a simplified outline of the transportation modeling process, with specific reference to the points where ACS data and products enter the mix; this general structure is shown in Figure 2-2. Similar to the general health care planning process described by Thomas in Section 2–C, the first step is preparation of demographic and economic profiles for the area and populations of interest, and then trying to forecast future trends in the population. In transportation, there is interest in the population, housing, and jobs in the region in the next few decades, and the ACS is an important source of those data. Particular variables of interest include household structure (headship), group quarters characteristics, school enrollment, and housing structure type preferences (single versus multifamily or mobile home). In a region like San Diego, with its military presence, information and forecasts on the active-duty military personnel (and their dependents) are particularly important. These demographic and economic characteristics—the data that enter the forecast models—are obtainable from the ACS summary tables or from analysis of the ACS PUMS files.
SOURCE: Adapted from workshop presentation by Beth Jarosz.
These underlying demographic and economic forecasts of the basic composition of the population are further scrutinized in what Jarosz described as subregional or “neighborhood”-level forecasts—drilling into finer geographic detail than the region as a whole, and attempting to predict how the population and characteristics will be distributed spatially over time.
The next step is application of a synthetic population model—the basic idea of which is to simulate actions by all actors in the transportation network and model how they will travel through the system. This work is done in tandem with a separate activity-based transportation model (also shown in Figure 2-2) that makes forecasts to simulate individual trips within the system (e.g., when commuters leave for work in the morning, whether they make stops en route to work or home, and when they arrive at work or home).12 For the synthetic pop-
12In the case study/agenda book for the workshop, Guy Rousseau describes the synthetic pop-
ulation model, ACS variables such as automobile ownership, family structure (presence or absence of children, which would affect the odds of one or more trips to school in a day), work status (and number of workers in the household), and income (as a correlate of the mode of transportation a person/household might select) can play important roles. The synthetic population model typically draws from analysis of ACS PUMS data, while the activity based transportation model draws from special tabulations of data that the Census Bureau distributes as part of the Census Transportation Planning Package (CTPP).13 CTPP tabulations are coded to the special level of geography typically used by transportation planners: Traffic Analysis Zones (TAZs), collections of census blocks that are defined by the Census Bureau in partnership with local transportation officials and that may be finer grained than census tracts or block groups.
Completing the explanation of the basic outline shown in Figure 2-2, the activity-based model and population synthesizer are part of a feedback loop to the subregional, “neighborhood” forecast. This is because transportation activities can affect the geographic distribution of the future population: Increased traffic congestion in one place might make a certain neighborhood less attractive for future development or, conversely, it might flag an area where the transportation infrastructure must be built up (making the neighborhood more attractive in the long term).
Jarosz stressed that the ACS is essential to transportation planning for the simple reason that it is the only source of small-area trip data, through the detailed CTPP custom tabulation. Through the CTPP, the ACS is the only systematic source of data on flows—giving an indication of where commutes begin and where they end, so that planners can predict how commuters travel between the two. The question of when respondents leave their homes in the morning is sometimes challenged by critics as invasive; Jarosz observed that collection of this information raises privacy concerns, but that both the Census Bureau and the downstream data users are deeply cognizant of those concerns. Like other ACS products, the custom CTPP tabulation is subject to review by the Bureau’s Disclosure Review Board and complies with the privacy protection requirements in Title 13 of the U.S. Code; it is also subject to statistical techniques to curb the disclosure of personally identifiable information.
ulation model and activity-based model used by the Atlanta Regional Commission in more detail, speaking particularly about the challenges of converting their existing models from census long-form sample inputs to ACS inputs.
13As Jarosz noted in her talk, the CTPP data tabulation is funded by pooled funds provided by state transportation departments and metropolitan planning organizations around the country; it was originally compiled from the long-form sample, and is now being converted to the ACS. Though CTPP is often used as shorthand specifically for the data product, the program itself includes training, technical assistance, and research for the transportation community. At the time of the workshop, CTPP tabulations based on 3-year ACS data were available, with 5-year data scheduled for release later in summer 2012.
Because the CTPP, and the ACS, is the lifeblood of transportation planning and modeling, Jarosz said that the transportation planning community is particularly sensitive to the potential drop in response that might accompany a voluntary ACS. From the travel modeling standpoint, the calculus is stark: Smaller sample sizes (as could occur under a voluntary ACS) would make the data unreliable, particularly at the relatively fine TAZ level of geographic aggregation. Smaller sample size would necessitate more data suppression (to protect privacy) and unreliable results. It is as simple as “nothing in, nothing out”—even if there existed a perfect model for predicting travel flows and modes, using unreliable data in a perfect model would still produce undesirable results. She said that these models, and these data, are being used to plan billions of dollars in transportation infrastructure nationwide, which argues for obtaining the best data available to spend those funds wisely. (In the discussion following the presentations, Jarosz noted her strong approval with a comment that a voluntary ACS might compromise the representativeness of the ACS sample, and that this was at least as harmful as a straight reduction in sample size that might result from a switch to voluntary methods.)
Through other work for SANDAG, Jarosz said that data users have come to expect the level of small-area detail that the ACS has been able to provide. The basic socioeconomic variables on the ACS—income and poverty, race and Hispanic origin, and age—are strengths of the data; the ACS questions that permit estimation of disability status have also been valuable for regional planning purposes. Foreshadowing a theme that would be addressed in more detail in the next presentation, Jarosz said that data users have come to expect quality small-area data on questions like the primary language spoken at home; for instance, she has been asked by transit authorities to detail the languages spoken within a half-mile radius of a particular transit stop because they need to produce documents and signage for people who might be affected by a service change. Potentially smaller sample sizes would make it harder to identify very tiny language “clusters” and address community needs. High-quality analysis and planning depends on high-quality data as the input, and that would argue for (if anything) an expansion of the sample rather than a contraction.
In her presentation, Jarosz briefly mentioned the important use of ACS data in establishing compliance with federal, state, and local law and guidelines on environmental justice, social equity, and public access to services. This theme was carried forward by Vincent Sanders, lead transportation systems planner for the Metropolitan Transit Authority of Harris County, Texas (hereafter, METRO),
Title VI of the Civil Rights Act of 1964 (codified as 48 USC § 2000d):
No person in the United States shall, on the ground of race, color, or national origin, be excluded from participation in, be denied the benefits of, or be subjected to discrimination under any program or activity receiving Federal financial assistance.
Executive Order 12898 (59 FR 5517; February 7, 1994):
To the greatest extent practicable and permitted by law, … each Federal agency shall make achieving environmental justice part of its mission by identifying and addressing, as appropriate, disproportionately high and adverse human health or environmental effects of its programs, policies, and activities on minority populations and low-income populations in the United States [and its territories and possessions].
As stated in Box 2-1, the enactment of Title VI of the Civil Rights Act of 1964 codifies the principle that government programs must provide equivalent benefits to all segments of the population; Executive Order 12898 in 1994 created similar language for environmental justice, dictating that federal agencies must avoid creating disparate negative impacts (in terms of health or environmental effects) among low-income or minority communities. As part of the enforcement of these provisions, federal grant making agencies may impose reporting requirements on their state and local agency recipients, documenting their compliance with antidiscrimination rules; Jarosz and Sanders noted in their presentations that the ACS is gaining increased use in providing quantitative evidence of compliance. The role of the ACS in the allocation of billions of dollars of federal funds is well known, but Jarosz and Sanders suggested that the ACS is an important part of evaluating how those allocated funds are spent.
In his presentation, Sanders described METRO’s work in adapting ACS data for demonstrating compliance with foreign language assistance requirements. In 2007, the Federal Transit Administration (FTA)—an important funding source for transit agencies like METRO—slightly adjusted its requirements for documenting Title VI compliance. Specifically, as summarized in Box 2-2, the new guidance to FTA fund recipients called for creation and maintenance of a language implementation plan for the relevant service population’s limited English proficiency (LEP) constituency. The language specifically called for these plans to include a strong quantitative assessment of the proportion of LEP persons likely to be served by the recipient program and estimation of the frequency with which LEP persons “come into contact with the program” in any form (e.g., signage, printed materials like timetables, promotional materials). Plans
Obtaining this information is difficult because METRO is a complex inter-modal transportation system serving a growing and diverse population. One of the largest transit agencies in the country, METRO covers almost 1,300 square miles in its service area with bus (about 1,250 buses on 130 routes), light rail (a 7.5-mile network), and high occupancy vehicle/carpool service.14 In 2011, METRO logged almost 77 million boardings on its bus and rail components—alone, a massive number of points of contact with its service population—and its light rail boarding ratio (per mile of track) is second only to Boston in the nation. From a regional planning perspective, Sanders said that METRO is looking at expanding its commuter rail service and investigating possibilities of high-speed capacity in the region—all of which, he agreed with Jarosz, will use the ACS as the basis for modeling and planning.
Sanders commented that the revised requirements came into sharp focus for METRO in 2009, when schedules were such that the agency was subject not only to a triennial review (including the revised Title VI compliance provisions), but also a general FTA audit. In 2009, prior to the rollout of the full set of ACS products, METRO made the natural first step of turning to the most comprehensive extant data concerning the LEP population, which was the 2000 census long-form sample data. But the combined triennial review and audit processes had the salutary benefit of providing the agency with immediate feedback for revision: FTA appreciated the analysis using 2000 census data but noted that those data were effectively obsolete for a growing and changing region like Houston. As context in his presentation, Sanders displayed a tabulation from the first set of 5-year ACS numbers for Harris County, showing the top five languages (other than English) spoken in the county; Spanish dominates the other non-English languages, but the data suggest that the county has become home to significant linguistically isolated Asian communities (Vietnamese, Chinese, and Tagalog included in the top five).15
Hence, Sanders and METRO began casting for alternatives—the first of which was to follow up on a suggestion in FTA guidance to obtain data on LEP services from area school districts (in this case, prior to the rollout of the full set of ACS products in 2010). With some scrambling, Houston METRO was able to obtain information on the concentration of LEP students from the Houston Independent School District (the region’s largest) and five smaller school districts serving part of the METRO service population. The school district data permitted Houston to create a map of its transit hubs and network structure,
14As implied by the agency’s full name, METRO’s service population is principally in Harris County; however, Sanders noted that the agency’s service population spreads over slightly into neighboring Fort Bend, Montgomery, and Waller Counties.
15The fifth language group in Harris County, roughly as large as the Tagalog-speaking population, is French.
A May 2007 Federal Transit Administration (FTA) Circular for recipients of FTA financial assistance (FTA C 4702.1A) updated previous guidelines issued in 1988. The new circular reinforced a triennial Title VI compliance reporting process for most FTA recipients (except for some metropolitan planning organizations, which are on a 4-year reporting process; p. II-2). Specifically, FTA recipients are asked to document their progress on “tak[ing] responsible steps to ensure meaningful access to the benefits, services, information, and other important portions of their programs and activities for individuals who are Limited English Proficient (LEP)” (p. IV-1). The suggested vehicle for meeting the “meaningful access” requirement is the development and implementation of a formal language implementation plan.
The core elements of such a language implementation plan are specified in regulations promulgated by the U.S. Department of Transportation (USDOT) and applicable to all USDOT programs, including FTA (beginning at 70 FR 74087 ). Section V of the regulation summarizes the basic factors that must be assessed in order for FTA recipient agencies to “determine the extent of [their] obligation to provide LEP services” (70 FR 74091; list reformatted for emphasis):
Recipients are required to take reasonable steps to ensure meaningful access to their programs and activities by LEP persons. While designed to be a flexible and fact-dependent standard, the starting point is an individualized assessment that balances the following four factors:
(1) The number or proportion of LEP persons eligible to be served or likely to be encountered by a program, activity, or service of the recipient or grantee;
(2) the frequency with which LEP individuals come in contact with the program;
(3) the nature and importance of the program, activity, or service provided by the recipient to people’s lives; and
(4) the resources available to the recipient and costs.
As indicated above, the intent of this policy guidance is to suggest a balance that ensures meaningful access by LEP persons to critical services while not imposing undue burdens on small businesses, small local governments, or small nonprofit organizations.
(Section VII of the regulation goes on to specify elements of the actual implementation plan, e.g., staff training and providing notice of LEP access.)
A separate “handbook for public transportation providers” on implementing the USDOT policies, issued by the FTA Office of Civil Rights (April 2007), describes the “four-factor framework” in greater detail, with specific reference to 2000 census (long-form sample) and ACS tables. The handbook also walks through the steps for downloading language use tables from the then-current version of the Census Bureau’s American FactFinder site.
overlaid on a choropleth map of LEP student concentration, as one indicator of the LEP population. Sanders complemented this map with a similar analysis of (self-reported) LEP households by census block group, derived from the 2009 version of Nielsen Claritas’ Census Data Update package—an analysis that was complicated by tight budget resources and the pressing deadline for submissions.
Facing another triennial review and update of METRO’s plans in 2012, Sanders said that the ACS “is coming to our rescue.” He noted that METRO is now trying to consistently use census tracts as the level of aggregation rather than block groups, but that this level of resolution has been proving useful. In terms of understanding the LEP population, the most recent ACS data suggest that 12.69 percent of the METRO service population is LEP and the data have lent themselves to more refined mapping and planning; they have learned that half of that LEP population (in areas with a LEP percentage higher than a certain threshold) are located within a fairly tight one-quarter-mile catchment area of route service on the METRO network. Working with the ACS data and keeping with the more general Title VI mandate, METRO has begun overlaying its transportation network map layer over ACS-derived maps of racial composition and poverty levels to document how and whether the current network is satisfying user needs.
In the discussion following the presentations, Sanders was asked whether METRO planned to make further use of the school district data on LEP student data. Sanders answered that METRO was generally pleased with the coverage of the data it assembled from school districts; he has done some direct comparison of the school district data with ACS tabulations and noted that the agency’s early work with the school district numbers had yielded surprises and directions for future work. However, METRO is sufficiently comfortable with and confident about the ACS as a consistent source of data for areas and groups within its whole service population that it does not plan to repeat the school district collection.
Sanders closed by noting that the Houston region is continuously changing; he said that the area is becoming known for its sociodemographic diversity, and METRO is one of several Houston planning agencies that feel a strong need for the ACS to continually update their plans and projections.
In the questions and discussion following the presentations, several speakers were asked to clarify whether they calibrate or check the ACS data on basic demographic groups (race and ethnicity) against the decennial census. Sanders answered that he and METRO use the 2010 decennial census counts for some of their count tabulations, but also use the ACS 2006–2010 estimates with some comparison. Jarosz clarified that she and SANDAG tend to use the ACS data as
a source for rates or percentages rather than raw counts; for counts, they use the decennial census totals as well as their in-house population estimates program. Thomas agreed, using the ACS for rates rather than counts.
Lester Tsosie (Navajo Nation and a workshop presenter; Section 5–D) asked a similar question, whether the presenters had developed any order of precedence for the decennial census and the 1-, 3-, and 5-year ACS products. Jarosz answered that it depends on the specific question at hand. For benchmarking population counts or counts for age or sex groups, the natural preference is for the decennial census counts. And, while some region-level questions might be answerable with 1- or 3-year ACS numbers, much of her work involved analysis at the census tract level, giving the 5-year products particular prominence. Sanders agreed, reiterating their use of the decennial census for counts but the ACS for rates (e.g., above or below threshold percentages of LEP population for Title VI compliance documentation).
Andrew Beveridge (Queens College and Social Explorer, Inc., and a workshop presenter; Section 7–C) followed this general thread of calibration between the ACS and the decennial census, asking the Census Bureau staff in the room about the shock in the 1- and 3-year estimates caused by the changing population base used to weight the user survey. To construct ACS estimates, the survey data must be weighted based on population controls that—until a new decennial census takes place—are based on the Census Bureau’s population estimates program, essentially updating population counts every year between the censuses. For large-population areas covered by the 1- and 3-year ACS data, this meant that ACS products released in 2010 were controlled to 2009-vintage population estimates; the products released the following year could make use of the completed 2010 census counts. In cases where the 2010 census counts were discrepant from the existing population estimates—including, as Beveridge noted, large cities like New York and Detroit—these differences and their effects on the ACS estimates made the ACS products less useful for tracking change over time. Beveridge asked why the Bureau did not go back and revise the earlier estimates, or otherwise work to bridge this shock in the estimates; James Treat (ACS Office Division Chief, Census Bureau) answered that this is certainly an issue the Bureau is aware of, but is basically a resource constraint. Generating the current estimates is intensive enough, with activities and requirements stacked on each other, that the resources do not exist to revisit earlier products.
Scott Boggess (U.S. Census Bureau) challenged the presenters with a question that would recur throughout the workshop. Though several of the speakers had commended the timeliness (and relative timeliness to other information) of ACS releases, he noted that the Census Bureau gets continued feedback saying that it would be nice if the releases came out even faster; Call had included more timely releases, even on a half-year basis, in her “wish list.” The question was whether there are specific examples where a difference of weeks or months in release would really affect the usefulness of the data—whether there are specific
instances where ACS estimates being available in June would be much better, in some sense, than estimates coming out in September. Jarosz said that the call for increased timeliness reflects the comments and feedback that people like the presenters in this session hear from their downstream clients and users; she said that in the age of information technology, elected officials, the public, and the media all expect “instant gratification” and the most current information. That said, she empathized with the Bureau’s concerns about the production cycle as well; indeed, she said that she regularly spends time teaching people about how long it takes to process census and ACS data. She noted her own personal comfort with the current release time, echoing Thomas’s comment during his presentation that analysts in health care are used to data being of considerably older vintage (4–5 years old). Stark agreed, noting that some of the other data used by DOHMH use ZIP Code as their sole geographic key—as a result, there are some analyses they do where DOHMH has to wait (sometimes many months) for results to be released at the ZIP Code tabulation area (ZCTA) level.16 When asked to work on a new public health initiative, DOHMH has to account for the possibility that they might need to switch to a different level of geography—where a quicker analysis might or might not be as good as one using the ZCTA-keyed data.
16As indicators of postal delivery areas, ZIP Codes do not necessarily forma coherent geographic layer; they may overlap or may technically relate to a specific geographic point rather than an area, as in a ZIP Code for a place without home mail delivery (all P.O. Box delivery). Hence, the Census Bureau periodically constructs ZIP Code tabulation areas to approximate the delivery areas of 5-digit ZIP Codes.