Read "Benefits, Burdens, and Prospects of the American Community Survey: Summary of a Workshop" at NAP.edu

« Previous: 3 Planning Social Services and Responding to Disasters

Page 57 Cite

Suggested Citation:"4 ACS and the Media." National Research Council. 2013. Benefits, Burdens, and Prospects of the American Community Survey: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/18259.

Page 58 Cite

Page 59 Cite

Page 60 Cite

Page 61 Cite

Page 62 Cite

Page 63 Cite

Page 64 Cite

Page 65 Cite

Page 66 Cite

Page 67 Cite

Page 68 Cite

Page 69 Cite

Page 70 Cite

Page 71 Cite

Page 72 Cite

Page 73 Cite

Page 74 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

â4â ACS and the Media In earlier workshop presentations, speakers from two nonprofit organiza- tions spoke about one important role they play with respect to American Com- munity Survey (ACS) data: serving as an âinterpreterâ of sorts for the complex data, both to the general public and to media outlets. The media are an impor- tant consumer and user of ACS data (and products derived from those data) be- cause they serve to communicate important findings and trends from the data; many more people are exposed to the data through the window of the media than will ever access and work with the data themselves. The media are an important front line in the communication of results of the survey and an im- portant and diverse data user constituency themselvesâranging from the small media outlets described by other speakers as lacking any capacity for original data analysis to outlets capable of working a new ACS release into a complete package of stories on changes in American life. The workshop presentations in the block dedicated to media perspectives included an important and considerable range of views. The workshop steering committee asked Haya El Nasser from USA TODAY to speak generally about the way in which news stories are carved from new data releases and general challenges of writing and communicating about the ACS (Section 4âA). From this general discussion, the session pivoted to a profile of a specific, intensive, data-driven exploration and how it came to beâa multipart profile of the im- pact of immigration (and illegal immigration) on the California economy devel- oped by Ronald Campbell of The Orange County Register (Section 4âB). The final speakerâgraphics editor Ford Fessenden of the The New York Times (Sec- tion 4âC)âtook El Nasserâs comments a bit further by talking about the unique opportunities and challenges created by ACS data in the graphical presentation 57

58 BENEFITS, BURDENS, AND PROSPECTS OF THE AMERICAN COMMUNITY SURVEY of data, typically through the Timesâ online platform.1 The discussion period following the speakersâ presentation (Section 4âD) revisited the thorny prob- lem of the presentation (or lack thereof) of standard errors associated with ACS estimates. 4âA FINDING STORIES IN ACS DATA Though she began her remarks by calling herself âprobably the least tech- nically savvyâ of the speakers, Haya El Nasser took care to note that her newspaperâUSA TODAY âhas a longstanding commitment to working with and presenting census and ACS data, and that she personally has been writing on related issues for almost 16 years. Noting the comments made in the previ- ous session, she commented that she does have Excel and enjoys working with extracts of the data herself; more generally, she said that USA TODAY is fortu- nate to have the tools and the capacity to do extensive work with the raw data. In that regard, she credited USA TODAY database editor Paul Overberg as their âsecret weaponâ in working with full data releases and getting things into the shape to form finished stories, and many of her workshop comments described the interactive process between her and Overberg in mining the stories from ACS data. Walking through the process by which a âdata dumpââwhat reporters like her tend to call a new release of ACS data (or a new set of results from the decennial census)âshe began by noting that some directions for data-driven sto- ries come from basic hunches and instincts. Having covered stories in this area full-time for a number of years, El Nasser said that she talks to a lot of people about demographic trends; she reads a lot of material and travels to work on sto- ries. From all of thatâas well as her experiences within her own communityâ she said that she makes a habit of thinking about these âhunches of changes,â whether they are on things like changes in household size or moving patterns (families moving or not moving in greater numbers) or a sense of shifts in de- mographic structure brought on by economic conditions (e.g., older children moving back to the parental home). El Nasser said that some of these hunches make their way into written notes (or just Post-ItÂ® notes) while she keeps others in her head, but that she compiles them for discussion with Overberg when a new census or ACS release is about to come out. In addition to being another sounding board on the possible veracity of a hunch, Overbergâas the person responsible for wrangling the raw dataâalso serves as a âreality checkâ on what 1 In addition to the three journalistic perspectives presented at the workshop, database reporter Phillip Reese from The Sacramento Bee contributed a short description of stories that he and his newspaper have specifically derived from the ACS for the workshopâs case study/agenda book. That summary, in turn, is a concise recap of a presentation he delivered in ACS use by the media at the Population Association of America annual meetings shortly before the workshop, in San Francisco in May 2012.

ACS AND THE MEDIA 59 types of analysis can or cannot be done on deadline for a compelling âfirst-day storyâ to accompany the new data release. Through these conversations, they narrow their focus down to a few areas or hypotheses where they might have anecdotal evidence but also feel that the data analysis might corroborate what they see on the ground. With those topics (and seeds for possible stories) in mind, they then wait for the numbers to go up, and start to work through possible stories while they have access to the data under an embargo period. After Overberg does initial work with the large data files and distills them into more workable form, El Nasser said that she starts looking at the newly created spreadsheets, playing around with the rankings and rate of change. Quite deliberately, they put a time limit on this initial exploring and eyeballingâtypically within a couple of hoursâand talk through what they see. In addition to more refined analysis, this initial exploration leads to the next step, which is tapping the very broad pool of sourcesâdemographers and other social scientistsâwho can weigh in on what they are noticing in the data. El Nasser candidly conceded that this early discussion and corroboration of data trends with outside sources might be a technical violation of the ACS embargo policy lamented by Terpstra (Sec- tion 3âB), but it is essential to getting any compelling story done on time (and having confidence in the trends they observe). Sometimes, she said, there emerges a good, strong theme for the âfirst-dayâ story based on the new data. At other times, it takes more time than the un- forgiving release deadline permits to develop such a strong theme. In those latter cases, the first-day story becomes what they colloquially call âthe stewââ highlighting a variety of changes in different variables, to give readers some sense of the extent of demographic changes occurring in the country. Not meant as a pejorative, âthe stewâ actually serves to highlight the strength of a data source like the ACS, when it can be used to describe trends across many variables and for all varieties of geographies (region, key states, large metropolitan areas, and so forth). Whether a single-theme story or âthe stew,â work on the story develops along two principal fronts (though certainly in collaboration with each other). Through her own looks at the data and discussion with sources, El Nasser works on the reporting of the storyâand works through the challenge of âturning the numbers into wordsâ and making them comprehensible to the broader public. Meanwhile, Overberg and other USA TODAY staffers work on the maps and graphics to accompany the story. In modern journalism, the focus is not just how to render data views from the ACS or other sources in print, but also in how they may be rendered on the newspaperâs website. Naturally, she said, this raises issues like making sure that the online product is in sync with the printed version andâwith the necessary design and computationâthe process can become extremely elaborate for a 1-day data release (but, she conceded, also a lot of fun).

60 BENEFITS, BURDENS, AND PROSPECTS OF THE AMERICAN COMMUNITY SURVEY Beyond the first-day story, they have the luxury to go back to the data and mine it for additional details and stories in the following weeks or months. De- tailed drill-downs into the data can be done while reporters travel to develop âfocus storiesââfuller reporting packages to illustrate what the numbers are showing. She said that first-day stories tend to emphasize hot-button topics of known reader interestâamong them trends in commuting and in housing costs (e.g., the share of household income that goes toward housing costs). With specific reference to housing, new data releases also commonly have a separate housing story developed by a reporter in USA TODAY âs financial section, typi- cally with insight and analysis work by the Joint Center for Housing Studies at Harvard University. As an example of a longer-term focus story, El Nasser said that they are currently working with ACS data for a story package on trends in mobility in the country; the release of data on mobility trends as detected in the Current Population Survey (CPS) is a primary spark for the story butâas noted by earlier speakersâthe much smaller CPS sample size limits the degree to which it can be disaggregated by geography or demographic group. Hence, they are working with the ACS data to get at the richer detail, to round out the story. Likewise, shortly before the workshop, USA TODAY ran a story that dipped back into ACS data on housing tenure to discuss the degree to which the United States is becoming a ârenterâs nationâ (increasing share of renters versus owners). Turning from the general process to specific aspects of USA TODAY âs use of ACS data, El Nasser made the point that the overriding value of the ACS for news editorsâand the reason why she and USA TODAY particularly like itâis the freshness and the timeliness of the data. Put most simply, ânew data make newsâ in a way that older data simply cannot. She recounted particular memo- ries from the 1990s, going to editors with stories in 1996 and 1997 making good use of the data from the decennial census long-formâgood stories, of which she and fellow reporters were proudâand having them rejected out of hand, with dismissive looks essentially saying âAre you crazy?â To an editorâfocused on the news of the momentâa story that looks like it is based on 6- or 7-year-old data is practically dead on arrival and the counterargument that the long-form data are really the only thing out there does not gain much traction. To be sure, she said, USA TODAY ran some stories of that varietyâand still doesââbut nobody got too excited about them.â Further, they do occasionally get buy-in from editors to conduct more archival or historical storiesâfor instance, for a longer piece accompanying results from the 2010 census that looked at broad major national changes relative to the 1990 and 2000 censuses2 âso the rejection of seemingly old data is not absolute. Still, newer data get the bulk of editorial 2 El Nasser recalled that the story she had just mentioned that accompanied the 2010 census results also made use of the most-current ACS data, on items such as marital status and age at first marriage, for comparison of trends with the previous census long-form samples.

ACS AND THE MEDIA 61 attention and having good data to talk about how the country is changing every year, with fresh numbers every year, is âquite amazing for usâ; it is that freshness and newness of results that makes the ACS an irresistible source for the media. Regarding the geographic levels that USA TODAY tends to use, El Nasser said that they value the ACS for its ability to provide national-level âbirdâs eye viewsâ on trends as well as profiles by state, county, and metropolitan area. Their use of finer-grained geographies than county or metropolitan area is rel- atively rare. They have not really pursued tract-level analysisâand have not really needed toâbut that option is certainly open, for instance, to focus a story on the impact of the foreclosure crisis on a particular city neighborhood. She did recall one instance where Overberg used data from the Public Use Micro- data Sample (PUMS) in support of a story on the diversity of the 2010 incoming kindergarten class and to suggest their demographic future. Overberg has also used 5-year ACS data at the Public Use Microdata Area (PUMA) level for a story on children in poverty; the text looked at poverty rates by different age groups and studied income ratios at the county level, but the full package included the glimpses at the PUMA level. El Nasser made the first mention at the workshop of a peculiar identity cri- sis faced by the ACS, a topic that would recur in subsequent discussion. When estimates from the â2010 ACSâ rolled out in 2011, on the heels of more de- tailed releases from the 2010 census, El Nasser and others began working on a first-day story covering six or so trends that were perhaps linked to the national recessionâs impact on individual Americansâ livesâdelayed marriages, fewer di- vorces, more crowded households, changing housing vacancy rates, and so forth. But initial response from editors was skepticism and confusion: why should we look at and use these numbers instead of the 2010 census? And so, El Nasser said, they had to go through a mini-seminar of sorts to clarify the differences and distinctions between the census and the ACS. She commented that she had been asked to talk about communicating the uncertainty inherent in ACS es- timates to readersâand she wouldâbut that the editorsâ reaction spoke to a potentially more fundamental concern: continued uncertainty and misunder- standing about exactly what the ACS is. One massive consolation is that the Census Bureau has an excellent reputation as a trustworthy source of informa- tion; as El Nasser put it, her editorsâwho act as a filter by taking on the role of the readerââcouldnât care less if it is the ACS or CPS or whatever . . . the bottom line for them is that it comes from the Census Bureau.â This combination of trust in the Census Bureau and confusion over what ex- actly the ACS means is the reason why media stories commonly use the generic references âcensus dataâ or âdata from the Census Bureau.â El Nasser said that reporters certainly know the precise source of the data in the stories, but find it difficult to win these battles; proper attribution of the ACS as a source begs (in editorsâ interpretations) for an explanation of what the ACS is and how it dif- fers from the census, and that is a tough sell when space is at a premium. What

62 BENEFITS, BURDENS, AND PROSPECTS OF THE AMERICAN COMMUNITY SURVEY she and Overberg have done is to try to take care to always insert âAmerican Community Surveyâ in the source line for graphics for the stories, so that peo- ple who are interested in following up on specifics have the clue that the actual source is the ACS. Outside of the first-day story crunch, the less time-sensitive focus stories are more likely to include ACS attribution in the textâparticularly if they combine data from multiple surveys or censuses. El Nasser was candid in saying that similar confusion or misunderstanding is the explanation for why âwe donât talk about margins of error in storiesââ outside the specific application of political polling, where the concept of a mar- gin of error has come into public parlance, the concept of standard errors and uncertainty is sufficiently opaque to editors (and general readers) that the gen- eral strategy is to avoid it in story text. She was quick to point out that, in report- ing the story, they are cognizant of the standard errors that are now prominently presented in ACS tabulations; in fact, she commended the Bureau for doing a good job in presenting the standard errors and thus making it fairly evident to users whether some basic comparisons are statistically significant. She relayed a comment and point of criticism from Overberg that the Bureauâs online access tools are not friendly for handling some kinds of basic manipulationsâfor in- stance, combining detailed age groups into broader categories and recomputing the standard errors. To that end, adding a calculator widget (of the sort being developed by Plyer [Section 3âC]) or an improved facility for custom-building tables would be greatly beneficial. El Nasser concluded by reiterating that the bottom line for a news organi- zation, driven by timely information, is that it will use the data that provides the most current, timely insights at the geographic levels of greatest pertinence. The decennial census has long been a trusted source for this information, but the ACS has become indispensable for its topics and geographic coverage. Noting the recent legislative developments, she commented that she and USA TODAY certainly hope that the ACS remains viable; her own sense is that ACS fund- ing is likely to weather the immediate budgetary storm but that some period of costly and time-consuming work to square the ACS with new expectations is likely to follow. 4âB DATA-BASED INVESTIGATION: IMPACT OF IMMIGRATION IN CALIFORNIA While El Nasser described the general process of generating news sto- ries from ACS dataâand, in particular, stories on a tight first-day-of-release deadlineâRonald Campbell of The Orange County Register described his own personal experience with a longer-term data-based investigative report. His se- ries of stories on the growth of the immigrant labor force in California since 1990 and its impact on the stateâs economy was published in the Register in

ACS AND THE MEDIA 63 September 20103 (and would ultimately be recognized in April 2011 with a âBest in Businessâ award from the Society of American Business Editors and Writers) and synthesized data from previous decennial censuses as well as the then-available estimates from the ACS. He said that the experience could be de- scribed as a case study of data-driven reporting orâat times, and less politelyâas a âproject from hellâ; though generally favorable about the data and their util- ity, he would end his remarks with an ominous prediction about ACS use in the media, and so set the theme for later discussion at the workshop. (In addition to the condensed version of the narrative in his presentation, Campbell con- tributed a fuller written account for the case study/agenda book; this summary draws detail from the written case study as appropriate.) Campbell said that he has long been involved in computer-assisted journalismâthe fieldâs term for using technology and databases to tell storiesâ and hit on the idea of using PUMS data from recent censuses and the ACS to assess prevailing conventional wisdom about immigrant labor in California. Immigration is an impossible topic for reporters in California to avoid; as he be- gan work on the series, he said that he knew that one bottom-line statisticâthat immigrants constitute roughly one-third of the labor force in Californiaâis an âextraordinary number.â He added that California politicians typically like to speak of the stateâs economy on a global scaleâif it were treated as a country in its own right, Californiaâs economy would be the eighth or ninth largest in the world. So the role of immigrant labor in an economy of that size was an interesting hook for a story to pitch to Campbellâs editors. A clinching detail was the knowledge that the only developed economy outside the Middle East with a larger proportion of immigrants in its workforce is the vastly smaller economy of Luxembourg. This project was a difficult one to take on, for a variety of reasons. The general topic was âwildly controversialâ and had been heated for many years, particularly following debates over the stateâs Proposition 187 in 1994;4 com- mon assertions cast in debates included that immigrant workers in California were predominantly illegal/undocumented, poorly educated, and poorly paid drains on state and local resources. As a piece of journalism looking at trends over several decades, the story also had to be compelling as both a history lesson and as fresh, topical news. With such a contentious topic, Campbell knew that 3 An overview âSeries at a Glanceâ page at http://www.ocregister.com/articles/choices-265585- immigrants-driven.html includes links to all the stories and sidebars from the series. Supporting spreadsheets are also hyperlinked in the online story footnotes. 4 Passed with a 59 percent vote in a 1994 statewide referendum, Proposition 187 (originally introduced in the California legislature as the âSave Our Stateâ initiative) would limit public educa- tion, health care, and other social services to persons verified to be U.S. citizens or legal immigrants. Challenged on constitutional grounds, the proposition was essentially stayed from implementation by federal court injunction shortly after its statewide passage; in 1997, it was held to be unconsti- tutional in the U.S. District Court for the Central District of California and, in 1999, the state of California withdrew its appeals.

64 BENEFITS, BURDENS, AND PROSPECTS OF THE AMERICAN COMMUNITY SURVEY his work and analysis of the relevant data had to be âbulletproofâ; as he put it, he and his editors âknew that, no matter how careful we were, readers would give us hell.â Accordingly, he set two basic parameters as he set about assem- bling the story: first, to establish a âhigh threshold of proofâ up frontâusing 95 percent confidence intervals and describing results in text only if they carried a margin of error of plus-or-minus 5 percentâand, second, getting permission from his editors to post all of his numbers and spreadsheets online. Through this posting, Campbell said that he wanted readers to see not only the numbers he worked with, but also thoseâclearly branded in red shadingâthat failed his âthreshold of proofâ test. Echoing El Nasserâs comments about the inability to describe statistical concepts when space is at a premium, he also lobbied his ed- itors to devote space in the story layouts to a ânerd boxââtext boxes describing the underlying methodology. Campbell said that it did not take long to encounter difficulty in stitching to- gether a story from a variety of census and ACS data sets of differing vintagesâ conceptual definitions (such as occupation categories) change over time and ge- ographic boundaries are in similar flux. More fundamentallyâand counter to previous speakersâ highly favorable comments about the ACS sample sizeâthe annual ACS sample size (as reflected in 1-year PUMS files) is so small relative to decennial census long-form samples that many comparisons one might want to make are simply not viable. To make more comparisons work, he turned to the most recent 3-year PUMS file (the 5-year numbers remained more than a year away from release) and specifically to the version of the file distributed by the IPUMS project (described below in Section 6âB). He acknowledged that direct comparison of a 3-year interval estimate to a point-in-time estimate like those from the 1990 or 2000 census long-form samples runs counter to the Census Bureauâs usual guidance, but he said that he felt that he had little choice. Other workarounds that he implemented to facilitate his analysis included working al- most exclusively with ACS percentage estimates rather than counts and to focus on larger geographies. To illustrate the point, he described his first attempt to use the 2006â2008 ACS PUMS file to derive the number of immigrant workers in each of the stateâs 233 PUMAs; the screenshot of the table he displayed was filled with red shadingâhis indicator for estimates with relative standard errors greater than 5 percentâbecause all 233 estimates failed the test. Switching from counts to percentages of foreign-born workers, Campbell was able to generate usable results for all but 17 PUMAs. For purposes of mapping data, Campbell said that he used the âconsistent PUMAsâ developed by the IPUMS project, which are designed to be consis- tently identifiable units (geographic areas) over the 1980â2000 censuses as well as the ACS samples from 2005 forward. These consistent PUMAs tend to be larger in size than the most recent 2010-ACS-vintage PUMAs, which further helps with the margin-of-error problems. For this investigative series, the key issue was not just profiling the immi-

ACS AND THE MEDIA 65 grant population but seeing how they are distributed across the workforceâ what kinds of jobs do they hold, and what proportion of various professions or industries are held by immigrants? Campbell said that this examination of a particular variable type proved difficult. The long-form samples from the 1970â 2000 censuses were able to support conclusions about the immigrant popula- tion within âhundredsâ of professions, and comparing across censuses suggested those occupations where the immigrant share had increased dramatically. But the picture became murkier with the ACS data. By his reliability standards, Campbell said that he could only obtain good results from the 1-year 2008 ACS numbers for 44 out of roughly 334 job categories coded in the ACS data (this looking at the state as a whole, with a pool of some 6 million foreign-born workers). Results improved turning to the 3-year, 2006â2008 ACS data; the three numbers supported reliable estimates of the foreign-born share of 108 job categories, which he subsequently reduced to a set of 90 for comparison with the earlier censuses. In generalâand, again, not challenging previous speakersâ comments about using the ACS relative to the CPS but instead summarizing work in his own applicationâCampbell said that he found it frustrating that âI deliberately had to be less specific in what I was aiming forâ (in terms of the number of years of pooled data and in geography) to get reliable results. In the end, his series on immigration in California documented and high- lighted some trends that ran well counter to popular âreceived wisdom.â As he summarized in his written account for the case study: Drawing on four decades of census data, the series showed that immigrants were responsible for most of the growth of Californiaâs labor force since 1990, that hundreds of thousands of immigrants worked in high-wage jobs as doctors, scientists or engineers, and that they had little or no apparent effect on the income of well-educated [U.S.-born workers]. On the last point, Campbell raised the example of two job categories that had experienced major influxes of foreign-born workers between 1970 and 2008, registered nurses (RNs) and automobile mechanics. Specifically, the 2006â2008 ACS data suggested that the foreign-born shares of these two job categories were roughly comparable in 1970 and in 2008 (as reflected in the 2006â2008 ACS), about 37 percent of RNs and 45 percent of mechanics in 2008. True, inflation- adjusted wages between the two job categories diverged greatly over the yearsâ mechanics made slightly more than RNs in 1970, while the mean wages of RNs in 2008 was more than twice that of mechanics. But Campbell argued that the difference beween the categories is not immigrant or foreign-born status but the degree of educational attainment that has come to be expected in the two jobs, with RNs shifting from typically having 2 years of college (or less) in 1970 to at least a bachelorâs degree in the 2008 data. Campbell wrapped up his comments (and his written case study) with what he called âissues for ink-stained wretchesââguidance for reporters thinking about undertaking similar analysis with ACS data. He reiterated the importance

66 BENEFITS, BURDENS, AND PROSPECTS OF THE AMERICAN COMMUNITY SURVEY of setting (and maintaining) appropriate thresholds for confidence intervals and margin of error, and of documenting the chosen methodology before starting analysis, revising it as need be during analysis, and (ideally) publishing it with the finished story. However, he cautioned his peers that âACS promises more than it deliversââthe great promise of the ACS for reporters was that tract-level data would be highly reliable, and so provide reporters with neighborhood-by- neighborhood comparisons and âthousands of different stories.â Howeverâ large though the ACS sample may beâhe said that the standard errors on such fine-grained estimates simply do not support that level of scrutiny. He cited the example of his home census tract in Orange County; âeven a casual inspection of tract-level dataâ from 5-year ACS data yields some estimates that he âwould not hesitate to use [in] the paperâ (median high school educational attainment of 97.4 percent with a 1.4 percent margin of error). But it also includes oth- ers that seem sufficiently out of whack (median household income about $119.9 thousand, plus or minus 16.6 percent). In Campbellâs opinion, the large margins of error make the ACS tract-level data a less reliable source for reporters than previous decadesâ census long-form samples. Commenting that he, El Nasser, and Fessenden constitute a somewhat un- representative sample of reporters in terms of sensitivity to data analysis issues, Campbell agreed with a basic point made by Terpstra and Plyer (Sections 3âB and 3âC): most reporters do not (or cannot) perform their own analysis of numbers. Though data-driven (or computer-aided reporting) is making some inroads, it is still a specialtyâand a general fear of numbers still pervades many newsrooms. However, where Terpstra and Plyer emphasized the correspond- ing need for âinterpretersâ to make ACS data and findings more usable by the media, Campbell feared that some reporters will continue to try to use the data and yet remain reluctant to really probe the estimatesâ standard errors and un- derstand the âpotential pitfalls.â Hence, he ended with a cautionary prediction: that sometime in the next several years, a news organization will have to retract a major storyâbased on ACS dataââbecause a reporter failed to understand the margin of error that is built into that data.â 4âC GRAPHICS AND PRESENTATION OF DATA TO NEWSPAPER AND NEWS WEBSITE READERS Taking the podium immediately after Campbellâs prediction, Ford Fessenden (graphics editor, The New York Times) agreed with the premise that âthere will be a huge problem at some pointâ when failure to understand ACS margins of error will force the retraction of a big story. But he stressed a more upbeat take on using the ACS data, particularly for small areas. He described his bottom line position as being that âwhat is most important about the ACS is exactly what is most problematic about the ACSââits capacity

ACS AND THE MEDIA 67 for allowing users to drill down to finer geographies and smaller groups. As Campbell said, the estimates can become much wobblier at those drilled-down levels but âthere are massive amounts of truth there.â He suggested that there is certainly a limit to the degree to which ACS data can make its way into printâ no newspaper would want to print endless tables, and the final stories can only contain so many numbers before they lose their narrative. But while the Times might not âwant all this stuff in the paper, we [still] want it allâ; they want access to the data for custom analyses, they want that access to provide readers with the capacity to directly interact with the numbers, and they conclude that âpushing the envelopeâ by drilling down to detail in ACS data âis a good thing.â Fessenden continued that journalists too frequently write about trends at macro levels but are really guessing about the real trends. As an example, a news story in New York might assert that the complexion of the borough of Queens is changing because of an increasing number of immigrants (in con- trast to other boroughs of the city)âan interesting story, and one that could be made very compelling with some anecdotal descriptions. But that macro-level story masks even more interesting changes going on within individual neigh- borhoods in Queensâconcentrations of different Asian groups in Flushing and other neighborhoods, for instance. It is the small areas that really tell the story, Fessenden said, and it would be âa terrible wasteâ to either ignore the finest- grained data or to be âfearful about presentingâ them. He commented that the ACS has vast potential for reportingânot only for spawning stories about what new releases of the data âsayâ (relative to prior releases) but as an invaluable source for background reporting in other stories. Fessenden said that the thing that needs to be understood in order to explain the interaction between the media and the ACS (or other data sources) is very basic: that print journalism is fundamentally about narrative writing, telling the story. Data might suggest a direction for a narrative, or provide pointers to groups of people who may be interesting to profile in a story, or help corrob- orate individualsâ experiences in the service of a storyâbut to newspapers and their editors, the basic truth is that data in their own right are âboringâ relative to the narrative. This explains, in part, the aversion to numbers and method- ological discussion described by Campbell and El Nasser. Computer-assisted or data-driven reporters have made some inroads in infusing data into the news- papers; Fessenden noted that he was such a computer-assisted reporter before becoming a graphics editor, and that reporters like Campbell, El Nasser, and himself serve as âBilly Beanes5 of the newspaper businessâ in trying to weave stories from data. The unwillingness to distract from the narrative carries over to the presentation of uncertainty inherent in ACS estimates. During his pre- 5 Billy Beane, general manager of Major League Baseballâs Oakland Athletics since 1998, is known for basing player assessments on statistics and data analysis (rather than the instincts of scouts), and was the subject of Michael Lewisâ 2003 book Moneyball.

68 BENEFITS, BURDENS, AND PROSPECTS OF THE AMERICAN COMMUNITY SURVEY sentation, Fessenden directly echoed Campbellâs languageâby and large, âwe [at the The New York Times] donât and we wonât communicate that uncertaintyâ by directly presenting the standard errors, and the phrase âmargin of errorâ âwill not appear except in the graphics about polling,â where it has become accepted. Instead, Fessenden said, what is incumbent on him and his peers is to study the margins of error and their implicationsââto try to understand it ourselves and then tell people something we believe is true.â Extending that point, the other main point that Fessenden said that he wanted to make about data-driven reporting and the increasing role of the ACS concerned âcorrectible errors.â An important role played by data users in news organizations is trying to catch errors (or validate anecdotal findings) before they make their way into print. As an example, he cited a recent instance at the Times of a reporter being tasked to do a story on a new, hot topic in the city: commuting by bicycle. The reporter started with a glance at some data sources, including ACS tabulations, butââwith no understanding of [margins of error] or outliers or anything like thatââthe results were perplexing; estimates âwere all over the place.â The reporter then went out and conducted interviews, re- turning a written story that âbeautifullyâ recounted what the reporter had ob- served. The lead of the story suggested that many people commute by bicycle, but the story went on to quote âa bunch of people who had never seen anybody on a bicycle,â and the story wound up suggesting that these interview stories confirmed what the data had said. Fortunately, Fessenden said, the story was stopped from publication by someone in the approval cycle who recognized that making use of the newspaperâs âability to go a little deeperâ into the data could improve the story. In this case, a correctible error was caught before it reached publication, and the storyâand its supporting data analysisâcould be reworked to give a clearer impression of what is actually occurring with bicycle commuting. As he displayed and spoke about some graphics projects that he has worked on for the Times, he noted that the newspaper is not always overt in presenting the standard errors associated with the estimates. One such project allows users to click and construct households of various configurations (e.g., male/female unmarried partners with one child under 18 and one child over 18); the count and prevalence of such households based on 5-year (2005â2009) ACS data are displayed, along with a quick race/ethnicity group and household income graph comparing those households to the overall population.6 He conceded that the calculator does not attempt to show the specific-household counts as an interval estimate, and so it might give some âsense of false precision.â But he said that 6 This family project is located at http://www.nytimes.com/interactive/2011/06/19/nyregion/ how-many-households-are-like-yours.html; the credit line notes that the calculator uses the IPUMS version of the file (see Section 6âB) and that it makes derivations from the data by Social Explorer, Inc. (see Section 7âC). A third graph pulls data from the 1900â2000 decennial censuses for the user-specified household type (when that detail can be extracted) and graphs the count over time.

ACS AND THE MEDIA 69 they assume that people know that as the household is made more and more complex, the numbers get smaller (and more uncertain). Understanding that the calculator could drill down to counts âso low that even we get kind of freaked outâ about the corresponding level of uncertainty, the page handles the problem by suppressing results for groups that fall below some threshold level (around 50 people in the 5-year sample or a weighted count of less than 1,000). For such rare groups (e.g., single male primary householder with two foster children), the calculator says only that there are âfewer than 1,000â such households and says that âthere are too few households of this type to chart.â As another exampleâemphasizing the richness and geographic detail in the ACS dataâFessenden turned to a series of maps also generated from the rela- tionship variable in the ACS, showing the prevalence of some nontraditional household types (e.g., âmissing generationâ households including a married cou- ple and a grandchild, married couples with two adult children living at home).7 Some of the maps show the nation as a whole, at the metropolitan-area level (as well as balance-of-the-state remainders)âor map New York City at the census tract level. Fessenden said that this set of maps was interesting to readersâgiving a picture of household structure in the nation that cannot be obtained from any other sourceâbut was particularly interesting because it spurred The New York Times staff to probe for explanations for some of the trends evident in the maps. The âmissing generationâ mapâmarried couples with a grandchildâwas an in- teresting one to parse, showing high levels of these types of households in the South and in rural areas; working with sociologist and demographer sources, the staff suggested that this might be âa map of drug abuse and rural poverty.â Pushing down to the tract level within New York City, the maps are undoubt- edly busier yet still show some strong patterns; the tract level is useful as being specific enough to see interesting things yet not so specific that â[a reader] says, âWhoa! You are wrong; that is not right in that place.â â Among the interesting patterns evident in the tract-level data are strong concentrations of unmarried partners in some parts of Queens (like Long Island City) and increasingly up- scale parts of Brooklyn; likewise, he noted that the map of single mothers with children under 18 overlaps considerably with pockets of poverty and minority- population concentration. ACS data were also used to generate a text-based map by metropolitan ar- eas that ran in the The New York Times in January 2012. For each metropoli- tan area, Fessenden and his colleagues derived the 99th quantile of household incomeâthat is, the income level necessary to qualify as âthe top 1 percentââ and printed the area name and estimates arrayed by their rough geographic lo- cation.8 Fessenden recalled that this project was frightening; rushing on a dead- 7 The map series can be viewed at http://www.nytimes.com/interactive/2011/06/17/ nyregion/maps-of-family-types.html?ref=nyregion. The maps draw from the IPUMS version of the ACS data file. 8 The original text-based map does not appear to be archived on the The New York Times website,

70 BENEFITS, BURDENS, AND PROSPECTS OF THE AMERICAN COMMUNITY SURVEY line, they encountered some unusual outliers. Consulting with their in-house experts as well as some outside consultants, they eventually reassured themselves by looking at other quantiles (e.g., 95th rather than 99th) and found the results to be relatively stable. One or two anomalies remainedââyou donât see them thereâ on the map, because they omitted the areas from the final map, because they did not want to commit a âcorrectible errorâ themselves. Summing up his basic approach to dealing with the uncertainty inherent in ACS estimatesâas he put it more colorfully in his slides, âhow we learned to stop worrying about sampling error and embrace the ACSââFessenden said the strategy is to try to emphasize trends. They try to keep the level of geographic resolution reasonableâsmall-area units in which differences in standard errors are not incredibly volatile. He also suggested that the very practice of mapping the data is an informal mechanism for borrowing strength from nearby errors; visually, the reader can reconcile rates in one area with those in the immedi- ate vicinity and so have a reasonable intuitive feel for larger trends. He cited a personal favorite example that ran in January 2011âa âmash-up of immigration [foreign-born householders] and raceâ mapped by census tracts in New York City that would have been impossible to do without the 5-year (2005â2009) ACS data.9 This effort was an unusual one for the Timesâit had not generated a similar map in the past, from census long-form dataâand proved âenlighten- ing in a way that maps of simple ethnicity or simple foreign-born [could] not.â Call-outs on the map flag the tracts with the largest concentrations of specific foreign-born immigrant groupsâfor instance, the largest single concentration of Dominican immigrants in a tract in Washington Heights, Manhattan, and the single most diverse tract in the city (at least 100 people from nine major foreign- born groups) in Queens Village, Queens. He acknowledged that the map is, visually, very busy, but that the Times staff concluded that âwe were on pretty good groundâ presenting the figures in this way. He further acknowledged sug- gestions from Joseph Salvo of the New York City Department of City Planning (see Section 7âA) that they might consider using a more aggregate level of geog- raphy to overcome some of the volatility. But, for now, they enjoy the insights obtained by pushing the data as far as possible. As an example of detailed analysis contributing to background reporting (and more generalized statements in narrative stories), Fessenden pointed to a package of graphs and a data-animation video that uses ACS data to describe the but a variantâthat constructs a map after the user types an income level to compare with the rest of the nationâis located at http://www.nytimes.com/interactive/2012/01/15/business/one-percent- map.html?ref=business. 9 The referenced page is http://www.nytimes.com/interactive/2011/01/23/nyregion/ 20110123-nyc-ethnic-neighborhoods-map.html. A series of submaps, under the main tract-map showing the ACS data, compares change in size for various specific immigrant groups (e.g., East Asians, Carribbean) based on the 2000 census long-form sample compared with the 2005â2009 ACS estimates.

ACS AND THE MEDIA 71 changing demographics along the course of the New York City Marathon.10 It draws particular contrasts between 2009 ACS and 1980 census long-form data (the premise of the story being that the route has not changed much since it was first adjusted to trek through all five boroughs in 1976, but the demographics have). The difference between 1980 and 2009 median household income is plot- ted along the marathonâs course, with high peaks in areas where median income has grown the most (e.g., Long Island City) and brief dips below ground level in areas that have become worse off (e.g., Mott Haven, in the Bronx). Similar above- and below-ground graphs along the marathon course are used to exam- ine changes in foreign-born population and particular racial groups. Fessenden briefly mentioned that he also made use of ACS data in background pieces and maps for a series of stories in 2011 to commemorate the bicentennial of Manhat- tanâs street grid layoutâagain reconciling the current ACS numbers with histor- ical census data and emphasizing the massive demographic changes in parts of the city. Fessenden closed by briefly displaying another tract-level map using 5-year (2005â2009) ACS data to describe housing stability in the city; tracts are shaded based on the median year in which householders moved into the area (empha- sizing in-moves since 2000 and collapsing all arrivals before 1990 into a single shading category). In this case, a reporter was dispatched to a particular tract in East Elmhurst, Queens, because that tract showed up as the single most sta- ble tract in the cityâan average tenure of greater than 30 years. Fessenden rued that this was a case âwhen the desk gets a little out ahead of youâ because the map suggests the tract might be a bit of an outlier; neighboring tracts seem to have considerably more recent housing arrivals, on average. But the report- ing did bear out remarkable stabilityââand that is our other check,â he said by way of conclusion. Data like the ACS can support the reporting, and vice versa; Fessenden said that he and the The New York Times value the ACS because âthere is so much more good in there than there is danger or negativity that we want it in the paper.â 4âD DISCUSSION Much of the closing discussion session for the first day of the workshop involved El Nasser and Fessenden being asked for additional comment on the presentation of margins of error.11 Terri Ann Lowenthal (see Section 7âD) asked the presenters to comment on Campbellâs assertion that the ACS promises more than it delivers with respect to the reliability of tract-level dataâwhether the journalists believe that the increased timeliness of ACS data (not being up to 10 10 The marathon graphic is located at http://www.nytimes.com/interactive/2011/11/05/ nyregion/the-evolving-neighborhoods-along-the-marathon.html. 11 Campbell had to leave the workshop immediately after his presentation in order to be able to participate in another conference.

72 BENEFITS, BURDENS, AND PROSPECTS OF THE AMERICAN COMMUNITY SURVEY years old, like the previous long-form samples) offset the increased margin of error. She followed up by asking whether the journalists thought that there was anything the Census Bureau could do to help address the point suggested by all three speakers that reporters cannot or will not present (and so might not fully understand) margins of error, and what they would do if ACS data were not available at all. El Nasser replied first, agreeing with Lowenthalâs premises; she and her col- leagues understand the problems and complications associated with ACS data, but concluded that Fessenden put it best with his basic conclusion that âthe good outweighs the bad.â She said that she understood Campbellâs concerns and criticismsâcutting the data by small area and by specific job types âslices it down to such a small levelâ that the analysis is decidedly complicated. But she said that for she and USA TODAY, the great value of the ACS is that âit is newsââproviding a wealth of data on such a regular schedule. She conceded that USA TODAY might not push down to the neighborhood level as much as The New York Times or a local newspaper like The Orange County Register, but the ACSâs capacity for analysis by metropolitan area and county is âquite amazing.â She said that she does not necessarily agree with the argument that the ACS promised more than it delivers because âI think we always knew what the problems would be; my expectations were not that much greater.â Clearly one could wish for more and more precision, but the current ACS seems to be a reasonable compromise. Fessenden agreed with El Nasserâs points, particularly the assertion that the ACS âis news.â More than that, âit is the news business; it is what we like.â Recalling the earlier question from a Census Bureau staffer on the practical difference that might come from ACS releases in June rather than September (Section 2âF), Fessenden said that the answer to that question is âan easy one for meâââthe fact that we have this data in much-more-like-real time is really important, and the only thing I would ask for is more.â He said that the avail- ability of the ACS has helped create a culture at The New York Times that data can be used to say things about what is going on in the city, and what is going on in policy, in ways that were simply unknown before. Just the fact that the ACS is releasing data on a regular schedule ensures that there will be new stories using the ACS all the timeâthe newer and fresher the data, the better. That said, he noted that he and his colleagues learn to live with a necessary problem: They very much like pushing the data into very small-area analysis but under- stand that this necessitates working with the 5-year numbers. The problem is that combining so many years means missing the ability to answer some ques- tions (or possibly getting some misleading readings) because the newest 5-year numbers meld pre- and post-recession numbers. Returning to the question on margins of error, James Treat (ACS Office division chief, Census Bureau) commented that the products from the 2000 cen- sus long-form sample did not publish margins of error with its estimates, yet

ACS AND THE MEDIA 73 the Census Bureau decided to make the margins of error much more âin your faceâ in ACS products; they are included in every table. He wondered whether margins of error were an issue in working with the 2000 data or whether er- ror has become an issue for media users simply because they are so much more prominent in the ACS products. Fessenden answered that he thought that the different style of presentation in the ACS products is a big part of the concern; he reiterated that he and his colleagues have to be mindful of the margins of error in deciding what to print and how to format finished products. Patricia Becker (APB Associates) added, and clarified, that there were margins of error for long-form-sample estimatesâthat the number of cases for geographic areas were âburied in the backâ of the products so that the standard errors could be calculated, with a lot of digging. She commented that âthere were errors in the long-form data, particularly for small population groups or small geographic ar- eas, and they are every bit as bad as [for] the areas in the ACS. Most reporters didnât know about them; demographers did.â She suggested for the record that the margins of error on the long-form estimates in Campbellâs analysis might be much more favorable to the ACS. Campbell Gibson (U.S. Census Bureau, retired) commented that it is good that the ACS makes the margins of error prominent in its products, but sug- gested that two points were missing from the discussion about comparing the ACS to past decennial censuses. One is thatâas he understood itâthe origi- nal sample size for the ACS was supposed to be about 3 percent of households per year, in which case a 5-year collection would more closely approximate the census long-form sample. But, for budgetary reasons, the ACS sample is about half that level. The other point (and, partially, a modest defense for suppressing the standard errors on the long-form products) is that the long-form samples yielded estimates only once every 10 yearsâa time span so long that changes in variables like percentage of foreign-born in an area could have changed so markedly that the question of whether the difference was significant from cen- sus to census was âkind of passÃ©.â But the annual availability of estimates from the ACS naturally creates the temptation to compare data from one year to the next andâparticularly for small areasâchanges in some variables might be much more subtle. Andrew Beveridge (Social Explorer, Inc; see Section 7âC) said that he wanted to add to this discussion his view on the ACS productsâ presentation of margins of error in cases where there are very small numbers or proportions for a particular group or area. In those instances, the Census Bureauâs approach has been to publish margins of error that allow for negative counts or propor- tions. He suggested that the general ACS data products might be improved if the Bureau implemented the methodology it used in generating a special tabula- tion of languages spoken in households to determine requirements for alternate- language voting materials under the Voting Rights Act (see Chapter 7). In that work, the Bureau produced Bayesian estimates that directly borrow strength

74 BENEFITS, BURDENS, AND PROSPECTS OF THE AMERICAN COMMUNITY SURVEY from estimates in nearby areas and tracts; the alternative of presenting hundreds or thousands of estimates akin to â0 plus-or-minus 133â seems absurd. Roderick Little (associate director for research and methodology, Census Bureau) closed the dayâs discussion by agreeing with Beveridge that products making fuller use of modeling is a good directionâand added that, as the son of a newspaper edi- tor, he found the session on media perspectives fascinating and that it is the start of a conversation on the distinction between simply âproducing factsâ relative to âproviding useful evidenceâ for people.

Next: 5 State, Local, Tribal, and Urban/Rural Uses of ACS Data »

Benefits, Burdens, and Prospects of the American Community Survey: Summary of a Workshop (2013)

Chapter: 4 ACS and the Media

Welcome to OpenBook!

Get Email Updates