Part I
Workshop Summary



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop Part I Workshop Summary

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop This page intially left blank

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop 1 Introduction BACKGROUND U.S. business data are used broadly, providing the building blocks for key national—as well as regional and local—statistics measuring aggregate income and output, employment, investment, prices, and productivity. Beyond aggregate statistics, individual- and firm-level data are used for a wide range of microanalyses by academic researchers and by policy makers. In the United States, data collection and production efforts are conducted by a decentralized system of statistical agencies.1 This apparatus yields an extensive array of data that, particularly when made available in the form of microdata, provides an unparalleled resource for policy analysis and research on social issues and for the production of economic statistics. However, the decentralized nature of the statistical system also creates challenges to efficient data collection, to containment of respondent burden, and to maintaining consistency of terms and units of measurement. It is these challenges that raise to paramount importance the practice of effective data sharing among the statistical agencies. During the workshop’s introductory session, Steven Landefeld— director of the Bureau of Economic Analysis (BEA), the workshop’s sponsoring agency—provided an overview of the goals motivating the event. He reflected on issues that arise in a decentralized statistical system, noting that its data products excel in detail, timeliness, and relevance but 1 See Norwood (1995) for an account of this historical development.

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop often lag behind in consistency. These inconsistencies create problems for BEA in producing the national income and product accounts, as they must draw from numerous data sources and make adjustments for differences in collection timing, as well as in concepts and definitions. The quality of data produced by the statistical agencies, in turn, affects the work of users, including other agencies, such as those responsible for budget projections and planning, the allocation of funds, and state and local decision making. Landefeld pointed out that, while data sharing has already improved and facilitated the work of BEA, current arrangements are limited in key ways. For example, the codes and regulations of the Internal Revenue Service (IRS) allow the Census Bureau to share data on large multiestablishment businesses but, for reasons discussed below, not on smaller and single-establishment businesses, which account for about 40 percent of receipts nationwide. As a result, critical data omissions persist as BEA and the Bureau of Labor Statistics (BLS) go about the business of producing statistical information on various dimensions of the U.S. economy. The stated purpose of the workshop, described in this summary report, was to present ideas for easing constraints limiting the ability of statistical agencies to efficiently share administrative and statistical data on businesses. In order to produce the highest quality data sets and statistics at the lowest possible cost—and with minimal respondent burden— statistical agencies must be able to access the best information available, system-wide. With this as the backdrop, BEA asked the Committee on National Statistics of the National Academies to convene a workshop to discuss interagency business data sharing. The workshop was held October 21, 2005. Recent legislation, most notably the 2002 Confidential Information Protection and Statistical Efficiency Act (CIPSEA), has served to revive debate on data-sharing, access, and confidentiality issues. Although U.S. statistical agencies have a long history of data sharing and of efforts to improve those arrangements, CIPSEA has created new opportunities to expand interagency sharing of business data among BEA, the Census Bureau, and BLS.2 The CIPSEA legislation embodies two core goals: to establish uniform cross-agency confidentiality protections and to promote efficiency in the production of the nation’s statistics by authorizing limited sharing of busi- 2 Chapter 5 provides a detailed description of the history of data sharing and data-sharing legislation. In addition, Appendix B provides brief summaries of relevant data-sharing legislation.

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop ness data for statistical purposes. The objectives behind the data-sharing component of the legislation are threefold. First, it was hoped that permitting the three agencies to share information would improve the comparability and accuracy of federal statistics by allowing more timely updating of sample frames, development of consistent classifications of establishments and industries, and exploitation of administrative data. Second, more integrated use of data should reduce the paperwork burden for surveyed businesses. Finally, through these mechanisms, it was hoped that the sharing of data would lead to improved understanding of the U.S. economy, especially for key industry and regional statistics. One example of CIPSEA’s potential is reflected in the recently authorized ability of the Census Bureau and BEA to link survey data to produce new statistics on domestic and international U.S. research and development activity. As Katherine Wallman, the chief statistician of the Office of Management and Budget (OMB), pointed out, input from the Census Bureau, BEA, and BLS is essential if CIPSEA implementation and guidance are to successfully build on experiences from earlier data sharing to make future arrangements more effective. Wallman noted that at least some additional access to tax information will be needed in order to realize the full benefit of the umbrella legislation for data sharing among the three statistical agencies. Tax data have always been an essential, but highly restricted, source of information for measuring aspects of the economy in general and for construction of the national income and product accounts in particular. Since long before CIPSEA, BEA has been able to utilize, in a limited manner, valuable business tax and revenue data from the IRS. For example, provisions in Section 6103 of the IRS code authorize BEA to access corporate income tax return information so that published IRS corporate profits data can be converted to accounting concepts appropriate for use in measuring gross domestic product. For its regional economic accounts program, BEA has been authorized under other provisions of Section 6103 to review individual tax return records in order to produce tabulations of nonfarm proprietors’ income (which are reviewed by the IRS to ensure taxpayer confidentiality); these estimates are used, in turn, to distribute BEA’s national totals by state and county.3 The above uses notwithstanding, BEA access to federal tax information is still extremely limited relative to that afforded to the Census Bureau. The current tax code allows the IRS to supply enough information (e.g., names and addresses) from businesses’ tax returns so that the Cen- 3 For more examples of data sharing, see Chapter 5.

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop sus Bureau can construct its business register and sampling frames; however, “commingled” Census Bureau-IRS data cannot be shared with either BEA or BLS. Because of the lack of specific legislative authority, Title 13 and IRS code (Title 26) guidelines vary with regard to whether or not the data collected by the Census Bureau directly from taxpayers (using the IRS-based sampling frame) are fully under their authority or whether the IRS should maintain some control. Tax data issues received considerable attention at the workshop and are reported on more fully below. Before moving on to describe the workshop proceedings, it is useful to clarify a few terms that are used throughout this summary:4 Data sharing is the exchange of information collected from businesses and individuals or reported to the IRS in identifiable form for statistical purposes. Business data include operating, financial, and related information about businesses, tax exempt organizations, and government entities (CIPSEA). Identifiable form means information that permits the identity of the respondent to whom the information applies to be reasonably inferred by either direct or indirect means. Statistical purposes involve the description, estimation, or analysis of the characteristics of groups, without identifying the individuals or organizations that comprise such groups. The designation also includes methods and procedures related to the collection, compilation, processing, or analysis of data about these groups and the development of related measurement methods, models, statistical classifications, or sampling frames. Box 1-1 lists acronyms and abbreviations related to interagency business data sharing. WORKSHOP CONTENT The workshop focused on the benefits of data sharing to two groups of stakeholders: the statistical agencies themselves and downstream data users. Presenters were asked to highlight untapped opportunities for productive data sharing that cannot yet be exploited because of regulatory or legislative constraints. The most prominently discussed example was that of tax data needed to reconcile the two primary business lists used 4 These definitions are expanded upon in Chapter 5.

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop BOX 1-1 Abbreviations and Acronyms Related to Interagency Business Data Sharing BEA Bureau of Economic Analysis BLS Bureau of Labor Statistics BR Business Register CBO Congressional Budget Office CBP County Business Patterns CES Current Employment Statistics CIPSEA Confidential Information Protection and Statistical Efficiency Act (2002) COS Company Organization Survey CPS Current Population Survey EIN Employer Identification Number FASB Financial Accounting Standards Board FRB Federal Reserve Board FTI Federal Tax Information GAAP Generally Accepted Accounting Principles GAO Government Accountability Office GDI Gross Domestic Income GDP Gross Domestic Product IRC Internal Revenue Code IRS Internal Revenue Service JCT Joint Committee on Taxation MWR Multiple Worksite Report NAICS North American Industry Classification System NABE National Association for Business Economics OMB Office of Management and Budget PEO Professional Employer Organization QCEW Quarterly Census of Employment and Wages RDC Research Data Center SBO Survey of Business Ownership SIC Standard Industrial Classification SIRD Survey of Industrial Research and Development SOI Statistics of Income SSA Social Security Administration SSN Social Security Number SSS Special Sworn Status TIN Taxpayer Identification Number USDA-NASS U.S. Department of Agriculture-National Agricultural Statistics Service

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop by the statistical agencies. Both BLS and the Census Bureau compile business establishment lists—the Business Establishment List and the Business Register, respectively—mainly from administrative data, but also with supplemental survey data. Each covers about 8 million business establishments with employees, and they are used for similar purposes: to create sampling frames for a wide variety of surveys by the Census Bureau and by other statistical agencies, for benchmarking survey data, for publishing employment and wage data, and for generating aggregates used by other agencies, most notably many of the inputs to the national income and product accounts (Box 1-2). In addition to leading to discrepancies of coverage, the redundancy of effort creates inefficiencies in maintaining up-to-date frames and samples, and it may contribute to difficulties in achieving adequate response rates to various surveys. Combining information from both sources could generate a more accurate, consistent business list that, for BEA, could improve its estimates in a number of areas (e.g., trade in services, corporate profits and industry employment, and wages by location). Inconsistencies, particularly in the assignment of establishments to industries and the range of entities covered, carry direct implications for the reliability of key statistics—from gross domestic product, to employment, to produc- BOX 1-2 Why Should BLS and the Census Bureau Work Toward a Reconciled Sampling Frame? The argument for the business list case goes beyond simply reducing redundancy and, possibly, administrative expenses. As noted by several of the workshop’s presenters, some widely used macro statistics are derived from combinations of the Census Bureau and BLS data. For example, productivity is calculated as the ratio of a Census Bureau figure (output) and a BLS figure (labor input). If output and input measures were estimated from the same survey, then the presence or absence of any particular firm from the sample would likely have a very small effect on the ratio, because both the numerator and the denominator would change in the same direction. With separate samples, however, even a relatively common occurrence, such as a discrepancy in the industry code of a firm, could have visible effects on aggregate industry productivity growth in two industries. This is because the change in output of the firm is attributed to one industry while the change in inputs is attributed to another. There are important research and policy incentives for moving toward the use of a common sampling frame.

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop tivity and industrial production—derived from business lists or subsequent survey-based data. Streamlining the business registers and survey programs is also likely to reduce respondent burden for businesses. In its presentation, BEA cited these and other improvements that could be achieved through more extensive integration of data across statistical agencies. However, it is important to note that there are advantages to having separate lists, as it allows each agency to tailor the characteristics of its list specifically to the purpose it serves. Jim Spletzer and Paul Hanczaryk (presenting for BLS and the Census Bureau, respectively) noted that interagency data sharing is an obvious and low-cost prerequisite for improving the business lists. While list comparison work is well under way, the idea of a business list reconciliation project is still very much at the discussion stage. The legalities and procedures necessary to begin this kind of work (specifically, the restrictions resulting from the federal tax information that the Census Bureau receives from the IRS) are not trivial. If sharing among the three CIPSEA-designated agencies is to be fully exploited, either IRS regulations or code (or both) must be changed. It was known at the time of its passage that, in order for the Census Business Register to be shared, companion legislation to CIPSEA would be needed to modify Section 6103 of the IRS tax code or to change interpretation of that code. The Joint Committee on Taxation has not yet taken action to address this specific data-sharing need. The 2004 Statistical Programs of the United States Government report (Office of Management and Budget, 2004) indicated that the proposal for companion legislation, which would make complementary changes to the provisions in the “statistical use” section of the IRS code, was endorsed by the Treasury Department and submitted to Congress; however, it expired with the 107th Congress. During the workshop’s morning session, representatives from BLS, the Census Bureau, the IRS, and BEA addressed current data-sharing arrangements and the role that data sharing plays in producing federal statistics. The Census Bureau and BLS provided information on their ongoing business list comparison project, which is intended to comprehensively document the comparability of the lists. Mark Mazur and Nick Greenia of the IRS Research, Analysis, and Statistics Division provided an overview of current data-sharing arrangements and interpretation of relevant regulations and legislation. They expressed a clear understanding of the importance of data sharing for purposes of improving business lists and indicated a willingness to work carefully and incrementally toward this goal within the legal guidelines. They further suggested that, if companion legislation to CIPSEA is to have a real chance of moving forward, the expansion of tax data sharing should be narrow in scope and clearly tied to purpose. For example, for the purpose of reconciling busi-

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop ness lists, perhaps only a few basic variables—e.g., name, address (or geocode), employer identification number, employment, payroll, industry—would need to be shared. During the afternoon session, which highlighted the perspectives of downstream data users, presenters from the Federal Reserve Board (FRB), the Congressional Budget Office, and academia discussed the benefits of business data sharing as it relates to productivity and real output measurement, informing monetary policy, estimating business profits, and budget forecasting. Federal Reserve policy models incorporate productivity statistics that are derived from industry-level output and employment data. Dennis Fixler of BEA and Carol Corrado of FRB described how the maintenance of two establishment lists creates time-consuming complications. For example, for calculating industry productivity statistics, inconsistencies arise because source data are drawn using different methods and from nonidentical sets of business entities. Output figures (the numerator) originate from the Census Bureau data, while input figures (the denominator) are derived from BLS data. Several participants touched on the distinction between informative data discrepancies and those leading to statistical inconsistencies that are costly in terms of user communities’ time and resources. For example, although household and payroll-based estimates of employment differ significantly at times, the two sources can illuminate slightly different aspects of the labor market picture. And, although BEA would certainly like to minimize the statistical discrepancy between income- and expenditure-side measures of gross domestic product (particularly for a few problematic industries), employer and household surveys each generate valuable information, both independently and in combination. In contrast, reconciliation of the two business lists involves mainly definitional and classification issues which, workshop presenters seemed to agree, should be as consistent as possible. Carol Corrado, chief of the Industrial Output Section in the FRB Division of Research and Statistics, noted that discrepancies between the Census Bureau and BLS employment by industry figures lessen her confidence in BEA’s industry accounts and, in turn, the accuracy of productivity change measures. She added that it is likely that the sectors of the economy experiencing large changes in productivity may also be those associated with problematic data inconsistencies. Corrado argued further that the statistical agencies could use resources more efficiently if they did not have to maintain two business lists. Steering committee member John Haltiwanger noted that, given the different uses of the business lists, it would not make sense to choose one over another; instead the weaknesses and strengths of each should be recognized and exploited. Corrado expressed the hope that, in the very near future, a sys-

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop tem would be in place that is capable of reconciling differences in employment by industry. Dale Jorgenson of Harvard University touched on similar themes in his presentation, observing that policy makers are hamstrung by the fact that the data system exhibits inconsistencies that arise because of the absence of statutory authority to share data among agencies. He expressed the view that resolving these uncertainties is essential and suggested beginning at the most fundamental level. One goal of Jorgenson’s work on a “new architecture” for the national income and product accounts is to have common registers of firms, establishments, families, and individuals and to collect the data in a way that is internally consistent at the micro level (Jorgenson, Landefeld, and Nordhaus, 2006). Several presenters suggested that an effective approach to advancing the dialogue between policy makers and statistical agencies is to begin by recognizing the potential, not just for increased efficiency and more accurate information for policy, but also for reducing respondent burden. The logic behind the change must include compelling data-driven cases for which the payoff is clear across data-sharing agencies and between respondents and the agencies. In that spirit, workshop participants cited numerous examples in which more effective sharing would improve data, and for which the associated benefits more than warrant action to build on the successful data sharing authorized by CIPSEA. Finally, the confidentiality side of CIPSEA was not neglected. Participants from the agencies stated the need to continue to take this responsibility very seriously as a matter of principle and as a means to buttress public confidence in the agencies. As outlined in the summary that follows, many presenters argued that the uniform confidentiality provisions created under CIPSEA provide sufficient coverage to expand data-sharing arrangements while still ensuring that the privacy and confidentiality of records will be maintained. At the time of the workshop, however, the confidentiality provisions under CIPSEA did not have OMB guidelines, and agencies continued to interpret the requirements differently. The point was made that, given the confidentiality requirements enacted through CIPSEA, the agencies are now in a better position than ever before to protect the data collected for statistical purposes under a pledge of confidentiality. Landefeld noted the importance of continuing to do a good job of protecting the confidentiality of data, while suggesting incremental changes in data-sharing arrangements, including: streamlining administrative procedures under CIPSEA; expediting access to research data centers (keeping statistical uses as top priority); and

OCR for page 1
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop modifying IRS procedures to promote effective use of administrative data for statistical uses, either through legislation or changes in the regulations. Landefeld expressed a hope that participants at the workshop would emerge with a renewed sense of the importance of moving forward to responsibly expand interagency data-sharing arrangements.