2
The Benefits of Data Sharing to the Statistical Agencies

Representatives from the Bureau of Labor Statistics (BLS), the Census Bureau, the Internal Revenue Service (IRS), and the Bureau of Economic Analysis (BEA) gave presentations on the benefits of data sharing to statistical agencies and on the prospects of enhancing current arrangements. Throughout the session, discussion about the data underlying the BLS and Census Bureau business lists was prominent. BLS and the Census Bureau are currently jointly engaged in a business list comparison project; Jim Spletzer and Paul Hanczaryk presented preliminary results and recommended next steps. Mark Mazur and Nick Greenia provided an explanation of current data-sharing arrangements and legal constraints from the IRS perspective. They also commented on the value of tax information to the statistical agencies and discussed the viability of various strategies for dealing with current data-sharing restrictions. Dennis Fixler delivered the morning’s final presentation, an overview of the current role and future potential of data sharing to serve BEA’s national income and product accounts work.

BUSINESS LIST COMPARISON AND RECONCILIATION

Both BLS and the Census Bureau compile business establishment lists—the Business Establishment List (BEL) and the Business Register (BR), respectively—mainly from administrative records, but also supplemented with survey data. Each “register” covers about 8 million business employer establishments, and they are used for similar purposes: to create sampling frames, to benchmark survey data, to publish employment



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop 2 The Benefits of Data Sharing to the Statistical Agencies Representatives from the Bureau of Labor Statistics (BLS), the Census Bureau, the Internal Revenue Service (IRS), and the Bureau of Economic Analysis (BEA) gave presentations on the benefits of data sharing to statistical agencies and on the prospects of enhancing current arrangements. Throughout the session, discussion about the data underlying the BLS and Census Bureau business lists was prominent. BLS and the Census Bureau are currently jointly engaged in a business list comparison project; Jim Spletzer and Paul Hanczaryk presented preliminary results and recommended next steps. Mark Mazur and Nick Greenia provided an explanation of current data-sharing arrangements and legal constraints from the IRS perspective. They also commented on the value of tax information to the statistical agencies and discussed the viability of various strategies for dealing with current data-sharing restrictions. Dennis Fixler delivered the morning’s final presentation, an overview of the current role and future potential of data sharing to serve BEA’s national income and product accounts work. BUSINESS LIST COMPARISON AND RECONCILIATION Both BLS and the Census Bureau compile business establishment lists—the Business Establishment List (BEL) and the Business Register (BR), respectively—mainly from administrative records, but also supplemented with survey data. Each “register” covers about 8 million business employer establishments, and they are used for similar purposes: to create sampling frames, to benchmark survey data, to publish employment

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop and wage data, and to provide aggregate measures to other agencies. Products of the BEL and the BR include the Quarterly Census of Employment and Wages (QCEW) and the County Business Patterns (CBP), respectively. Within each agency’s business list programs, there are numerous tasks for which data sharing could be helpful. For example, both the Census Bureau and BLS require establishment-level data for multiunit firms. The Census Bureau requests that these firms break out employment and payroll numbers by establishment in its Company Organization Survey (COS); for BLS, the Unemployment Insurance program’s Multiple Worksite Report (MWR) is used. Because the timeliness and comprehensiveness of the COS and the MWR are not the same, combining results could enhance the measurement of employment, payroll, and establishment birth and death trends for multiunit firms. Spletzer described the collaborative project ongoing at BLS and the Census Bureau to compare, improve, and (perhaps eventually) reconcile the two lists. The goals of the comparison project are twofold: to understand the differences in the lists and to identify the strengths and weaknesses of each. Contributors to the project at BLS and the Census Bureau are ultimately motivated by the prospect of identifying opportunities to improve the value of the lists in the context of the uses that they serve. Improving the comparability and accuracy of the Census Bureau and BLS business lists not only would provide benefits to the statistical agencies and to downstream users, but also could reduce reporting burden on the business community and possibly reduce costs to the agencies in the long run. Preliminary comparisons have found that heterogeneity between lists increases at finer levels of industry and geography detail; thus Spletzer and Hanczaryk suggest the need for greater sharing of micro-level data. Additionally, they noted the work the agencies plan to do to resolve the different methods used to determine single versus multi-establishment status. One of the purposes of the Confidential Information Protection and Statistical Efficiency Act of 2002 (CIPSEA) is “to improve the comparability and accuracy of Federal economic statistics by allowing the Bureau of the Census, the Bureau of Economic Analysis, and the Bureau of Labor Statistics to update sample frames, develop consistent classifications of establishments and companies into industries, improve coverage, and reconcile significant differences in data produced by the three agencies” (Public Law 107–347, Subtitle B—Statistical Efficiency, Sec. 521, Findings and Purposes). The main hurdle to business list coordination work that arises involves the statutory restrictions on federal tax information. The bulk of the data underlying the Census Bureau register originates from IRS tax records and is shared under Title 26; however, Title 26 does not authorize BLS or BEA to access these records or, for that matter, the Cen-

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop TABLE 2-1 Comparison of Published Statistics 2001 Data County Business Patterns Quarterly Census of Employment and Wages % Difference Establishments 7,095,302 7,213,611 –1.7 Employment 115,061,184 108,916,710 5.5 Payroll (millions) 3,989,086 3,972,605 0.4 NOTE: Figures are adjusted for differences in industry coverage. sus Bureau data commingled with them. Therefore, change in the tax code is required before microdata from IRS sources can be shared for programmatic purposes. To inform the business list comparison project, BLS and Census Bureau analysts first evaluated and compared the published aggregate statistics. In order to make a comparison, adjustments were made to take into account differences in the scope and coverage of the BR and the QCEW, most notably to account for the fact that the former includes large segments of the self-employed business population that the latter does not.1 BLS private-sector data needed filtering to remove certain industries that the Census Bureau does not cover, such as crop and animal production, rail transportation, postal services, and private households. Several other industries, such as employment in government hospitals and employment in government liquor stores, had to be added. One of the most striking discoveries of the business list comparison project relates to the aggregate employment numbers. As shown in Table 2-1, the overall employment count in 2001 is 5.5 percent higher for the CBP than it is for the QCEW. These findings were reported in Foster et al. (2005), which also concluded that industry and geographic coverage matter a good deal and that the heterogeneity of results increases at finer levels of industry and geography. One purpose in cataloging the differences in published statistics was to help guide the micro-level analysis, which, given the 8 million establishment records, needed to focus first on the industries and states showing the greatest discrepancies. Different methods were used to compare data from single-establishment units and multiestablishment units. Single-establishment businesses were examined on an exact match basis; that is, for a given establishment, the lists had to show the same number of employees to 1 A full explanation of the methods used in this comparison project can be found in Foster et al. (2005).

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop count as a match. For multiestablishment businesses, the analysts used a “near match” band of 5 percent (± 2.5 percent). The percentage of cases in which the BLS and Census Bureau lists disagreed on single versus multi-unit status was not available at the time of the October presentation. In Table 2-2, the first row of figures shows the difference in aggregate employment for single-establishment businesses. The employment numbers are reasonably similar, but analysis of the microdata reveals important differences. About 30 percent of the employment and payroll estimates for the matched single-establishment businesses do not match exactly. Considering that the majority of Employer Identification Numbers (EINs) are for single-establishment businesses, these differences, both in number of establishments and the employment counts, are noteworthy and should be further explored. In the case of multiestablishment businesses, using the near-match concept, the employment and payroll estimates match only 39 and 51 percent of the time, respectively. Again, this shows that significant micro-level heterogeneity underlies the comparatively similar macro-level statistics. The comparison project will document and further explore these similarities and micro-level disparities. In addition, the project will examine inconsistencies between the Census Bureau and BLS classifications of single versus multiestablishment businesses. As Table 2-2 indicates, there are approximately 309,000 cases (found by summing 197,000 and 112, 000, the bottom two rows of column two) in which the Census Bureau and BLS disagree over single versus multiunit status, and these businesses account for about 21-22 million employees. The project will examine a number of other topics—the role of nonemployers, the data quality for professional employer organizations and help-supply services, overlap and duplication in the COS and the MWR, and the role of firm identifiers—that factor into the inconsistencies found in the two business lists. Ascertaining the sources of the nonmatched data will take time, as nonmatches are complicated with technical issues of scope and coverage and the cooperation of the states. Only 47 states authorized BLS to share their data for this project, and the relationship between the states that opted out and the nonmatches is still being explored. Hanczaryk acknowledged that sharing between BLS and the Census Bureau would likely lead to improvements in both lists. BLS industry coding, physical location addresses, multiunit data from the MWR, and employment data for single units are recognized as being very thorough, and this detail would benefit the Census Bureau. The Census Bureau is particularly interested in the data for multiunit companies within states, as well as in BLS data for the client businesses of professional employer organizations (PEOs). PEOs (or employee leasing) firms typically supply human resource management services (e.g., payroll accounting or ben-

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop TABLE 2-2 Business List Comparison 2001 Microdata for 47 States EIN (in millions) BLS Employment (in millions) Census Bureau Employment (in millions) Based on Microdata Comparison Match on Payroll Match on Employment Matched single-establishment EINs 4.1 35 38 71% (exact) 69% (exact) Matched multiestablishment EINs .112 49 48 51 (neara) 39 (neara) Matched EINs, BLS multiestablishment, Census Bureau single-establishment .197 15 15     Matched EINs, BLS single-establishment,Census Bureau multiestablishment .112 6 7     NOTES: EIN = employer identification number, BLS = Bureau of Labor Statistics. aNear match within ± 2.5%. SOURCE: Workshop presentations by Jim Spletzer (BLS) and Paul Hanczaryk (Census Bureau).

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop efits administration) to their clients. The Census Bureau’s tax record-based data do not accurately indicate the geographic location and industry of leased employees working at client sites; rather, they indicate the industry and location of the PEO itself. BLS will benefit from an evaluation of firm information that is collected as part of the Census Bureau’s COS. Access to the Census Bureau data could potentially add consistency to BLS industry codes, giving the agency the ability to analyze microdata on nonemployer businesses (18.6 million on which the Census Bureau publishes data). Hanczaryk provided an overview of the current limited data sharing between the agencies. The Census Bureau provides BLS with approximately 1.2 million EINs every quarter, which BLS then matches to their files to provide industry codes and physical location addresses. From this process, for 2004, 3.4 million BLS industry codes were returned to the Census Bureau. Sharing these EINs and codes reduced costs and respondent burden and provided greater uniformity of the two agencies’ economic data and, in the process, produced evidence that this type of data sharing works. The comparison project currently under way will provide some indication to the agencies of what areas will provide the biggest payoff from more extensive sharing. Since, under the current agreement, the data can be shared for research purposes only—but not to update either of the registries—an important aspect of the project is to guide programmatic opportunities. The comparison would provide key input for any future reconciliation of data, should that become an option. For example, the COS and the MWR are now overlapping mail-out surveys to multiunit companies. By combining these two surveys, the agencies could reduce response burden on businesses, one of the twin goals outlined in CIPSEA. In order to move forward on business list improvement and data sharing, the agencies must overcome important analytical and legal hurdles. First, the agencies need to better understand the magnitude of the differences and the reasons for them. However, comparing multiunit companies is complicated by the fact that there are no numerical identifiers that provide a one-to-one comparison of establishments. Second, without companion legislation, BLS is not authorized, under Title 26, to receive the Census Bureau microdata that include federal tax information. Finally, BLS, which has an explicit relationship with state unemployment insurance programs, would like to increase consistency in survey processes and economic data development, but this goal is hampered by the fact that the states cannot access key Census Bureau microdata. Hanczaryk concluded that the potential of data sharing to improve business lists and other programs in the U.S. statistical system that would benefit users should provide BLS, the Census Bureau, and BEA with ample motivation to move forward.

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop AN OVERVIEW OF TAX DATA AND IRS DATA-SHARING ARRANGEMENTS The availability of federal tax information, especially confidential microdata, was a recurring theme of the workshop. The IRS houses data on tax returns from a variety of entities, including individuals, estates and gifts, tax-exempt organizations, and businesses and corporations (tracked by EIN, not establishment). The IRS presenters—Mark Mazur and Nick Greenia from the Office of Research, Analysis, and Statistics—identified constraints to using data processed from these returns, most notably the need for an authorized purpose as defined by the Internal Revenue Code (IRC). Their paper (see Chapter 6) discusses the legislative history and lays out limitations, lessons learned, possible means to expand data sharing under current constraints, and steps that might be taken to change the legislation. Their presentation touched on three specific topics related to expanded access—the need for statutory change, regulatory change, and policy agreements; the importance of linking any expanded access to a specific research or statistical purpose; and the need to make the benefits to the Treasury Department and other agencies clear to policy makers. There are three major authorized uses of federal tax information: tax administration (which is the concern that overrides all others from the IRS perspective), tax policy analysis, and research and statistical uses. Tax data are subject to strong presumptions of confidentiality, and Section 6103(j) of the statute authorizing access for outside statistical use permits recipients to access only the minimum amount of data necessary to accomplish a stated purpose. The penalties for unauthorized disclosure or inspection of tax data are strong and clearly defined in the statute. In addition, data recipients are subject to regular safeguard reviews of physical and computer security, need and use, and other factors in order to ensure that they are in compliance with IRS requirements to protect taxpayer confidentiality. An increasingly difficult problem in releasing data is to ensure that they remain anonymous. Ever-advancing technology—improving data linkage programs, faster and cheaper computer processing, and increasing amounts of administrative data available on the Internet—has made it easier to match an individual or company to a specific record. For example, if a record includes an industry code, address, and revenue figure that could identify a particular business, anonymity would be breached and the law broken. In turn, protecting the confidentiality of tax data becomes increasingly more difficult. The IRS considers all tax data sensitive, meaning that no distinction is made between information that is publicly available elsewhere, such as an address found in a phone book, and information that may not be, such as a firm’s income and profit entries. In addition, no statute of limitations

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop exists for tax data, which means that they must be protected in perpetuity. Various levels of security have been implemented to protect data from unauthorized use. The IRS maintains an audit trail for those who have access to the data, conducts background checks on users, and requires users of tax data to have a computer system that is separated from other types of data uses. These safeguards and constraints increase the cost of providing data and ultimately limit access. Both IRS presenters emphasized the point that voluntary compliance is a cornerstone for accomplishing the central IRS mission of administering tax collection, and it is dependent on the protection, both real and perceived, of taxpayer confidentiality. Due to the sensitive nature of the information reported by taxpayers, ensuring and demonstrating the protection of confidentiality in the IRS process are vital aspects of promoting compliance. An ongoing concern at the IRS is that expansion of data access could increase the risk of a confidentiality breach and, along with it, a public perception that tax information is shared carelessly throughout the federal government. This in turn could weaken the voluntary self-assessment system. Mazur noted that a 1-percentage point change in the overall voluntary compliance rate translates into tens of billions of dollars in tax revenues collected. Thus, there are two important goals in managing tax data: protecting them to maintain as high a compliance rate as possible, and exploiting them effectively and efficiently for other authorized purposes, including statistical uses. The characteristics of tax data are unique. The size of the population covered (over 20 million organizations and over 125 million individual taxpayers) and the scope of return data (covering information on everything from mortgage interest deductions to corporate net profits) create a complex respondent universe and a wealth of detailed data. Because of disincentives dissuading false or late reporting, nonresponse is thought to be low relative to most survey alternatives. However, the IRS captures neither all data for all types of returns nor data on taxpayers who fail to file; as such, there are well-known and systematic inaccuracies in the data reported to tax authorities. Nonetheless, given that the data are used for tax administration purposes, including enforcement and internal research and analysis at the IRS, there is reason to believe that many components of the data are accurate (again, relative to survey-sourced data). The federal executive and legislative branches conduct tax policy analysis with the data, while four agencies covered in Section 6103(j) of the IRC—BEA, the Census Bureau, the Department of Agriculture’s National Agricultural Statistics Service, and the Congressional Budget Office (CBO)—can use the data for statistical purposes, although the extent of access for specific purposes varies widely by agency. Data in nonidentifiable form, including public use files, have been used more

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop broadly by decision makers in the federal government, businesses, policy think tanks, academic researchers, and state governments. Statistical agencies are charged with using existing data systems, such as administrative records, to the maximum extent possible to reduce costs and counter concerns about respondent burden. The IRS, by contrast, must protect federal tax information by providing it only for authorized purposes and to the minimum extent required for each purpose. These two directives create tensions, some of which have been partially relieved by agreements that stipulate clearly delineated uses of tax data and the conditions under which they may be used. Generally speaking, however, the IRS does not view either burden or cost reduction alone as reason enough to grant access. Recognizing the great demand for access to federal tax information, Greenia and Mazur offered guidance about what kinds of sharing arrangements were most feasible. As outlined in the Greenia-Mazur paper (see Chapter 6), three methods are available to expand access to data: statutory change, regulatory change, and policy agreement. Greenia clarified the differences between statutory and regulatory change: the former requires the passage of a law through Congress and the signature of the president, while the latter requires Treasury Department approval. In order to add new statistical users, or to expand the access of those currently authorized to access federal tax information, Section 6103(j) must be amended. This statute stipulates who may use the data and for what purpose, as well as what data fields may be accessed—for example corporate tax items. The presenters suggested that limiting the specific data items to those actually needed for a specific purpose (e.g., basic information needed to construct a business list sampling frame), clearly tying requests to intended use, and specifying them in an amended statute might improve the chances of passing a proposed legislative change. The idea is to conform to the “minimum need” requirement by bounding the item content in the statute. Data sharing among authorized recipients is enabled elsewhere in statute—by Section 6103(p)(2) and the associated regulation (B). On the regulatory side, the Treasury Department requires a compelling, data-driven business case in order to grant access to additional data items. Regulatory changes have been used in the past to both add and remove data field access. A change in regulation can supplement statutes and adjust for changes in user needs. For example, if the Census Bureau needed to access additional corporate tax data fields to accomplish a mandate under Title 13, Chapter 5, a change in regulation would be required to enable such an expansion in access. For purposes of business list comparison and reconciliation projects, Greenia discussed freezing item content in the statute to basic sampling frame information—again to be re-

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop sponsive to the minimum need requirement. He suggested, further, that limiting requests in this way would ease potential concerns by legislators and their staffs that additional items might be added in the future with only the approval of Treasury’s assistant secretary of tax policy. The third mechanism for expanding access is a policy agreement, which is intended to supplement the statutes and regulations. The Census Bureau-IRS Criteria Agreement, which emerged in response to concerns over access to tax data facilitated by the development of the Census Bureau Research Data Centers (RDCs), is an example. The crux of this agreement is that the work must have a predominant Title 13, Chapter 5, purpose, which essentially means it must improve Census Bureau programs. The Census Bureau is responsible for evaluating proposals based on scientific merit and predominant purpose, and the IRS documents the request and either authorizes or denies it after determining whether it complies with their regulations. External researchers apply for Special Sworn Status (SSS), go through an FBI clearance, swear to an oath to abide by Title 13, and then, after clearing these hurdles, are certified to access Title 13 and Title 26 data in the same manner as Census Bureau employees. The Joint Committee on Taxation publishes an annual report that lists the volume of tax data records disclosed by the IRS, classified by statute and including statistical purposes. Disclosures under Section 6103(n), such as contract work with BEA or the Federal Reserve Board, are not included in this report. This report shows that the number of disclosures for federal statistical uses, most of which involve demographic data requested by the Census Bureau, is second only to those needed for state tax administration. Greenia noted that policy-oriented research and statistical analyses are important considerations for tax data administration. When asked about the effect that outside researcher access has on public perceptions of confidentiality and privacy, Mazur suggested that it is likely minimal, as long as the number of analysts is small, access is at arm’s length, and researchers are subject to the same enforcement rules as others with data access. During the open discussion, steering committee member John Haltiwanger inquired about possible models for new data sharing. One suggestion was to allow a third party, such as the Office of Tax Analysis in the Treasury Department, to access the data specifically for tax policy use and, second, to generate simulated synthetic data as public use data. Under this scenario, researchers from the Office of Tax Analysis would access data as SSS agents and follow the same rules as other external researchers working under the Census Bureau-IRS Criteria Agreement. The presenters recognized the potential importance of synthetic data as a tool for expanding access to microdata, citing several projects already under way at the Census Bureau. While the IRS is supportive of the Census Bu-

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop reau work developing synthetic data, the quality and utility of the data sets are still largely unknown. Mazur noted that, while further development of synthetic data is necessary, past efforts have not generated data sets known to provide inference-valid results, especially for complex modeling applications. In the context of moving forward on data sharing, the presenters offered a list of lessons learned (see Chapter 6 for a full discussion of these lessons). First, strong leadership and support from the highest levels of government are needed. The Office of Management and Budget (OMB), the Council of Economic Advisers, and congressional staff are good places to enlist support for a companion bill to CIPSEA. The effort needs active congressional support, from both staff of tax-writing committees and from members of the House and the Senate. It also requires clear communication to policy makers and the public of the potential benefits from interagency data sharing, such as increased efficiency, more accurate statistics, and reduced respondent burden. Second, Mazur suggested that the myth of a zero-sum game, in which expanding access in one area requires reduced access elsewhere, must be addressed. In addition, he argued, discrete and incremental steps may be better than bold leaps as statutory or regulatory changes are pursued. Finally, there should be some interagency coordination of confidentiality protection procedures, and the benefits of proposed changes to the Treasury Department need to be clear. (In 2002, staffers from the Ways and Means Committee and the Senate Finance Committee asked what was in the companion bill for Treasury.) By leading the Treasury Department effort to advocate for CIPSEA in Congress, the IRS demonstrated that it can play a major role in the development and passage of data-sharing legislation. The more agencies that are behind the legislation, and the stronger the argument for widely distributed benefits, the more likely it is to receive congressional support. DATA SHARING AND BEA PRODUCTION OF ECONOMIC STATISTICS The extent to which agencies are able to share data for statistical purposes carries direct ramifications for national income accounting. As background for their presentation, Dennis Fixler and Steven Landefeld of the BEA contributed a paper on this topic, which appears as Chapter 7 of this volume. In his presentation, Fixler specifically outlined how disparate sources of data can lead to inconsistencies in the construction of economic statistics. A prominent example is the fact that gross domestic income (GDI), gross domestic product (GDP), and state personal income have all, at

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop times, displayed different rates of growth—differences that can present problems for policy makers. For example, GDP estimates are used for OMB and CBO budget forecasts, and the Federal Reserve incorporates the output growth information in monetary policy decisions. Chapter 7 provides details and further examples of why understanding these statistical discrepancies matters. GDI and GDP are the two aggregate measures of domestic output. The product or expenditure side (GDP) is calculated primarily from the Census Bureau data, while the income side (GDI) is calculated mostly from BLS data. Conceptually, the two series should be equal, but because the data come from different sources, typically there is a statistical discrepancy. Historically, more weight has been given to GDP; however, as the statistical discrepancy increased during the latter half of the 1990s, analysts (e.g., those studying productivity trends or forecasting tax revenues) began paying more attention to the income side. Wage and salary growth for the period 1995-2001, as measured using the Census Bureau data, has been greater than that shown by BLS data. Fixler and Landefeld suggest that one possible source of the GDP-GDI discrepancy may be tied to the way that stock options, bonuses, and fringe benefits are recorded in the BLS and Census Bureau payroll figures. Table 7-8 (see Chapter 7) breaks down differences in payroll growth to an annual basis, revealing that the Census Bureau figures are frequently but not always greater than those of BLS. To gain a clearer picture of measured wage trend differences, it is helpful to explore the data at the industry level. For example, between 1998 and 2002, the Census Bureau data show a faster growth rate than BLS data for the information sector; the opposite is true for construction. Calculating real value-added growth in a few selected sectors using Census Bureau rather than BLS data shows that the absolute differences can be substantial. The growth rate for computer and electronic products in 2002 illustrates this difference, as the Census Bureau measure of current-dollar value-added is roughly double the BLS count. The higher Census Bureau number supports an altered view of that sector, and of trends in manufacturing generally, suggesting a different recovery story for the period. More complete data sharing among the agencies would allow researchers to investigate these data discrepancies in a systematic manner. Data sources used by BEA to compute gross output per worker— indicators of productivity—also show substantial differences. Two examples, shown in Table 2-3, are oil and gas extraction and petroleum and coal products, in which the percentage differences are 13.9 and 12.8, respectively, between the CBP and QCEW measures of gross output per worker. These differences are linked to those discussed in the presentation by Spletzer and Hanczaryk on the BLS and Census Bureau business

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop TABLE 2-3 Gross Output per Worker (in dollars) 2002 NAICS Code   2002 Output per Employee Selected Industries Census Bureau BLS Percent Difference 211 Oil and gas extraction 991,595 853,547 –13.9 324 Petroleum and coal products 2,062,617 1,798,598 –12.8 486 Pipeline transportation 761,076 660,673 –13.2 515-517 Broadcasting and telecommunications 296,694 342,739 15.5 52-535 Finance, insurance, real estate, rental, and leasing 392,955 434,753 10.6 NAICS: North American Industry Classification System. list comparisons, and they show exactly why BEA has an interest in this project. Another area for further exploration relates to Title 26 data. BEA, the National Science Foundation, and the Census Bureau worked together on a project begun in 2003 looking at research and development expenditures. The project was designed to use Title 13 and BEA data, avoiding federal tax information. As Table 7-10 reveals (see Chapter 7), in some cases the expenditures of U.S. parent companies, a subset of all U.S. firms, exceeded all U.S. expenditures. This inconsistency between expenditure measures could potentially be solved by using Title 26 data. Estimation of state income taxes provides another example of how multiple sources of data can lead to inconsistencies in accounting. Chapter 7 includes estimates of the extent to which the difference between the BLS and Census Bureau payroll figures can affect projected state and local income taxes. The benefits of data sharing can be viewed from either a system-wide perspective or, more narrowly, from the perspective of specific agencies. The system-wide benefits include improved sampling frames, more consistent industry and region classifications, and an increased capacity to resolve anomalies in responses—all without increased respondent burden. Finally, there are important analytical policy questions that can be addressed at the micro level through data matching. Prime examples are policies involving foreign direct investment and offshoring. From the national accounts perspective, Fixler argued that data sharing offers a number of benefits. Data sharing would aid in resolving the statistical discrepancies underlying source data, such as those underlying

OCR for page 13
Improving Business Statistics Through Interagency Data Sharing: Summary of a Workshop the income and product sides of the national accounts and the payroll and employment estimates. Preliminary employment figures, indicators, and extrapolators, which are used in BEA’s early estimates, as well as its models and projections, may be improved through data sharing and fuller access to data that capture accounting and other business changes. The extent to which sharing would help overcome problems of data disruptions, such as from natural disasters like Hurricane Katrina, is unclear, but allowing agencies to compare notes could help fill the gaps. During open discussion, Carol Corrado noted that some discrepancies are informative. She pointed out that analysts at the Federal Reserve Board look at differences across data series to piece together analytical insights. Identifying sources of discrepancies requires access to data to determine if there is a difference in exporting between the Census Bureau and the BLS forms, a difference in the interpretation of language, or something else. Fixler noted that, under current constraints, this capability is limited, as restrictions on federal tax information typically do not allow for adequate analysis below the aggregate levels. Fixler stressed that data sharing cannot solve all discrepancies, but it will allow analysts to better understand the source of the differences and provide policy makers with a clearer picture of what is happening in the economy.