Statistical Data on Organizations
The idea of individual rights is well established in all societies and cultures of the world. Rights associated with collectives are, however, another matter.
Paul Reynolds, 1993
The uses of statistics on businesses and other organizations are not as widely understood as are the uses of statistics on persons and households based on the population censuses, the Current Population Survey (source of monthly estimates of unemployment), the vital statistics compiled by the National Center for Health Statistics, and other household surveys and administrative record sources. Yet, economic statistics are critical to understanding the health of the economy and the direction in which the economy is moving. The U.S. national economic accounts, the retail and wholesale price indexes, the data on foreign trade and financial flows—all are based primarily on data collected from organizations. The generally favorable reception to the 1991 Boskin initiative to improve the quality of economic statistics (see Council of Economic Advisers, 1991) is evidence of the importance that policymakers and other users attribute to these kinds of economic data.
TYPES OF ORGANIZATIONS
Statistics on organizations (considered legal persons) cover all data subjects (units of analysis) other than natural persons or groups of natural persons, such as families or households. Occasionally, data for persons and organizations may be part of the same data set. For example, some surveys link information about the business
activities of sole proprietorships with demographic information about the personal characteristics of the proprietors. Other surveys link information about such organizations as hospitals or schools with data on persons served by or working in those organizations.
There are many kinds of organizations, and the differences among them often determine the level of confidentiality accorded to their data. In the commercial sector there are three legal forms of ownership: sole proprietorship, partnership, and corporation. Nonprofit corporations, which are exempt from income taxation, are a special group. Among the for-profit corporations, those whose shares are publicly traded are subject to special reporting requirements, and the contents of their reports (e.g., to the Securities and Exchange Commission) are generally available to the public. A subset of for-profit corporations, especially utility companies, are granted exclusive rights to particular markets; in return, their financial and other data may be subject to even greater public scrutiny.
Many companies consist of several individual establishments, at different physical locations. Although data for the company as a whole may be readily available to anyone, the same is not necessarily true for employment, payroll, production, and other data for each establishment controlled by the company.
In the public sector, the general expectation is that most information about the activities of federal, state, and local agencies and units of government will be available to all. Public access to such data is facilitated by freedom of information and sunshine laws at the federal level and in many states.
DIFFERENCES BETWEEN DATA ON PERSONS AND DATA ON ORGANIZATIONS
As noted in earlier chapters, information about organizations is not protected by the Privacy Act of 1974 (P.L. 93–579). If a statistical agency that has identifiable information about organizations is not governed by agency-specific confidentiality legislation, it must rely primarily on exemption 4 of the Freedom of Information Act (P.L. 89–487), which covers "trade secrets and commercial or financial information obtained from a person [interpreted to include legal persons—such as business organizations] and privileged or confidential," to deny public requests for access
to identifiable records for organizations. The Freedom of Information Act, however, does not provide authority to deny requests from other parts of the government.
Most of the major federal statistical agencies have some form of legislation that allows them to protect the confidentiality of data on organizations that they collect and process. One exception is the Bureau of Labor Statistics, which, as explained in Chapter 5, has relied on a combination of regulations and lower court decisions to protect data that it obtains from businesses, either directly or through state employment security agencies.
For most of the economic censuses and surveys conducted by the Census Bureau, the confidentiality provisions in Section 9 of Title 13 of the U.S. Code apply. Those provisions protect the confidentiality of respondents' file copies of census and survey report forms, as well as the originals submitted to the Census Bureau. They do not apply to reports collected from state and local governments, because those reports, by their nature, contain only data available, at least in theory, to anyone. They also do not apply to the data compiled from official import and export documents in the Census Bureau's foreign trade statistics program. Section 301(a) of Title 13 makes the export data confidential unless the secretary of commerce determines that it is in the national interest to disclose them. The import data are not covered by Title 13 because they are collected by the Customs Service and only compiled by the Census Bureau. As a matter of policy, the import and export data are published in extensive detail by commodity and other variables, but without explicit identification of importers and exporters. The majority of data cells in the most detailed tabulations are based on fewer than five transactions.
Some additional aspects of confidentiality legislation and its effects on the ability of federal statistical agencies to protect data on organizations and to share or release their data for statistical and research purposes are discussed in connection with four case studies presented below.
In addition to the differences in confidentiality protection based on the type of organization involved and the statutory framework of the federal agency that collects the data, federal statistics on organizations differ from those on persons in several significant ways:
Response to statistical surveys of organizations is more likely to be required by law. Mandatory response requirements do not guarantee complete response to such surveys because prosecution of survey nonrespondents is not a high priority for federal law enforcement units. They do have the effect, however, that many surveys of organizations do not require a true informed consent process, as detailed in Chapter 3.
There are likely to be greater incentives for other federal agencies and individuals to desire access, for nonstatistical purposes, to individually identifiable statistical information about commercial organizations. As discussed below, agencies with regulatory and compliance roles might consider a survey of businesses a good source of leads to companies that are or may have been in violation of laws and regulations. Industrial spies might derive substantial material benefits by obtaining access to trade secrets and other information critical to a company's competitive position.
For aggregate variables like production, payroll, and expenditures, the population distributions for organizations tend to be much more skewed than they are for most variables associated with persons. Especially at the subnational level, data for one or two large establishments can dominate a data cell. As a consequence, there have been virtually no releases of public-use microdata sets containing business data, and a considerable amount of masking is required even in the public release of aggregate statistics. The size of a firm also affects the dynamics of the data collection process. Small businesses are less likely to be users of the data collected from them, have less influence over the content of economic data collection programs, and are likely to be disproportionately burdened by paperwork requirements.
CONTENT OF CHAPTER
Most of the panel members have had limited experience with data on organizations and our time and resources did not permit a systematic review of all the confidentiality issues associated with such data. We therefore chose case studies as the vehicle for identifying and exploring selected issues that have come to our attention. The remainder of the chapter consists of four case studies. For each case study, we provide the relevant background information, discuss the main issues, and present our findings and recommendations.
THE ENERGY INFORMATION ADMINISTRATION VS. THE DEPARTMENT OF JUSTICE: A FURTHER EROSION OF FUNCTIONAL SEPARATION
Early in 1990, the Antitrust Division of the Department of Justice began an investigation into the sharp increases in the prices of home heating oil and other fuels that had occurred as a result of unusually cold weather in December 1989. Later in the year, the division also started an investigation of the increases in gasoline prices that followed the Iraqi invasion of Kuwait in August.
To aid its investigations, the Antitrust Division requested the Energy Information Administration (EIA), Department of Energy, to provide specified data on prices, production, and other variables, some in aggregate form and some for individually named companies or establishments. The Energy Information Administration was willing to provide the aggregate data, but it resisted the request for individual company records on the grounds that release would be contrary to its Policy on Disclosure of Individually Identifiable Energy Information in the Possession of the Energy Information Administration, which had been published in the Federal Register (45(177):59812–59816). The Antitrust Division's request did not meet any of the conditions, set forth in Section E of that policy, under which EIA would disclose individually identifiable information to other federal agencies. Further, EIA pointed out that it had notified survey respondents about its confidentiality policies and that its failure to abide by them could hurt its ability to collect complete, accurate, and timely data in the future.
Although other federal agencies requesting data from EIA had accepted its 1980 policy on disclosure, the Antitrust Division did not. After prolonged negotiations, the matter was referred to the Office of Legal Counsel in the Department of Justice, which ruled on March 20, 1991, that the Federal Energy Administration Act of 1974 (P.L. 93–275), one of the statutes on which the EIA policy on disclosure was based, required that EIA produce the information requested by the Antitrust Division. Subsequent to this ruling, the Energy Department brought the issue to the attention of Counsel to the President, C. Boyden Gray. Gray, although sympathetic to a prospective solution to EIA's problem, did not feel he could recommend to the President that he direct the Justice Department to withdraw its request in this instance. In the ensuing
months, there were protracted negotiations on the precise nature of the data to be released. Eventually, the Justice Department closed the two oil-pricing investigations after receiving some aggregate data—but not company-specific proprietary data from EIA. The Justice Department, however, has indicated that it believes it is legally entitled to such data and will seek to obtain it if appropriate in future investigations (General Accounting Office, 1993).
With a few exceptions, the provisions of EIA's 1980 policy on disclosure had permitted disclosures of individually identifiable data to federal agencies outside the Energy Department only for statistical purposes. The March 1991 Justice Department ruling meant that that portion of the EIA policy on disclosure could no longer be sustained; identifiable records would have to be released to any agency that insisted on having them.
In August 1991, EIA sent a letter to all of the roughly 25,000 respondents to the surveys affected to notify them of the forced change in its policy. As of January 1992, EIA had received approximately 50 letters and 200 telephone calls with complaints or inquiries. By the end of 1992 there had been no observable effects on the completeness and timeliness of responses to the surveys. However, the real test will come if individually identifiable data are turned over to the Justice Department and the disclosures are publicized in trade journals or other media.
Is EIA a Statistical Agency?
The Committee on National Statistics (National Research Council, 1992b:2) has proposed the following definition of a federal statistical agency: ''a unit of the federal government whose principal function is the compilation and analysis of data and the dissemination of information for statistical purposes." The committee also states that a federal statistical agency, to be effective, must protect the confidentiality of individual responses and must not disclose identifiable information for administrative, regulatory, or enforcement purposes.
The principal mission of EIA is statistical, and thus the agency may be said to meet the basic definition proposed by the Committee on National Statistics, but there are significant exceptions. The Department of Energy Organization Act (P.L. 95–91) requires EIA to gather data in support of regulatory and program needs of the Department of Energy. For those surveys that collect data
already available to the public and for those that collect data for regulatory purposes, completed report forms are available for inspection by the public, and respondents are informed of this fact.
Clearly, EIA lacks the kind of confidentiality policy that the Committee on National Statistics considers essential for an effective federal statistical agency. Individually identifiable information can be disclosed and is sometimes required to be disclosed for nonstatistical purposes. The agency's 1980 policy on disclosure allows disclosures for nonstatistical purposes to other units of the Department of Energy, the Congress, the General Accounting Office, and when the information does not come under the exemptions in the Freedom of Information Act, to the public. Thus, the 1991 ruling by the Justice Department's Office of Legal Counsel was not a first breach in an otherwise airtight policy of functional separation of data. Rather, it might be looked at as another hole in an already leaky dike.
Officials in EIA are quite concerned about their inability to guarantee full confidentiality of information that they collect or acquire by other means for statistical purposes. They have tried more than once to obtain legislation to remedy the situation, but they have yet to succeed. In the meantime, because it cannot guarantee confidentiality, EIA has been unable to get access to other statistical agencies' lists of companies and establishments for use in its own surveys, and it must develop and maintain completely independent lists, at considerable cost.
Did EIA Mislead Respondents?
Most of EIA's survey respondents are companies or establishments and their response to most of its surveys is mandatory. Following final adoption of the 1980 policy on disclosure, the agency developed a standard notification statement for its mandatory company and establishment surveys, which included information about the conditions under which identifiable information might be released to the public in response to Freedom of Information Act requests, to courts and congressional bodies, and to other federal agencies. The provisions of the policy governing release to other federal agencies were not described in detail in the notification statement; respondents wanting full information had to obtain and study the complex 1980 policy statement in the Federal Register.
One can assume that many respondents did not bother to obtain the policy statement and therefore were not familiar with the
conditions under which their individual data might or might not be released to other federal agencies. It cannot be said that respondents in this category were seriously misled, they were merely not fully informed. However, those who were familiar with the policy were confronted, when they received the August 1991 letter from EIA, with a sudden—and retroactive—change. They had clearly been misled by a statement representing a policy that EIA had believed it could follow but which turned out not to be fully supported by the relevant statutes.
Is the EIA's Problem Unique?
In a sense, every federal statistical agency operates under a different statutory and regulatory framework. However, as discussed in Chapter 5, the agencies may be classified roughly as haves and have-nots with respect to legislation that both allows and requires them to apply the principle of functional separation to all data in their possession. The Energy Information Administration is clearly one of the have-nots and is in an especially awkward position because its statutorily defined mission includes some collection of data for nonstatistical purposes. The Bureau of Labor Statistics (BLS) is also a have-not agency in terms of legal protection of confidentiality, but its mission is more clearly statistical and, as explained in Chapter 5, it has so far been able to maintain de facto functional separation.
The Intermodal Surface Transportation Efficiency Act (P.L. 102–240), passed late in 1991, included provisions to establish a separate Bureau of Transportation Statistics in the Department of Transportation. This legislation has weak confidentiality provisions, clearly putting the new agency in the have-not category with respect to adequate statutory protection of the confidentiality of data obtained for statistical purposes. Legislation, proposed but not passed in 1991, to establish a separate statistical agency in a Department of Environmental Protection had similarly weak confidentiality provisions.
FINDINGS AND RECOMMENDATIONS
Businesses are subject to many kinds of regulations and are also eligible for various benefits. Monitoring compliance with regulations and determining eligibility for benefits require substantial amounts of data for individually identifiable units. Such data are used to make decisions that directly affect individual
businesses. They can also be used for statistical analyses, which are sometimes focused directly on the program for which the data were collected and sometimes may be entirely unrelated to it.
Some kinds of business data, however, are of interest only for statistical research and analysis. Such data can best be collected by statistical agencies that have the authority and mandate to ensure that the data are fully protected from disclosure and from any use whatsoever for nonstatistical purposes. The collection of data that have nonstatistical and statistical uses should be left to programmatic and regulatory agencies. Such data, with identifiers if needed, can be acquired by statistical agencies and used for statistical purposes, but once in the possession of a statistical agency, the data should be given the same confidentiality protection as data collected directly by the agency.
Recommendation 7.1 The principle of functional separation, which the panel endorsed in Recommendation 5.1(a), should apply equally to data for persons and data for organizations.
A similar position was adopted by the Conference of European Statisticians (1991). Its Resolution on the Fundamental Principles of Official Statistics in the Region of the Economic Commission for Europe states that "individual data collected by statistical agencies for statistical compilation, whether they refer to natural or legal persons [such as business organizations], are to be strictly confidential and used exclusively for statistical purposes" (p. 8).
As illustrated by the EIA example, a statistical agency's intention to operate under this principle is not sufficient.
Recommendation 7.2 Legislation that authorizes and requires protection of the confidentiality of data for persons and organizations should be sought for all federal statistical agencies that do not now have it and for any new federal statistical agencies that may be created (see also Recommendation 5.1).
An opposing argument is that, for the sake of efficiency, federal agencies needing data for nonstatistical purposes, especially if related to compliance of businesses with laws and regulations, should be permitted to acquire the data from any agency that has them. Such a policy would seriously threaten the quality of the nation's economic statistics. Businesses, knowing that their census
and survey forms could be used for any purpose, would be less inclined to submit complete, accurate, and timely data. One has only to review the history of economic data in the socialist economies over the past 50 years to understand this. Why risk such consequences for the relatively small efficiencies that might be realized by compliance agencies in deciding which businesses should be the targets of their detailed investigations?
The EIA example also illustrates the need to exercise care in the development of statements that notify respondents to mandatory surveys about how their data will be used and who will have access to them.
Recommendation 7.3 Data providers, whether persons or organizations, should have ready access to as much information as they want about the uses of the information they are requested or required to provide to federal statistical agencies. They should be told who will have access to their data in individually identifiable form. Statements of the collecting agency's intentions should be clearly distinguished from statements describing what is authorized and required by statute.
INABILITY TO SHARE BUSINESS LISTS: AN EMBARRASSMENT TO THE FEDERAL STATISTICAL SYSTEM
The question of access to business lists for statistical purposes by federal agencies has a history covering more than half a century. It is a question that has important implications for the cost, quality, and internal consistency of the economic statistics produced by the federal statistical system. In essence, the problem is that there are significant barriers to interagency sharing of business lists among agencies in the decentralized federal statistical system.
Broadly speaking, business lists are lists of companies, establishments, employers, and other kinds of economic units. The lists contain identifiers, such as name, address, and Employer Identification number, and classifiers, such as a Standard Industrial Classification (SIC) code and size codes based on employment, wages, production, or other measures. A primary statistical use of business lists is the development and maintenance of sampling
frames for censuses and surveys. In addition, the lists may be used as a means of achieving uniformity in the classifications, especially SIC codes, that are assigned to the same units by different agencies.
Major producers of economic statistics include federal statistical agencies and operating agencies with statistical units. In the first category, the Census Bureau, the Bureau of Labor Statistics, and the National Agricultural Statistics Service (NASS) have major programs. We also allude in this case study to the use of business lists by the Bureau of Economic Analysis (BEA) and the Energy Information Administration.
In the second category, the Internal Revenue Service (IRS), with its Statistics of Income program, and the Social Security Administration (SSA), with its Continuous Work History Sample, are of major importance. These two agencies maintain extensive business lists in connection with their tax and benefit programs, lists that have great potential value (only partially realized at this time) for statistical applications. We also refer here to the use of business lists by the Small Business Administration's (SBA's) Office of Advocacy, which has developed its own lists for use in surveys and studies of small businesses.
To what extent are business lists currently being shared? The Census Bureau in the early 1970s developed its Standard Statistical Establishment List (SSEL) to serve as a master list for use in all of its economic censuses and surveys. Direct use of the SSEL for intercensal surveys has been limited, however, because of the difficulty of keeping the list current between censuses. The Census Bureau also originally intended that the SSEL would be available for use by other statistical agencies. There have been several attempts, so far unsuccessful with one small exception, to obtain legislation that would make this possible. A report by the Economic Policy Council's (1987) Working Group on the Quality of Economic Statistics recommended that BLS and NASS be administratively designated, by the Office of Management and Budget (OMB), as the "central collection agencies" for nonfarm and farm business lists, respectively. However, this recommendation was never implemented.
The Census Bureau obtains inputs to the SSEL from several sources, especially the administrative lists of the IRS and SSA. Use of tax return information for this purpose is specifically permitted by the "statistical information" exception to the disclosure constraints of the Internal Revenue Code (see Chapter 5). The terms of this exception were negotiated between IRS and the
Commerce Department in the 1976 amendments that established the current disclosure policy of the Internal Revenue Code (Title 26 U.S.C.). However, those same provisions prohibit redisclosure, by recipients, of tax return information received for statistical purposes. For a particular establishment or employer whose identity was originally supplied by the IRS, the Internal Revenue Code allows the Census Bureau to contact the taxpayer, and any response returned to the Census Bureau is considered to be data collected under the authority of the Census Bureau (Title 13 U.S.C.) rather than tax return information.
There is little flow of business list information from the Census Bureau to other agencies. Exceptions are the occasional correction or updating of SIC codes on lists provided by other agencies, under the authority of a 1953 opinion issued by Attorney General McGranery (41 Op. A.G. 120), and the release of certain SSEL information to BEA, which has become possible as a result of legislation passed by the 101st Congress (Foreign Direct Investment and International Financial Data Improvements Act of 1990, P.L. 101–533). As part of the same legislation, BEA is required to share with BLS and the Census Bureau selected data on foreign direct investment that it collects from business enterprises.
There are a few other examples of business list sharing, but what is more to the point is the sharing that is not occurring. The National Agricultural Statistics Service shares its farm list information with the Census Bureau in preparation for each quinquennial census of agriculture, but Title 13 prohibits any reverse flow of information to NASS for use in its extensive program of current agricultural surveys. The Census Bureau uses farm tax return lists from the IRS as a major source of its sampling frame for the agricultural censuses, but the provisions of the Tax Reform Act of 1976 (P.L. 94–455) do not permit IRS or the Census Bureau to share the same lists with NASS. The result has been to increase substantially the cost to NASS of developing and maintaining the lists of farms that it needs for its data collection programs.
For a long time there had been no sharing of data for individual establishments or other units in either direction between the Census Bureau and BLS, the two agencies with the most extensive program of economic surveys for the nonagricultural sectors. However, changes are occurring. Late in 1990, in preparation for the 1992 economic censuses, the Census Bureau submitted a request to OMB for approval of several classification surveys to improve industry classification for new businesses. The OMB,
within its authority under the Paperwork Reduction Act of 1980 (P.L. 96–511), denied the request by the Census Bureau to collect industry classification information because, in its view, such collection would duplicate surveys already conducted by the BLS. The OMB proceeded to work with BLS, the Census Bureau, and the IRS to resolve some legal questions about the sharing of business list information. Terms negotiated with the three agencies were incorporated in an order (authorized under 44 U.S.C. 3510) directing limited sharing of the data (see MacRae, 1990). A formal interagency agreement between BLS and the Census Bureau (signed by Barbara E. Bryant, Census Bureau, and Janet L. Norwood, BLS, April 19, 1991) implementing the OMB order was negotiated with the assistance and support of OMB. The matching of BLS and Census Bureau records, based primarily on Employer Identification numbers, is being carried out at BLS by BLS employees who are also special sworn employees of the Census Bureau. The emphasis in the list sharing is on the transfer of BLS's SIC codes to the Census Bureau. Under the McGranery opinion cited above, the Census Bureau's SIC codes could be transferred to BLS for units already on BLS lists, but BLS has not asked for the Census Bureau codes. It is not likely that information for unmatched units on either agency's list can be transferred to and used by the other agency. However, the agreement includes research to evaluate such discrepancies in coverage with a view to developing a coordinated data collection strategy, such as using a jointly sponsored data collection program, to resolve them.
The EIA and the SBA's Office of Advocacy require general and specialized business lists for most of their economic surveys, but neither has access to the Census Bureau's SSEL, BLS lists developed in connection with the federal-state Unemployment Insurance program, or the business lists that could be developed from the IRS/SSA administrative systems. Previous efforts to make the SSEL available to other agencies for statistical use have excluded EIA and SBA's Office of Advocacy because, as explained above for EIA, neither agency had the kind of statutory provisions that would guarantee that it could protect the shared list information from all access for nonstatistical purposes.
The proposal for improving the quality of economic statistics issued by Chairman Michael Boskin of the Council of Economic Advisers (1991:6) included as one of seven major elements the development of legislation to permit "limited sharing of confidential statistical information solely for statistical purposes between statistical agencies under stringent safeguards." Such legislation,
as now envisioned, would provide a statutory basis for sharing business lists and other kinds of data among four major federal statistical agencies: the Bureau of Economic Analysis, Bureau of Labor Statistics, Census Bureau, and the National Agricultural Statistics Service. As of early 1993, the legislation had not yet been introduced.
PRACTICES IN OTHER COUNTRIES
The panel reviewed the policies and practices of several other developed countries with respect to statistical and other uses of business lists established and maintained by government statistical agencies. The review was based mainly on eight papers presented at two meetings of the International Roundtable on Business Survey Frames, an informal international group of government statisticians, which has met annually since 1986 to discuss statistical and other uses of business lists. Relevant papers from the 1989 meeting covered the business list confidentiality and access policies of Australia, France, Japan, the Netherlands, Sweden, and the United Kingdom. Papers from the 1990 meeting covered Finland and New Zealand. We obtained information about Canadian policies directly from Statistics Canada.
The nine countries whose policies the panel reviewed vary widely as to who may have access to business lists and the purposes for which they may be used. Insofar as we could determine, Finland, France, and Sweden place no restrictions on access to and uses of basic list information other than payment of fees and, in some instances, prohibition of release to third parties. The most restrictive country was Japan, which allows access to its complete list only by other units of national and local government, solely for statistical purposes. Australia and the United Kingdom joined Japan in allowing access only to other units of government, but they permitted some kinds of nonstatistical uses. The Netherlands allows disclosure to other units of government and specified types of nongovernment entities, solely for statistical purposes. New Zealand makes list information available to any type of organization, but it prohibits release to third parties and some types of nonstatistical uses.
In Canada, the Statistics Act allows the disclosure of lists of businesses, by order of the chief statistician, as an exception to the general prohibition against disclosure contained in the law. A committee reviews requests for such information, using criteria set out in an internal Statistics Canada policy, and makes recommendation
to the chief statistician, who has the discretion to grant or deny each request. For each request, the review committee considers the proposed uses of the lists and their potential impact. Lists may be released for the collection of statistical data if the proposed survey uses acceptable methodology, does not duplicate information already collected, and does not appear to jeopardize respondents' continued cooperation with Statistics Canada. Lists may also be released to assist data users in the analysis or interpretation of data, and for that purpose, they are sometimes included in industry publications, particularly for the manufacturing sector. Lists may include any or all of the following information: names and addresses; telephone numbers for statistical inquiries; official language preferred for statistical inquiries; services provided and products produced, manufactured, processed, transported, stored, purchased or sold; and size, expressed in terms of an employment size range (letter from Ivan P. Fellegi, chief statistician of Canada, to panel, January 26, 1993).
Three of the nine countries give units included in their lists the option to have their names excluded from some kinds of releases. The United Kingdom, in connection with one of its economic surveys, requests permission from manufacturers for their inclusion in a directory that is published at five-year intervals. Less than half of the units, in terms of employment, agree to have their information included. In Finland and the Netherlands, units may request to have their information excluded from any releases of directory information to other organizations. Neither country provided information on the number of such requests.
The most striking finding of this review was that none of the nine countries has business list policies as restrictive as those currently followed in the United States. All of the countries reviewed allow, at a minimum, access to the government's business lists by all units of national government (and generally local government units as well) for statistical purposes. Some allow unrestricted access to the lists for any purpose.
FINDINGS AND RECOMMENDATIONS
In 1939, the U.S. Central Statistical Board proclaimed the need for "a United States Business Directory or Official Mailing List which will show the name, address, and industrial classification of each important business enterprise" (Bureau of the Budget, 1961:1). Over the intervening years, other organizations and advisory groups too numerous to mention have recommended more sharing of
business lists (see, e.g., American Statistical Association, 1980; Economic Policy Council, 1987), but until quite recently the trend has been in the opposite direction.
There is little doubt that significant savings and improvements in the quality and comparability of the economic data produced by BLS, the Census Bureau, and NASS could be realized if all three agencies had full access to the IRS/SSA administrative lists and to each others' lists. Further gains would accrue if other agencies that conduct economic surveys could be brought into the system.
The panel commends OMB's Statistical Policy Office for the steps it has been taking to promote limited sharing of business list information between BLS and the Census Bureau and to develop legislation that will permit further sharing of business lists, as recommended in the Boskin initiative.
Recommendation 7.4 There should be increased sharing of business lists for statistical purposes by federal and state agencies.
Detailed business lists, especially at the establishment level, that are developed by federal agencies for statistical uses should be protected against nonstatistical uses. Hence, federal agencies should have access only if they can guarantee such protection. Two potential statistical users, EIA and the SBA's Office of Advocacy, are currently unable to meet this requirement.
Recommendation 7.5 New legislation on sharing of business lists for statistical purposes should provide that government agencies that are now unable to guarantee protection against nonstatistical uses can have access to business lists if they acquire statutory authority for such protection in the future.
WAIVERS: WHOSE INFORMATION IS IT?
Statistical agencies sometimes request permission from survey respondents to use the latter's information in ways that depart from standard agency policies for the protection of confidentiality. For example, an agency might wish to
transfer individually identifiable information for an organization to another agency for a statistical purpose,
release tabulations without application of some of the masking techniques that would usually be used, or
include identification information for respondents in a published directory of organizations.
Agencies may seek waivers from respondents for such purposes, either because the proposed uses of the latter's data would not usually be permitted by applicable statutes or regulations or because the uses would be contrary to announced agency policies. In terms of fair information principles, the process of requesting waivers, if carried out according to accepted procedures for informed consent, allows respondents greater control over how information about them is used.
Organizations may benefit in some ways from granting waivers requested by statistical agencies. For example, if a waiver permits two agencies to share an organization's data for statistical uses, the organization will not have to provide it to each one separately. If the waiver allows an agency to publish a tabulation with a production or sales data cell that is dominated by the organization, the latter may be in a better position to determine its market share. Below we present several examples of situations in which federal statistical agencies have asked organizations to waive confidentiality protections for their data.
In detailed tabulations of economic survey data, it is common for one or two data providers to dominate a single data cell. For example, one establishment in a county or state may account for a large proportion of total employment, payroll, production, or some other variable. As described in Chapter 6, most agencies have policies that suppress or mask such data cells when one or two units account for more than a specified proportion of the total. At least three federal statistical agencies sometimes seek waivers that will allow them to include such data cells, without the usual suppression or masking, in their publications. The National Agricultural Statistics Service has a formal standard for its state offices to use for obtaining permission from respondents in instances in which there are only one or two respondents in a cell or one respondent accounts for more than 60 percent of the value to be published. Written permission is required and must be
updated every five years (National Agricultural Statistics Service, 1989). Consideration is being given to updating the permission every two or three years.
The Bureau of Labor Statistics has a cooperative program with state employment security agencies for the collection of periodic occupational employment statistics data. Frequently, a single company will account for a high proportion of the persons in particular occupations in its area. In such instances the state agency conducting the survey seeks waivers to allow publication of the affected data cells.
The Census Bureau has a Current Industrial Reports program for periodic collection, from manufacturers, of intercensal data on the production of a large number of specifically defined commodities. For some commodities, one or two manufacturers may dominate total production, even at the national level. The Census Bureau uses a waiver procedure to obtain permission from survey respondents to publish data for the affected commodities.
Until recently, the Census Bureau also used waivers for a different purpose: to share individually identifiable data about cotton ginning operations with NASS. As described in a July 18, 1990, memorandum to the panel from Frederic A. Vogel, an official of NASS,
in the past, NASS has had access to individual gin reports to compile data for the monthly cotton production forecast. This data sharing activity was done with the concurrence of the cotton gins so they could eliminate duplicate reporting.
However, OMB's legal counsel ruled in 1990 that Census Bureau employees may not release individually identifiable information collected under Title 13, even when waivers have been obtained. It was OMB's position, based on the legislative history, that the provision of Title 13 (§ 8(a)) that permits transfer of copies of reports to authorized agents was intended to apply only to special situations involving a few individual respondents, not to large-scale transfer of records from a particular survey. Further, OMB counsel argued, the right to confidentiality under Title 13 constitutes a public right (as opposed to a private right) that cannot be waived by the respondent. In this particular instance, the difficulty was resolved by transferring responsibility for the cotton ginning survey program from the Census Bureau to NASS, which is not governed by Title 13.
FINDINGS AND RECOMMENDATIONS
The panel finds it somewhat incongruous that OMB's interpretation of Title 13 prevents the Census Bureau from using a waiver procedure to share data with NASS for statistical purposes, but that the Census Bureau is able to use a waiver procedure in its Current Industrial Reports program to permit the release of data cells whose publication would usually be contrary to the confidentiality provisions of Title 13. We believe that the use of waiver procedures for the kinds of statistical purposes illustrated in this section should be permitted, provided the consequences of granting waivers are clearly explained to respondents and they are not put under any kind of pressure to grant the permission requested.
Recommendation 7.6 The Office of Management and Budget's Statistical Policy Office should develop uniform guidelines for federal statistical agencies covering the purposes for which waivers of confidentiality protections by organizations are considered acceptable and the methods of obtaining waivers from respondents. Efforts should be made to amend the confidentiality statutes of federal statistical agencies that would otherwise be prevented from using waivers for generally accepted statistical purposes.
With respect to waivers for the publication of data cells dominated by one or two large organizations, there may be some circumstances in which smaller organizations contributing to the same cell, especially if they are few in number, should also be asked for permission to publish the data for that cell. If there are only one or two smaller organizations, they may not want their large competitors to have more precise information about them than would usually be available. The written policies we reviewed did not include any provision for preventing that.
USER ACCESS: GETTING A BETTER RETURN ON INVESTMENTS IN ECONOMIC STATISTICS
As we pointed out in the introduction to this chapter, organizations vary widely on many important characteristics, and thus individual organizations are often easily recognizable on the basis
of a few data items and classifiers, especially if their geographic location is given. Consequently, federal statistical agencies have been unable, with few exceptions, to issue public-use microdata sets containing individual records, minus explicit identifiers, for companies, establishments, employers, and other organizational entities. Even for aggregate data on organizations, the same considerations restrict the amount of detail by location, type of economic activity, and other classifiers that can be published.
Substantial benefits have been realized by data users, statistical agencies, and society as a whole as a result of the wide dissemination of public-use microdata sets on persons. Comparable returns on investment in data collection have not been realized from the resources devoted to statistics on businesses and other organizations. Those data are an underutilized resource.
The same constraints carry over to hierarchical files containing data on persons and organizations. As explained in Chapter 6, an important reason why microdata from the Continuous Work History Sample, a 1 percent longitudinal sample of persons issued Social Security numbers, are no longer widely available to researchers is the concern that some large employers could be identified fairly easily on the basis of their industry classifications and geographic locations. Thus, employers having access to the file might be able to identify their own employees who were in the sample and learn about their work histories and current second jobs. Similar considerations apply to data from surveys conducted by the National Center for Education Statistics in which data are collected simultaneously for students, staff, and educational institutions.
In Chapter 6, we described several forms of restricted access to federal statistical data that are provided for external users: American Statistical Association/National Science Foundation (ASA/NSF) fellowships that allow researchers to work with data at federal agencies; remote on-line access with query restrictions, as in the Luxembourg Income Study; release of encrypted microdata in CD-ROM format; and various types of licensing agreements that provide access for users at their work sites, but place restrictions on the uses that can be made of the data and often include penalties for failure to abide by the terms of the agreement.
To some extent, these kinds of arrangements have been providing greater access to data for organizations over the past few years. Some of the ASA/NSF fellows, including those who have worked at the Census Bureau's Center for Economic Studies, have had access to microdata for establishments and enterprises. Through
contracts for joint research studies the Center for Economic Studies has also provided access to such data, at the Census Bureau, to several researchers from nonprofit organizations. McGuckin (1992), in a discussion paper issued by the Center for Economic Studies, proposes that some economic microdata sets be made available to researchers, working as special sworn Census Bureau employees, at the Census Bureau's regional offices. He also recommends a broad interpretation of the requirement that research studies relying on this mode of data access be of joint interest to the Census Bureau and the researchers.
The National Agricultural Statistics Service and the Economic Research Service of the Department of Agriculture have established a research enclave that makes it possible for researchers to have limited access, in Washington, D.C., to microdata on farms from the two agencies' annual Farm Costs and Returns Survey. The National Agricultural Statistics Service has also developed administrative procedures that allow some researchers to have restricted access to statistical data at its state offices. The National Center for Education Statistics has become an active proponent of dissemination of data by means of encrypted CD-ROM diskettes and licensing agreements, and some of the data released in those ways are for schools.
Despite these developments, external users still have difficulty obtaining access to federal statistical data on organizations. This is particularly true for data on nonagricultural establishments and other economic units, for which many of the data sets that would be of most interest to researchers are maintained by the Census Bureau.
FINDINGS AND RECOMMENDATIONS
The panel's general findings and recommendations about procedures for giving external users restricted access to federal statistical data were presented in Chapter 6. In brief, we expressed our belief that a greater return on public investment in statistical programs would be possible through carefully controlled expansion of the availability of federal data sets to external users. We encouraged statistical agencies to develop and use some of the newer data dissemination techniques, such as the use of encrypted CD-ROM diskettes and licensing agreements, with appropriate confidentiality safeguards and periodic reviews of costs and benefits.
We believe there is a need for substantially expanded user
access to federal statistical data about organizations, especially business establishments and other economic units.
Recommendation 7.7 Federal statistical agencies that collect data on organizations should make a special effort to improve access for statistical research and analysis by external users and, if necessary, should seek legislation that will permit them to develop licensing arrangements that allow such users to have access at their work sites, subject to penalties for violating the conditions under which they are allowed access to the data.