Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 77
5 Confidentiality, Disclosure, and Data Access In contrast to the usual situation in federal government survey data c ollections—in which the data are available for statistical use but are pro- tected from being used for compliance and enforcement purposes—data on equal employment opportunity (EEO) issues are available for compliance purposes but are closely held and not often made available for research and statistical analysis purposes. This anomalous situation poses interest- ing challenges to the U.S. Equal Employment Opportunity Commission (EEOC) and the other federal agencies that have responsibility for the data collected from public- and private-sector employers and unions for anti discrimination enforcement purposes. In addition to internal EEOC compliance and analytical uses, the data collected from employers have value to other federal and state agencies for their compliance and analytical purposes, to researchers to support analy- sis of discrimination practices, and to those who evaluate the effectiveness and efficiency of antidiscrimination programs. These uses outside of EEOC require the agency to develop practices and procedures to protect the data that are collected from employers under a pledge of confidentiality.1 1 hat pledge derives from Title VII, Section 709(e) of the Civil Rights Act of 1964, which T sets the requirements for confidentiality: It shall be unlawful for any officer or employee of the Commission to make public in any manner whatever any information obtained by the Commission pursuant to its authority under this section prior to the institution of any proceeding under this subchapter involving such information. Any officer or employee of the Commission who shall make public in any manner whatever any informa- tion in violation of this subsection shall be guilty, of a misdemeanor and upon conviction thereof, shall be fined not more than $1,000 or imprisoned not more than one year. 77
OCR for page 78
78 COLLECTING COMPENSATION DATA FROM EMPLOYERS In this chapter we discuss current EEOC procedures for protecting confidential employer data in tabular and microdata form, evaluate the ef- fectiveness of those measures, and suggest possible enhancements to those measures. STATISTICAL PROTECTION OF TABULAR DATA AND MICRODATA As discussed in Chapter 1, the EEOC now publishes a large amount of data that are derived from the collection of information from employers, both private and public. These data are generally published in aggregated form by geographic area and industry group detail in standard tabular packages that are posted on the EEOC website and otherwise made avail- able to the public. To comply with the confidentiality provisions of Title VII that govern release of individually identifiable information from EEO-1 reports (see Chapter 1), the tables are assembled under reportedly elaborate but unpublished rules that provide for suppression of data that could iden- tify a particular establishment or multi-establishment firm. In releasing aggregated data of private employers collected from annual EEO-1 surveys, the EEOC uses a data suppression rule that is quite similar to the rule used by other federal government agencies for statistical data based on information collected from employers, including the Quarterly Census of Employment and Wages (QCEW) Program from the Bureau of Labor Statistics (BLS).2 The EEOC suppression rule is triggered when it meets the two primary suppression stipulations: (1) the group has three or fewer employers, or (2) one employer makes up at least 80 percent of the group employment in the aggregate. In applying the suppression rules to industry group or geography en- tity or any combination of aggregates, the EEOC withholds any group’s numbers if the group (an industry or a geography entity or an industry-by- geography group, etc.) contains fewer than three firms (represented by the presence of any number of establishment(s) of an individual firm within the group) or if any one firm in the group (represented by the total numbers of all the establishment(s) of the same firm within the given group) constitutes more than 80 percent of the group totals. Unlike some other federal agencies, EEOC does not withhold aggre- gated data beyond its two primary suppression rules. There are no second- ary suppression rules, and the agency does not further screen the aggregated data if the data have passed the fewer-than-three rule test. But although 2 or more information on suppression, see http://www.bls.gov/opub/hom/homch5_d. F htm#Presentation [December 2011].
OCR for page 79
CONFIDENTIALITY, DISCLOSURE, AND DATA ACCESS 79 EEOC literature documents the above rules, as a general practice EEOC does not disclose the detailed methodology for suppression because the agency wants to prevent users from reverse-engineering the data in order to obtain the suppressed numbers. Cell suppression is just one means of protecting tabular data. Because there is always a risk of secondary disclosure, other means have been explored in recent years by U.S. government agencies to protect data by perturbing the data in some way (see Reznek, 2006, p. 3). Two methods are discussed here: adding noise and controlled tabular adjustment. Noise addition is accomplished by adding random “noise” to the un- derlying establishment-reported data before they are tabulated. In this data perturbation method, cell values that would normally meet the criteria for suppression are changed by a large amount, while cell values that are not as sensitive are changed by a smaller amount. This technique is less compli- cated than cell suppression, and, by adding noise, an agency can show data for all cells and for all tables, which preserves the ability to draw inferences from all cells.3 Another effort to preserve the analytical value of protected sensitive data is being developed using a controlled tabular adjustment technique. In this technique, a sensitivity rule determines which cells are sensitive, and the technique replaces each sensitive value with a safe value that is some distance away from the sensitive value. To preserve additivity, the nonsensitive values are minimally adjusted (Reznek, 2006, p. 5). Another increasingly popular technique that is intended to make data available for research and analytical purposes is to generate synthetic data: for generation of synthetic microdata, see Reiter (2005); for generation of synthetic tables, see Slovkovic and Lee (2010). This technique relies on sampling and simulations. Typically, a model is developed to generate synthetic or partially synthetic data that have some of the same properties as the original data by sampling from the posterior predictive distribution of the confidential data. A typical method would be to use a sequential regression imputation. In this procedure, the original value of each variable is blanked-out and replaced by a model-generated value. The technique has been used at the Census Bureau to develop a synthesized microdata file link- ing Social Security Administration earnings data with data from a Census Bureau demographic survey (Reznek, 2006, p. 6). Creating publicly available data products that are statistically valid and in which confidential data are protected is a complicated process. 3 The technique is currently being used by the Census Bureau to protect confidential mi- crodata from the Longitudinal Employer-Household Dynamics (LEHD) Program used in the Quarterly Workforce Indicators, which use, as inputs, sensitive data from unemployment insurance wage records and Census Bureau demographic and economic information (Abowd et al., 2006).
OCR for page 80
80 COLLECTING COMPENSATION DATA FROM EMPLOYERS The best procedure to use depends on the type of data and their intended purposes, as well as on the risks of disclosure. For an overview of current statistical disclosure limitation practices in the United States, see Federal Committee on Statistical Methodology (2005). Many new techniques are being developed. The most recent ones combine techniques from statistics and computer sciences and aim to account for increased disclosure risk due to the presence of more externally available information and better record linkage technologies. Recent advances in data redaction strategies and data sharing, which include, among others, virtual research data centers, remote access servers, privacy-preserving mechanisms for distributed data bases, and differentially private mechanisms, are highlighted in a special 2009 issue of the Journal of Privacy and Confidentiality (Kinney, Karr, and Gonzalez, 2009). PROTECTING ORIGINAL DATA EEOC Procedures The actual, original data collected from the forms that employers sub- mit to EEOC are now shared with the Office of Federal Contract Compli- ance Programs (OFCCP) of the U.S. Department of Labor, the Civil Rights Division of the U.S. Department of Justice (DOJ), and 95 state-level fair employment practices agencies (FEPAs). There are other sharing arrange- ments with the U.S. Department of Education and with researchers. Often these agencies have their own procedures for assuring the confidentiality of the shared data. The specific arrangements vary in each instance. For example, OFCCP is a statutory member of the Joint Reporting Committee with EEOC for the collection of the EEO-1 reports. This arrangement is made known in advance to companies that provide their data to the EEOC.4 According to protocols that are in place for the Joint Reporting Committee, EEOC collects the data, edits them as needed, appends some additional identifiers to the records, and transmits a copy of the entire statistical file to OFCCP. The DOJ Civil Rights Division is a member of a joint state and local reporting committee with EEOC for the collection of EEO-4 reports (see Chapter 1). As it does with the EEO-1 data, EEOC collects the data and at the conclusion of the survey forwards a copy of the EEO-4 statistical file 4 he T EEO-1 instruction booklet (p. 1) states: In the interests of consistency, uniformity and economy, Standard Form 100 has been jointly de- veloped by the Equal Employment Opportunity Commission and the Office of Federal Contract Compliance Programs of the U.S. Department of Labor, as a single form which meets the statistical needs of both programs.
OCR for page 81
CONFIDENTIALITY, DISCLOSURE, AND DATA ACCESS 81 to DOJ.5 It also transmits copies of the actual individual EEO-4 reports directly to DOJ officials, allowing immediate access to reports during the reporting period, as well as access to historical data. FEPAs are state or local authorities that investigate and resolve charges of employment discrimination filed under Title VII, the Americans with Disabilities Act (ADA), the Age Discrimination Employment Act (ADEA), and comparable state laws and local ordinances in partnership with EEOC. Over the years, EEOC has negotiated work-sharing agreements with these agencies that allow the sharing of data. EEO-1 data are shared routinely in a charge tracking system that EEOC provides, which enables the FEPAs to retrieve the reports and run statistical comparisons. Other data are shared on an ad hoc basis. Under the auspices of a school reporting committee, the EEOC shares EEO-5 data (see Chapter 1) with DOJ and the U.S. Department of Educa- tion. Statistical files are shared with both agencies. Specific requests for EEO-5 data are also honored, most often for DOJ. From time to time, EEOC has entered into agreements with other federal agencies to allow the sharing of survey data. Currently, the only active agreement is with DOJ to share EEO-1 data. The memorandum of understanding (MOU) agreement, discussed below, spells out strict provi- sions for the protection of the confidentiality of the data. The EEOC has also historically entered into agreements with individual researchers to allow the sharing of data: see Box 5-1. This has been a prac- tice of the EEOC since 1969, when EEOC entered into an agreement with Eleanor Brantley Schwartz of Georgia State University to study women in management. The mechanism for sharing data in a protected environment is quite detailed, complicated, and time consuming, and it relies on giving the potential data user the status of a sworn federal employee. Office of Federal Contract Compliance Programs Procedures OFCCP confidential data are derived from a “scheduling letter” process in which compliance reviews are initiated and certain documents and data sets are requested. The documents consist of the written Affirmative Ac- tion Plan (AAP) for the scheduled facility, certain compensation data, and information on additional personnel practices and policies to demonstrate compliance obligations. 5 his T arrangement is described in the EEO-4 booklet (p. 1): In the interests of consistency, uniformity and economy, State and Local Government EEO-4 is being used by Federal government agencies that have responsibilities for equal employment opportunity. A joint State and Local Reporting Committee, with which this report must be filed, represents those various agencies.
OCR for page 82
82 COLLECTING COMPENSATION DATA FROM EMPLOYERS BOX 5-1 Intergovernmental Personnel Act Agreements with Researchers EEOC has used Intergovernmental Personnel Act agreements that detail outside persons to an employment arrangement to allow the sharing of survey data. These agreements give the researcher the status of a federal employee and access to the data. The researcher signs an agreement that prohibits disclosure of the data to anyone (including professors, advisers, and colleagues), except those persons directly employed by the project. It also requires the researcher to submit any work based on the EEOC information to the EEOC to (a) determine whether it contains any confidential information and (b) approve any language describing the relationship between the researcher and the EEOC. The data are to be returned to EEOC at the conclusion of the project, and all working files are to be certified as destroyed. The penalties for disclosure of confidential data in Title VII are formally transferred to the researcher. SOURCE: Summary by panel staff of sample EEOC Intergovernmental Personnel Act agree- ment for external researchers, provided by EEOC staff on November 28, 2011. Unlike EEOC, OFCCP has no formal data-sharing arrangement with federal or state agencies. Its data sharing occurs on an ad hoc or informal basis, such as when OFCCP refers cases to DOJ or EEOC to pursue en- forcement. Sharing can also occur on a very limited basis under the MOU with EEOC. For data collected only by OFCCP, the past instances of data sharing have been infrequent, although additional sharing with EEOC can be foreseen. Unlike Title VII of the Civil Rights Act of 1964 as amended, Execu- tive Order 11264, which comprises the legal basis for OFCCP, is silent on rules and penalties for confidentiality of data from employers. However, confidentiality provisions that cover OFCCP are spelled out in the agency’s regulations (see 41 C.F.R. 60-1.20(f)–(g) and 60-1.43). The regulations es- sentially state that the disclosure of data to the public is subject to the Free- dom of Information Act and the Privacy Act and also to the procedures for preclusion of certain data due to assertion of privileges during litigation.6 6 FCCP rules were spelled out in the regulation that authorized the collection of the O Equal Opportunity Survey (41 C.F.R. 60-2.18(d)). These rules state: (d) Confidentiality. OFCCP will treat information contained in the Equal Opportunity Survey as confidential to the maximum extent the information is exempt from public disclosure under the Freedom of Information Act, 5 U.S.C. 552. It is the practice of OFCCP not to release data where the contractor is still in business and, where the contractor indicates, and through the Department of Labor review process it is determined, that the data are confidential and sensitive and that the release of data would subject the contractor to commercial harm.
OCR for page 83
CONFIDENTIALITY, DISCLOSURE, AND DATA ACCESS 83 The OFCCP approach to data confidentiality is evolving in the direction of greater transparency. An example is a new initiative under the umbrella of the Open Government Directive,7 under which the Department of Labor (DOL) has developed a searchable “enforcement database” comprised of DOL enforcement agencies, including OFCCP.8 This database is available for viewing by academic researchers, stakeholders, and the public. Users can retrieve data by state or zip code, the company name, North American Industry Classification System codes, violation, and year. The database divides OFCCP data into two categories: evaluations (compliance reviews) and investigations (complaints). In making these administrative data avail- able for the first time, OFCCP has a policy of limiting disclosed informa- tion. For example, it provides only data specific to the facility reviewed and only summary data (yes/no) for violations found, if any. However, it should be noted that the true underlying disclosure risks with such data are not fully understood. Department of Justice Procedures As noted above, DOJ’s Civil Rights Division obtains EEO-4 data from EEOC on a regular basis and holds it in confidence as a member of the joint state and local reporting committee. The DOJ uses the EEO-4 data to identify investigations that it believes should be launched, but it does not use the data directly in the investigation, nor are the data directly used in court cases. Instead, DOJ uses the data collected in the process of discovery to support its litigation. The transmittal of EEO-1 data from EEOC to DOJ is covered by an MOU that was executed in May 2011.9 The MOU calls on EEOC to provide DOJ with data for the most recent reporting period as soon as practicable after the EEOC has reconciled and finalized the statistical file. Historical EEO-1 files are also to be provided. In turn, DOJ agrees to pre- serve the confidentiality of the data in the same manner that EEOC employ- ees are required by Title VII of the Civil Rights Act of 1964 as amended. Among the steps leading to identification of a possible infringement of EEO laws, the DOJ compares the profiles of the public sector organizations 7 hite W House, Memorandum on Transparency & Open Government, M-10-06. Decem- ber 8, 2009. See http://www.whitehouse.gov/sites/default/files/omb/assets/memoranda_2010/ m10-06.pdf [October 2012]. 8 or details, see http://ogesdw.dol.gov [July 2012]. F 9 .S. Equal Employment Opportunity Commission, Memorandum of Understanding Be- U tween the U.S. Equal Employment Opportunity Commission and the U.S. Department of Justice–Civil Rights Division for Sharing of Employer Information Report (EEO-1) Data, May 12, 2011.
OCR for page 84
84 COLLECTING COMPENSATION DATA FROM EMPLOYERS under the agency’s jurisdiction with similar organizations in the private sec- tor, using the EEO-1 data that are obtained from EEOC. FURTHER PROTECTION OF SHARED EEO DATA As the above discussion indicates, the EEOC shares sensitive EEO-4 and EEO-1 report data with other agencies in the federal government and with the FEPAs through rather informal arrangements, most of which are not backed by force of law. This practice is in contrast to the usual prac- tice of federal statistical agencies that protect shared data through formal agreements backed by clear legislative authority that is enforced by stern penalties. For EEOC, even when there is an agreement, such as the one with DOJ, to share EEO-1 data, there is no indication that the data are shielded from court challenge or from requests under the Freedom of Information Act when they are shared. In recent years, a procedure for protecting shared data has been imple- mented by several federal statistical agencies that might well serve as a model for protecting the EEOC employer data. The Bureau of Labor Sta- tistics, Census Bureau, and Bureau of Economic Analysis can now share confidential data obtained from employers under provisions of the Confi- dential Information Protection and Statistical Efficiency Act (CIPSEA). This statute, under the umbrella of the U.S. Office of Management and Budget, prohibits disclosure or release, for nonstatistical purposes, of information collected under a pledge of confidentiality. Under this law, data may not be released to unauthorized persons. Willful and knowing disclosure of pro- tected data to unauthorized persons is a felony punishable by up to 5 years imprisonment and up to a $250,000 fine—penalties that are significantly more stringent than those that are enumerated in the Title VII legislation. It is certain that the sensitivity of the data that employers provide to EEOC will be heightened if earnings data were to be added to the EEO data records. Employee compensation data are generally considered to be highly sensitive; they are even considered proprietary information by many private-sector employers. As this chapter points out, EEOC provides data to agencies that do not have the same level of confidentiality protections and are not covered by the same penalties that apply to EEOC employees and researchers under Interagency Personnel Act (IPA) agreements. Legislation patterned after the CIPSEA law could increase the protection of confidentiality of EEO data, specifically, to authorize sharing agreements between EEOC, OFCCP, DOJ, and the state and local FEPAs and extend the Title VII penalties beyond EEOC and its IPA researchers.
OCR for page 85
CONFIDENTIALITY, DISCLOSURE, AND DATA ACCESS 85 Such protection could be expected to increase the willingness of em- ployers to provide detailed employment data. It could also help mitigate concerns of other federal agencies about the matching of the EEO-1 survey records to administrative data (such as those discussed in Chapter 2) if such matching was some day deemed useful to help improve the quality of the data.