Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 77
5
Confidentiality, Disclosure,
and Data Access
In contrast to the usual situation in federal government survey data
c
ollections—in which the data are available for statistical use but are pro-
tected from being used for compliance and enforcement purposes—data on
equal employment opportunity (EEO) issues are available for compliance
purposes but are closely held and not often made available for research
and statistical analysis purposes. This anomalous situation poses interest-
ing challenges to the U.S. Equal Employment Opportunity Commission
(EEOC) and the other federal agencies that have responsibility for the data
collected from public- and private-sector employers and unions for anti
discrimination enforcement purposes.
In addition to internal EEOC compliance and analytical uses, the data
collected from employers have value to other federal and state agencies for
their compliance and analytical purposes, to researchers to support analy-
sis of discrimination practices, and to those who evaluate the effectiveness
and efficiency of antidiscrimination programs. These uses outside of EEOC
require the agency to develop practices and procedures to protect the data
that are collected from employers under a pledge of confidentiality.1
1 hat pledge derives from Title VII, Section 709(e) of the Civil Rights Act of 1964, which
T
sets the requirements for confidentiality:
It shall be unlawful for any officer or employee of the Commission to make public in any manner
whatever any information obtained by the Commission pursuant to its authority under this section
prior to the institution of any proceeding under this subchapter involving such information. Any
officer or employee of the Commission who shall make public in any manner whatever any informa-
tion in violation of this subsection shall be guilty, of a misdemeanor and upon conviction thereof,
shall be fined not more than $1,000 or imprisoned not more than one year.
77
OCR for page 78
78 COLLECTING COMPENSATION DATA FROM EMPLOYERS
In this chapter we discuss current EEOC procedures for protecting
confidential employer data in tabular and microdata form, evaluate the ef-
fectiveness of those measures, and suggest possible enhancements to those
measures.
STATISTICAL PROTECTION OF TABULAR
DATA AND MICRODATA
As discussed in Chapter 1, the EEOC now publishes a large amount of
data that are derived from the collection of information from employers,
both private and public. These data are generally published in aggregated
form by geographic area and industry group detail in standard tabular
packages that are posted on the EEOC website and otherwise made avail-
able to the public. To comply with the confidentiality provisions of Title
VII that govern release of individually identifiable information from EEO-1
reports (see Chapter 1), the tables are assembled under reportedly elaborate
but unpublished rules that provide for suppression of data that could iden-
tify a particular establishment or multi-establishment firm.
In releasing aggregated data of private employers collected from annual
EEO-1 surveys, the EEOC uses a data suppression rule that is quite similar
to the rule used by other federal government agencies for statistical data
based on information collected from employers, including the Quarterly
Census of Employment and Wages (QCEW) Program from the Bureau of
Labor Statistics (BLS).2 The EEOC suppression rule is triggered when it
meets the two primary suppression stipulations: (1) the group has three or
fewer employers, or (2) one employer makes up at least 80 percent of the
group employment in the aggregate.
In applying the suppression rules to industry group or geography en-
tity or any combination of aggregates, the EEOC withholds any group’s
numbers if the group (an industry or a geography entity or an industry-by-
geography group, etc.) contains fewer than three firms (represented by the
presence of any number of establishment(s) of an individual firm within the
group) or if any one firm in the group (represented by the total numbers of
all the establishment(s) of the same firm within the given group) constitutes
more than 80 percent of the group totals.
Unlike some other federal agencies, EEOC does not withhold aggre-
gated data beyond its two primary suppression rules. There are no second-
ary suppression rules, and the agency does not further screen the aggregated
data if the data have passed the fewer-than-three rule test. But although
2 or more information on suppression, see http://www.bls.gov/opub/hom/homch5_d.
F
htm#Presentation [December 2011].
OCR for page 79
CONFIDENTIALITY, DISCLOSURE, AND DATA ACCESS 79
EEOC literature documents the above rules, as a general practice EEOC
does not disclose the detailed methodology for suppression because the
agency wants to prevent users from reverse-engineering the data in order
to obtain the suppressed numbers.
Cell suppression is just one means of protecting tabular data. Because
there is always a risk of secondary disclosure, other means have been
explored in recent years by U.S. government agencies to protect data by
perturbing the data in some way (see Reznek, 2006, p. 3). Two methods are
discussed here: adding noise and controlled tabular adjustment.
Noise addition is accomplished by adding random “noise” to the un-
derlying establishment-reported data before they are tabulated. In this data
perturbation method, cell values that would normally meet the criteria for
suppression are changed by a large amount, while cell values that are not
as sensitive are changed by a smaller amount. This technique is less compli-
cated than cell suppression, and, by adding noise, an agency can show data
for all cells and for all tables, which preserves the ability to draw inferences
from all cells.3 Another effort to preserve the analytical value of protected
sensitive data is being developed using a controlled tabular adjustment
technique. In this technique, a sensitivity rule determines which cells are
sensitive, and the technique replaces each sensitive value with a safe value
that is some distance away from the sensitive value. To preserve additivity,
the nonsensitive values are minimally adjusted (Reznek, 2006, p. 5).
Another increasingly popular technique that is intended to make data
available for research and analytical purposes is to generate synthetic data:
for generation of synthetic microdata, see Reiter (2005); for generation
of synthetic tables, see Slovkovic and Lee (2010). This technique relies
on sampling and simulations. Typically, a model is developed to generate
synthetic or partially synthetic data that have some of the same properties
as the original data by sampling from the posterior predictive distribution
of the confidential data. A typical method would be to use a sequential
regression imputation. In this procedure, the original value of each variable
is blanked-out and replaced by a model-generated value. The technique has
been used at the Census Bureau to develop a synthesized microdata file link-
ing Social Security Administration earnings data with data from a Census
Bureau demographic survey (Reznek, 2006, p. 6).
Creating publicly available data products that are statistically valid
and in which confidential data are protected is a complicated process.
3
The technique is currently being used by the Census Bureau to protect confidential mi-
crodata from the Longitudinal Employer-Household Dynamics (LEHD) Program used in the
Quarterly Workforce Indicators, which use, as inputs, sensitive data from unemployment
insurance wage records and Census Bureau demographic and economic information (Abowd
et al., 2006).
OCR for page 80
80 COLLECTING COMPENSATION DATA FROM EMPLOYERS
The best procedure to use depends on the type of data and their intended
purposes, as well as on the risks of disclosure. For an overview of current
statistical disclosure limitation practices in the United States, see Federal
Committee on Statistical Methodology (2005). Many new techniques are
being developed. The most recent ones combine techniques from statistics
and computer sciences and aim to account for increased disclosure risk due
to the presence of more externally available information and better record
linkage technologies. Recent advances in data redaction strategies and
data sharing, which include, among others, virtual research data centers,
remote access servers, privacy-preserving mechanisms for distributed data
bases, and differentially private mechanisms, are highlighted in a special
2009 issue of the Journal of Privacy and Confidentiality (Kinney, Karr,
and Gonzalez, 2009).
PROTECTING ORIGINAL DATA
EEOC Procedures
The actual, original data collected from the forms that employers sub-
mit to EEOC are now shared with the Office of Federal Contract Compli-
ance Programs (OFCCP) of the U.S. Department of Labor, the Civil Rights
Division of the U.S. Department of Justice (DOJ), and 95 state-level fair
employment practices agencies (FEPAs). There are other sharing arrange-
ments with the U.S. Department of Education and with researchers. Often
these agencies have their own procedures for assuring the confidentiality
of the shared data.
The specific arrangements vary in each instance. For example, OFCCP
is a statutory member of the Joint Reporting Committee with EEOC for
the collection of the EEO-1 reports. This arrangement is made known in
advance to companies that provide their data to the EEOC.4 According
to protocols that are in place for the Joint Reporting Committee, EEOC
collects the data, edits them as needed, appends some additional identifiers
to the records, and transmits a copy of the entire statistical file to OFCCP.
The DOJ Civil Rights Division is a member of a joint state and local
reporting committee with EEOC for the collection of EEO-4 reports (see
Chapter 1). As it does with the EEO-1 data, EEOC collects the data and at
the conclusion of the survey forwards a copy of the EEO-4 statistical file
4 he
T EEO-1 instruction booklet (p. 1) states:
In the interests of consistency, uniformity and economy, Standard Form 100 has been jointly de-
veloped by the Equal Employment Opportunity Commission and the Office of Federal Contract
Compliance Programs of the U.S. Department of Labor, as a single form which meets the statistical
needs of both programs.
OCR for page 81
CONFIDENTIALITY, DISCLOSURE, AND DATA ACCESS 81
to DOJ.5 It also transmits copies of the actual individual EEO-4 reports
directly to DOJ officials, allowing immediate access to reports during the
reporting period, as well as access to historical data.
FEPAs are state or local authorities that investigate and resolve charges
of employment discrimination filed under Title VII, the Americans with
Disabilities Act (ADA), the Age Discrimination Employment Act (ADEA),
and comparable state laws and local ordinances in partnership with EEOC.
Over the years, EEOC has negotiated work-sharing agreements with these
agencies that allow the sharing of data. EEO-1 data are shared routinely in
a charge tracking system that EEOC provides, which enables the FEPAs to
retrieve the reports and run statistical comparisons. Other data are shared
on an ad hoc basis.
Under the auspices of a school reporting committee, the EEOC shares
EEO-5 data (see Chapter 1) with DOJ and the U.S. Department of Educa-
tion. Statistical files are shared with both agencies. Specific requests for
EEO-5 data are also honored, most often for DOJ.
From time to time, EEOC has entered into agreements with other
federal agencies to allow the sharing of survey data. Currently, the only
active agreement is with DOJ to share EEO-1 data. The memorandum of
understanding (MOU) agreement, discussed below, spells out strict provi-
sions for the protection of the confidentiality of the data.
The EEOC has also historically entered into agreements with individual
researchers to allow the sharing of data: see Box 5-1. This has been a prac-
tice of the EEOC since 1969, when EEOC entered into an agreement with
Eleanor Brantley Schwartz of Georgia State University to study women in
management. The mechanism for sharing data in a protected environment
is quite detailed, complicated, and time consuming, and it relies on giving
the potential data user the status of a sworn federal employee.
Office of Federal Contract Compliance Programs Procedures
OFCCP confidential data are derived from a “scheduling letter” process
in which compliance reviews are initiated and certain documents and data
sets are requested. The documents consist of the written Affirmative Ac-
tion Plan (AAP) for the scheduled facility, certain compensation data, and
information on additional personnel practices and policies to demonstrate
compliance obligations.
5 his
T arrangement is described in the EEO-4 booklet (p. 1):
In the interests of consistency, uniformity and economy, State and Local Government EEO-4 is being
used by Federal government agencies that have responsibilities for equal employment opportunity.
A joint State and Local Reporting Committee, with which this report must be filed, represents
those various agencies.
OCR for page 82
82 COLLECTING COMPENSATION DATA FROM EMPLOYERS
BOX 5-1
Intergovernmental Personnel Act Agreements with Researchers
EEOC has used Intergovernmental Personnel Act agreements that detail
outside persons to an employment arrangement to allow the sharing of survey
data. These agreements give the researcher the status of a federal employee and
access to the data. The researcher signs an agreement that prohibits disclosure of
the data to anyone (including professors, advisers, and colleagues), except those
persons directly employed by the project. It also requires the researcher to submit
any work based on the EEOC information to the EEOC to (a) determine whether it
contains any confidential information and (b) approve any language describing the
relationship between the researcher and the EEOC. The data are to be returned to
EEOC at the conclusion of the project, and all working files are to be certified as
destroyed. The penalties for disclosure of confidential data in Title VII are formally
transferred to the researcher.
SOURCE: Summary by panel staff of sample EEOC Intergovernmental Personnel Act agree-
ment for external researchers, provided by EEOC staff on November 28, 2011.
Unlike EEOC, OFCCP has no formal data-sharing arrangement with
federal or state agencies. Its data sharing occurs on an ad hoc or informal
basis, such as when OFCCP refers cases to DOJ or EEOC to pursue en-
forcement. Sharing can also occur on a very limited basis under the MOU
with EEOC. For data collected only by OFCCP, the past instances of data
sharing have been infrequent, although additional sharing with EEOC can
be foreseen.
Unlike Title VII of the Civil Rights Act of 1964 as amended, Execu-
tive Order 11264, which comprises the legal basis for OFCCP, is silent on
rules and penalties for confidentiality of data from employers. However,
confidentiality provisions that cover OFCCP are spelled out in the agency’s
regulations (see 41 C.F.R. 60-1.20(f)–(g) and 60-1.43). The regulations es-
sentially state that the disclosure of data to the public is subject to the Free-
dom of Information Act and the Privacy Act and also to the procedures for
preclusion of certain data due to assertion of privileges during litigation.6
6 FCCP rules were spelled out in the regulation that authorized the collection of the
O
Equal Opportunity Survey (41 C.F.R. 60-2.18(d)). These rules state:
(d) Confidentiality. OFCCP will treat information contained in the Equal Opportunity Survey as
confidential to the maximum extent the information is exempt from public disclosure under the
Freedom of Information Act, 5 U.S.C. 552. It is the practice of OFCCP not to release data where
the contractor is still in business and, where the contractor indicates, and through the Department
of Labor review process it is determined, that the data are confidential and sensitive and that the
release of data would subject the contractor to commercial harm.
OCR for page 83
CONFIDENTIALITY, DISCLOSURE, AND DATA ACCESS 83
The OFCCP approach to data confidentiality is evolving in the direction
of greater transparency. An example is a new initiative under the umbrella
of the Open Government Directive,7 under which the Department of Labor
(DOL) has developed a searchable “enforcement database” comprised of
DOL enforcement agencies, including OFCCP.8 This database is available
for viewing by academic researchers, stakeholders, and the public. Users
can retrieve data by state or zip code, the company name, North American
Industry Classification System codes, violation, and year. The database
divides OFCCP data into two categories: evaluations (compliance reviews)
and investigations (complaints). In making these administrative data avail-
able for the first time, OFCCP has a policy of limiting disclosed informa-
tion. For example, it provides only data specific to the facility reviewed and
only summary data (yes/no) for violations found, if any. However, it should
be noted that the true underlying disclosure risks with such data are not
fully understood.
Department of Justice Procedures
As noted above, DOJ’s Civil Rights Division obtains EEO-4 data from
EEOC on a regular basis and holds it in confidence as a member of the
joint state and local reporting committee. The DOJ uses the EEO-4 data to
identify investigations that it believes should be launched, but it does not
use the data directly in the investigation, nor are the data directly used in
court cases. Instead, DOJ uses the data collected in the process of discovery
to support its litigation.
The transmittal of EEO-1 data from EEOC to DOJ is covered by
an MOU that was executed in May 2011.9 The MOU calls on EEOC to
provide DOJ with data for the most recent reporting period as soon as
practicable after the EEOC has reconciled and finalized the statistical file.
Historical EEO-1 files are also to be provided. In turn, DOJ agrees to pre-
serve the confidentiality of the data in the same manner that EEOC employ-
ees are required by Title VII of the Civil Rights Act of 1964 as amended.
Among the steps leading to identification of a possible infringement of
EEO laws, the DOJ compares the profiles of the public sector organizations
7 hite
W House, Memorandum on Transparency & Open Government, M-10-06. Decem-
ber 8, 2009. See http://www.whitehouse.gov/sites/default/files/omb/assets/memoranda_2010/
m10-06.pdf [October 2012].
8 or details, see http://ogesdw.dol.gov [July 2012].
F
9 .S. Equal Employment Opportunity Commission, Memorandum of Understanding Be-
U
tween the U.S. Equal Employment Opportunity Commission and the U.S. Department of
Justice–Civil Rights Division for Sharing of Employer Information Report (EEO-1) Data,
May 12, 2011.
OCR for page 84
84 COLLECTING COMPENSATION DATA FROM EMPLOYERS
under the agency’s jurisdiction with similar organizations in the private sec-
tor, using the EEO-1 data that are obtained from EEOC.
FURTHER PROTECTION OF SHARED EEO DATA
As the above discussion indicates, the EEOC shares sensitive EEO-4
and EEO-1 report data with other agencies in the federal government and
with the FEPAs through rather informal arrangements, most of which are
not backed by force of law. This practice is in contrast to the usual prac-
tice of federal statistical agencies that protect shared data through formal
agreements backed by clear legislative authority that is enforced by stern
penalties. For EEOC, even when there is an agreement, such as the one with
DOJ, to share EEO-1 data, there is no indication that the data are shielded
from court challenge or from requests under the Freedom of Information
Act when they are shared.
In recent years, a procedure for protecting shared data has been imple-
mented by several federal statistical agencies that might well serve as a
model for protecting the EEOC employer data. The Bureau of Labor Sta-
tistics, Census Bureau, and Bureau of Economic Analysis can now share
confidential data obtained from employers under provisions of the Confi-
dential Information Protection and Statistical Efficiency Act (CIPSEA). This
statute, under the umbrella of the U.S. Office of Management and Budget,
prohibits disclosure or release, for nonstatistical purposes, of information
collected under a pledge of confidentiality. Under this law, data may not be
released to unauthorized persons. Willful and knowing disclosure of pro-
tected data to unauthorized persons is a felony punishable by up to 5 years
imprisonment and up to a $250,000 fine—penalties that are significantly
more stringent than those that are enumerated in the Title VII legislation.
It is certain that the sensitivity of the data that employers provide to
EEOC will be heightened if earnings data were to be added to the EEO
data records. Employee compensation data are generally considered to be
highly sensitive; they are even considered proprietary information by many
private-sector employers.
As this chapter points out, EEOC provides data to agencies that do not
have the same level of confidentiality protections and are not covered by
the same penalties that apply to EEOC employees and researchers under
Interagency Personnel Act (IPA) agreements. Legislation patterned after the
CIPSEA law could increase the protection of confidentiality of EEO data,
specifically, to authorize sharing agreements between EEOC, OFCCP, DOJ,
and the state and local FEPAs and extend the Title VII penalties beyond
EEOC and its IPA researchers.
OCR for page 85
CONFIDENTIALITY, DISCLOSURE, AND DATA ACCESS 85
Such protection could be expected to increase the willingness of em-
ployers to provide detailed employment data. It could also help mitigate
concerns of other federal agencies about the matching of the EEO-1 survey
records to administrative data (such as those discussed in Chapter 2) if
such matching was some day deemed useful to help improve the quality of
the data.