~ enlarge ~
The race-specific average of Y it gives
the proportion of tests, for each race, in which auditors were treated
favorably.
The gross measure of discrimination is the proportion of tests in
which the minority auditor was treated unfavorably and the white
auditor was treated favorably, P10. The net measure of
discrimination is the difference
OCR for page 27
Page 27
between the gross measure (P10) and the proportion of tests
in which the white auditor was treated unfavorably compared with the
minority auditor, P01 (i.e., net measure =
P10– P01) (see also Chapter 4.) P01is a proxy for the frequency of
adverse treatment incidences against minorities that are unrelated to
race. P01 may be a poor proxy, however, if it includes
deliberate reverse discrimination, which is subtracted out of the net
measure.
The discussion frequently returned to the need to clearly define the
concept of discrimination. To find the correct measure of
discrimination, participants contemplated a conceptual experiment in
which auditors are matched perfectly on all observable characteristics
and encounter completely identical circumstances during their visit to
a housing agent. Under these circumstances, the researchers believe
the correct measure of the incidence of disparate treatment
discrimination is the gross measure. One could measure both reverse
racial discrimination (P01) and racial discrimination
(P10), although the latter is the quantity of interest. The
Urban Institute researchers noted, however, that this conceptual
experiment is unachievable.
Additional discussion centered on the standard for housing market
transactions, more specifically, the solutions for the joint
probabilities in Table 5-2 in the absence of
housing discrimination. Workshop participants suggested that the Urban
Institute should consider the solutions for Pij
and their implications for the net and gross measures of adverse
treatment. These solutions for varying levels of housing
discrimination would help the Urban Institute assess whether the gross
and net measures are adequately capturing discrimination in the
market. While the discussion addressed this issue, of major concern
was the measurement of discrimination in the context of the population
of interest and a clear definition of
TABLE 5-2 Proportion of Auditors Receiving Favorable Treatment
Minority
White
Favorable
Unfavorable
Favorable
P11
P10
a
Unfavorable
P01
P00
a Gross measure; The net measure is = P10– P01.
OCR for page 28
Page 28
discrimination. Each of these issues is presented in separate sections
of this report.
Some participants expressed their preference for the net measure since
it captures the difference in unfavorable treatment of minority and
white testers. The gross measure will reveal the number of instances
of discrimination against minorities and may appear high; however, the
frequency of these instances may be equivalent to that for whites. The
net measure will capture this by calibrating the magnitude of the
discrimination.
Charles Manski, Board of Trustees Professor, Department of Economics,
Northwestern University, and Susan Murphy, Associate Professor,
Statistics Department, and Senior Associate Research Scientist, Survey
Research Center, University of Michigan, also commented on the breadth
of methodological issues in the 2000 HDS and the implications of these
issues for measuring discrimination in the national housing market.
Their comments included a discussion of the strengths and weaknesses
of the current methodology and some alternative methodologies that
could be applied.
Manski's discussion addressed measuring the severity or magnitude of
discrimination rather than just the occurrence of discrimination. For
example, the extent to which the characteristics of minority
households must be altered so they appear more qualified than white
households could serve as a measure of the magnitude of
discrimination. During his comments, Manski also proposed that by
collecting richer data, researchers could distinguish between
statistical and prejudicial discrimination.
FACTORS AFFECTING HOUSING
DISCRIMINATION
The HDS focuses predominantly on economic and family-size
characteristics. These attributes of the individual are expected to
drive housing needs and thus the units shown or suggested to the
auditor. The initial model posits that disparate treatment is due
to the individual's race and observable circumstances that could
arise during the tester's visit. During the workshop, Urban
Institute researchers acknowledged an inability to match auditors
on the myriad of possible unobservable characteristics. They stated
that their goal was different: to structure a study that could test
whether those unobservable characteristics really matter in
racially differential treatment of the auditors.
Some participants raised questions about the power of the
statistical tests being performed and the need to control for
covariates even if the
OCR for page 29
Page 29
paired-testing methodology appears to control for them. One argument
for the use of covariates is that favorable or unfavorable treatment
by housing agents may depend on the sector of the housing market or
type of transaction observed. The audit methodology results in
identical agents observing auditors with similar characteristics.
Including the covariates in the model would allow the researchers to
observe how the estimated marginal probabilities in Table 5-2 respond to this methodology. Another question
raised during the workshop was whether the discussion of power for the
statistical tests and the need to control for covariates is necessary
in the absence of a clearly defined population. An appropriate model
may be one that accounts for the measurement of outcomes that
represent a mix of different measured phenomena.
CHARACTERISTICS OF TESTER PAIRS
Several participants expressed concern about the
“actual” characteristics of auditors—those not
assigned by the test coordinator—and their potential effect
on the validity of the test. More specifically, participants asked
how test coordinators ensure that the audit pair are believable
potential renters or purchasers of the advertised housing unit. The
discussion encompassed whether auditors appear able to afford a
particular housing unit, as well as how close an auditor's actual
residence is to the test site.
An additional concern of workshop participants was heterogeneity
among white testers, given that two such testers of differing
ancestry may receive very different treatment by a housing agent.
Participants suggested that the test coordinator be mindful of this
heterogeneity when pairing white with minority testers. Otherwise,
the result of the test may reflect not solely minority-white
differences, but also the housing agent's perceptions, based on
ethnicity or other factors, of a white applicant's attractiveness
as a buyer or renter.
In contrast with previous audits, testing agencies participating in
HDS 2000 collect actual tester characteristics, such as income,
level of education, employment experience, and testing experience.
Tester training and test protocols are designed to limit the effect
of variation among tester pairs. Participants stressed the
importance of addressing the issue of heterogeneity among the
auditors, the housing units, and the housing agents. Heterogeneity
in any of these elements may have an impact on both the gross and
net measures of adverse treatment.
Sanders Korenman, Center for the Study of Business and Government,
OCR for page 30
Page 30
Baruch College, City University of New York, commented that auditors'
assigned characteristics should reflect the legal definition of
discrimination. Researchers should control for attributes that provide
legally allowable reasons to deny housing. Researchers may want the
minority or white auditor to represent the subset of the minority or
white population possessing those allowable characteristics. Korenman
believes it would then be unnecessary to control for other differences
correlated with race (e.g., language) if these differences are
irrelevant to the housing transaction.
Joseph Altonji, Department of Economics, Northwestern University,
presented the following model for dealing with the above issues:
y it = f (x
it ,ε it
,z it ,v it
;R i )
where i denotes the auditor, and
t denotes the test. In this model, y
it is the outcome measure representing favorable
or unfavorable treatment (e.g., whether the auditor was shown the
unit). The variable x it is a vector
containing the characteristics of the auditor that are observed by or
known to the researchers and are used to match audit pairs. It
includes both assigned and nonassigned attributes, the latter having
been collected by the researcher during the application process. The
variable eit is a vector of characteristics of the
auditor that are relevant to the agent's assessment of the suitability
of the auditor for the unit and are observed by the agent but not used
to match audit pairs. The elements of eit vary
across auditors and over time for a given auditor. Both x
it and eit are limited to
factors that are legitimate indicators of the suitability of the
auditor for the housing unit and may legally be used by the auditor to
make judgments. The variables z it and
v it represent observed or known and
unobserved or unknown characteristics of the unit that determine how
the agent weighs the characteristics x it
and eit of the auditor. Finally, the variable
R i denotes the race of the auditor.
In terms of the model, a natural benchmark for discrimination is the
situation in which race, R i plays a role
in the agent's decision function given the characteristics of the unit
z and v and the characteristics of the auditor
x and e. R will play a role in the auditor's
decisions if there is (1) institutional discrimination or racial
preference, whether conscious or subconscious, on the part of the
agent; and/or (2) the agent uses the race of the auditor to draw
inferences about the suitability of the auditor for the unit, such as
ability to pay the rent, maintain the unit, or get along with
neighbors, or the degree of interest in the unit.
Note that the housing provider may draw inferences about the auditor's
OCR for page 31
Page 31
suitability for the unit on the basis of the characteristics
x and ε. However, if the housing
provider uses race to draw any inferences about characteristics that
are relevant to the housing transaction, he or she is discriminating.
The audit methodology is to send auditors with the same value of
x it to inquire about a housing unit. The
fraction of times the outcome is favorable for whites but not for
non-whites is sometimes interpreted as a measure of discrimination
against non-whites. The fraction of times the outcome is favorable for
non-whites but not for whites is sometimes interpreted as a measure of
discrimination against whites. The sum of these two fractions is
referred to as the gross discrimination rate. The difference between
these two fractions is a measure of net discrimination against
non-whites.
The problem with the gross measure of discrimination is that random
variation across testers in ε
it , differences in the distribution of
ε it that are related to
race, and random variation in z and v between testor
visits to a particular unit will lead to differences in the outcomes
even though the audit pairs have been matched on x
it . (Variation in z and v may
arise, for example, from situational changes in the housing provider
that occur between the two audit visits, or different weights placed
by a particular agent on the characteristics x and
ε in the event the auditors see different
agents.) That is, the gross measure of discrimination will be positive
even if there is no discrimination, and R plays no role in
the decision of any of the agents. Note that the variation in
z it or in elements of
ε it that is observed by the
researchers could be accounted for in analyzing the results of the
audits. The problem with the net measure of discrimination against
non-whites is that it will overstate discrimination to the extent that
the values of the uncontrolled auditor characteristics
ε it are systematically
related to race.
The design and analysis of the audit studies should account for the
differences among the auditors and housing providers that are
reflected in ε it ,
z it , and v
it in the above model. Altonji offered four
comments on how the Urban Institute could address heterogeneity in the
study. First, researchers could look for differences in the outcomes
of auditors of the same race who have visited similar housing units.
This method would assess treatment outcomes within racial groups.
Second, researchers could have individual auditors perform multiple
tests involving similar units. This method would provide information
about the influence of variation across auditors in
ε it on the distribution of
outcomes. Third, auditors could perform sandwich tests, in which
auditors are sent on a test in triples, rather than pairs. The fourth
comment is that more information should be gathered about the auditors
even if it is not used to form matched pairs. Addi
OCR for page 32
Page 32
tional steps should also be taken to gather preferences and
characteristics relevant to housing providers. While the 2000 HDS has
started to collect these data, more information could be gathered.
From this information, the audit researchers could assess which
characteristics are most important in matching auditors and assigning
attributes. Researchers have considered using the information on the
treatment of whites in all the audits to improve estimates of the
treatment of whites. These estimates would increase the precision of
the net adverse treatment measure.
Murphy's discussion of the methodological aspects of the 2000 HDS also
addressed the interaction of auditor characteristics and the structure
of audit pairs. She commented that, given the number of audit pairs
and the number of visits per audit pair, researchers would not
accumulate information within an audit pair because individual
characteristics, which may not vary by race, persist across audit
pairs. The resulting estimate of discrimination obtained for these
audit pairs may be due to individual characteristics that are equally
distributed across race or due to discrimination. Provided that
researchers have matched testers on characteristics that matter to the
housing providers, researchers can obtain better estimates of
discrimination by looking across tester pairs.
APPLICATION OF SAMPLING WEIGHTS TO A MEASURE
OF HOUSING DISCRIMINATION
A secondary objective of the workshop was for participants to
discuss the notion of preserving probability in the selection of
advertisements by sampling with probabilities proportional to the
size of the audit site. The Urban Institute uses classical
population sampling to draw inferences about a population. It is
not clear that application of these methods is necessary, however,
since the study will not draw the usual theoretical inferences
about population parameters. Rather than estimating a known
population parameter, the researchers are trying to estimate an
underlying phenomenon that exists within the population. The
underlying universe encompasses this conceptual model of
discrimination and the character or prevalence of discrimination
activities that occur in the interaction between two hypothetical
individuals.
There was considerable discussion during the workshop about the
relevance of sampling weights to the analysis. For certain
statistical analyses, weighting is important; however, many
participants do not believe sample survey weights are relevant for
the type of analysis the Urban Institute is
OCR for page 33
Page 33
performing. The researchers argued for maintaining weights because
advertisements are stratified by weeks. During high-volume weeks,
fewer tests are performed. If discriminatory agents represent a large
proportion of advertisements during high-volume weeks, they will also
be overrepresented in the sample. Not allowing for weighting of the
advertisements will ignore the potential bias in the estimate.
Altonji offered another suggestion for addressing weights. He
suggested the Urban Institute weight the results using not the
advertisements, but the characteristics of the housing unit. The audit
results could then be compared with a national database containing the
distribution and characteristics of the housing stock in the United
States, namely occupancy or vacancy rates. The audit results could be
weighted to reflect the expected availability of different housing
stock in the market at a particular point in time. It was noted that
if weighting is appropriate, approximate weights for the correct
population are preferred over equal probability weights that are
generated for the incorrect population.
Workshop participants discussed the use of multiple newspapers in the
original sampling frame instead of just in the pilot phase.
Researchers from the Urban Institute expressed doubt about whether
they had placed too much emphasis on the potential overlap in
advertising and the fact that a single unit may be advertised in
multiple newspapers. Analysts noted that the use of multiple
newspapers could not be applied because the Phase I analysis of the
2000 HDS must remain comparable to the 1989 analysis. In discussing
potential changes in the design of Phase II, workshop participants
suggested the analysts merge all newspaper advertisement sources.
Fienberg noted that once the sample has been obtained, analysts can
perform the calculation two ways: (1) reweighting according to the
sampling probabilities and (2) not reweighting or disregarding the
potential overlap. Participants also discussed the feasibility of
providing separate estimates for subsets of newspaper sources or for a
clearly defined population of newspapers—for example, having the
ability to estimate the likelihood of discrimination for the major
newspapers in a particular area without concern for drawing inferences
about the U.S. housing market. Several variations could be explored,
including oversampling of underrepresented housing unit types.
A recurring theme throughout the workshop was characterization of the
housing market. Specification of the population of housing units has
implications for the inferences drawn, as well as the appropriate
weighting scheme. Workshop participants proposed that while the U.S.
housing mar
OCR for page 34
Page 34
ket is a candidate for the population, it may not reflect the true
population of interest to the researchers. More specifically, if
researchers are interested in discrimination against minority
households, the population might be restricted to housing units in
which this subgroup would be interested. The entire U.S. housing
market may be the housing choice set of minority groups, or that set
may be restricted to particular housing types. One proposal for
restricting the housing market was to segment it by housing costs or
affordability.
METHODOLOGICAL IMPLICATIONS OF THE PHASE II
DESIGN
Tom Louis of the RAND Corporation addressed methodological
implications of the Phase II design. He discussed the importance of
identifying a set of primary goals for the study in a
nonstatistical way. For instance, if the design includes the whole
population, however defined, what summaries will be obtained, and
what will they mean? Without being concerned with sample weights or
statistical tests, what do the estimates mean, and do they provide
the information needed? Once the proper estimates have been
obtained and their meaning understood, the problem can be designed
with the appropriate weights and statistical model. A premise of
the audit design is that the survey design and weights can be
extrapolated to a population. Inherent in the variables of interest
is that these extrapolations capture contrasts in the population.
The design should serve the objective of comparing treatment
between white and minority home seekers. The weights will provide
metropolitan-area estimates based on the distribution of
advertisements within the sample relative to the population.
Louis also discussed the importance of weights applied to the
sample of advertisements. If the contrast in white and minority
treatment measured by some metric (e.g., the difference or odds
ratio) has either no or low interaction with attributes used to
form strata or sampling frames, the within-sample weights are
adjusted. Louis addressed the design of later study phases in view
of the findings from earlier phases. He suggested Phase II could
serve the objective of providing reasonable estimates of the
variance components associated with auditors, housing providers,
and advertisement sources. The later phases of the study would rely
on exploration of the interactions between audit pairs and other
methodological concerns identified in earlier phases.
OCR for page 35
Page 35
Louis suggested that a more appropriate primary goal of the HDS might
be to better understand transactions in the general housing market
rather than to conduct a definitive study representing the population
of housing market transactions. These statistical and policy-related
decisions on the study design and objectives will determine how
samples are allocated. Louis's remarks also addressed matching of
audit pairs and its implications for the interpretation of audit
results. He expressed concern about the large variance component for
the matched pairs on the one hand and the inability to properly model
tester heterogeneity on the other. He suggested that matching auditors
on the wrong attributes—characteristics that have high variance
components—could be worse than not attempting to match auditors
at all. He did not suggest abandoning the matching of auditor pairs.
Rather, he stressed matching on important attributes and formulating a
model that would allow for the specification of covariance
adjustments.
As noted earlier, sandwich tests, in which two auditors of the same
race view the advertised unit—one prior to and the other after
the minority tester—can provide important information about
differential treatment in housing transactions. Louis noted that
similar information could be obtained without performing an actual
sandwich test. By combining information within racial groups across
audits for similar housing units, researchers could explore variation
within racial groups, particularly for matched characteristics.
Analyses across audit sites could also provide information needed in
low-population sites, such as underserved communities. For some sites,
the definition of an underserved community restricts the study to
small sample sizes. Louis proposed that a mix of design- and
model-based analyses that incorporates results from various test sites
could help in obtaining estimates within smaller sites. Participants
did not offer definitive ways of addressing these issues, but noted
the importance of raising them.
Korenman presented several methodological implications of the Phase II
design. He emphasized the need to assess the quality of an estimator
with respect to how the researchers and other members of the housing
community will use the measure. He also mentioned the importance of
having a definition of discrimination and identifying what the study
attempts to measure. He reiterated two uses of the latter: providing a
benchmark for racial discrimination in U.S. housing markets and
identifying target communities for enforcement audits.
OCR for page 36
Page 36
Returning to an issue discussed earlier, Korenman also addressed which
measure—gross or net adverse treatment—is most appropriate
for estimating discrimination. He noted that while the objective is
not to provide one measure, but rather various components of an
overall benchmark, each component should be a credible and reliable
estimate. He noted the importance of having the gross and net measures
capture the desired phenomenon and move in directions consistent with
what is known about housing discrimination from other sources. One
aspect of this issue is the need to measure adverse treatment relative
to the legal definition of discrimination or adverse treatment.
Korenman stated that, consistent with the legal definition,
researchers could assign profiles and match testers on attributes that
constitute legal bases for differential treatment.
Korenman also expressed the need for a better understanding of the
processes that generate variation across time and space in the
measurement of housing discrimination. He did not propose that such
analysis be added to the scope of the HDS, but observed that the
issues involved are important and call for some caution in
interpreting results.
Korenman commented as well on the proposed remedies for selection bias
in the newspaper sampling methodology. In addition to underrepresented
areas, the sampling frame may underrepresent housing unit types (e.g.,
rent control units). The modified sampling frame would still miss some
unit types. Participants discussed capturing available housing stock
by linking vacancy rates with actual rentals or turnovers to buttress
the newspaper selection methodology. Korenman commented on the
screening call, in which a white tester calls about the housing unit
to determine whether it is still available. He asked what information
is retained from such calls and whether researchers could test to see
whether the race of the auditor making the initial screening call
matters.
Finally, participants discussed the implications of changes in
demographics for the legal definition of discrimination and the audit
methodology. Some participants commented on the basis of casual
observations that discrimination against whites may be more prevalent
in some high-minority housing markets. Also, in some housing markets
where whites are a small minority of the population, white-minority
testing may not make sense; rather, it may be more appropriate to pair
a second- or third-generation Hispanic or Asian auditor with an
African American auditor. These multiracial and multiethnic pairs may
be more reflective of the actual housing search pattern in these types
of communities. The 2000 census represents the first time respondents
could multiply identify on race and
OCR for page 37
Page 37
ethnicity on a full national scale. Data obtained from the census may
indicate potential modifications to the paired-testing methodology.
Participants raised the issues of (1) how to measure discrimination in
housing markets with changing demographics, and (2) whether sending
individual auditors as opposed to pairs of auditors representing a
household would better capture the housing market.
Representative terms from entire chapter:
urban institute