Bertrand and Mullainathan (2002) conducted a large-scale field experiment on job hiring by sending résumés in response to over 1,300 help-wanted advertisements in Boston and Chicago newspapers (submitting four résumés per ad). In all they submitted 4,890 résumés. For each city, the authors took résumés of actual job seekers, made them anonymous, and divided them into two pools based on job qualifications—high and low. Two résumés from each pool were assigned to each advertisement, and race was randomly assigned within each pair. Thus, they randomly assigned white-sounding names (e.g., Allison and Brad) to two of the résumés and black-sounding names (e.g., Ebony and Darnell) to the remaining two résumés. This crucial randomization step breaks the tie between the résumé characteristics and race. Addresses were also randomized across résumés so that the ties between race and neighborhood characteristics and résumé attributes and neighborhood characteristics were also broken. Thus for each ad the researchers were able to observe differential callbacks by race both within and between the high- and low-qualified résumé pools.
Using callback rate as the outcome of interest, the authors found that on average, applicants with white-sounding names received 50 percent more callbacks than applicants with black-sounding names. Specifically, the researchers found a 12 percent callback rate for interviews for “white” applicants compared with a 7 percent callback rate for interviews for “black” applicants. They also found that higher-quality résumés yielded significant returns for white applicants (14 percent callback rate for white applicant/high-quality résumés versus 10 percent for white applicant/low-quality résumés) but not for black applicants (7.7 percent callback rate for black applicant/high-quality résumés versus 7.0 percent for black applicant/low-quality résumés). The authors concluded that for blacks having more productive skills may not necessarily reduce discrimination.
By randomizing the assignment of race, the authors made it possible to directly estimate the usual missing counterfactual—whether a callback would have been received if the résumé had belonged to an applicant likely to be perceived as being of the other race. Two résumés were selected from each pool (high- and low-qualified) because the same résumé could not be sent in response to a single advertisement with different names and addresses attached but otherwise identical content. Because race was randomized within each quality pair, any difference by race in the résumé quality (within a quality pool) for a particular advertisement could be expected to average out over a large number of advertisements. Thus the outcomes of the two résumés within a quality level could be compared, and the average of these comparisons could provide an estimate of the effect of race on callbacks within each quality level, which