enough that the FBI and other forensic laboratories keep separate databases for whites and blacks, and two separate databases for Hispanics, one for those from the eastern United States and another for those from the West.
The data in Table 4.1 can be used for another purpose. As mentioned in Chapter 2, DQA data can distinguish samples from different individuals 93% of the time, clearing many innocent suspects. The overall probability that two independent persons will have the same DQA genotype is the sum of the squares of the genotype frequencies, as illustrated in Box 4.1.4
Box 4.1. Calculating the Exclusion Power of a Locus
We can illustrate the 93% average exclusion power of DQA by reference to the data in Table 4.1. The probability that two randomly chosen persons have a particular genotype is the square of its frequency in the population. The probability that two randomly chosen persons have the same unspecified genotype is the sum of the squares of the frequencies of all the genotypes. Summing the squares of the expected genotype frequencies (in parentheses) for the black population yields 0.0232 + 0.0792 + . . . + 0.0922 = 0.078. We used expected rather than observed genotype frequencies to obtain greater statistic precision. For the white population, the value is 0.063. The average is about 0.07. The exclusion power is the probability that the two persons do not have the same genotype, or 1 -0.07 = 0.93.
If there are n loci, and the sum of squares of the genotype frequencies at locus i is Pi, then the exclusion power is 1-(P1P2. . .Pn). Five loci with the power of DQA would give an exclusion power of 1-(0.07)5 = 0.999998.
4 The concept of exclusion power was initially described by Fisher (1951). The calculation of the exclusion power can be simplified, especially if the number of alleles is large, by noting that in HW proportions the unconditional probability of identical genotypes is
Each sum on the right has n terms, where n is the number of alleles, rather than n(n + 1)/2, the number of genotypes. Note that the sum in parentheses on the right-hand side is the homozygosity, fs.
An approximation to the probability of identical genotypes, due to Wong et al. (1987; see also Brenner and Morris 1990), is 2fs2- fs3. This gives the maximum value and is quite accurate for small fs or when the allele frequencies are roughly equal.