Unlike many of the technical aspects of DNA typing that are validated by daily use in hundreds of laboratories, the extraordinary population-frequency estimates sometimes reported for DNA typing do not arise in research or medical applications that would provide useful validation of the frequency of any particular person's DNA profile. Because it is impossible or impractical to draw a large enough population to test calculated frequencies for any particular DNA profile much below 1 in 1,000, there is not a sufficient body of empirical data on which to base a claim that such frequency calculations are reliable or valid per se. The assumption of independence must be strictly scrutinized and estimation procedures appropriately adjusted if possible. (The rarity of all the genotypes represented in the databank can be demonstrated by pairwise comparisons. Thus, in a recently reported analysis of the FBI database, no exactly matching pairs of profiles were found in five-locus DNA profiles, and the closest match was a single three-locus match among 7.6 million basepair comparisons.)13
The multiplication rule has been routinely applied to blood-group frequencies in the forensic setting. However, that situation is substantially different: Because conventional genetic markers are only modestly polymorphic (with the exception of human leukocyte antigen, HLA, which usually cannot be typed in forensic specimens), the multilocus genotype frequencies are often about 1 in 100. Such estimates have been tested by simple empirical counting. Pairwise comparisons of allele frequencies have not revealed any correlation across loci. Hence, the multiplication rule does not appear to lead to the risk of extrapolating beyond the available data for conventional markers. In contrast, highly polymorphic DNA markers exceed the informative power of protein markers, so multiplication leads to estimates that are less than the reciprocal of the size of the databases.
The multiplication rule is based on the assumption that the population does not contain subpopulations with distinct allele frequencies—that each individual's alleles constitute statistically independent random selections from a common gene pool. Under this assumption, the procedure for calculating the population frequency of a genotype is straightforward:
Count the frequency of alleles. For each allele in the genotype, examine a random sample of the population and count the proportion of matching alleles—that is, alleles that would be declared to match according to the rule that is used for declaring matches in a forensic context. This step requires only the selection of a sample that is truly random with reference to the genetic type; it does not appeal to any theoretical models.