For each allele, a modified ceiling frequency should be determined by (1) calculating the 95% upper confidence limit for the allele frequency in each of the existing population samples and (2) using the largest of these values or 10%, whichever is larger. The use of the 95% upper confidence limit represents a pragmatic approach to recognize the uncertainties in current population sampling. The use of a lower bound of 10% (until data from ethnic population studies are available) is designed to address a remaining concern that populations might be substructured in unknown ways with unknown effect and the concern that the suspect might belong to a population not represented by existing databanks or a subpopulation within a heterogeneous group. We note that a 10% lower bound is recommended while awaiting the results of the population studies of ethnic groups, whereas a 5% lower bound will likely be appropriate afterwards. In the context of the discussion of the ceiling principle, the higher threshold reflects the greater uncertainty in using allele frequency estimates as predictors for unsampled subpopulations.

Once the ceiling for each allele is determined, the multiplication rule should be applied. The race of the suspect should be ignored in performing these calculations.

Regardless of the calculated frequency, an expert should—given with the relatively small number of loci used and the available population data—avoid assertions in court that a particular genotype is unique in the population. Finally, we recommend that the testing laboratory point out that reported population frequency, although it represents a reasonable scientific judgment based on available data, is an estimate derived from assumptions about the U.S. population that are being further investigated.

As an example, suppose that a suspect has genotype A1/A2, B1/B2 at loci A and B and that three U.S. populations have been sampled in the current "convenience sample" manner and typed for these loci. The likelihood of a match for this two-locus genotype would be estimated as follows:


Population 1

Population 2

Population 3

Derived frequency


750 persons

500 persons

200 persons


Locus A





Allele A1




Use 0.10

Allele A2




0.124 + 0.032 = 0.156a

Locus B





Allele B1




Use 0.10

Allele B2




0.228 + 0.021 = 0.249a

Loci A and B


[2(0.10)(0.156)][2(0.10)(0.249)] = 0.001554

a The upper 95% confidence limit is given by the formula p + 1.96 , where p is the observed frequency and N is the number of chromosomes studied.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement