Page 165

Q/[(1 -Q)Np]; when Np is small, this is essentially the same as the preceding case. A Bayesian analysis is particularly well-suited to deal with the case where the database can be expected to contain almost all reasonable suspects. In that case, the prior odds, Q/(1 - Q), would be large.

Equation 5.5 can be derived as follows. Let M denote the number of pairwise matches in the population when K loci are typed. To evaluate the probability that P{M ³ 1} we let , where is the probability of the genotype in some fixed enumeration of the set of all possible genotypes. As an application of the "birthday problem" with unequal probabilities (Aldous 1989, p 109), we have

(5.12)

if N is large and max() is small. The contribution to E of a single locus, expressed in terms of the allele frequencies pi and homozygosity f at that locus, is

Taking the product over all loci, we find that an upper bound for E is P_{L}[2f_{L}2]. Hence, a simple approximate upper bound for the desired probability is

(5.13)

where f = (P f_{L})1/K, the geometric mean of the homozygosities.

If the homozygosity of some loci is moderate or high, as for some PCR loci, the following refinement of our approximate upper bound can be useful because it shows that a smaller number of loci may yield uniqueness at each given probability level. In the above derivation, instead of dropping S_{1}pi^{4}, note from Jensen's inequality (see, for example, James and James 1959) that S_{1} p_{1}^{4} ³^{}(S_{i} __p___{i}^{2})^{3} = f^{3}. That leads to the approximate upper bound obtained by setting

(5.14)

in Equation 5.12.

As an example, suppose N = 5 X 109 and f_{L} = 0.5 for every L. If we insist that the probability of simultaneous uniqueness of all profiles exceed 0.99, then Equation 5.13 requires 71 loci, whereas Equations 5.12 and 5.14 show that 50 actually suffice.