Page 89

Population Genetics

Much of the controversy about the forensic use of DNA has involved population genetics. In this chapter, we first explain the principles that are generally applicable. We then consider the special problem that arises because the population of the United States includes different population groups and subgroups with different allele frequencies. We develop and illustrate procedures for taking substructure into account in calculating match probabilities. We then show how those procedures can be applied to VNTRs and PCR-based systems. Consider the comparison of DNA from a crime-scene specimen and from a

suspect. (Actually, the evidence DNA need not come from the crime scene, nor the second sample from a suspect, but we use this vocabulary for convenience.) Under current procedures, if the DNA profile from the crime-scene sample reportedly matches that of the suspect, there are two possibilities (aside from error): The DNA at the crime scene came from the suspect or the DNA at the crime scene came from someone else who had the same profile as the suspect. If the DNA profile in question is common in the population, the crime-scene DNA might well have come from someone other than the suspect. If it is rare, the matching of the two DNA profiles is unlikely to be a mere coincidence; the rarer the profile, the less likely it is that the two DNA samples came from different persons.

To assess the probability that DNA from a randomly selected person has the same profile as the evidence DNA, we need to know the frequency of that profile in the population. That frequency is usually determined by comparison with some reference data set. A very small proportion of the trillions of possible profiles are found in any database, so it is necessary to use the frequencies of

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 89

Page 89
4 Population Genetics Much of the controversy about the forensic use of DNA has involved population genetics. In this chapter, we first explain the principles that are generally applicable. We then consider the special problem that arises because the population of the United States includes different population groups and subgroups with different allele frequencies. We develop and illustrate procedures for taking substructure into account in calculating match probabilities. We then show how those procedures can be applied to VNTRs and PCR-based systems. Consider the comparison of DNA from a crime-scene specimen and from a
suspect. (Actually, the evidence DNA need not come from the crime scene, nor the second sample from a suspect, but we use this vocabulary for convenience.) Under current procedures, if the DNA profile from the crime-scene sample reportedly matches that of the suspect, there are two possibilities (aside from error): The DNA at the crime scene came from the suspect or the DNA at the crime scene came from someone else who had the same profile as the suspect. If the DNA profile in question is common in the population, the crime-scene DNA might well have come from someone other than the suspect. If it is rare, the matching of the two DNA profiles is unlikely to be a mere coincidence; the rarer the profile, the less likely it is that the two DNA samples came from different persons.
To assess the probability that DNA from a randomly selected person has the same profile as the evidence DNA, we need to know the frequency of that profile in the population. That frequency is usually determined by comparison with some reference data set. A very small proportion of the trillions of possible profiles are found in any database, so it is necessary to use the frequencies of

OCR for page 89

Page 90
individual alleles to estimate the frequency of a given profile. That approach necessitates some assumptions about the mating structure of the population, and that is where population genetics comes in.1
Allele and Genotype Proportions It is conventional in genetics to designate each gene or marker locus with a letter and each allele at that locus with a subscript numeral. So, A10 designates the tenth allele at locus A, B5 the fifth allele at locus B, and so on. When we want a statement to apply to any of the alleles of a given locus, we use a literal subscript, such as i or j. We designate the frequencies (it is customary to use the word frequency for relative frequency, meaning proportion) of alleles with the letter p and a corresponding subscript. Thus, the frequency of allele A3 is p3 and of allele Ai is pi. The sum of all the pi values is 1 because it includes all the possibilities. Symbolically, if S stands for summation, Spi = 1.
At the DQA locus, discussed in Chapter 2, six alleles are customarily used in forensic analysis (Table 4.1). For example, allele D1.1 (designated as 1.1 in the table), has a proportion of 0.150, or 15.0%, in the black population; this was computed from the proportions in the right-hand portion of the table. The first six genotypes include the 1.1 allele (the top one has two copies) and adding their frequencies—0.036 + (0.076 + 0.009 + 0.036 + 0.027 + 0.080)/2—yields 0.150. The division by 2 is because in heterozygotes only half the alleles are D1.1·
Random Mating and Hardy-Weinberg Proportions In the simplest population structure, mates are chosen at random. Clearly, the population of the United States does not mate at random; a person from Oregon is more likely to mate with another from Oregon than with one from Florida. Furthermore, people often choose mates according to physical and behavioral attributes, such as height and personality. But they do not choose each other according to the markers used for forensic studies, such as VNTRs and STRs. Rather, the proportion of matings between people with two marker genotypes is determined by their frequencies in the mating population. If the allele frequencies in Oregon and Florida are the same as those in the nation as a whole, then the proportions of genotypes in the two states will be the same as those for the United States, even though the population of the whole country clearly does not mate at random.
We use random mating to refer to choice of mates independently of genotype at the relevant loci and independently of ancestry. The expected proportions with
1An elementary exposition of population genetics is found in Hartl and Clark (1989). A more advanced text, with discussion of many of the formulae used here, is Nei (1987). Practical details of estimation and analysis are given by Weir (1990). See also Weir (1995a).

OCR for page 89

Page 91
TABLE 4.1 Observed and Expected Frequencies of DQA Genotypes Based on 224 Blacks and 413 Whitesa
ALLELES
GENOTYPES
Allele Frequency %
Observed (Expected) Frequency %
Allele
Black
White
Genotype
Black
White
1.1
15.0
13.7
1.1/1.1
3.6(2.3)
2.2 (1.9)
1.2
26.3
19.7
1.1/1.2
7.6 (7.9)
3.6 (54)
1.3
4.5
8.5
1.1/1.3
0.9 (1.4)
2.9 (2.3)
2
12.1
10.9
1.1/2
3.6 (3.6)
1.9 (3.0)
3
11.8
20.1
1.1/3
2.7 (3.5)
5.3 (5.5)
4
30.3
27.1
1.1/4
8.0 (9.1)
9.2(7.4)
1.2/1.2
8.5 (6.9)
4.6(3.9)
1.2/1.3
2.2 (2.4)
3.4 (3.4)
1.2/2
4.0 (6.4)
4.6 (4.3)
1.2/3
7.1 (6.2)
8.2 (7.9)
1.2/4
14.7 (16.0)
10.4 (10.7)
1.3/1.3
0.0 (0.2)
1.2 (0.7)
1.3/2
2.2 (1.1)
1.5 (1.9)
1.3/3
1.3 (1.1)
1.7 (3.4)
1.3/4
2.2 (2.7)
5.1 (4.6)
2/2
2.2 (1.5)
2.2 (1.2)
2/3
1.3 (2.9)
4.8 (4.4)
2/4
8.5 (7.4)
4.6 (5.9)
3/3
0.9 (1.4)
4.4 (4.0)
3/4
9.4 (7.2)
11.4 (10.9)
4/4
8.9 (9.2)
6.8 (7.3)
Homozygotes
24.1 (21.5)
21.4 (19.0)
Heterozygotes
75.7 (78.9)
78.6 (81.0)
aHomozygous genotypes in boldface. Data from Maryland State Crime Laboratory (Helmuth, Fildes, et al. 1990).
random mating are called the Hardy-Weinberg (HW) proportions, after GH Hardy, a British mathematician, and Wilhelm Weinberg, a German physician. For example, suppose that the proportions of alleles A1, A2, and A3 are p1, P2, and p3, respectively. The proportions of the three alleles among the sperm are given along the top of Table 4.2, and among the eggs, along the left margin. (It is intuitively reasonable and easily demonstrated that random mating is equivalent to combining gametes at random.) The genotypes and their frequencies are given in the interior of the table. The proportion, or frequency, of A1A1 homozygotes is thus p12, and the proportion of A2A3 (we do not distinguish between A2A3 and A3A2) heterozygotes is p2p3 + p3p2 = 2p2p3.
According to Table 4.1, the proportions of alleles D2 and D4 in the white population are 0.109 and 0.271. If we assume HW and treat the sample allele frequencies as if they were the true population frequencies, then the proportion

OCR for page 89

Page 92
TABLE 4.2 Hardy-Weinberg Proportions for a Locus with Three Alleles
Alleles (and Frequencies) in Eggs
Alleles (and Frequencies) in Sperm
A1 (p1)
A2 (p2)
A3 (p3)
A1 (p1)
A1A1 (plpl)
A1A2 (p1p2)
A1A3 (p1p3)
A2 (p2)
A2A1 (p2p1)
A2A2 (p2P2)
A2A3 (p2p3)
A3 (p3)
A3A1 (p3p1)
A3A2 (p3p2)
A3A3 (p3p3)
of genotype D2D2 would be (0.109)2 = 0.012, or 1.2%; as Table 4.1 shows, the observed fraction in this sample is 2.2%. The proportion of genotype D2D4 would be 2(0.109)(0.271) = 0.059, or 5.9%; the observed value is 4.6%. Neither of those differences is statistically significant. (Note that genotype D1.3D1.3 was not found in the black database of 224 persons. With multiple alleles and four or five loci, as with VNTRs, most genotypes are not found in any given database.)
The HW relationship is easily stated symbolically. Using letter subscripts for generality, we let pi and pj be the population proportions of two alleles Ai and Aj. If capital letters designate the genotypic proportions, the HW expectations are
(4. 1a) (4. 1b)
In words, the simple rule is: The proportion of persons with two copies of the same allele is the square of that allele's frequency, and the proportion of persons with two different alleles is twice the product of the two frequencies.
If for some reason a population does not exhibit HW proportions, as will be the case if mating in the previous generation(s) has not been random, only a single generation of random mating is needed to produce HW proportions. This is clear from Table 4.2, which shows that the proportions of gametes that unite to produce individuals in the next generation depend only on the allele frequencies, not the parental genotypes of the current generation. That property adds greatly to the usefulness of Equations 4.1, because it increases the probability that they are accurate. Populations from different parts of the world with different allele frequencies can be homogenized in a single generation, provided that mating is random. Of course, exactly random mating is very unlikely, but the equations are accurate enough for many practical purposes. In Chapter 5 we give estimates of the degree of uncertainty caused by departures from random mating proportions.
Table 4.1 shows how close actual populations come to HW proportions for DQA. The deviations from HW expectations are not great. In the white population, there is a small but statistically significant excess of homozygotes (P » .03);

OCR for page 89

Page 93
there is an excess in the black population also, but it is not statistically significant.2 It is not unusual to find a slightly higher proportion of homozygotes than predicted. We consider reasons for that later in the chapter.
In forensic applications, we are often interested in the magnitude of a difference, not just its statistical significance.3 In the example above, the deficiency in the observed frequency of heterozygotes is greater in the black population than in the white, but only in the latter is it statistically significant. This is because statistical significance depends strongly on sample size: In large samples, quite small differences can be statistically significant but may not be biologically meaningful.
HW Proportions in a Large Sample The data in Table 4.1 show approximate agreement with HW expectations, but there is some discrepancy. In the black population, the deficiency of heterozygotes is about 4%, and in the white population, it is about 3%. Most of this discrepancy comes from uncertainty introduced because of the sizes of the databases (224 and 413 persons). With larger samples, we would expect the agreement to be better.
2 The usual x2procedure is weak as a test for departure from HW proportions. The following test has considerably more power to detect departures from equilibrium of particular interest in population genetics (Robertson and Hill 1984). In a database of size N, let Xij denote the number of persons of genotype AiAj. We assume the model
We want to test the hypothesis that (i.e., HW proportions; see section on subpopulation theory for a discussion of ). It can be shown that a score test, which can be expected to be particularly powerful in detecting small values of , is based on the statistic
where K is the number of alleles and Qi is the maximum likelihood estimate of pi if = 0: in this case, Qi is the observed proportion of Ai alleles.
An excess of homozygotes will lead to a positive value of T. Provided that N is large enough, the statistic T has approximately a standard normal distribution if .
In this case, for the white population in Table 4.1, the Xii values are 413(0.022), 413(0.046), . . . : the values of pi are 0.137, 0.197, . . ; N = 413; and K = 6. Substituting those values into the equation gives T = 1.88, which from a table of the normal probability integral gives P » 0.03. For the black population, T = 0.77, giving P » 0.22, where P refers to the probability.
3 The homozygote excess in this data set is larger than is usually found for this locus in more extensive recent studies (such as Rivas et al. 1995). The data in Table 4.1 come from a variety of sources. The data on the black population come mainly from disease-screening programs in California. The data on whites come from a forensic laboratory and from the CEPH (Centre d'Etude du Polymorphisme Humain) collection of family data, stored in France and used for genetic linkage studies.

OCR for page 89

Page 94
To examine a much larger sample, we consider data on the M-N blood group locus in the New York City white population for six periods between 1931 and 1969. At this locus, there are two alleles, M and N, and therefore three genotypes, MM, MN, and NN. The data include 6,001 persons (12,002 genes). We chose this locus for three reasons. First, there are only two alleles, and all three genotypes are identified. Second, the allele frequencies are close to 1/2, maximizing the power to detect departures from HW ratios. Finally, the observations are highly reliable technically. They are from A. S. Wiener, the leading blood-group expert of the time. New York City is certainly not a homogeneous population. The persistence of two alleles at intermediate frequencies in many populations suggests that these blood groups are subject to natural selection, but the selection is probably weak, and there are only minor allele-frequency differences among various European countries (Mourant et al. 1976, p 251-260).
These blood-group data (Table 4.3) show that, even in a population as heterogeneous as that of New York City, HW ratios are very closely approximated for traits that are not factors in mate selection. The overall heterozygote frequency is within about 1% of its HW expectation. Agreement with HW expectations should be at least as close for loci, such as most of those used in forensics, that are thought to be selectively neutral.
In the United States, bin frequencies within a racial group are usually similar in different regions. The top two graphs in Figure 4.1 show the similar distribution in white populations in Illinois and Georgia. Comparison of the black and the white populations illustrates a point often made by population geneticists—namely, that differences among individuals within a race are much larger than the differences between races. Nevertheless, the intergroup differences are large
TABLE 4.3 M-N Blood Group Genotypes in New York City Whites a
Sample
Total
MM
MN
NN
PM
PN
Relative Error
1
236
71
116
49
0.5466
0.4534
0.0083
2
461
132
232
97
0.5380
0.4620
- 0.0123
3
582
166
289
127
0.5335
0.4665
0.0024
4
3,268
1,037
1,623
608
0.5656
0.4344
-0.0107
5
954
287
481
186
0.5529
0.4471
-0.0198
6
500
158
249
93
0.5650
0.4350
-0.0131
Total
6,001
1,851
2,990
1,160
0.5576
0.4424
- 0.0099
aThe columns show the total number, numbers of the three genotypes, the allele frequencies, and the relative error, computed as follows: The expected number of heterozygotes is 2PMPN X Total. For sample I this is 2(0.5466)(0.4534)(236) = 116.975; relative error = (116.975 - 116.0)/1 16.975 = 0.0083, or 0.83%. The sources of the six convenience samples are (1) parents, (2) mothers, (3) patients and hospital staff, (4) donors and paternity cases, (5) professional donors, (6) paternity cases. Data from Mourant et al. (1976), p 274.

OCR for page 89

Page 95
Figure 4.1 Fixed VNTR bins with frequencies of each bin in the United States. The locus is D2S44 with the enzyme HAE III: (A) Illinois white population, (B) Georgia white population, (C) US black population. From FBI (1993b), p 52, 51, 185.

OCR for page 89

Page 96
enough that the FBI and other forensic laboratories keep separate databases for whites and blacks, and two separate databases for Hispanics, one for those from the eastern United States and another for those from the West.
Exclusion Power of a Locus The data in Table 4.1 can be used for another purpose. As mentioned in Chapter 2, DQA data can distinguish samples from different individuals 93% of the time, clearing many innocent suspects. The overall probability that two independent persons will have the same DQA genotype is the sum of the squares of the genotype frequencies, as illustrated in Box 4.1.4
Box 4.1. Calculating the Exclusion Power of a Locus
We can illustrate the 93% average exclusion power of DQA by reference to the data in Table 4.1. The probability that two randomly chosen persons have a particular genotype is the square of its frequency in the population. The probability that two randomly chosen persons have the same unspecified genotype is the sum of the squares of the frequencies of all the genotypes. Summing the squares of the expected genotype frequencies (in parentheses) for the black population yields 0.0232 + 0.0792 + . . . + 0.0922 = 0.078. We used expected rather than observed genotype frequencies to obtain greater statistic precision. For the white population, the value is 0.063. The average is about 0.07. The exclusion power is the probability that the two persons do not have the same genotype, or 1 -0.07 = 0.93.
If there are n loci, and the sum of squares of the genotype frequencies at locus i is Pi, then the exclusion power is 1-(P1P2. . .Pn). Five loci with the power of DQA would give an exclusion power of 1-(0.07)5 = 0.999998.
4 The concept of exclusion power was initially described by Fisher (1951). The calculation of the exclusion power can be simplified, especially if the number of alleles is large, by noting that in HW proportions the unconditional probability of identical genotypes is
Each sum on the right has n terms, where n is the number of alleles, rather than n(n + 1)/2, the number of genotypes. Note that the sum in parentheses on the right-hand side is the homozygosity, fs.
An approximation to the probability of identical genotypes, due to Wong et al. (1987; see also Brenner and Morris 1990), is 2fs2- fs3. This gives the maximum value and is quite accurate for small fs or when the allele frequencies are roughly equal.

OCR for page 89

Page 97
Table 4.4 shows the frequency of bins (the VNTR equivalent of alleles— See Chapter 2) for two VNTR loci. D2S44 has an exclusion power of about 99%. The exclusion power of D17S79 is smaller because it has fewer alleles and more varied bin frequencies; its exclusion power is about 93%.
Departures from HW Proportions Clearly, the HW assumption is hardly ever exactly correct. The issue in forensic DNA analysis is whether the departures are large enough to be important. The earlier report (NRC 1992) recommended that databases be tested for agreement with HW expectations and that loci that exhibit statistically significant differences from the expectation be discarded. In our view, that places too much emphasis on formal statistical significance. In practice, statistically significant
TABLE 4.4 Bin (Allele) Frequencies at Two VNTR Loci (D2S44 and D17S79) in US White Populationa
D2S44
D17S79
Bin
Size Range
N
Prop.
Bin
Size Range
N
Prop.
3
0- 871
8
0.005
1
0- 639
16
0.010
4
872- 963
5
0.003
2
640- 772
5
0.003
5
964-1,077
24
0.015
3
773- 871
11
0.007
6
1,078-1,196
38
0.024
4
872-1.077
6
0.004
7
1,197-1.352
73
0.046
6
1,078-1,196
23
0.015
8
1,353-1,507
55
0.035
7
1,197-1,352
348
0.224
9
1,508-1,637
197
0.124
8
1,353-1,507
307
0.198
10
1,638-1,788
170
0.107
9
1,508-1,637
408
0.263
11
1,789-1,924
131
0.083
10
1,638-1,788
309
0.199
12
1,925-2,088
79
0.050
11
1,789-1,924
44
0.028
13
2,089-2,351
131
0.083
12
1,925-2,088
50
0.032
14
2,352-2,522
60
0.038
13
2,089-2,351
16
0.010
15
2,523-2,692
65
0.041
14
2,352-
9
0.006
16
2,693-2,862
63
0.040
1,552
0.999
17
2,863-3,033
136
0.086
18
3,034-3,329
141
0.089
19
3,330-3,674
119
0.075
20
3,675-3,979
36
0.023
21
3,980-4,323
27
0.017
22
4,324-5,685
13
0.008
25
5,686-
13
0.008
1,584
1.000
aD2 and D17 indicate that these are on chromosomes 2 and 17. N is the number of genes (twice the number of persons). Each bin includes a range of sizes (in base pairs) grouped so that no bin has fewer than five genes in the data set; this accounts for nonconsecutive bin numbers. Data from FBI (1993b), p 439, 530; see Budowle, Monson, et al. (1991).

OCR for page 89

Page 98
departures are more likely to be found in large databases because the larger the sample size, the more likely it is that a small (and perhaps unimportant) deviation will be detected; in a small database, even a large departure might not be statistically significant (see Table 4.1 for an example). If the approach recommended in 1992 is followed, the loci with the largest databases, which are the most reliable, would often not be used. As stated earlier, our approach is different. We explicitly assume that departures from HW proportions exist and use a theory that takes them into account. But, as can be seen from the MN data in Table 4.3, we expect the deviations to be small.
Departures from HW proportions in populations can occur for three principal reasons. First, parents might be related, leading to inbreeding. Inbreeding decreases the proportion of heterozygotes, with a compensatory increase in homozygotes.
Second, the population can be subdivided, as in the United States. There are major racial groups (black, Hispanic, American Indian, East Asian, white). Allele frequencies are often sufficiently different between racial groups that it is desirable to have separate databases. Within a race, there is likely to be subdivision. The blending in the melting pot is far from complete, and in the white population, for example, some groups of people reflect to a greater or lesser extent their European origins. A consequence of population subdivision is that mates might have a common origin. Translated into genetic terms, that means that they share some common ancestry—that they are related. Thus, the consequences of population structure are qualitatively the same as those of inbreeding: a decrease of heterozygotes and an increase of homozygotes.5
Third, persons with different genotypes might survive and reproduce at different rates. That is called selection. We shall not consider this possibility, however, because the VNTR and other loci traditionally used in forensic analysis are chosen specifically because they are thought to be selectively neutral or nearly so. Some, such as DQA, are associated with functional loci that are thought to be selected but show no important departures from HW expectations.
Inbreeding and Kinship Inbreeding means mating of two persons who are more closely related than if they were chosen at random. The theory of inbreeding was worked out 75 years ago by Sewall Wright, who defined the inbreeding coefficient, F (explained in Wright 1951). He gave a simple algorithm for computing F for any degree of
5 There is a theoretical possibility of an increase in heterozygosity. It can happen in a population of first-generation children of different ancestral populations. But such populations are usually mixed with second-generation children, in whom heterozygosity is reduced, and there are other matings. So the effect of population subdivision is to increase homozygosity in the overwhelming majority, if not all, cases.

OCR for page 89

Page 99
relationship of parents. The kinship coefficient, also designated by F and used to measure degree of relationship between two persons, is the same as the inbreeding coefficient of a (perhaps hypothetical) child.6 For parent and child, F = 1/4; for sibs, 1/4; for half sibs, 1/8; for uncle (or aunt) and nephew (or niece), 1/8; for first cousins, 1/16; and for second cousins, 1/64.
With inbreeding, the expected proportion of heterozygotes is reduced by a fraction F; that of homozygotes is correspondingly increased. Thus, with inbreeding,
(4.2a) (4.2b)
Because F for first cousins is 1/16, a population in which everybody had married a first cousin in the previous generation would be 1/16 less heterozygous than if marriages occurred without regard to family relationships.
Population Subgroups The white population of the United States is a mixture of people of various origins, mostly European. The black and Hispanic populations also have multiple origins. Matings tend to occur between persons who are likely to share some common ancestry and thus to be somewhat related. Therefore, homozygotes are somewhat more common and heterozygotes less common than if mating were random.
The related problem of greatest concern in forensic applications is that profile frequencies are computed (under the assumption of HW proportions) from the population-average allele frequencies. If there is subdivision, that practice will always lead to an underestimate of homozygous genotype frequencies and usually to an overestimate of heterozygote frequencies.
To understand that, consider a population divided into subpopulations, each in HW proportions. Let pi denote the frequency of the allele Ai in the entire population. If that entire population mated at random, the frequencies of the genotypes AiAi and AiAj (i ¹ j) would be pi2 and 2pipj, respectively. The relationship between those hypothetical genotype frequencies and the actual frequencies of homozygotes, Pii, and heterozygotes, Pij, in the entire population is given by
6 Wright's algorithm is given in standard textbooks (Hartl and Clark 1989, p 238ff; see also Wright 1951). One definition of the inbreeding coefficient is the probability that the two homologous genes in a person are descended from the same gene in a common ancestor. The kinship coefficient of two persons is the corresponding probability of identity by descent of two genes, randomly chosen, one from each person. From those definitions, Wright's algorithm can readily be derived. The algorithm is easily modified for genes on the X-chromosome, but since they constitute such a small fraction of the genome, this is an unnecessary refinement for our purposes.

OCR for page 89

Page 114
be known that the DNA came from a white person, in which case the white database is appropriate. If the race is not known or if the population is of racially mixed ancestry, the calculations can be made with each of the appropriate databases and these presented to the court. Alternatively, if a single number is preferred, one might present the calculations for the major racial group that gives the largest probability of a match. Similar procedures can be used for persons of mixed ancestry.
If it is known that the contributor of the evidence DNA and the suspect are from the same subpopulation and there are data for that subpopulation, this is clearly the set of frequencies to use to obtain the most accurate estimate of the genotype frequency in the set of possible perpetrators of the crime. Of course, the database should be large enough to be statistically reliable (at least several hundred persons), and rare alleles should be rebinned (see Chapter 5) so that no allele has a frequency less than five. The product rule is appropriate, in that departures from random mating within a subgroup are not likely to be important (and, as mentioned above, this is supported empirically). The use of the 2p rule makes the product rule conservative.
Some have argued that even if there is no direct evidence, it should be assumed for calculation purposes that the person contributing the evidence and the suspect are from the same subgroup (Balding and Nichols 1994). Even though it is not known to which subpopulation both persons belong, Balding and Nichols assume that the two are likely to be more similar than if they were chosen randomly from the population at large. In our view, that is unnecessarily conservative, and we prefer to make this assumption only when there is good reason to think it appropriate—for example, if the suspect and all the possible perpetrators are from the same small, isolated town. Most of the time, we believe, the subgroup of the suspect is irrelevant.
To continue with the assumption that the person contributing the evidence and the suspect are from the same subgroup, an appropriate procedure is to write the conditional probability of the suspect genotype, given that of the perpetrator. As before, we measure the degree of population subdivision by , although a single parameter is not sufficient to describe the situation exactly. A number of formulae have been proposed to deal with this (Morton 1992; Crow and Denniston 1993; Balding and Nichols 1994, 1995; Roeder 1994; Weir 1994). They depend on different assumptions and methods of derivation but agree very closely for realistic values of and p.15 The simplest of the more accurate formulae is due to Balding and Nichols (1994, 1995):
(4.10 a)
15 Deriving a formula for these conditional probabilities requires some assumption about the population structure. Some models that have been used are a pure random-drift model, a mutation-drift, infinite-allele model, or a mathematically identical migration-drift infinite-allele model; or
(footnote continued on next page)

OCR for page 89

Page 115
( 4.10b)
Nothing in population-genetics theory tells us that should be independent of genotype. In fact, there is likely to be a different for each pair of alleles Ai and Aj. Since individual genotypes are usually rare, these values are inaccurately measured and ordinarily unknown. The best procedure is to use a conservative value of in Equations 4.10, knowing that the true individual values are likely to be smaller. Balding and Nichols (1994) extend Equations 4.10 to account for undetected bands. They also give an upper limit for homozygotes, analogous to the 2p rule. Their upper bound on the conditional probability is . We believe, however, that because Equation 4.10a is already conservative, this rule is usually unnecessary.
The value of has been estimated for several populations. As mentioned above, typical values for white and black populations are less than 0.01, usually about 0.002. Values for Hispanics are slightly higher, as expected because of the greater heterogeneity of this group, defined as it is mainly by linguistic criteria.
Table 4.9 gives numerical examples of calculations for three racial groups. using the data of Table 4.8. Two alternative assumptions are made: that the evidence profile is heterozygous (there are two clear bands) at all four loci, and that locus A has a single band at allele A6. In this example, the three racial groups are very similar; if all are heterozygous or if the 2p rule is used for homozygotes, they are within a factor of 3. That will not always be true. If one locus is single-banded, the 2p rule makes a substantial difference in the calculation. With four multiallelic loci, such as VNTRs, most four-locus profiles will be heterozygous at all loci. (For example, if the heterozygosity per locus is 0.93, as it is for D2S44, the probability that all four loci will be heterozygous is about 0.75.)
If all loci are heterozygous, then assuming that the evidence DNA and the DNA from the suspect came from the same subpopulation, using Equations 4.10 has a fairly small effect on the calculations when . However, using a value of decreases the likelihood ratio (increases the match probability—see Chapter 5) by a factor of 10. If the A locus is homozygous, then Equation 4.1 a with the 2p rule is more conservative than Equation 4.10a with and very close to Formula 4.10a with .
(footnote continued from previous page)
various statistical assumptions concerning the distribution of allele frequencies among the subpopulations. A more appropriate model would be a stepwise-mutation theory because VNTR lengths tend to change by small steps, but that has not been worked out. Even that would not be completely satisfactory unless one also takes migration, which may be more important than mutation, into account. When is small (< 0.02), the formulae derived from different models agree closely. Although the specific models are highly idealized, when different assumptions lead to similar results, it increases our confidence in the final formulae. The formulae given are from Balding and Nichols (1994), and were chosen because they are both simple to evaluate and accurate.

OCR for page 89

Page 116
TABLE 4.9 Likelihood Ratio (Reciprocal of Match Probability) for Four-Locus Profiles in Three Populations Calculated by Various Formulaea
White
Black
Hispanic
Equations 4.1
All loci heterozygous
3.79 x 108
3.52 x 108
6.56 x 108
A-locus homozygous
1.80 x 109
3.60 x 108
2.25 x 108
A-locus single band, 2p rule
3.14 x 107
1.66 x 107
1.18 x 107
Equations 4.10,
All loci heterozygous
1.20 x 108
1.16 x 108
1.74 x 108
A-locus homozygous
2.80 x 108
9.87 x 107
6.63 x 107
Equations 4.10,
All loci heterozygous
2.04 x 107
2.06 x 107
2.53 x 107
A locus homozygous
2.48 x 107
1.39 x 107
1.02 x 107
All Races
Interim Ceiling Principle
All loci heterozygous
2.68 x 106
A-locus single-band, 2p rule
2.68 x 105
aThe data are from Table 4.8. The evidence profile is either (1) all loci heterozygous, A6A11 B8B14 C10C13 D9D16, or (2) A-locus single-banded, A6 -. All calculations use the product rule.
For urban populations, 0.01 is a conservative value. A higher value—say, 0.03—could be used for isolated villages.16
The table also gives calculations based on the interim ceiling principle (using 1.645 instead of the value 1.96 cited in NRC 1992—see Chapter 5). As will be explained in Chapter 5, we believe that the ceiling principles are unnecessary. We give the calculation for illustration only.
PCR-Based Systems As described in Chapter 2, other systems are coming into greater use. Most of them are based on PCR, require much smaller amounts of DNA, and have the additional advantage that the exact allele can usually be determined, so the
16 Empirical estimates of , essentially the same as FST and GST, are found throughout the population-genetics literature. An extensive compilation is given by Cavalli-Sforza et al. (1994). The values in the compilation are sometimes considerably higher than the values that we use. There are two reasons: The Cavalli-Sforza comparisons are often between major groups, and many of the comparisons are for blood groups and similar polymorphisms, which have much lower mutation rates than VNTRs and are often subject to selection. Selection can differ in different populations; for example, selection for malaria-resistance genes is strong in hot, wet areas. We regard the empirical estimates of from VNTRs, made either from comparison of homozygote and heterozygote frequencies (when the interpretation of single bands is not a substantial problem) or directly by comparisons among groups, as being a much better guide for forensic calculations.

OCR for page 89

Page 117
complications of matching and binning are eliminated. That is true for mitochondrial DNA, DQA, and other markers such as STRs.
The newer systems have not had the large amount of population study that VNTRs have had. The databases are smaller, but the studies that have been done show the same agreement with HW and LE that VNTRs do (Herrin et al. 1994; Budowle, Baechtel, et al. 1995; Budowle, Lindsay, et al. 1995). STRs and some of the other loci share the property of VNTRs of not producing a protein product or having any known selectable function. Their chromosomal positions are known, and they can be chosen so that no two are linked. It should be relatively easy to get more population data, because it is not necessary to find the people; DNA samples for large populations already exist.
The previously mentioned advantages of STRs and other new methods (exact genotype determination, fast turnaround, lower cost, and small DNA-sample requirements) are such that the use of these methods will continue to increase. We also expect that population data will continue to accumulate and that tests, particularly of HW and LE, will continue to be carried out; and thus, the new methods will soon be on the same solid footing as VNTRs. Meanwhile, the similarity of some of these loci to VNTR loci and results of studies already done offer evidence that the methods given here will provide to the degree of accuracy required for forensic use.
A locus that is being increasingly used is D1S80. It is also a length variant, but unlike VNTRs, the size of the DNA fragment is small enough to permit PCR analysis. The locus consists of 16-base units, each of which is repeated from 14 to 41 times. It has been validated, both for robustness to environmental insults and for agreement with HW proportions (Sajantila et al. 1992; Budowle, Baechtel, et al. 1995; Cosso and Reynolds 1995).
STR loci appear to be particularly appropriate for forensic use. Like VNTRs, they can be chosen to be in noncoding regions and therefore can be expected to be selectively neutral. Also, they have many alleles, and there are potentially a very large number of loci. Unlike VNTRs, they can be amplified with PCR, and the individual alleles are identifiable.
Table 4.10 compares VNTR loci with two PCR-based systems, STR and Polymarker.17 The total gene diversity is the proportion of heterozygotes that would exist if the entire population were in random-mating proportions. In the table, the gene diversity within subpopulations is given as a fraction of this total
17 The six STR loci represent seven populations from three races, grouped as follows (subgroups within races are in parentheses: east Asians (Chinese, Japanese, Houston Asians), whites (German. Houston), and blacks (Nigeria, Houston). The Polymarker data come from 12 populations from five races: Eskimos (Barrow, Bethel), whites (two US samples, Swiss), blacks (two US samples), Hispanics (three US samples), and east Asians (Chinese, Japanese). Polymarker designations are: DQA (part of the HLA region); LDLR (low density lipoprotein receptor); GYPA (glycophorin A, the MN blood group), HBGG (hemoglobin G gamma globin), D7S8 (a marker of unknown function on chromosome seven), and GC (group specific component).

OCR for page 89

Page 118
TABLE 4.10 Comparison of VNTR, STR, and Polymarker Systemsa
Gene Diversity
Proportion
Locus
No. of Alleles
Repeat Size
Total
(a)
(b)
(c)
VNTR loci
£ 31 bins
D1S7
9
0.9470
0.995
0.005
0.001
D2S44
31
0.9342
0.985
0.007
0.009
D4S139
32
0.9103
0.989
0.005
0.006
D10S28
33
0.9489
0.990
0.005
0.005
D17S79
38
0.8366
0.971
0.011
0.018
Mean
0.9154
0.986
0.006
0.008
STR loci
CSFIR
10
4
0.751
0.987
0.005
0.008
THO1
8
4
0.781
0.905
0.011
0.084
PLA2A
9
3
0.814
0.945
0.004
0.051
F13A1
14
4
0.798
0.902
0.006
0.092
CYP19
10
4
0.723
0.947
0.007
0.046
LPL
7
4
0.656
0.956
0.006
0.038
Mean
0.708
0.939
0.007
0.054
Polymarker loci plus DQA
DQA1
6
0.788
0.948
0.009
0.043
LDLR
2
0.483
0.914
0.004
0.082
GYPA
2
0.478
0.971
0.012
0.017
HBGG
3
0.539
0.876
0.003
0.121
D7S8
2
0.475
0.995
0.002
0.003
GC
3
0.654
0.909
0.003
0.088
Mean
0.571
0.934
0.006
0.060
a(a) Proportion of gene diversity accounted for by between-individual variability within subpopulations; (b) proportion within races between subpopulations; (c) proportion between races (Chakraborty, Jin, et al. 1995).
(a), as are the increments added by subpopulation differences (b) and racial differences (c). As these figures emphasize, for VNTRs, almost all the variability is between individuals within subgroups. Although these proportions, based on limited data sets, suggest that (b) and (c) are approximately the same, in general the divergence between races is larger than that between subgroups within a race (Latter 1980; Chakraborty and Kidd 1991; Devlin and Risch 1992; Devlin, Risch and Roeder 1993, 1994).
The population genetics of the Polymarker loci make these loci less advantageous than VNTRs, for three reasons. First, the number of alleles is small, and that is reflected in the lower gene diversity; several more loci are required than for VNTRs. Second, the variability between races is greater. That is particularly

OCR for page 89

Page 119
true for the loci LDLR, HBGG, and GC, which are all associated with functional genes (Chapter 2). Third, Polymarker loci have lower mutation rates and are less likely to be selectively neutral than VNTRs and STRs. These factors might cause the differences between groups.
STRs are intermediate in diversity between VNTRs and Polymarkers, as expected given that they have an intermediate number of alleles. The allocation of gene diversity to individual versus group and subgroup differences is also intermediate. Additional data from different STRs in different racial populations are in substantial agreement with the findings presented in the table (e.g., Bever and Creacy 1995; Meyer et al. 1995). An extensive study of blacks, whites, and Hispanics in Houston involving 12 STR loci found a mean heterozygosity (diversity) of about 75%, and 97.6% of the genetic diversity was within racial groups (Edwards et al. 1992; Hammond et al. 1994), in good agreement with the data in Table 4.10.
Compared with VNTRs, STRs have less exclusion power per locus, and Polymarker loci have less than STRs. The power of exclusion depends strongly on the heterozygosity (see footnote 4). Assuming HW proportions and LE and using the data in Table 4.10, the probability that two randomly selected individuals would have the same profile is about 10-10 for the five VNTR loci, about 10-6 for the six STR loci (using the 12 STRs mentioned in the paragraph above would lower the probability to about 10-12), and about 10-4for the six Polymarker loci.
Whereas the total database for VNTRs now numbers in the tens of thousands, the number for the newer systems is still in the hundreds, but the numbers are increasing rapidly, and the studies are being extended to different populations.
It is quite proper to combine different systems (e.g., VNTRs and STRs) in the product rule, provided, of course, that the loci are close to LE.
What do we conclude about PCR-based systems? We believe that they are ready to be used along with VNTRs. Newer data (Chakraborty et al. 1995; Gill and Evett 1995; Promega 1995; Evett, Gill et al. 1996) show low values of , comparable to those for VNTRs. Within the limitations of the data, there is good agreement with HW and LE. Graphs such as those in Figures 5.3 and 5.4 show about the same degree of uncertainty as VNTRs. Most STRs are at neutral loci. PCR-based systems have fewer alleles and hence higher allele frequencies than VNTRs. This means that the value of has less influence (see Equation 4.4a). Yet, mutation rates for PCR loci are generally lower than those for VNTRs, and this might lead one to expect greater values of .
We conclude that PCR-based systems should be used. A value of 0.01 for would be appropriate. However, in view of the greater uncertainty of PCR-based markers because of less extensive data than for VNTRs, a more conservative value of 0.03 may be chosen.
A Conservative Rule for PCR Loci For VNTRs, we used the 2p rule and showed that it was conservative for populations in which the values of are positive. The rule was originally

OCR for page 89

Page 120
introduced to adjust for uncertainty as to whether a single band is a homozygote or heterozygote. That problem does not arise with loci at which there is no ambiguity about allele identification. Is there a conservative adjustment for subdivided populations for such loci that corresponds to the 2p rule? It is simple to choose one:18 Assign to each homozygote a frequency pi (rather than pi2). This, however, is unnecessarily conservative.19
A more accurate but still conservative procedure, and one that we recommend, is to use Equation 4.4a with a conservative value of . Since observed values of are usually less than 0.01, this value would be appropriate. (In view of the greater uncertainty of PCR calculations because of less extensive population data than for VNTRs, a more conservative value of 0.03 might be chosen.) For small, isolated populations, a value of 0.03 is appropriate. This value is intermediate between those that would be found in populations of first- and of second-cousin matings and is a reasonable upper limit for what might be expected.
The 2p rule for VNTRs was introduced because single bands may actually come from heterozygotes. If the techniques are or become good enough that this ambiguity does not exist, then VNTRs should be treated like the PCR-based systems, and the procedure of the previous paragraph should be applied. Conversely, even in PCR-based systems, it may be desirable to use the 2p rule if there is uncertainty caused by null alleles. In a well-characterized system, the frequency of null alleles can often be estimated, and a more accurate correction can then be applied.
Development of New Systems PCR-based systems have several advantages, the most important being that they can be used when source material is sparse or degraded and a second being that there need not be uncertainties of measurement. But there are also
18 Here are two proofs in the style and notation of Footnote 9. First, we have
Second, note that
as above.
19 The error involved in assuming HW ratios and ignoring subpopulations makes little difference for heterozygotes. From Equation 4.4b, we see that the frequency is overestimated by a factor , or approximately when is small. Furthermore, the error is in the desired direction of conservatism. In contrast, from Equation 4.4a it is seen that the error for homozygotes can be considerable, and in the wrong direction. For example, if pi = 0.03 and , assuming HW gives an estimate of 0.0009. whereas Equation 4.4a gives 0.0018, a two-fold error. But note that this ''p rule" is excessively conservative in assigning a value of 0.03 instead of 0.0018, a 17-fold difference—too conservative, we believe.

OCR for page 89

Page 121
disadvantages. VNTRs have many alleles, none of which is at a high frequency. Presumably, the high mutation rate accounts for that and for the small differences in frequencies among subgroups.20 The VNTRs used for forensics also occur at loci that have no function and therefore are probably not affected by natural selection. Some of the loci used in PCR-based systems have only a small number of alleles, and the loci are at functional genes, which means that there is less assurance of HW and LE. Many more loci are required to produce the same probability levels than are required for VNTRs.
Yet, the statistical uncertainties with VNTRs (discussed in more detail in Chapter 5) make it desirable to bring new loci into the system. The extensive activity in mapping human genes is leading to the rapid discovery of many more possible markers, some of which are expected to have the kinds of properties that are desirable for forensic use: high mutation rate, multiple alleles, lack of function (which increases the probability of neutrality), speed of analysis, low cost, and unambiguous identification of alleles. We encourage the development and validation of such systems.
Inadequate Databases There are situations in which the database is inadequate. The population of possible suspects might be so structured that no reasonable average allele frequency can be determined, or there might be no basis for estimating . Such a situation may be found among some American Indian tribes, Inuits, or isolated immigrant groups. As databases become more extensive and varied, such gaps should be filled.
If an inadequate database is encountered, one procedure is to use allele frequencies from other groups. These should be groups for which the databases are large enough to be reliable, and they should be as closely related to the group in question as possible. We emphasize that they be closely related to discourage the use of a population, possibly unrelated, solely because it has a set of frequencies favorable to the position being argued. For the same reason, we believe that the number of groups examined should be limited. The calculations based on each of the groups, or some sort of average—or if the desire is for the most conservative estimate, the one that is most favorable to the defendant—can be presented to the court.
20 VNTR systems have a high mutation rate, and mutations usually consist of small changes in the length of the VNTR segment. These two factors are largely responsible for the large number of alleles, none of which is very common, in VNTR systems. The resulting high diversity between individuals and small diversity between groups make VNTRs particularly useful as forensic evidence. Although the mutation rates for STRs are not as high as those for VNTRs, the rates are still much higher for STRs than for classical loci. A high mutation rate is desirable for forensic identification (although not for paternity testing).

OCR for page 89

Page 122
Conclusions and Recommendations Sufficient data now exist for various groups and subgroups within the United States that analysts should present the best estimates for profile frequencies. For VNTRs, using the 2p rule for single bands and HW for double bands is generally conservative for an individual locus. For multiple loci, departures from LE are not great enough to cause errors comparable to those from uncertainty of allele frequencies estimated from databases.
With appropriate consideration of the data, the principles in this report can be applied to PCR-based systems. For those in which exact genotypes can be determined, the 2p rule should not be used. A conservative estimate is given by using the HW relation for heterozygotes and a conservative value of in place of ii in Equation 4.4a for homozygotes.
Recommendation 4.1: In general, the calculation of a profile frequency should be made with the product rule. If the race of the person who left the evidence-sample DNA is known, the database for the person's race should be used; if the race is not known, calculations for all racial groups to which possible suspects belong should be made. For systems such as VNTRs, in which a heterozygous locus can be mistaken for a homozygous one, if an upper bound on the genotypic frequency at an apparently homozygous locus (single band) is desired, then twice the allele (bin) frequency, 2p, should be used instead of p2. For systems in which exact genotypes can be determined, should be used for the frequency at such a locus instead of p2. A conservative value of for the US population is 0.01; for some small, isolated populations, a value of 0.03 may be more appropriate. For both kinds of systems, 2pipj should be used for heterozygotes.
A more conservative value of might be chosen for PCR-based systems in view of the greater uncertainty of calculations for such systems because of less extensive and less varied population data than for VNTRs.
Evidence DNA and Suspect from the Same Subgroup Sometimes there is evidence that the suspect and other possible sources of the sample belong to the same subgroup. That can happen, e.g., if they are all members of an isolated village. In this case, a modification of the procedure is desirable.
Recommendation 4.2: If the particular subpopulation from which the evidence sample came is known, the allele frequencies for the specific subgroup should be used as described in Recommendation 4.1. If allele frequencies for the subgroup are not available, although data for the full population are, then the calculations should use the population-structure Equations 4.10 for each locus, and the resulting values should then be multiplied.

OCR for page 89

Page 123
Insufficient Data For some groups—and several American Indian and Inuit tribes are in this category—there are insufficient data to estimate frequencies reliably, and even the overall average might be unreliable. In this case, data from other, related groups provide the best information. The groups chosen should be the most closely related for which adequate databases exist. These might be chosen because of geographical proximity, or a physical anthropologist might be consulted. There should be a limit on the number of such subgroups analyzed to prevent inclusion of more remote groups less relevant to the case.
Recommendation 4.3: If the person who contributed the evidence sample is from a group or tribe for which no adequate database exists, data from several other groups or tribes thought to be closely related to it should be used. The profile frequency should be calculated as described in Recommendation 4.1 for each group or tribe.
Dealing with Relatives In some instances, there is evidence that one or more relatives of the suspect are possible perpetrators.
Recommendation 4.4: If the possible contributors of the evidence sample include relatives of the suspect, DNA profiles of those relatives should be obtained. If these profiles cannot be obtained, the probability of finding the evidentiary profile in those relatives should be calculated with Formulae 4.8 or 4.9.
Appendix 4A Here, we derive the relation (Equation 4.5) between the average of the parameters and Wright's (1951) fixation index, FIT (Nei 1977, 1987, p 159164; Chakraborty and Danker-Hopfe 1991; Chakraborty 1993). We begin with an arbitrary mating pattern; in particular, we do not assume that random mating occurs within subpopulations, or even that distinct subpopulations occur. Later, we posit distinct subpopulations and random mating in each of them.
The homozygosity, fo, and heterozygosity, h0, in the substructured population are
,
where Pij is the frequency of genotype AiAj. If the entire population mated at random, these quantities would become
,

OCR for page 89

Page 124
where the allele frequencies pj satisfy
We can rewrite hT as
First, we express the homozygote parameters in terms of the heterozygote parameters . Substituting Equations 4.4 into the equation
and noting that pi ¹ 0 leads to
Multiplying that by pi and summing over i enables us to define the mean
Thus, the weighted means of the homozygote and heterozygote parameters are equal.
We insert Equation 4.4b to deduce that
If the subpopulations are distinct and mating is random in each subpopulation, then FIT = FST , and hence .