DNA Typing: Technical Considerations
"DNA typing" is a catch-all term for a wide range of methods for studying genetic variations. Each method has its own advantages and limitations, and each is at a different state of technical development. Each DNA typing method involves three steps:
Laboratory analysis of samples to determine their genetic-marker types at multiple sites of potential variation.
Comparison of the genetic-marker types of the samples to determine whether the types match and thus whether the samples could have come from the same source.
If the types match, statistical analysis of the population frequency of the types to determine the probability that such a match might have been observed by chance in a comparison of samples from different persons.
Before any particular DNA typing method is used for forensic purposes, it is essential that precise and scientifically reliable procedures be established for performing all three steps. This chapter discusses the first two—laboratory analysis and pattern comparison—and Chapter 3 focuses on statistical analysis.
There is no scientific dispute about the validity of the general principles underlying DNA typing: scientists agree that DNA varies substantially among humans, that variation can be detected in the laboratory, and that DNA comparison can provide a basis for distinguishing samples from different persons. However, a given DNA typing method might or might not be scientifically appropriate for forensic use. Before a method can be ac-
cepted as valid for forensic use, it must be rigorously characterized in both research and forensic settings to determine the circumstances under which it will and will not yield reliable results. It is meaningless to speak of the reliability of DNA typing in general—i.e., without specifying a particular method. Some states have adopted vaguely worded statutes regarding admissibility of DNA typing results without specifying the methods intended to be covered. Such laws obviously were intended to cover only conventional RFLP analysis of single-locus probes on Southern blots—the only method in common use at the time of passage of the legislation. We trust that courts will recognize the limitations inherent in such statutes.
Forensic DNA analysis should be governed by the highest standards of scientific rigor in analysis and interpretation. Such high standards are appropriate for two reasons: the probative power of DNA typing can be so great that it can outweigh all other evidence in a trial; and the procedures for DNA typing are complex, and judges and juries cannot properly weigh and evaluate conclusions based on differing standards of rigor.
The committee cannot provide comprehensive technical descriptions for DNA typing in this report: too many methods exist or are planned, and too many issues must be addressed in detail for each method. Instead, our main goal is to provide a general framework for the evaluation of any DNA typing method.
ESSENTIALS OF A FORENSIC DNA TYPING PROCEDURE
The forensic use of DNA typing is an outgrowth of its medical diagnostic use—analysis of disease-causing genes based on comparison of a patient's DNA with that of family members to study inheritance patterns of genes or with reference standards to detect mutations. To understand the challenges involved in such technology transfer, it is instructive to compare forensic DNA typing with DNA diagnostics.
DNA diagnostics usually involves clean tissue samples from known sources. It can usually be repeated to resolve ambiguities. It involves comparison of discrete alternatives (e.g., which of two alleles did a child inherit from a parent?) and thus includes built-in consistency checks against artifacts. It requires no knowledge of the distribution of patterns in the general population.
Forensic DNA typing often involves samples that are degraded, contaminated, or from multiple unknown sources. It sometimes cannot be repeated, because there is too little sample. It often involves matching of samples from a wide range of alternatives present in the population and thus lacks built-in consistency checks. Except in cases where the DNA evidence
excludes a suspect, assessing the significance of a result requires statistical analysis of population frequencies.
Despite the challenges of forensic DNA typing, we believe that it is possible to develop reliable forensic DNA typing systems, provided that adequate scientific care is taken to define and characterize the methods. We outline below the principal issues that must be addressed for each DNA typing procedure.
Written Laboratory Protocol
An essential element of any clinical or forensic DNA typing method is a detailed written laboratory protocol. Such a protocol should not only specify steps and reagents, but also provide precise instructions for interpreting results, which is crucial for evaluating the reliability of a method. Moreover, the complete protocol should be made freely available so that it can be subjected to scientific scrutiny.
Procedure For Identifying Patterns
There must be an objective and quantitative procedure for identifying the pattern of a sample. Although the popular press sometimes likens DNA patterns to bar codes, laboratory results from most methods of DNA testing are not discrete data, but rather continuous data. Typically, such results consist of an image—such as an autoradiogram, a photograph, spots on a strip, or the fluorometric tracings of a DNA sequence—and the image must be quantitatively analyzed to determine the genotype or genotypes represented in the sample. Quantitation is especially important in forensic applications, because of the ever-present possibility of mixed samples.
Patterns must be identified separately and independently in suspect and evidence samples. It is not permissible to decide which features of an evidence sample to count and which to discount on the basis of a comparison with a suspect sample, because this can bias one's interpretation.
Procedure For Declaring a Match
When individual patterns of DNA in evidence sample and suspect sample have been identified, it is time to make comparisons to determine whether they match. Whether this step is easy or difficult depends on the resolving power of the system to distinguish alleles. Some DNA typing methods involve small collections of alleles that can be perfectly distinguished from one another—e.g., a two-allele RFLP system based on a polymorphism at a single locus. Other methods involve large collections of similar alleles that are imperfectly distinguished from one another—e.g., the hypervariable VNTR
systems in common forensic use, in which a single sample might yield somewhat different allele sizes on repeat measurements.1 It is easy to determine whether two samples match in the former case (assuming that the patterns have been correctly identified), but the latter case requires a match criterion—i.e., an objective and quantitative rule for deciding whether two samples match. For example, a match criterion for VNTR systems might declare a match between two samples if the restriction-fragment sizes lie within 3% of one another.
The match criterion must be based on the actual variability in measurement observed in appropriate test experiments conducted in each testing laboratory. The criterion must be objective, precise, and uniformly applied. If two samples lie outside the matching rule, they must be declared to be either ''inconclusive" or a "nonmatch." Considerable controversy arose in early cases over the use of subjective matching rules (e.g., comparison by eye) and the failure to adhere to a stated matching rule.
Identification of Potential Artifacts
All laboratory procedures are subject to potential artifacts, which can lead to incorrect interpretation if not recognized. Accordingly, each DNA typing method must be rigorously characterized with respect to the types of possible artifacts, the conditions under which they are likely to occur, the scientific controls for detecting their occurrence, and the steps to be taken when they occur, which can range from reinterpreting results to correcting for the presence of artifacts, repeating some portion of the experiment, or deciding that samples can be reliably used.
Regardless of the particular DNA typing method, artifacts can alter a pattern in three ways: Pattern A can be transformed into Pattern B, Pattern A can be transformed into Pattern A + B; and Pattern A + B can be transformed into Pattern B. It is important to identify the circumstances under which each transformation can occur, because only then can controls and corrections be devised. For example, RFLP analysis is subject to such artifacts as band shifting, in which DNA samples migrate at different speeds and yield shifted patterns (A→B), and incomplete digestion, in which the failure of a restriction enzyme to cleave at all restriction sites results in additional bands (A→A + B).
Some potential problems can be identified on the basis of the chemistry of DNA and the mechanism of detection in the genetic-typing system. Anticipation of potential sources of DNA typing error allows systematic empirical investigation to determine whether a problem exists in practice. If so, the range of conditions in which an assay is subject to artifact must be characterized. In either case, the results of testing for artifacts should be documented. Empirical testing is necessary, whether one is considering a new method, a new locus, a new set of reagents (probe or enzyme) for a
pre-existing locus, or a new device. Under some circumstances, even small changes in procedure can change the pattern of artifacts.
Once potential artifacts have been identified, it is necessary to design scientific controls to serve as internal checks in each experiment to test whether the artifacts have occurred. Once the appropriate controls are identified, analysts must use them consistently when interpreting test results. If the appropriate control has not been performed, no result should be reported. When a control indicates irregularities in an experiment, the results in question must be considered inconclusive; if possible, the experiment should be repeated. A well-designed DNA typing test should be a matter of standardized, objective analysis.
Sensitivity to Quantity, Mixture, and Contamination
Evidence samples might contain very little DNA, might contain a mixture of DNA from multiple sources, and might be contaminated with chemicals that can interfere with analysis. It is essential to understand the limits of each DNA typing method under such circumstances.
Before a new DNA typing method can be used, it requires not only a solid scientific foundation, but also a solid base of experience in forensic application. Traditionally, forensic scientists have applied five steps to the implementation of genetic marker systems:2,3
Gain familiarity with a system by using fresh samples.
Test marker survival in dried stains (e.g., bloodstains).
Test the system on simulated evidence samples that have been exposed to a variety of environmental conditions.
Establish basic competence in using the system through blind trials.
Test the system on nonprobative evidence samples whose origin is known, as a check on reliability.
When a technique is initially developed, all five steps should be carefully followed. As laboratories adopt the technique, it will not always be necessary for them to repeat all the steps, but they must demonstrate familiarity and competence by following steps 1, 4, and 5.4
Most important, there is no substitute for rigorous external proficiency testing via blind trials. Such proficiency testing constitutes scientific confirmation that a laboratory's implementation of a method is valid not only in theory, but also in practice. No laboratory should let its results with a new DNA typing method be used in court, unless it has undergone such proficiency testing via blind trials. (See Chapter 4 for discussion of proficiency testing.)
Publication and Scientific Scrutiny
If a new DNA typing method (or a substantial variation on an existing one) is to be used in court, publication and scientific scrutiny are very important. Extensive empirical characterization must be undertaken. Results must be published in appropriate scientific journals. Publication is the mechanism that initiates the process of scientific confirmation and eventual acceptance or rejection of a method.
Some of the controversy concerning the forensic use of DNA typing can be traced to the failure to publish a detailed explanation and justification of methods. Without the benefit of open scientific scrutiny, some testing laboratories initially used methods (for such fundamental steps as identifying patterns, declaring matches, making comparison with a databank, and correcting for band shifting) that they later agreed were not experimentally supported. In some cases, those errors resulted in exclusion of DNA evidence or dismissal of charges.
TECHNICAL ISSUES IN RFLP ANALYSIS
Choice of Probes
A DNA probe used in forensic applications should have the following properties:
It should recognize a single human locus (or site), preferably one whose chromosomal location has been determined.
It should detect a constant number of bands per allele in most humans.
It should be characterized in the published literature, including its typical range of alleles, and its tendency to recognize DNA from other species.
It should be readily available for scientific study by any interested person.
The committee recommends against forensic use of multilocus probes, which detect many fragments per person. Because such probes might detect fragments with quite different intensities, it is difficult to know whether one has detected all fragments in a sample—particularly with small and degraded forensic samples—and difficult to recognize artifacts and mixtures. Such problems increase the difficulty of pattern interpretation. Multilocus probes increase the risk of incorrect interpretation, and numerous single-locus probes, which do not pose such problems, are available. The use of enough single-locus probes gains the advantages of the single multilocus probes without the problems of interpretation.
Southern Blot Preparation
The basic protocol for preparing Southern blots is fairly standard, but testing laboratories vary in such matters as choice of restriction enzyme, gel length and composition, and electrophoresis conditions. Such differences do not fundamentally affect the reliability of the general method, but some enzymes might require characterization (e.g., each restriction enzyme must be characterized for sensitivity to inhibitors, for tendency to cut at anomalous recognition sites under some conditions—often called "star activity"—and for tendency to produce partial digestions), and differences in gels and electrophoresis conditions will affect resolution of fragments and retention of small fragments.
Questions have arisen concerning the use of ethidium bromide, a fluorescent dye that binds to DNA and so allows it to be visualized. Some laboratories incorporate ethidium bromide into analytical gels before electrophoresis; others stain gels with ethidium bromide after electrophoresis. The committee strongly recommends the latter, for two reasons:
Ethidium bromide binds to DNA in a concentration-dependent manner and has been shown to alter the mobility of fragments at high DNA concentrations, thus decreasing the reliability of fragment-size measurements.
Staining after electrophoresis requires smaller amounts of ethidium bromide, and that is preferable, because the dye is a known carcinogen and thus poses problems of exposure and disposal.
Because there are several advantages and no drawbacks to staining after electrophoresis, we conclude that there is no present justification for use of ethidium bromide in analytical gels.
Identification of DNA Patterns
Identification of the DNA pattern of each sample should be carried out very carefully. When analyzed with a single-locus probe, each lane will ideally show at the most fragments derived from two alleles and nothing else. However, complications can arise. To interpret such complications properly, an examiner requires considerable knowledge and skill and might need to examine control experiments.
Examination of a Control Pattern
Every Southern blot procedure should be applied to a known DNA sample (in addition to the evidence samples in question), to verify that the hybridization was performed correctly. If this control sample does not yield
a clean result that shows the correct pattern for a particular hybridization, the result of the test hybridization should be discounted.
Sometimes, only a single band will be detected when two distinct alleles are present. That might occur because the second allele is so small that it has migrated off the end of the gel, because the second allele is similar in size to the first allele and thus is not resolved, or because the second allele is much larger, and larger fragments are preferentially lost in partially degraded samples.
When only a single band is found, the interpretation should always include the possibility that a second band has been missed—i.e., that the pattern is actually of a heterozygote, not a homozygote. (For statistical interpretation, the frequency of a single-band pattern should be taken to be the sum of the frequencies of all patterns containing this band. This is approximately twice the allele frequency of the band.) In some cases, it could be important to interpret the absence of a second larger fragment—e.g., when two samples match in a smaller band, but the questioned sample lacks a second larger band. That could arise either because the samples are from different persons or because the samples come from the same person but the questioned sample is partially degraded. Ideally, to distinguish these alternatives, one should determine whether a second larger band could have been detected in the questioned sample by hybridizing the membrane with a single-copy probe that detects an even larger monomorphic fragment—i.e., one that is constant in all humans. In contrast, it would not be sufficient simply to estimate the degree of degradation from the ethidium bromide staining pattern of the sample.
A sample might show more than two bands for various reasons. E.g., the hybridization conditions were improper and caused the probe to hybridize to incorrect fragments; the probe was contaminated with another sequence, which caused it to recognize other fragments; the membrane was incompletely stripped after a previous use, so a pattern seen on the previous hybridization is still being detected; the restriction digestion did not proceed to completion, so the region recognized by the probe is present in incompletely cut fragments of multiple sizes; or the sample actually contains a mixture of multiple DNAs. The last example is extremely important to recognize, because it can bear importantly on a case. Whenever extra bands are observed, their origin should be determined.
The following clues provide a partial decision tree:
If the hybridization conditions were improper or the probe contaminated, the pattern in the control DNA should be seen to be incorrect; the hybridization should be repeated.
If the membrane was improperly stripped, the extra bands will be in the same location as in the previous hybridization and might be present in the control sample; the hybridization should be repeated.
If the restriction digestion was incomplete, one should see additional bands, even with the use of monomorphic probes that typically give only a single constant band. To ascribe extra bands to incomplete digestion, one should therefore perform such a hybridization. If incomplete digestion has occurred, the sample ideally should be re-extracted and redigested, and a new Southern blot should be prepared. If that is not possible, because there is too little sample, it will usually be difficult to get a reliable result.
If the samples are mixtures from more than one person, one should see additional bands for all or most polymorphic probes, but not for a single-copy monomorphic probe. Mixed samples can be very difficult to interpret, because the components can be present in different quantities and states of degradation. It is important to examine the results of multiple RFLPs, as a consistency check. Typically, it will be impossible to distinguish the individual genotypes of each contributor. If a suspect's pattern is found within the mixed pattern, the appropriate frequency to assign such a "match" is the sum of the frequencies of all genotypes that are contained within (i.e., that are a subset of) the mixed pattern.
Another possible cause of extra bands is leakage between adjacent sample lanes or misloading of two samples in a single lane. Such an occurrence can be exceedingly difficult to detect and could result in an incorrect conclusion. It is therefore important to leave a blank lane between a suspect sample and an evidence sample, so that leakage can be detected and will not lead to false-positive results.
Reporting of Anomalies
Examiners should document their interpretations of samples thoroughly in writing. They should note all observed bands and any questionable densities that they do not consider to be bands. Anomalous bands should be explained on the basis of appropriate control experiments of the sorts described above.
Measurement of Fragments
Molecular-weight measurements of fragments should initially be made by comparing band positions with known molecular-weight standards run in separate lanes on the same gel (so-called external molecular-weight stan-
dards). Measurements should be performed with a computer-assisted or computer-automated system, in which the operator identifies the positions of the bands with a digitizing pen or similar device that directly records them, visually inspects them, or both. Computer-based procedures ensure appropriate documentation of the measurement and promote objectivity.
External molecular-weight standards alone, however, are not sufficient, because anomalies in electrophoresis can lead to errors in RFLP typing caused by band shifting.5 Such anomalies can be due to differences in salt or DNA concentrations among samples (which could be corrected by repeated extraction) or to covalent or noncovalent modifications of the DNA (which might be irreversible). Band shifting could cause two DNA samples from one person to show different patterns or DNA samples from two different persons to show the same pattern. Band shifting also makes it impossible to measure fragment sizes relative to external molecular-weight standards, because the standards have migrated at a different speed.
Band shifting is easy to detect by hybridizing the Southern blot with monomorphic probes—that is, probes that detect constant-length fragments that are always in the same position in all people. If several monomorphic fragments are in the same position in both lanes, it is safe to assume that no band shifting has occurred. If the monomorphic fragments are in different positions, band shifting is present. The committee considers it desirable for all samples to be tested for band shifting by hybridization with monomorphic probes that cover a wide range of fragment sizes in the gel. That approach will eliminate the rare production of a match by shifting of bands in an evidence sample to the same positions as in a suspect sample. Testing laboratories now investigate the possibility of band shifting only when they find two samples with patterns that appear to be similar but shifted relative to one another. (Multiple monomorphic probes might not be available for some systems and might need to be developed.)
Testing for band shifting is easy, but correcting it is harder. The best approach is to clean the samples (by re-extraction, dialysis, or other measures) and repeat the experiment in the hope of avoiding band shifting. When that is impossible because too little sample is available or it fails (perhaps because of covalent modification of the DNA), it is possible in principle to determine the molecular weights of polymorphic fragments in a sample by comparing them with monomorphic human bands in the same lane—so-called internal molecular-weight standards. These monomorphic fragments are expected to have undergone the same band shift, so they should provide an accurate internal ruler for measurement. (Note that the polymorphic fragments and the internal molecular-weight standards are visualized on separate hybridizations, but can be superimposed on one another, if the external molecular-weight standards are used to align the gels.)
In practice, however, the use of internal standards presents serious dif-
ficulties. Accurate size determination requires a number of internal standards. If band shifting caused all fragments to change their mobility by the same percentage, one would need only a single monomorphic fragment to determine the extent of shift. But band shifting appears to be more complex than that. Different regions of the gel shift by different amounts.
Little has been published on the nature of band shifting, on the number of monomorphic internal control bands needed for reliable correction, and on the accuracy and reproducibility of measurements made with such correction. For the present, several laboratories have decided against attempting quantitative corrections; samples that lie outside the match criterion because of apparent band shifting are declared to be "inconclusive." The committee urges further study of the problems associated with band shifting. Until testing laboratories have published adequate studies on the accuracy and reliability of such corrections, we recommend that they adopt the policy of declaring samples that show apparent band shifting to be "inconclusive." The committee recommends that all measurement data be made readily available, including the computer-based images and records. Any analytical software for image processing or molecular-weight determination should also be readily available. All fragment sizes for both known and questioned samples should be clearly listed on the formal report of the testing laboratory.
Current RFLP-based tests use VNTR probes that have dozens of closely spaced alleles. On the one hand, the high degree of polymorphism increases the power of the test to detect differences among persons. On the other hand, the large number of alleles increases the complexity of matching samples, because gels have little ability to resolve nearby alleles (which can differ by as little as 9 basepairs, so that, for practical purposes, the distribution of alleles can appear to be continuous).
Because of the limited resolution, two samples from a single person will often lead to slightly different measurements—e.g., 3.00 and 2.45 kilo-bases (kb) in one case, 3.03 and 2.40 kb in another. To decide whether two samples match, each laboratory must have a match criterion.6 The match criterion should provide an objective and quantitative rule for deciding whether two patterns match—e.g., all fragments must lie within 2% of one another. When samples fall outside the match criterion, they should be declared to be "inconclusive" or "nonmatching."
The match criterion must be based on reproducibility studies that show the actual degree of variability observed when multiple samples from the same person are separately prepared and analyzed under typical forensic
conditions. Some testing laboratories originally used matching rules that were based on the average spacing of fragment sizes in each region of the gel, rather than on actual studies of reproducibility. Other laboratories used purely visual matching criteria. Both are inadequate. Each testing laboratory must carry out its own reproducibility studies, because reproducibility varies among laboratories. The precise match criterion of each laboratory should be made freely available to all interested persons and should be stated in forensic reports.
The match criterion is also used in the calculation of allele frequencies. To determine the probability that a matching allele was found by chance, one counts the number of matching alleles in an appropriately chosen reference population. For the calculation to be valid, the same match criterion must be applied in screening the population databank and in comparing the forensic samples. Some testing laboratories originally used less stringent rules for declaring a match between forensic samples and more stringent rules for determining the frequency of matching alleles in the databank; the effect was an overstatement of the probability of obtaining a match by chance.
Some have advocated that testing laboratories, instead of using a match criterion, should report a likelihood ratio—the ratio of the probability that the measurements would have arisen if the samples came from the same person to the probability that they would have arisen if they came from different persons. No testing laboratories in the United States now use that approach. The committee recognizes its intellectual appeal, but recommends against it. Accuracy with it requires detailed information about the joint distribution of fragment positions, and it is not clear that information about a match could be understood easily by lay persons.
A laboratory's level of reproducibility can increase or decrease over time. Reproducibility should be measured not only when a laboratory first implements DNA typing, but continually on the basis of actual casework, as well as external proficiency testing (see Chapter 4). One easy way is to record the fragment measurements from the control samples of known DNA included on the membrane and regularly examine the variability in these measurements. A drawback of that approach is that the control pattern might become too well known to the examiners. A slight variation would eliminate the problem. Examiners would continue to use a fixed known control sample on every membrane, but would also be given a blind control sample as a bloodstain to analyze with each case. The latter sample would be randomly selected from a collection of a few dozen known samples. The examiners would not know its specific identity, but only a code number. They would compare the blind control sample against the known patterns, to determine whether it matched to the expected extent. Such an internal test of reproducibility would provide continuing internal measurement of a
laboratory's reproducibility. It would likely be a powerful tool for quality control in a laboratory. For convenience, blind control samples could be distributed by a professional association or a private-sector firm. The committee recommends that testing laboratories adopt such a system for continuing measurement of reproducibility and that they regularly examine and report the results. Recommendations for mandating such testing systems are discussed in Chapter 4.
Retention of Sample
Scientifically, the best way to resolve ambiguity is often to repeat the experiment. The U.S. justice system guarantees opposing sides the right to have repeat experiments performed by experts of their choice, whenever that is possible. Accordingly, testing laboratories should measure DNA samples before analysis (with accurate devices, such as fluorometers, as well as with ethidium-stained "yield" gels) and should use only the quantity of DNA required for reliable Southern blot analysis. When they can, they should retain enough of a sample to repeat the entire analysis.
TECHNICAL ISSUES IN PCR-BASED METHODS
PCR is a relatively new technique in molecular biology, having come into common use in research laboratories only in the last 4 years. Although the basic exponential amplification procedure is well understood, many technical details are not, including why some primer pairs amplify much better than others, why some loci cause systematically unfaithful amplification, and why some assays are much more sensitive to variations in conditions. Nonetheless, it is an extremely powerful technique that holds great promise for forensic applications because of its great sensitivity and the potential of its use on degraded DNA.
We discuss here two broad categories of technical issues concerning PCR methods: issues related to the amplification step and issues related to the detection of amplified product.
Technical Issues Related to Amplification
The quality and specificity of amplification with PCR depends on the amplification conditions: the amplification cycling program (temperatures, mixture (e.g., primer, nucleotide, polymerase, and magnesium concentrations), times, and number of cycles), the composition of the amplification and the amount and nature of the target DNA in the sample (single-stranded or
double-stranded).7 In some cases, results can vary among thermocyclers from different manufacturers, among thermocyclers of a single manufacturer, and even among different sample wells in a single machine. It is therefore essential that precise conditions be established for each typing system and that the system be thoroughly characterized for its sensitivity to variations in these conditions. If an assay yields spurious or confusing results under particular conditions, it may be necessary to prescribe strict condition limits or to discard the assay altogether. No PCR assay should be used until it has been rigorously characterized in this way.
Qualitative and Quantitative Fidelity
Ideally, PCR amplification products would faithfully represent the starting material in the sample—both qualitatively and quantitatively. But that is not always the case.
PCR amplification is known to result in misincorporation of nucleotides at the relatively low rate of less than one per 10,000 nucleotides per cycle.8 Amplification is usually performed on a sample that contains a large number of molecules; if the misincorporation is random, the low frequency of random errors will not be detected in most systems and will pose no problem for the typing result. Difficulties arise for systems in which the misincorporation is not random. For example, DNA sequences that contain tandem repeat sequences—such as the dinucleotide (CA)n or some VNTRs—present serious problems. Apparently, the DNA polymerase can slip during amplification, introduce or delete copies of the repeat, and produce a heterogeneous collection of fragments, often making interpretation difficult. That drawback is unfortunate: such simple sequence repeats tend to be highly polymorphic in the human population and so would seem to be useful for forensics. Because there is no way to predict which PCR assays will be subject to this problem, each assay must be thoroughly characterized.
In some cases, PCR can be qualitatively faithful but quantitatively unfaithful, because some alleles amplify more efficiently than others. A sample might contain a 50:50 mixture of two alleles and yield an amplified product with a 90:10 ratio.9 Differential amplification can arise through several mechanisms. It has been observed in the amplification of allelic products of different sizes (larger products tend to amplify less efficiently than shorter products) and in the amplification of sequences that differ significantly in GC content (because of differing denaturation efficiencies). In some cases, faithful amplification occurs at some temperatures and differential amplification at other temperatures.8 The possibility of differential amplification needs to be addressed in the design and development of amplification protocols for each genetic-marker system. The safeguards to
ensure that differential amplification does not occur should be defined and documented.
Quantitative analysis of mixed samples with PCR might be problematic. Suppose that PCR amplification reveals four alleles in a sample, and alleles 1 and 2 give a stronger signal than alleles 3 and 4. A conclusion that the two stronger alleles correspond to one contributor with genotype 1/2 and the two weaker alleles to a contributor with genotype 3/4 would be justified only if one had demonstrated that the amplification and detection process yielded signals that were directly proportional to the initial quantities of the alleles. If the locus were subject to differential amplification, the conclusion might be unjustified. This underscores the importance of characterizing possible differential amplification.
Ideally, primer pairs should amplify only the desired target locus. However, nonspecific amplification can be seen, if one amplifies for extended cycle numbers. Limits on the cycle number might be required as a safeguard against nonspecific products.
Some forensic samples contain factors that inhibit amplification, either by binding to the target DNA or by inhibiting the polymerase. In particular, amplification inhibition is often seen with DNA from older bloodstains. It can usually be remedied by re-extracting the DNA to remove the inhibiting factor, by diluting the offending DNA, or by increasing the concentration of polymerase. There is no evidence that any of those procedures affects typing adversely. Nevertheless, the nature of inhibiting factors and the mechanism of the inhibition effect deserve additional study. Each PCR system should be thoroughly characterized on a range of simulated and known forensic samples, to document any effect on reliability.
One of the most serious concerns regarding PCR-based typing is contamination of evidence samples with other human DNA. PCR is not discriminating as to the source of the DNA it amplifies, and it can be exceedingly sensitive. Potentially, amplification of contaminant DNA could lead to spurious typing results. Three sorts of contamination can be identified, as set forth below; each has its own solutions.
Mixed samples. Some evidence samples occur as mixtures, e.g., sexual-assault evidence, which often contains a mixture of semen and vaginal fluids. In mixed samples that contain semen, it is possible to extract the
sperm DNA and the DNA of vaginal epithelial cells separately. That allows the genetic contribution of the male and female to be distinguished. However, there is one important caveat: if the sperm fraction shows a genotype that matches that of the victim, one cannot conclude that this represents the genotype of the perpetrator, inasmuch as it could be due to residual vaginal epithelial cells. The problem should disappear as PCR-based assays for more loci become available. For other mixtures, such separation is not possible. For example, it is not possible to separate the DNA contributed by different persons in mixed bloodstains or in sexual-assault samples that involve two or more perpetrators. Mixed samples are a reality of the forensic world that must be accommodated in interpretation and reconstruction. As a rule, mixed samples must be interpreted with great caution. Their interpretation should always be based on results from multiple PCR assays, so that one can check for consistency across various loci. Interpretations based on quantity can be particularly problematic—e.g., if one saw two alleles of strong intensity and two of weak intensity, it would be improper to assign the first pair to one contributor and the second pair to a second contributor, unless it had been firmly established that the system was quantitatively faithful under the conditions used.
Contamination from handling in the field and laboratory. It is conceivable that DNA can be transferred to evidence samples or reaction solutions through handling, either from the person doing the handling or in transfer from other evidence samples. There are no hard data on the amounts of DNA transferred by physical contact, but there are anecdotal reports of experimenters who contaminated their PCR mixtures with their own DNA. It is difficult to assess the likelihood of this sort of contamination. Steps should be taken to minimize it, such as handling samples with gloves and preparing solutions and processing samples in separate areas. Contamination of solutions can be recognized with appropriate positive-control and blank-control amplifications, which should be used routinely. When a stain composed of blood, semen, or other biological material is analyzed with PCR, it is important to analyze unstained materials next to the stain with PCR as a control for contamination.
PCR product carryover contamination. The most serious problem is contamination of evidence samples and reaction solutions with PCR products from prior amplifications. Such products can contain a target sequence at a concentration a million times greater, and even a relatively small quantity could swamp the correct signal from the evidence sample. Even the simple act of flipping the top of a plastic tube might aerosolize enough DNA to pose a problem.
Many research and diagnostic laboratories have been afflicted with the problem of PCR carryover. Contamination risks can be minimized by strict
adherence to sterile technique; the use of separate work areas for sample processing, solution preparation, amplification, and type testing; the use of separate pipettes in each area (pipettes are a major source of carryover contamination); and maintenance of a one-way flow of materials from the evidence-storage area to the sample-preparation area to the type-testing area.
Those precautions focus primarily on preventing PCR carryover contamination. But it has become clear that carryover products from the PCR reaction to another must also be eliminated. One way is to use the nucleotide dUTP in place of dTTP in all PCR reactions.10 PCR products made in this manner can be selectively destroyed by the enzyme uracil N-glycolase (UNG), which excises dUTP. Accordingly, all evidence samples would be treated before PCR amplification with UNG, to destroy contamination from previous PCR reactions. The method holds promise, although it has not yet been extensively tested in practice. Methods of detecting and preventing contamination from one PCR reaction to another in forensic laboratories are generally still in their early stages, and additional development should be encouraged.
As with contamination due to handling, carryover contamination can be signaled by the appearance of product in blank controls and of mixed or inappropriate types in samples and positive controls. Such controls should be used rigorously. Moreover, it should be remembered that the controls are useful for monitoring general contamination in the laboratory, not the accuracy of a particular experiment. If a blank control is positive in one experiment, it indicates a potential problem not just for that experiment, but for any experiment performed at about the same time—even in a laboratory contaminated with PCR carryover, blank controls do not necessarily become contaminated on every occasion. It will be wise to repeat all work with samples that have never been exposed to the PCR-typing laboratory.
In view of the problem of contamination due to handling and carryover, laboratories must incorporate contamination control into their standard operating procedures. And outbreaks of contamination and the steps taken to correct the problem should be documented.
One of the best safeguards against contamination is to have DNA typing independently performed in two laboratories, each starting with a piece of the unprocessed evidence sample. Given the inexpensiveness of typing, serious consideration should be given to independent replication of results—at least during the early stages of this technology.
Issues Related to Detection of Amplified Product
Variation after PCR amplification can be detected in several ways. The most popular detection schemes for nonforensic analyses are reverse dot hybridization, analysis of PCR products for size variation with gel electro-
phoresis, analysis of PCR products with gel electrophoresis after restriction-endonuclease digestion, analysis for the presence of amplification after use of allele-specific amplification primers, and analysis of the nucleotide sequence. No matter what detection scheme is used, contamination of the test sample with a second DNA sample or differential amplification of one allele in a sample that contains two alleles at the test locus can produce an error in typing. Differential amplification could result in the typing of a true heterozygote as a homozygote, or the low level of hybridization of the second allele could suggest the presence of a contaminating DNA in the test sample. A repeat PCR amplification and analysis might be successful and pinpoint a problem in the first amplification procedure. Besides those general problems, each detection format can entail its own technical problems.
Reverse Dot Hybridization
In forensic analysis today, the single PCR-based kit available uses reverse dot hybridization to detect variation at the HLA-DQ locus.11 Reverse dot hybridization is based on a yes-no detection scheme. Theoretically, the absence of a signal in a dot means that the test allele is not present in the DNA sample, and a signal in a dot indicates the presence of the test allele. An intermediate signal in a dot—a signal that is considerably less intense than a second signal in the test hybridization—can result from a second DNA sample's contaminating the test sample, technical variation in the conditions of analysis (e.g., hybridization temperature), or true heterozygosity for the allele that produces the intermediate signal. Usually, such problems can be resolved through repeat experiments or by comparing results from a number of loci with many alleles (such loci could shed light on the nature of mixtures). If the DNA sample of a crime victim contains the allele in question, it would suggest contamination as the source. If repeat of the hybridization and washing procedures eliminates the intermediate signal, it would suggest its spurious nature. Rarely, the origin of an intermediate hybridization signal might remain unresolved; if the type of the sample is still questionable, the data should be discarded.
Other Detection Methods
When restriction-enzyme digestion of a PCR product is necessary to demonstrate alleles, the major pitfall is incomplete digestion. One can control the problem in the test sample by using a restriction enzyme for which a constant site that produces a fragment of constant size exists in the test fragment. For example, suppose that digestion of a 500-bp PCR fragment by HaeIII yields a constant 300-bp fragment and either a 200-bp fragment or 120-bp and 80-bp fragments; a poor yield of the 300-bp frag-
in a particular sample would suggest incomplete digestion of the PCR product.
Allele-specific amplification has been used in DNA diagnosis of genetic disease. Diagnosis can be based solely on absence of an amplification product, so it is a difficult technique to control adequately. Absence of a fragment can indicate either failure of the amplification procedure or absence of the allele in question from the test sample. If the former applies, a typing error would result. For this reason, the committee recommends that this method not be used for forensic analysis.
DNA sequence analysis of PCR products is commonly carried out either manually or with automation. Some ambiguity of nucleotide sequence at one or more positions is common and can signal DNA contamination or a technical problem in the analysis. Another important problem occurs when the DNA sequence of a fragment demonstrates variation at more than one position in the nucleotide sequence. Because both alleles at a locus are sequenced in the procedure, it is difficult to determine what fraction of the variation is contributed by each allele. For example, if heterozygosity is observed at two positions in a sequence, one cannot know, without further experimentation, whether one allele contains both variants or each allele contains one.
New methods of detection of PCR products will surely be devised. Well-controlled, extensive studies of the methods will be required before their use in forensic science, and the quality-assurance procedures described in Chapter 4 will be important to ensure their rigorous testing and reliability.
Use of Kits
One commercial kit for forensic PCR analysis has been marketed. Other such kits will probably be ready for commercial markets soon. The committee sees a potential for introduction of unreliable kits and the misuse of kits. The existence of a kit suggests ease of use and low chance of technical error. The committee believes that nonexpert laboratories will run a significant chance of error in using kits. We therefore recommend that a standing committee (discussed later in this chapter) consider the issue of regulatory approval of kits for commercial use in forensic DNA analysis. Even though no precedent exists for regulation of tests in forensic applications, we believe that it might be necessary for a government agency to test and approve kits for DNA analysis before their actual forensic use.
Prospects of PCR-Based Methods
PCR analysis has a number of desirable features for forensic applications. It requires very little DNA (less than for a Southern blot by a factor of 100-150) in the evidence sample. It is thus feasible to amplify dozens of
loci. It generates a large quantity of relatively pure product that can be analyzed with much greater precision than Southern blots, even down to the nucleotide level. At the same time, it poses even more serious issues of proficiency, control, and technology transfer than RFLP typing.
In summary, it is well established that one can greatly amplify a locus with authenticity and that one can reliably detect alleles or sequence variation at the amplified locus with any of a number of techniques. PCR analysis is extremely powerful in medical technology, but it has not yet achieved full acceptance in the forensic setting. The theory of PCR analysis, even though it is the analysis of synthetic DNA, as opposed to the natural sample, is scientifically accepted and has been accepted by a number of courts. However, most forensic laboratories have invested their energy in development of RFLP technology and have left the development of forensic PCR technology to a few other laboratories. Thus, there is no broad base of experience in the use of the technique in identity testing.
Forensic PCR-based testing is now limited for the most part to analysis of genetic variation at the DQ locus in the HLA complex. Potential ambiguities in typing results cannot yet be checked by studying a number of other loci in the same DNA sample. That shortcoming will be rectified with the addition of new PCR markers for forensic analysis. However, it is clear that analysis of the DQ locus with PCR can often provide useful information during the investigative phase in the forensic setting.
In general, further experience should be gained with respect to PCR in identity testing. Information on the extent of the contamination problem in PCR analysis and the differential amplification of mixed samples needs to be further developed and published. A great deal of this information can be obtained when a number of polymorphic systems are available for PCR analysis. Ambiguous results obtained with a number of polymorphic markers will signal contamination or mixtures of DNA in a sample.
Quantification of PCR results needs to be explored, to make the results more reliable. Laboratories that gain experience with PCR should determine the relationship between cycle number and percentage of contaminating DNA easily detected for each system used. Control primers that amplify small amounts of DNA reliably and robustly need to be added to test amplifications. In general, information derived from new polymorphic loci under standardized conditions with easily quantifiable results or end points is needed. Considerable advances in the use of PCR in forensic analysis can be expected soon; the method has enormous promise.
NATIONAL COMMITTEE ON FORENSIC DNA TYPING
Forensic DNA typing is advancing rapidly. RFLP-based typing methods continue to be refined and improved, PCR typing methods are being
used in some court cases, and other methods are being developed in scientific and commercial research laboratories. Typing methods will continue to be replaced with ever more sophisticated approaches for some time to come. These developments hold great promise for increasing the sensitivity and reliability of forensic DNA typing.
The rapidity of development creates a need to balance two competing societal objectives. On the one hand, new technologies should be made available quickly. On the other hand, forensic typing methods should not be used until their soundness is established both in principle and in practice. The problem involves both technology and technology transfer. Forensic DNA typing is drawing methods from the cutting edge of molecular genetics, but must apply them to quite different circumstances.
The committee believes that the field of forensic science would be best served by the creation of a National Committee on Forensic DNA Typing (NCFDT) to provide advice on scientific and technical issues as they arise. NCFDT would consist primarily of molecular geneticists, population geneticists, forensic scientists, and additional members knowledgeable in law and ethics. Its charges would be to provide guidance about the power and limitations of DNA typing methods, to identify potential problems and their solutions, to provide guidance about whether new technologies are ready for practical use in the forensic laboratory, and to provide advice concerning the regulation of kits for forensic DNA typing. In addition (as discussed in Chapter 3), NCFDT would provide advice on population genetics and statistical interpretation.
Such a committee could play a critical role in smoothing the acceptance of DNA typing technologies in the courtroom while ensuring their reliability. Although NCFDT would have no formal regulatory authority, we anticipate that substantial influence would derive from its stature and the quality of its advice, so that courts could look to its recommendations in making their decisions.
The present committee recommends that NCFDT be convened under the auspices of an appropriate government agency. Because its task is fundamentally scientific, we feel that the agency should be one whose primary mission is scientific, rather than related to law enforcement. To avoid any appearance of conflict of interest, an agency that uses forensic DNA typing itself would be unsuitable. Two excellent choices would be the National Institutes of Health (NIH) or the National Institute of Standards and Technology (NIST). NIH has extensive experience in molecular biology, population genetics, and laboratory practice. NIST has less direct experience in those fields, but has considerable experience in evaluating technologies. Regardless of which agency convenes NCFDT, we believe that the effort should have broad government support from NIST, NIH, the National Science Foundation, the National Institute of Justice, the Federal
Bureau of Investigation, and the State Justice Institute. NCFDT should also have broad support from the American Society of Crime Laboratory Directors, the Genetics Society of America, and the American Society of Human Genetics.
The creation of an expert advisory committee is a somewhat unusual step for forensic science. However, we feel that it is the appropriate way to ensure that the field can incorporate new developments promptly while maintaining high standards.
SUMMARY OF RECOMMENDATIONS
Any new DNA typing method (or substantial variation on an existing method) must be rigorously characterized in both research and forensic settings, to determine the circumstances under which it will yield reliable results.
DNA analysis in forensic science should be governed by the highest standards of scientific rigor, including the following requirements:
Each DNA typing procedure must be completely described in a detailed, written laboratory protocol.
Each DNA typing procedure requires objective and quantitative rules for identifying the pattern of a sample.
Each DNA typing procedure requires a precise and objective matching rule for declaring whether two samples match.
Potential artifacts should be identified by empirical testing, and scientific controls should be designed to serve as internal checks to test for the occurrence of artifacts.
The limits of each DNA typing procedure should be understood, especially when the DNA sample is small, is a mixture of DNA from multiple sources, or is contaminated with interfering chemicals.
Empirical characterization of a DNA typing procedure must be published in appropriate scientific journals.
Before a new DNA typing procedure can be used, it must have not only a solid scientificcirclendation but also a solid base of experience.
Regarding RFLP-based typing, the committee makes a number of technical recommendations, including specific recommendations about the choice of probes, the use of ethidium bromide in gels, controls for anomalous bands, measurement of fragment sizes, controls for band shifting, match criteria, and sample retention.
Regarding PCR-based typing, the committee makes a number of technical recommendations, including recommendations for thorough characterization of each PCR assay for definition of the range of conditions under
which it will perform reliably and for strict contamination measures and other control procedures
The committee strongly recommends the establishment of a National Committee on Forensic DNA Typing under the auspices of an appropriate government agency, such as NIH or NIST, to provide expert advice primarily on scientific and technical issues concerning forensic DNA typing.
1. Balazs I, Baird M, Clyne M, Meade E. Human population genetic studies of five hypervariable DNA loci, Am J Hum Genet. 44:182-190, 1989.
2. Culliford BJ. Determination of phosphoglucomutase types in bloodstains. J Forensic Sci Sociol. 7:131-133, 1967.
3. Culliford BJ. The examination and typing of blood stains in the crime laboratory. Washington, D.C.: U.S. Government Printing Office, 1971.
4. Sensabaugh GF. Biochemical markers of individuality, pp. 338-415 in: Saferstein R, ed. Forensic science handbook. Englewood Cliffs, New Jersey: Prentice-Hall, 1982.
McNally L, Baird M, McElfresh K, Eisenberg A, Balazs I. increased migration rate observed in DNA from evidentiary material precludes the use of sample mixing to resolve forensic cases of identity. Appl Theor Electrophoresis. 5:267-272, 1990.
6. Thompson WC, Ford S. The meaning of a match: sources of ambiguity in the interpretation of a DNA print in forensic DNA technology, pp. 93152 in: Farley M, Harrington J, eds. Forensic DNA technology. Chelsea, Michigan: Lewis Publishing, 1991.
7. Amheim N, Levenson CH. Polymerase chain reaction. C&E News 68:36-47, October 1, 1991).
8. Erlich HA, Gelfand D, Sninsky J J. Recent advances in the polymerase chain reaction. Science. 252:1643-1651, 1991.
9. Comey CT, Jung JM, Budowle B. Use of formamide to improve amplification of HLA DQa sequences. Biotechniqucs. 10:60-61, 1991.
10. Longo MC, Berninger MS, Harley JL. Use of uracil DNA glycosylase to control carryover contamination in polymerase chain reactions. Gene. 93:125, 1990.
11. Saiki RK, Walsh PS, Levenson CH, Erlich HA. Genetic analysis of amplified DNA with immobilized sequence-specific oligonucleotide probes. Proc Natl Acad Sci USA. 86:6230-6234, 1989.