National Academies Press: OpenBook
« Previous: Introduction
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

Workshop Proceedings

The workshop was divided into four sessions (Appendix C). The flrst focused on properties of DNA after formalin flxation. The second examined ways to obtain sequence information from formalin-flxed samples. In the third session, participants discussed applications of bioinformatics for reconstructing DNA sequences from formalin-flxed samples. Each session began with brief presentations by participants with relevant expertise, followed by open discussion. The challenge to participants was to identify a path to successful recovery of DNA sequence information from formalin-flxed samples stored in either alcohol or formalin. In the flnal session, participants made suggestions on areas of research and experimentation needed to investigate the mechanisms and kinetics of DNA damage by formalin flxation and on how to develop ways to repair DNA that would make it useful for study.

Workshop cochair Donald M. Crothers (Yale University) acknowledged the enormous potential for advancing science that could accrue to DNA sequence information from museum specimens. Biologists in various disciplines, for example, would like to use natural history specimens collected over 100-year spans for evolutionary, molecular, and genetic studies. However, users encounter major problems in obtaining both the DNA and the sequence information from those samples because of interference from the formalin flxation. Although the proximate goal is to obtain sequence information of the cytochrome c oxidase subunit 1 (COI) gene for DNA

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

barcoding,1 the ultimate goal is to obtain genome sequences for other studies.

Biologists have used different methods for extracting DNA from formalin-flxed samples, and some have yielded DNA sequence information, but only under narrow conditions. However, those nominal successes suggest that the problem of recovering DNA sequence information from formalin-flxed samples is solvable. This workshop brought a group of experts together to discuss potential solutions and alternative methods. Crothers clarifled that the specimens in question had been flxed and stored in 5 to 10 percent formalin solution or had been flxed in formalin solution for a few days and then preserved in ethanol. Workshop cochair Ann Bucklin (University of Connecticut), explained that in some cases the formalin for preservation or storage was unbuffered and therefore acidic.

Mark Rubin (Brigham and Women’s Hospital) and David Schindel (Consortium for the Barcode of Life; CBOL) suggested that, although the workshop’s focus was on biological samples stored in aqueous solution, much could be learned from protocol development for DNA extraction from formalin-flxed and paraffln-embedded samples. Marvin Caruthers (University of Colorado) said that paraffln embedding creates a more stable environment for the formalin-flxed sample than storage in aqueous formalin or alcohol. For example, whereas the pH of the paraffln does not change, the formaldehyde in formalin can be oxidized to formic acid by exposure to atmospheric oxygen, thereby reducing its pH.

DNA IN SAMPLES EXPOSED TO FORMALIN

Reactions of DNA and Formaldehyde

To begin the discussion on the effect of formalin exposure on DNA, Crothers showed a slide that sums up the reactions that occur in formaldehyde flxation of a drug, adriamycin (Figure 1). During flxation,

1

“DNA barcoding is a technique for characterizing species of organisms using a short DNA sequence from a standard and agreed-upon position in the genome. DNA barcode sequences are very short relative to the entire genome, and they can be obtained reasonably quickly and cheaply. The cytochrome c oxidase subunit 1 mitochondrial region (COI) is emerging as the standard barcode region for higher animals. It is 648 nucleotide base pairs long in most groups, a very short sequence relative to 3 billion base pairs in the human genome, for example” (“DNA barcoding,” Consortium for the Barcode of Life, http://barcoding.si.edu/DNABarCoding.htm).

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

FIGURE 1 Covalent adriamycin-DNA adduct. SOURCE: Zeman et al., 1998.

formaldehyde reacts with amino groups of guanine (G), adenine (A), and cytosine (C), and a guanine residue is a typical reaction product. The formaldehyde forms covalent linkages with the amino groups, which then cross-link with proteins, so the drug is linked to the guanine moiety. The other guanine moiety has a strong hydrogen bond with adriamycin, which produces a tight, stable complex. However, the drug must be kept cold to maintain the stability of the flxation; heat can cause the disassociation of the entire complex. The cross-links are labile to the aromatic amines of DNA, and the cross-links or the reaction to formaldehyde are stable.

The process of formaldehyde flxation alters DNA in three ways: through fragmentation, sequence modiflcation, and cross-linking. Cross-linking is not destructive to nucleic acids, and is reversible. Using 13C-labeled formaldehyde, Crothers said, it is possible to see that the methylene carbon came from the formaldehyde (Figure 1). Nuclear magnetic resonance (NMR) spectroscopy can be used to show what happens to the formaldehyde carbon when it reacts with DNA in different circumstances so that the kinetics of

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

the reactions between formaldehyde and different kinds of nucleotides over time can be revealed. Using modern NMR methods to study formaldehyde’s reactions with double-helical or single-stranded oligonucleotides could reveal information on the kinetics.

If formaldehyde is left unbuffered, it is oxidized to form formic acid, which has destabilizing properties. The formic acid depurinates DNA, that is the cleavage of the N glycosidic link between purine bases and deoxyribose in DNA resulting in the loss of purine from the DNA backbone, and the degradation is likely to be irreversible. Crothers mentioned a paper by Quach and colleagues (2004) that assessed sequence modiflcations due to formalin flxation. That group reported that formalin flxation speeds sequence modiflcation, but that the rate does not depend on the duration of formalin flxation. The ability to make a longer amplicon using polymerase chain reaction (PCR) analysis decreases dramatically with increasing duration of formalin flxation. Crothers said he suspected that the formaldehyde used for flxation is oxidized to formic acid over time which causes denaturation of DNA and more cross-linking reactions. Storage of samples in unbuffered formalin for prolonged periods is likely to produce DNA that is so degraded it cannot be used for PCR analysis.

Crothers cautioned that, in some cases, even if PCR did not produce amplifled DNA, the lack of an amplicon does not imply an absence of DNA. Rather, DNA puriflcation reagents could contain PCR inhibitors. In response to that comment, Charles Cantor (Sequenom, Inc.) suggested the use of an internal control (that is, adding copies of a standard that is known to amplify with its primers) to ensure that PCR was not inhibited. Crothers suggested mass spectrometry or single-molecule sequencing for small DNA fragments as an alternative to PCR. Because single-molecule sequencing can be done on multiple molecules, the resulting sequences could be compared to locate the damage in each sequenced molecule.

Caruthers agreed with Crothers that sequencing DNA from formalin-flxed samples is comparable to sequencing apurinic acid with small stretches of pyrimidines. He suggested that sequence information can be recovered from those small stretches but the informatics involved would be challenging. Mitochondrial DNA (MtDNA) has many adenine-thymine base pairs (that is, it is A-T rich) so that the method that Caruthers suggested is not likely to work, said Robert DeSalle (American Museum of Natural History). Timothy O’Leary (U.S. Department of Veterans Affairs) added that MtDNA is less accessible than is nuclear DNA, probably because of the abundance of adenine-thymine base pairs.

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

Caruthers stated that the major problem is in the cross-linkages. First, there are cross-linkages between bases—adenine with adenine or adenine with cytosine. Second, there are cross-linkages with proteins—for example, histones with a lot of lysine. The samples would require protease treatment to yield DNA. Protease K is commonly used to degrade proteins, but Caruthers said that the cross-linkages would halt DNA polymerization. He suspected that the polymerase would be stopped by lysine-adenosine or adenine-adenine linkages. Aside from cross-linkages, Caruthers said it is not clear how DNA would be degraded by formalin unless a solution were acidic and acid hydrolysis were causing depurination.

Daniel Ryan (Agilent Technologies, Inc.) suggested a possible solution to the depurination problem. In DNA, purines are base paired with pyrimidines so that all that remains in depurinated DNA is the pyrimidines—absent their complementary bases. Ryan and his colleagues are working with microarrays that are functionally similar to addressable beads. They have seen a single nucleic acid bound to a microarray spot, and they have detected one molecule or one of those spots. He suggested that it might be possible to isolate depurinated DNA using the DNA’s remaining binding energy. Crothers questioned whether there is a hybridization system that can recognize depurinated DNA. Ryan suggested that it is a stringency problem.

Cantor suggested a method complementary to Ryan’s. He said that if a universal base, such as inosine, could be added to those apurinic sites, the fragments would become nucleic acids again, and working with nucleic acids is simpler. Caruthers and Timothy Harris (Helicos BioSciences Corporation) agreed that sequence information could be obtained from nucleic acids reconstructed from fragments of depurinated DNA by single-molecule sequencing. However, Tom Evans (New England Biolabs, Inc.) said the proposed repair method would work only with double-stranded DNA.

O’Leary showed the reactions of formaldehyde with nucleotides and nucleic acids over a short period (under 24 hours) (Equations 1 and 2).

[Equation 1]

[Equation 2]


He reported that the reactions are reversible. The reaction between formaldehyde and nucleic acid carried out at 24°C could be reversed by incubation at 70°C or by dialysis. However, if the sample were transferred to alcohol after flxation, other reactions and molecular alterations would occur.

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

Transferring the formalin-flxed sample to ethanol triggers depurination of DNA in the sample and formation of ethanol adducts. The ethanol adduct could be cleaved at pH 4, which is not acidic enough to cause depurination. Even a short period of formalin flxation followed by dehydration can cause damage but the nature of that reaction is not well-studied. Investigating the reaction with single nucleotides could prove useful.

Reliability of Sequence Information

Some participants questioned the reliability of the sequence information obtained from formalin-flxed samples. For example, would certain DNA strands be more susceptible to sequence modiflcation as a result of formalin flxation? If so, the DNA sequence obtained would not be a true representation of the original specimen. DeSalle suggested that multiple clones could be examined to see whether there is consensus in the DNA sequences among the clones. Harris pointed out that examination of multiple clones would be only useful if the sequence alterations that result from flxation are random. By comparing multiple clones and looking for overlap in sequences, a true sequence can be deciphered. However, if the alteration is systematically biased—that is, if sequence modiflcation occurs in the same region of every clone—then there is no way to determine where the alteration occurs unless the formalin-flxed sample can be compared with a fresh sample.

Rubin pointed out that conducting a systematic comparison of fresh and preserved samples could lead to a better understanding of the problems associated with recovering DNA from formalin-flxed samples. The comparative study would allow documentation of alterations in formalin-flxed samples. Then, more research could determine whether any of those alterations hampers the determination of the original sequence.

Oxidative Damage

Miral Dizdaroglu (National Institute of Standards and Technology) discussed his work on oxidative stress and damage. His laboratory uses mass spectrometric techniques to observe oxidative damage in DNA isolated from animal tissues between 10,000 and 20,000 years old. Oxidative stress causes the formation of highly reactive hydroxyl radicals, which react with DNA bases and with the sugar moiety of DNA, possibly to cause base damage, sugar damage, DNA protein cross-links, and single- and double-strand

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

breaks in DNA. The damage occurs because a hydroxyl radical can add to the double bonds, forming other intermediate radicals that react further. Cytosine and purine lesions, for example, form as a result of reactions with hydroxyl radicals. Most lesions are mutagenic, which means that polymerase will not stop but will go onto the wrong base across from those lesions. Some lesions are lethal; they stop polymerase, and DNA cannot be synthesized from that point on. Dizdaroglu has found thymidine dimers in samples exposed to ultraviolet radiation, but he had not compared lesions in DNA extracted from fresh samples with that extracted from formalin-flxed samples. Harris questioned whether a vial of DNA that sustains oxidative damage could be repaired by an enzyme cocktail. Several participants said that it could be repaired to some extent. Basic repair, however, involves many enzymes, said Dizdaroglu. Crothers added that DNA had to remain double stranded for successful enzymatic repair. Furthermore, hydroxyl damage has some minimal sequence preference, so lesions might not always occur in the same region.

Variations in Curatorial Treatments

Participants who work with natural history collections discussed the variations in the curatorial processing of biological samples. Buffered and unbuffered formalin, for example, have been used for flxation and storage. Although most zooplankton samples are flxed and stored in formalin, others often are flxed in formalin and transferred to a 70 percent ethanol solution after flxation. The duration of formalin flxation varies widely among samples stored in ethanol.

Ryan asked whether anyone had determined the size of DNA extracted from formalin-flxed specimens. Speciflcally, he was wondering whether a speciflc formalin treatment or curatorial treatment of a specimen leads to increased fragmentation of DNA. Crothers asked whether anyone had obtained PCR products from specimens preserved in aqueous formalin. Bucklin replied that she and her colleagues had examined DNA sequences for northern krill, Meganyctiphanes norvegica (Crustacea, Euphausiacea), flxed and stored in formalin for 2, 3, 15, and 18 years (Bucklin and Allen, 2004). When they amplifled the DNA to determine the size of fragments, they found that the longer the specimen had been preserved in formalin, the shorter the DNA fragments. Bucklin and colleagues had not been able to obtain DNA from specimens stored in unbuffered formalin. Crothers reiterated that obtaining DNA from samples that had been stored in

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

pH 2 formalin is fruitless because the formic acid is likely to have irreversibly depurinated the DNA. Caruthers stressed that an attempt to restore damaged DNA by neutralizing pH 2 formalin might incur more damage, at least to the sample’s RNA.

Christoffer Schander (University of Bergen) also has compared the success of different protocols for extracting DNA from tissues flxed in formalin for different durations (Schander and Halanych, 2003). He reported that the success of DNA extraction depends not only on the protocol, but on the tissue from which the DNA was extracted. O’Leary suggested that bones and teeth could be useful alternatives to soft tissue for obtaining DNA. Bones and teeth might be better protected from damage by formalin. However, Evon Hekkala (U.S. Environmental Protection Agency) and Gonzalo Giribet (Museum of Comparative Zoology, Harvard University) pointed out that the strategy would be useless for many organisms that have neither bones nor teeth.

Some participants asked how many repeats of the sequencing process would be needed to obtain reliable information. The number, according to Ernie Mueller (Sigma-Aldrich Company) would depend on the size of the DNA fragments. The shorter the fragment, the more repeats are needed. One participant indicated that fragments of 500-600 base pairs were the longest that had been obtained from formalin-flxed samples stored in aqueous solution.

Rubin suggested that information about the curatorial history of samples is critical to developing an optimal protocol. He mentioned that he chaired a task force at the National Cancer Institute to devise an optimal protocol for obtaining DNA from archival human tissues. That group reported that researchers in different laboratories use different protocols for sample preservation, and sometimes, even a slight variation can make a big difference in the success of DNA extraction. Determining which variation in a preservation or processing protocol has the largest effect is an important step in identifying optimal protocols for DNA extraction. Hekkala thought that Rubin’s approach might be useful for developing a matrix that could be used to predict whether particular specimens would be useful for recovering sequence information. Participants agreed that a survey of curatorial practices could be useful for determining which other factors should be considered in identifying specimens that have the potential for DNA extraction.

Giribet recalled a project by Bhadury and colleagues (2005) that examined nematodes preserved in formalin. When the nematodes were flxed for

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

7 days, the extracted and amplifled DNA showed clear bands on a gel. But when the nematodes were flxed for 11 days, the gel showed smears instead of distinct bands, indicating that much of the usable material was lost. Bucklin said that she had obtained good DNA from samples flxed for a week or less, regardless of whether the formalin was buffered or not. Steven Hofstadler (Isis Pharmaceuticals) questioned whether Bhadury’s group quantifled the extracted DNA. If not, the results could be interpreted as a lower yield in extracted DNA from samples flxed for longer duration instead of as a decrease in ampliflcation.

Schindel asked whether there is a way to detect the DNA’s integrity without extraction. Crothers said that would be a true analytical challenge. Cantor mentioned that the DNA sequences with multiple thymines in a row (called poly-T tracts) are more stable than others. Those sequences preserve well because they do not react with formaldehyde, said Crothers. One of Cantor’s students found that poly-T tracts in closely related bacteria can be distinguished if they are measured precisely. However, Cantor does not know the variability of poly-T tracks in higher organisms.

Schindel asked about the relative importance of DNA degradation and PCR inhibition when DNA ampliflcation has not been observed. There is a potential for small molecules to block PCR, especially if there are only few copies of the DNA to be amplifled, said Crothers. Therefore, an internal control would ensure that PCR was not inhibited. Hofstadler cautioned that the amount of internal standard could mask the DNA to be amplifled if the DNA is present only in a low concentration. In addition to small molecule inhibitors, a lesion in DNA also can inhibit PCR, said Crothers. A lesion is an inhibitor in a sense that even though a molecule is of a given length, it cannot be amplifled because the enzymes cannot get through it. That kind of PCR inhibition is more difflcult to control for. The larger the DNA fragment, the more likely it is to have a lesion or protein bound that blocks PCR.

The discussion turned to questions of how to assess the integrity of DNA before extraction. Alison Williams (Princeton University) suggested that capillary electrophoresis might be sensitive enough, and that perhaps a few bases could be observed from one or two kilobases. Cantor suggested ethidium bromide and 4,6-diamidino-2-phenylindole (commonly known as DAPI), and DeSalle suggested spectroscopic analysis. Ryan suggested reversing cross-linking by hydrolysis with water vapor at 65°C. Intercalating dyes that have speciflc fiuorescence signatures, such as ethidium bromide or the more sensitive PicoGreen, can be added (Ahn et al., 1996).

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

Schindel asked whether heating the samples would enhance extraction. O’Leary explained that the aqueous heating might not be useful for samples from natural history collections because those samples are likely to sustain more extensive damage than are paraffln-embedded, formalin-flxed samples. Hekkala said she had followed the critical drying method proposed by Fang et al. (2002) but did not obtain any ampliflable DNA from her samples. Crothers noted that some participants had not shared information about failed attempts at DNA extraction from formalin-flxed samples until this workshop. The user community has no means of communicating the protocols that they have tried to use to extract DNA and failed. Sharing that information is important because the collective information could shed light on why some attempts succeed. Crothers suggested a Web site be set up for that purpose.

The development of a set of standardized reference samples to test DNA extraction protocols would be useful for comparing the protocols to determine what works under which conditions, said Schander. Scott Miller (Smithsonian Institution) explained that the Smithsonian has a set of specimens—goldflsh exposed to a series of different formalin treatments—that could be used as the standards. The specimen set includes samples of frozen tissue that has not been exposed to formalin. Schindel said he would like to see the goldflsh standard held in reserve until an acceptable approach to extraction protocol testing is developed. For example, Schindel had hoped that chemists could elucidate the degradation processes in formalin flxation that block DNA extraction or hamper PCR analysis, and then identify or develop better protocols. “There is no gold standard method at the moment,” Crothers said. Because of variations in curatorial processes, the chemistry of DNA degradation in formalin-flxed specimens is largely unknown.

In summation, Crothers listed possible alterations or damage to DNA exposed to formalin. They include irreversible depurination caused by acidiflcation (if formaldehyde is unbuffered), cross-linking, oxidation from reactions with minor content of formaldehyde, cytosine deamination, and minor adducts. The chemistry of cross-linking is not well understood, and some cross-linkages are rather stable. Cytosine deanimation is enzymatically reversible in samples that contain double-stranded DNA.

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

OPTIMIZATION OF DNA SEQUENCE INFORMATION

This section explored how sequence information can be optimized after DNA is extracted from samples. Several participants explained methods that could be used to assemble sequence information from DNA.

PCR-MASS SPECTROMETRY

Cantor explained a PCR-mass spectrometry method developed at Sequenom. The method’s advantages are in its sensitivity, precision, and cost-effectiveness. Its sensitivity is better than what is possible in conventional sequencing methods because mass spectrometry produces no background noise, and it is less expensive than real-time PCR analysis. Although the technology cannot be used to survey an entire genome, it is a cost-effective method for examining hundreds of loci in thousands of samples. PCR-mass spectrometry is fully automated, and it can process about 3000 samples per day. Some 160 entities are using the PCR-mass spectrometry technology; many are organizations that provide the service for a fee.

PCR-mass spectrometry is a multiplex method that can analyze 30 genotyping or 20 gene expression samples simultaneously in one tube. Matrix-assisted laser desorption/ionization time-of-fiight (MALDI-TOF) mass spectrometry requires a smaller amount of DNA as input than PCR, so that the method is optimally designed for small amplicons. The mass spectrometry method covers an unlimited dynamic range. Because mass spectrometry is expensive, it is not used for standard sequencing. Rather, nucleic acids are sequenced by base-speciflc cleavage reactions. Sequenom has all four single base-speciflc cleavage reactions working with complicated, but single-tube, technology. It is possible to quantify reactions at every locus in mixed, complicated samples, including deanimated samples. Conventional sequencing requires one continuous target. But sequencing in the mass spectrometer by fragmentation does not require a continuous target, so a discontinuous set of sequences can be sampled for the cost of a single sequencing reaction. The PCR-mass-spectrometry method yields 98-99 percent correct typing. Cantor encouraged the users to try the method because it is a mature and available technology. Caruthers asked whether multiplexing PCR is a problem. Cantor explained it is not, because the amplicons used are short and because all amplicons are close to the same size. Sequenom had experience working with short amplicons, and the multiplex is designed by software. Generally, 28 of 30 multiplexes work well.

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

Mass Spectrometry for Interrogating Nucleic Acid

Hofstadler explained another mass spectrometry method for obtaining base composition information from PCR products. He reiterated Cantor’s point that the mass spectrometer is a powerful tool for analyzing base composition of nucleic acids. Unlike conventional sequencing, mass spectrometry as described by Hofstadler derives a “base composition” signature, which represents the exact count of adenine, guanine, cytosine, and thymine (for example, A10 G23 C32 T17), on the basis of the precise measurements of the PCR product’s mass. Unlike a sequencing technique, the approach can provide information on a mixture of nucleic acids with a dynamic range of about 100:1. Mass spectrometry is highly automated, so it can run round the clock to analyze more than 1500 samples in 24 hours. It also is more sensitive than conventional sequencing. Hofstadler said he had performed single-molecule detection at the stochastic limit of PCR to obtain PCR amplicons from 10-20 copies of a genomic template or reaction.

Hofstadler described the electrospray interface and solution conditions he used to ionize intact DNA. The procedure starts with DNA molecules in solution. Using speciflc interface and buffer conditions, the DNA is denatured in the gas phase into complementary strands. The mass spectrometer measurement provides independent measurements of the forward and reverse strands of the amplicon. From those strands, an unambiguous base composition can be determined for the complementary amplicon pair. Because large molecules yield many charge states upon electrospray ionization, measurement of those charge states from the same molecule facilitates the determination of molecular weight, which is in turn used to determine base composition.

Analytical chemists routinely identify the elemental composition of small molecules based on mass measurements. Knowing the weight of the molecule and the atomic mass of the elements, chemists can calculate the proportion of each element in the molecule. Similarly, base composition of DNA can be determined from the mass of the DNA strand and from the known masses of its adenine, guanine, thymine, and cytosine. The PCR products that Hofstadler typically examines are no more than 150 base pairs long. Molecular weights from both strands are used to derive base compositions because determination of base composition from a single strand is ambiguous even with high-precision measurements (for example, to one part per million [ppm]). For a DNA strand that weighs about 33,000 mass units,

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

there typically are 80-90 base compositions that add up to the same mass, within achievable mass measurement uncertainties. However, because PCR reveals complementary products—the number of adenine on one strand equals the number of thymine on the other—the determination of base composition is simple as long as a mass measurement of 25 ppm is achievable. PCR typically is driven to saturation (35-40 cycles), so analytically useful measurements are derived routinely from low numbers of copies.

The base composition approach offers at least two potential advantages over conventional sequencing: Complex mixtures can be interrogated directly, and useful signatures can be derived from relatively short pieces of highly degraded DNA.

Single-Molecule Sequencing by Synthesis

Harris discussed single-molecule sequencing, a method designed for short-read resequencing of genomes with a reference that was flrst presented by Braslavsky and colleagues (2003). Single-molecule sequencing requires fragmentation of genomic DNA but, as Hofstadler pointed out, that might not be necessary because the DNA obtained from formalin-flxed tissue is already fragmented. In the method, genomic DNA is fragmented into pieces on the order of 100 base pairs long, and the fragments are then melted to single strands (Figure 2). If the DNA is to be repaired, it should be repaired at the ends, and then poly-A tails are added to the 3 end of each fragment. For convenience, a Cy-3-labeled dideoxy-nucleotide also is added at the 3 end so the strand cannot be extended in that direction. Each fragment shown at the bottom of Figure 2 represents a single molecule of DNA that will be probed as an individual target template. Slides are made with poly-T on the surface. The poly-A serves as the capture probe and as a primer for sequencing. A picture is then taken with image green to record the position of each template. A dye labeled “nucleotide X” is added to the DNA fragments enzymatically at sites where the target strand contains a nucleotide that is complementary to X. Misincorporation is negligible (less than 0.1 percent). The excess deoxyribonucleotide triphosphate is then washed out, and a picture is taken of each dye molecule.

Finally, the dye molecules are cleaved from the DNA and the sequencing reaction is repeated for nucleotide Y (X, Y, and so on correspond to A, T, G, C). Each piece is a single molecule, sequenced by synthesis. The read length is limited to fewer than 50 nucleotides (50mers), usually about 30 nucleotides (30mers), because the sequence is synthesized one

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

FIGURE 2 The flrst three steps in single-molecule sequencing. SOURCE: T. Harris, Helicos BioSciences Corporation.

base at a time and because of unknown chemical failures. Harris asked the bioinformaticians present whether they could assemble 700 base pairs (1400 bases) of complementary sequence from 30 polymers. Steven Salzberg (University of Maryland, College Park) stated that would be possible if the sequence were only 700 bases long, but not if the sequence has 7 million bases.

DeSalle asked whether Harris had encountered problems with normalization. Harris replied that although no problems had arisen yet, the representation of sequence is not uniform. Some sequences appear overrepresented (excess coverage), others are underrepresented. This nonuniform variation of sequence coverage was found in single-molecule sequencing for viruses. It is not known whether the coverage bias arises from biased sample fragmentation, from bias in the sequence generation process, or both.

In Vitro Repair

Evans discussed a “PreCR” method for in vitro DNA repair that involves treating damaged DNA in vitro with a mixture of DNA repair

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

enzymes before PCR. The enzyme mixture emulates base excision repair, the core enzymes are a DNA polymerase and a DNA ligase. A broad spectrum of DNA damage is repaired by including other DNA repair enzymes that act in concert with the ligase and polymerase. At the time of the workshop, the PreCR enzyme mix Evans had been using could repair abasic sites, nicks, gaps, ultraviolet radiation damage, deaminated cytosine, and some forms of oxidative damage. DNA cross-links or highly fragmented DNA could not be repaired effectively, and the enzyme mix cannot repair highly damaged DNA.

The PreCR repair mixture was tested against templates purposely damaged by methylene blue, which causes oxidative damage, and by low pH, heat, and ultraviolet radiation. 8-oxo-guanine is reported to be a common oxidative product that causes cytosine-to-thymidine changes in the PCR product. If oxidatively damaged DNA is a template for PCR, the amplicon contains many more mutagenic lesions than generally would be found in damaged DNA treated with FPG (formamidopyrimidine [fapy]-DNA glycosylase). FPG excises 8-oxo-guanine so that the number of mutagenic lesions decreases, but the DNA template is fragmented in the process, and the PCR results are affected. If FPG is used in conjunction with other enzymes, speciflcally a ligase, polymerase, and endonuclease IV, the mutagenic lesions are effectively removed and an amplicon is obtained.

The repair mix was designed to repair abasic sites, thymine dimers, gaps, nicks, and deaminated cytosine. A purposely damaged template was used during development to optimize the repair. However, PreCR was not successful when tested on formalin-flxed samples. The DNA damage needs to be better deflned so that researchers can devise more effective solutions. Evans suggested the need to identify markers that indicate different types of DNA damage, so that he and his colleagues could design speciflc enzyme mixes for use in DNA repair. The lack of success could result from the presence of unrecognized DNA lesions that were not repaired by the enzymes currently in the PreCR mix.

Whole Genome Amplification

Mueller presented an overview of GenomePlex®, a Sigma-Aldrich whole genome-ampliflcation system for use on damaged DNA and possibly on formalin-flxed tissue. The system starts with partially fragmenting genomic DNA, a step that would be unnecessary with the fragmented DNA found in formalin-flxed samples. Mueller, however, cautioned that the sys-

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

tem would not work with DNA that is too fragmented. The fragments are efflciently primed and amplifled to make a library of fragments that have a known sequence at each end, and thus are PCR-ampliflable units. PCR is then performed on those fragments with universal primers to create an amplifled Omniplex® library (Figure 3).

Before ampliflcation, Mueller tested the amount of DNA extracted from formalin-flxed and fresh tissues using two protocols, the Leed’s protocol, which uses proteinase K (Jackson et al., 1990) and a detergent-mediated lysis, CelLytic Y. His results showed a consistently lower amount (30 times less) of ampliflable DNA in formalin-flxed tissue than was available from fresh tissue. DNA yield varied slightly with the type of tissue used in the extraction. Because extraction gave consistent yields, as shown by quantitative PCR, Mueller and his colleagues presumed that the lim

FIGURE 3 Overview of GenomePlex® whole genome ampliflcation. SOURCE: E. Mueller, Sigma-Aldrich Company.

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

ited ampliflcation was attributable to the quality of the extracted DNA. Large additions of extracted lysate carried over from the extraction process often resulted in PCR inhibition. There are two issues associated with formalin-flxed samples. First, a substantial amount of DNA is lost in the flxation process—only three to flve percent of the DNA can be extracted for PCR. Second, the DNA extraction process itself could introduce PCR inhibitors.

Mueller also mentioned that Sigma-Aldrich has been pursuing the development of a strand-based technique for DNA repair, similar to the one Evans presented. Mueller and his colleagues found a model system that repairs abasic sites. They sent the system for testing by M. Hajibabaei at the University of Guelph, Canada. Hajibabaei was working with the COI gene (cytochrome c oxidase subunit 1) in formalin-flxed samples. The model system worked for one batch of sample but failed for a second batch. Mueller said that the Smithsonian goldflsh specimens might provide a good standard test for elucidating those inconsistent results.

Protocols for Obtaining Sequence Information from Formalin-Fixed Samples

Hekkala described her successes and failures in experimenting with various protocols for obtaining sequence information from formalin-flxed samples. She used crocodile tissue—brain, heart, and muscle—in her tests. She obtained different yields by various extraction protocols, although the quality of DNA did not differ. She had about 20 percent success in obtaining sequence information with the Shedlock protocol (Shedlock et al., 1997). Using identical tissue, she produced no sequence information with the critical point drying method (Fang et al., 2002) or with the Qiagen protocol (Wickham et al., 2001). The Shedlock protocol uses a glycine buffer to “soak out” the formalin. In that test, Hekkala ran 40 samples, 10 of which produced amplicons. She ran the extraction again on those 10 samples, and 8 yielded amplicons. There was no change from the 30 samples that produced no amplicons either on the flrst run or in subsequent runs. The 40 samples were obtained from different individual specimens provided by different museums. The 10 samples that yielded amplicons were taken from specimens provided by the American Museum of Natural History and the California Academy of Science. Twelve specimens provided by the Field Museum of Natural History yielded no amplicons. Curatorial processes at

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

different museums could have affected the ability to extract and amplify the DNA.

Choosing the Path to Optimize Sequence Information

After hearing about different methods for optimizing sequence information from extracted DNA, Schander raised the question of what path could lead to sequence information from formalin-flxed samples. Should the effort focus on developing DNA extraction protocols suitable for formalin-flxed samples? Should it work at identifying enzymes for DNA repair? Should bioinformatics be used to assemble short fragments of DNA? Schindel said he favored research on improving DNA extraction, but DeSalle disagreed. Based on the presentations and discussions in the workshop, DeSalle thought the success of obtaining DNA sequence information would depend on whether the formalin-flxed samples in fact contained extractable DNA and not on the availability of extraction protocol or the suitability of an existing one. Extraction might produce long or short fragments, but sequence information can be obtained using some of the methods already discussed. Therefore, he said, repairing DNA to obtain longer fragments might not be necessary, and identifying the samples with extractable DNA is more important than is reflning extraction protocols. Evans countered that it is not known which aspect is the largest problem—damage to DNA, inefflcient DNA extraction, or the presence of PCR inhibitors—in blocking retrieval of DNA sequence information from formalin-flxed samples. Therefore, DNA repair should not be discounted as part of the solution.

Schander echoed DeSalle’s thought that it would be useful to determine whether a formalin-flxed sample contains any extractable DNA—samples left in unbuffered formalin are not likely to have extractable DNA—because extraction, PCR, and sequencing are time-consuming. Hekkala agreed and suggested experiments be designed to correlate traits of biological samples (curatorial history, pH, taxonomic group) with the potential for obtaining DNA from those samples. The correlations could inform the choice of samples for sequencing. Participants agreed that developing a framework to identify samples most likely to yield usable DNA would be useful. For example, samples kept in unbuffered formalin are not likely to be usable because of formic acid damage, and samples with free purines, which could be detected by mass spectrometry, also are unlikely to be usable. A formal assessment of the physical and chemical conditions and the curatorial his-

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

tory of formalin-flxed samples in natural history collections coupled with controlled DNA extraction experiments would be necessary to develop the framework. Crothers suggested a pilot explanatory assessment because the DNA degradation processes in formalin-flxed samples are largely unknown, although controlled experiments could shed light on that. He said that characterizing the quantity and quality of DNA extracted would be important, and that no one, at least at the workshop, seemed to have done that.

Determining the mechanism of DNA degradation (for example, depurination or formation of formaldehyde adducts), the efflciency of DNA extraction (whether extraction yields any DNA at all), and the efflciency of PCR ampliflcation (whether damaged sites are amplifled or PCR inhibitors are present) could help explain why sequence information cannot be obtained from particular formalin-flxed samples. Some participants said that the type of tissue used for extraction also affects the success of extraction or PCR ampliflcation. Ryan suggested having a set of samples that included different tissue types sent to different laboratories to test the same DNA extraction protocols. The outcome of systematic, simultaneous, independent testing of extraction protocols would provide information about protocol success, and whether some tissue types yield larger amounts of extracted DNA than others. Schindel and Hofstadler also said that the Smithsonian’s goldflsh specimens would be useful for systematic testing.

BIOINFORMATICS FOR RECONSTRUCTING DNA SEQUENCES

Can sequence information be obtained from the ordinarily short DNA fragments obtained from formalin-flxed samples? Schander asked whether sequence information obtained from short fragments can be considered reliable. Salzburg had indicated earlier that assembling a 700 base region from 30mers is possible, but assembling an entire genome from 30mers is impossible. Neil Hall (The Institute for Genomic Research) added that if random 30mers of a 700 base pair region are generated, then the 30mers can be assembled into larger contigs, but the problem is to pull out the right regions.

Hall said that the reliability of sequence information depends on what is done with the PCR product. Assuming that the PCR analysis is working well and the DNA is amplifled to generate long sequences, there is no way to verify whether the reconstructed sequence is representative of the original if sequencing is performed directly on the PCR products. Bioinformatics

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

also would be unable to determine the reliability of the sequence. However, if resequencing were done by cloning the PCR products so that the entire population of DNA in the sample were analyzed, it would be possible to identify and correct the sequence modiflcations by assembling those reads.

Salzburg emphasized that depth of coverage is necessary to distinguish between sequencing errors, sequence modiflcations, and true mutations. A minimum of four-times coverage could be necessary, especially if the DNA is likely to have been damaged; the cost of sequencing 700 bases is within reason for most laboratories. Schander agreed that it would be worthwhile to increase depth of coverage to examine sequence modiflcations caused by formalin flxation. If sequence modiflcation is induced by formalin, and if that is a common phenomenon, then the more that is known about it, the effects could be better predicted, said Hall. He reiterated that error screening requires sequencing of the cloned PCR products, and not the direct sequencing of a PCR product.

Participants discussed how many replicate sequences of COI gene would be needed to create a reference library for DNA barcoding. Schindel said the goal of the barcoding project is to create a reference library with bidirectional reads of flve specimens per species, but no replicate per individual specimen. Hekkala said that sequencing several individuals of the same species is important so that sequence variation within a group can be observed. More important, if one formalin-flxed specimen is used for sequencing, it would be necessary to compare it against fresh or frozen tissue to ensure that sequence variation is not an artifact of formalin flxation. Schander questioned the likelihood of sequence error attributable to properties of formalin flxation. O’Leary stated that cloning would be more likely to introduce artifacts than would cycle sequencing. He suggested that data collection on the likelihood of sequence errors as a result of formalin flxation could help determine whether 5 replicates per 10,000 or more specimens would be necessary. Error introduction in PCR sequences is not unheard of, said Hall. Bioinformatics could be used to reassemble the short sequences and to identify errors in sequences if many small PCR products with overlapping regions were being assembled. The overlapping regions show where the sequence information differs. However, Hall said he did not have a good sense of the magnitude of error that would be introduced to a PCR sequence as a result of formalin flxation; so it would be difflcult to decide whether it is worth investigating. Turning the question around, Schindel asked how many times a formalin-flxed specimen that produces

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

DNA fragments would need to be resequenced to ensure that a correct sequence was assembled. Salzburg said the number of replications would depend on the confldence level sought. Bioinformaticians can quantify the likelihood of an error in a sequence if they are given a particular number of raw sequences. On the basis of that information, they can estimate the number of replicates necessary to achieve a given confldence level.

SUMMING UP

This section reviews the questions in the charge to the workshop participants and their answers to the questions as presented by the rapporteurs in their summary on the second day of the meeting. The questions are listed in boldface type.


What is the state of preservation of DNA in the presence of formalin? Are the DNA chains intact or broken? Does formalin denature DNA or is it the process of extraction that is fragmenting the DNA? Are the nucleotides at each site being preserved or altered?

The quality of DNA in a sample, the percentage of recoverable or ampliflable DNA, the length of the fragments, and whether the DNA is well preserved or nucleated in formalin-flxed samples are largely unknown. The variations in processing of formalin-flxed samples partly contribute to that lack of knowledge. For example, some samples are stored in unbuffered formalin, others are flxed in formalin for different durations and some are transferred to ethanol after flxation. Because of those variations in curatorial treatment, the kinetics of formaldehyde and DNA reactions and the byproducts of different reactions are largely unknown. DNA damages and degradation that can occur in formalin-flxed samples include:

  • Cross-linking with formaldehyde.

  • Fragmentation.

  • Sequence modiflcation.

  • Modiflcations to adenosine, including methylol adduct formation and depurination.

  • Formation of oxidative adducts that lead to mutagenic lesions.

  • Modiflcation of bases, including adduct formation, if the sample is stored in ethanol.

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

It is not known whether extraction processes fragment DNA, but they could introduce PCR inhibitors that prevent ampliflcation of the extracted product. Therefore, quantiflcation and characterization of DNA extracted from formalin-flxed samples would reveal both whether DNA can be obtained from those samples and what the quality of the extracted DNA would be. Ampliflcation of an internal control sequence would verify that PCR inhibitors are not the source of the problem.


How can the physical and chemical states of the DNA-formalin cross-linkages be better characterized? What additional information on these cross-linkages is needed?

The condition of the DNA obtained from formalin-flxed tissue can be characterized by NMR spectroscopy and by mass spectrometry. In addition to characterizing the cross-linkages and other damage to DNA, it is important to correlate the type of damage and degradation attributable to different curatorial practices. Detailed knowledge of curatorial history might signal the likely damage or degradation. For example, if a specimen were kept in unbuffered formalin, its DNA is likely to have become depurinated and useless for sequencing. Additional important information includes data on the kinetic stability of formaldehyde adducts and cross-linkages, whether the stable products are read as mutations by DNA polymerase, and whether they serve to block polymerase altogether. Mass spectroscopy and NMR on small, single-stranded and duplex DNA samples would aid in characterizing the structure and the stability of the formaldehyde reaction products. Additional work also could focus on the reactions that ensue when a sample is exposed to ethanol.


What new chemical and physical methods for DNA extraction should be tested, beyond those that have already been applied to formalin-fixed tissue?

Participants agreed that testing new methods for DNA extraction will not be fruitful if the condition of the DNA in formalin-flxed tissue is largely unknown because a failure to obtain sequence cannot be attributed unamibiguously to a failure of extraction protocol, or the absence of usable DNA in formalin-flxed samples, or the presence of PCR inhibitors. Some participants, including Schander, Bucklin, and Hekkela, reported that published protocols have led to some success in obtaining sequence information from formalin-flxed tissue. Instead of testing new protocols, they said testing existing protocols with a set of standardized samples could

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

provide greater insight. Those samples would include several tissue samples from one organism flxed in formalin for different periods; frozen or fresh tissue would be used as a control. Testing different protocols on samples that had been flxed and preserved with standardized curatorial methods could shed light on which extraction protocol is optimal for each type of sample. Once DNA is successfully extracted from formalin-flxed samples, different methods—mass spectrometry, single-molecule sequencing, or whole-genome ampliflcation—could be used to obtain sequence information. As with the extraction process, the optimal sequencing method might depend on the quality of the extracted DNA.


In what ways and to what extent can fragmented DNA be repaired physically and chemically after extraction from formalin?

In some cases, fragmented DNA can be repaired with excision enzymes mixed with other enzymes and with polymerase. But without more information about the type of damage sustained as a result of formalin flxing, designing the appropriate mix of enzymes for the repair will be difflcult.


Can bioinformatics techniques be used to reconstruct the original sequence in silico from the DNA fragments recovered from formalin?

Bioinformatics can be used to construct large, contiguous, consensus sequences from short fragments of DNA. One complication is that formalin flxation could cause sequence modiflcation and a sequence obtained from a formalin-flxed sample might not accurately represent the original sequence of the untreated sample. Whether formalin flxation introduces random or systematic error into a DNA sequence is not known, but it is worth investigating. The potential for error introduction could be studied by repeated sequencing of cloned PCR products and by repeating PCR analysis from replicated DNA preparations. The repeated sequencing and PCR from replicated DNA preparations could reveal whether there are overlapping consensus sequences. Based on the overlapping sequences, random errors could be corrected accordingly. Systematic errors would be more difflcult to correct. They tend to occur consistently in the same place in the sequence, thereby appearing to be a correct base. In that case, the only way to determine whether formalin flxation alters the sample’s sequence would be to compare it with a sequence from a fresh sample.

In his summary, Cantor stressed that without knowledge of the quality of the DNA, flnding a solution to the problem of obtaining sequence information from formalin-flxed samples is difflcult. Rubin agreed and suggested

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

initiating a process to determine which formalin-flxed samples could be used for DNA extraction and how. He listed three elements: First, it would be necessary to screen the specimen before DNA extraction to assess its usability. Screening could be done with mass spectrometry to detect free purines or by testing sample pH. If the DNA damage appeared reversible, a repair could be attempted. The second phase would involve using different protocols to extract DNA from samples that have been subjected to various curatorial treatments. That test would provide a framework for predicting which specimen would be most likely to yield high-quality DNA in each protocol. The last phase would test how well the framework developed in the second phase could predict success in DNA extraction with a whole new set of specimens. To reveal practical limitations, O’Leary said, the process should be iterative and cover a spectrum of samples representing various species curatorial treatments.

THE PATH TO EFFECTIVE RETRIEVAL OF GENOMIC INFORMATION FROM FORMALIN-FIXED SAMPLES

A better understanding of the quality of DNA in samples and of how quality relates to the success of DNA extraction will be needed to inform solutions for effective retrieval of genomic information from formalin-flxed samples. To conclude the workshop, Crothers urged participants to suggest action items for advancing the retrieval of genomic information from formalin-flxed samples. This section compiles the participants’ suggestions.


Properly characterize formalin-fixed samples for DNA extraction.

Discussion during the workshop involved the difflculty of deriving effective ways to obtain sequence information from formalin-flxed samples—especially when there is little information about the causes of the problems. To help identify cause-and-effect relationships, participants developed a table with columns of curatorial treatments and rows of problems caused by each (Table 1). The information to be collected would include curatorial history (duration of formalin flxation and whether the formalin was in a buffered solution) and information about the quality of the DNA in the sample (presence of free purines or adducts). The quantity of DNA and its ability to be amplifled would be assessed after extraction, and PCR would be conducted on highly conserved sequences. The information collected would be used to flll in the table’s rows and columns, and

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

TABLE 1 Curatorial Treatments of Formalin-Fixed Samples and Factors That May Impair DNA Extraction or PCR Ampliflcation

Factors Prohibiting DNA Extraction or PCR Ampliflcation

Curatorial Treatments

Excessive Fixation

Excessive Heat

Impurities in Alcohol

Low Alcohol Level

Unbuffered Formalin

Other Treatments

Cross-linking

 

Cytosine deamination

 

Denaturation

 

Depurination

 

Formalin-ethanol interaction

 

Oxidative damage

 

Point sequence modiflcation

 

Presence of PCR inhibitors

 

Other factors

 

The matrix is designed to identify classes of samples in natural history collections that should not be used for DNA and sequence information on the basis of their curatorial history. The table provides some examples of curatorial treatments that could affect the quality of DNA and PCR ampliflcation. Others that could affect the quality samples also should be considered.

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

those data in turn would serve as a guide for determining whether sequence information could be obtained from a given sample.


A survey could be sent to curators to obtain information on the variety of curatorial treatments in natural history collections. Establishing a network also would be useful to facilitate communication among researchers who want to obtain DNA and sequence information from formalin-fixed samples.

Table 1 presents only the curatorial treatments discussed during the workshop. Other factors could inhibit DNA extraction or PCR ampliflcation, especially given the multiplicity of treatments used to preserve natural history specimens. Participants suggested designing a survey—in consultation with experts at institutions that have natural history collections—to gather information on curatorial history and on any successes or failures in the extraction of DNA or PCR ampliflcation.

Participants also noted that testing of DNA extraction protocols has been done mostly by groups interested in obtaining sequence information from formalin-flxed samples. Although occasional successful attempts are reported in the literature, failed attempts are not reported. Yet comparison of successful and failed attempts could provide clues about determining factors. Thus, the establishment of a Web forum was suggested to allow researchers to pool information on their work with formalin-flxed samples.


In addition to retrospective assessment, controlled experiments on standardized, formalin-fixed samples could be used to examine the mechanisms and kinetics of chemical and physical reactions that could hamper efficient DNA extraction or PCR amplification.

Several experiments could begin immediately to reveal the mechanisms prohibiting efflcient DNA extraction:

  • Preliminary studies in selected laboratories could be conducted to assess the effect of formalin and alcohol on the integrity of duplex DNA and RNA. The time course of DNA degradation and the efflcacy of DNA repair also could be examined.

  • The effects of extraction versus flxation could be examined by studying the properties of freshly flxed tissue, oligonucleotides, and DNA samples mixed with protein.

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
  • The standard goldflsh samples that had been subjected to different curatorial treatments can be used for testing DNA extraction protocols and for elucidating the effects of different treatments on DNA. In addition, each goldflsh sample could be sent to different institutions for independent testing. Each institution could then test its own extraction method by quantifying DNA yield (using an Agilent analyzer, for example) and then using the extracted DNA for PCR ampliflcation. Because the flxation and extraction processes could introduce PCR inhibitors, PCR analysis would be conducted with an internal control to ensure that the reaction is not blocked. Results from the multiple-institution protocol testing would be used to develop a standard protocol for DNA extraction from formalin-flxed samples. Repeating the experiment with an invertebrate standard (for example, a fiatworm) could provide useful information.

  • Genomic library clones could be established for samples from which DNA had been successfully extracted. The DNA could then be correlated to the flxation process and to sample history and properties.

  • Cloning PCR products for selected formalin-flxed samples that have a known gene sequence or an equivalent frozen or fresh sample could reveal whether formalin flxation induces sequence modiflcation. A comparison of cloned PCR products from flxed tissue with products from fresh or frozen samples—or with the known sequence—would help to quantify mutations. Replicating the experiment on different samples and different species could lead to characterization of patterns and level of sequence modiflcations attributable to flxation.

A database can be established to collate the information as collected in retrospective and experimental studies. Information in the database could guide the determination of whether particular formalin-fixed specimens could be used for DNA sequencing on the basis of the specimens’ chemical and physical properties.

Participants drafted an example of how the data collected from retrospective and experimental studies could be organized (Table 2). Because institutions with natural history collections have so many formalin-flxed specimens, an assessment of curatorial history and of the chemical and physical properties of the specimens would help identify those that are still useful for DNA sequencing and help prioritize sequencing and barcoding efforts.

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

TABLE 2 An Example of How Data Collected from the Retrospective and Experimental Studies on DNA Extraction and Sequencing from Formalin-Fixed Biological Samples Could Be Organized. Such a Database Could Serve as a Tool for Assessing the Feasibility of Using Certain Specimens for DNA Sequencing.

Fixation Type

Tissue Type

Duration of Formalin Fixation

Quantity of DNA Obtained, by Protocols

Shedlock Protocol

Leeds

Qiagen

Chelex

Critical Drying Point

Other

Formalin-flxed, paraffln embedded

 

Formalin-flxed, stored

 

Formalin-flxed, stored in ethanol

 

Other flxation type

 

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

After potentially useful DNA extraction protocols are identifled from the preliminary experiments, random and focused sampling of PCR products could be conducted on a variety of sample types (from different taxa or flxed and preserved with different curatorial treatments) to identify the best protocol for each type. The issues to be addressed involve the recoverability of DNA, including sequences other than polypyrimidine tracts. Samples with euchromatic and heterochromatic DNA also could be considered.

In the long term, consideration of high-throughput processing of formalin-flxed samples for DNA barcoding and genomic studies would be appropriate, given the large number of samples in museum collections. Some individuals in institutions with collections are identifying specimens in their collections or taxa suitable for high-throughput processing, but a systematic and collaborative effort could facilitate and speed up the process.

Ideas and suggestions from the workshop participants could further the efflcient extraction of DNA from formalin-flxed samples, and that in turn could improve access to the sequence information of many rare or difflcult-to-collect species in natural history collections. Action by the Consortium for the Barcode of Life to follow up on the workshop participants’ ideas and suggestions could facilitate the effective recovery of sequence information from formalin-flxed biological samples.

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×

This page intially left blank

Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 5
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 6
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 7
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 8
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 9
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 10
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 11
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 12
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 13
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 14
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 15
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 16
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 17
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 18
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 19
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 20
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 21
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 22
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 23
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 24
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 25
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 26
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 27
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 28
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 29
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 30
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 31
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 32
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 33
Suggested Citation:"Workshop Proceedings." National Research Council. 2006. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary. Washington, DC: The National Academies Press. doi: 10.17226/11712.
×
Page 34
Next: References »
Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections: Workshop Summary Get This Book
×
Buy Paperback | $29.00 Buy Ebook | $23.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Museums catalogue our knowledge of the Earth's biodiversity, and their collections represent many decades of work by experts. Access to DNA sequence information in archival specimens would greatly extend knowledge of the genetic relationships within our biosphere. However, molecular genetic analysis of museum specimens has been slowed by the usual practice of fixation and storage of samples in formalin. Formalin is an environmental toxin and induces genetic and chromosomal alterations to the samples.

Few of the many attempts to obtain and sequence DNA from formalin-fixed specimens stored in aqueous formalin or ethanol have been successful. All of the protocols are slow, difficult, and often expensive, and few produce DNA fragments longer than 500 base pairs. Path to Effective Recovering of DNA from Formalin-Fixed Biological Samples in Natural History Collections examines past attempts on DNA recovery from formalin-preserved biological specimens and discusses the research needed to advance the development of similar but more efficient and cost-effective protocols.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!