most of the projects, the starting material has been DNA extracted from pure cultures of organisms grown in the laboratory or in association with animal or plant cells. Regardless of the organism selected for sequencing, the goal of the projects has been the same: to generate a complete or nearly complete genome sequence that can serve as the substrate for genome annotation and analysis (see Figure 4-2). For metagenomics projects, it will be important to accumulate additional complete genome sequences, especially for currently under-represented taxa. Such sequences should help in the identification of otherwise unidentifiable open reading frames in metagenomic fragments, and facilitate scaffolding of metagenomic data.
Metagenomics projects differ from traditional microbial-sequencing projects in many respects. The starting material is a mixture of DNA from a community of organisms that may include bacterial, archaeal, eukaryotic, and viral species at different levels of diversity and abundance. Most of the organisms will elude attempts at cultivation. In some projects, sample collection may be confounded by the presence of limited amounts of DNA or the presence of contaminating DNA or other compounds that interfere with DNA extraction. These factors make it much more challenging to think about the generation of complete or nearly complete genome sequences from metagenomics projects. Often, generating complete genomes will not