9
Epilogue

Twenty years ago, the Human Genome Project, and the nascent genomic sciences more generally, were highly controversial. Many biologists thought that investing resources in such “molecular natural history” was economically wasteful and intellectually suspect. Now, practically all biologists are genomicists. If not directly pursuing genome sequencing and the other “omic” methods, biologists nevertheless often ground their particular genetic, biochemical, physiological, behavioral, or ecological studies in the work of someone who is. Genomics has been transformative in the deepest sense, not only answering many questions about how organisms function, develop, and evolve, but also driving a radical reformulation of the terms in which such questions are asked. Although initially many of us thought of genomics mostly as a more economical and efficient way (because of economies of scale) to recover and study the behavior of individual genes, in fact it has shifted focus to the collective and integrated activities of genes functioning together, to the networks of interactions between them, and to how these are integrated (and have evolved) in the highly complex and coordinated business of living and reproducing at the level of cells and organisms. As noted earlier, genomics and the associated high-throughput “omic” technologies targeting gene expression, protein synthesis (and modification), protein interactions and protein structure are all becoming experimental subdisciplines of a new concept-driven computational science called systems biology.

What then, will metagenomics have become, in 20 years? We believe that it too will be a concept-driven computational science with subdisciplines that have evolved from the fusion of “omic” approaches and more tradi-



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 134
9 Epilogue Twenty years ago, the Human Genome Project, and the nascent genomic sciences more generally, were highly controversial. Many biolo- gists thought that investing resources in such “molecular natural history” was economically wasteful and intellectually suspect. Now, practically all biologists are genomicists. If not directly pursuing genome sequencing and the other “omic” methods, biologists nevertheless often ground their par- ticular genetic, biochemical, physiological, behavioral, or ecological studies in the work of someone who is. Genomics has been transformative in the deepest sense, not only answering many questions about how organisms function, develop, and evolve, but also driving a radical reformulation of the terms in which such questions are asked. Although initially many of us thought of genomics mostly as a more economical and efficient way (because of economies of scale) to recover and study the behavior of indi- vidual genes, in fact it has shifted focus to the collective and integrated activities of genes functioning together, to the networks of interactions between them, and to how these are integrated (and have evolved) in the highly complex and coordinated business of living and reproducing at the level of cells and organisms. As noted earlier, genomics and the associated high-throughput “omic” technologies targeting gene expression, protein synthesis (and modification), protein interactions and protein structure are all becoming experimental subdisciplines of a new concept-driven compu- tational science called systems biology. What then, will metagenomics have become, in 20 years? We believe that it too will be a concept-driven computational science with subdisciplines that have evolved from the fusion of “omic” approaches and more tradi- 

OCR for page 134
 EPILOGUE tional disciplines, such as environmental and clinical microbiology, biogeo- chemistry, biological oceanography, soil sciences, and theoretical ecology. It will indeed be the systems biology of the most inclusive biological system we know about: the biosphere of the planet. These disciplines will in the process be transformed and many questions redefined and refocused, most often at a level below (genes and genomes) or above (communities and ecosystems) the organism and species levels at which microbial ecologists have traditionally concentrated their efforts. Although individual microbial cells will always be suitable units of study, the “species,” because we have just begun to uncover the enormous genomic diversity within it, may no longer be a reliable or useful ecological unit. Instead, we will understand ecosystems in terms of the collective activities and interactions of the genes they contain, how these are distributed and expressed in space and time, and how they function together. We can expect, in 20 years, enormous advances on three fronts— technical, computational, and biological—as well as a host of specific applications. POTENTIAL TECHNICAL ADVANCES Sequencing technology will have reduced the per-base price of finished sequence to fractions of a cent, and the cost of sequence-data acquisition will no longer by a serious consideration in studies of specific ecosystems. Sequencing methods now in use will have increased run lengths substan- tially but will themselves probably have been replaced with even more direct, and often also cloning-independent, approaches, perhaps single- molecule technologies now under development or others yet to be imagined. Single cell genome sequencing will be routine, and cell-sorting methods that readily permit recovery of even unique individual cells will be well advanced. Complete genome sequences, some produced by “traditional” methods based on isolates (or single cells) but others acquired metagenomi- cally, will number in the thousands, perhaps even tens of thousands. There will be many “species” for which hundreds of individual isolates will have been sequenced. Transcriptomic and proteomic applications to community samples will be comparable in their reliability and efficiency with such methods as are used in human genomics today. Incremental improvements in microarray sensitivity, specificity, and reproducibility will make it possible to assess community membership and abundance down to the “species” level, how- ever that concept is then understood. New normalization protocols will allow a census of even the rarest members of a community, and whole- community RNA amplification will access their transcriptomes. We will be able routinely to classify or type ecosystems and monitor changes in

OCR for page 134
 THE NEW SCIENCE OF METAGENOMICS their compositions and activities with arrays (and their future equivalents, which may be microfluidics-based) that are inexpensive and readily avail- able commercially. Such monitoring will indeed be routine practice in many environment-based business and regulatory activities and in epidemiology. New “omic” methods and sciences will have been developed for charac- terizing communities and their genetic, physiological, biochemical, and biogeochemical activities. Many currently unculturable organisms and consortia will have been “domesticated,” by using knowledge of their individual needs and potenti- alities as derived from community metagenomics. As we come to appreciate the true extent of diversity (even within designated species) we will know that even such facilitated pure-culture or defined-culture studies will never be adequate for global understanding, but will provide excellent models of physiological interactions and the refinement of computational models for such interactions. POTENTIAL COMPUTATIONAL ADVANCES In 20 years, infrastructural accommodations will have been made for the almost unimaginable amount of metagenomic data that will have accu- mulated. For reasons elaborated in Chapter 5, the metagenomics databases are expected to dwarf genomic databases, no matter the predicted rate of growth of the latter. Although all sequences and trace data (or their future technological equivalents) will be available through GenBank or compa- rable public repositories there will be specialized (but fully public and interoperable) databases of all sorts. It will be possible to answer questions like those sketched in Box 5-1 by direct queries to the databases, which will also be rich in associated metadata. Just as much biological research is now conducted by computer scientists, much microbial ecology will be purely computational. Indeed, these downstream activities may be the dominant form of metagenomics employment; but metabioinformaticians will need even broader interdisciplinary training and collaborative links—in geochemistry, oceanography, earth and atmospheric sciences, biochemis- try, microbiology, ecology, genetics and genomics, statistics, and computer science. Although traditional microbial classification practices (phenotypic char- acterization and identification at the level of species and genus) may remain useful, the basis on which we predict properties of isolates will be sequence- and computation-driven and probabilistic. Equally often, investigations of community activities of any magnitude (from the tiny but complex eco- system of a termite’s gut to the Pacific Ocean) will be conducted at the level of genes and their interactions—understanding the “games being played,” with decreased emphasis on phylogenetic identification of the “players.”

OCR for page 134
 EPILOGUE POTENTIAL BIOLOGICAL ADVANCES It is of course pure science fiction to predict what we will know about the biosphere 20 years from now and it is in the nature of a transforma- tive science to be unpredictable. But it is of some value to guess at the kinds of breakthroughs in biological science that metagenomics will make possible. Viruses There are many more viruses (and possibly more kinds of viruses) than there are cells (or kinds of cells). In many ecosystems, viruses are the principal regulators of organismal abundance and may well be the principal agents of genetic exchange between organisms. Their genomes collectively harbor a vast number of genes about which we know almost nothing and that can be exchanged between viruses and cells in a mix-and- match fashion. In 20 years, we hope to have some good idea of the depth of this enormous gene pool and (through comparative genomics, ab initio structural modeling, and extensive structural genomics) a vastly better understanding of what many of the genes do for their viral or cellular hosts and what they might do for us. We will understand and be able to monitor the exchange of information between viruses in the environment and those infecting us and the animals and plants that we use. Our ability to monitor and predict the emergence of viral diseases will be much enhanced. Cells and Their Genes and Genomes We will have come to an understanding of the diversity of gene content within species, of how many strain-specific genes are involved in strain- specific biology, and of how many are “just passing through.” We will have a vast inventory of gene sequences and, through structural genomics, a vast reservoir of genes with reasonably inferred functions even if the organisms of origin and the roles of the genes in their biology remain a mystery. We will be able to say whether adaptation to environmental change of any sort most often involves recruitment of preadapted lineages from elsewhere or cobbling together of novel lineages by exchange and assembly of genes already present. Species We will have enough information on the diversity of environmental gene sequences to allow us to redefine the species concept to a more consis- tent, accurate, defensible, and enduring concept that will have broad value

OCR for page 134
 THE NEW SCIENCE OF METAGENOMICS across numerous disciplines and applications. We will have relegated so much of the task of identification of isolates and prediction of their proper- ties to computers and sequence databases that it will be the predictions, not formal identification, that we care about. We will understand the various processes that might be termed “speciation” and have a good idea of their relative frequencies in nature. We will have redefined questions of diversity (“How many species are there in an environment or in the world?”) in terms of the sequences of genes and the composition of genomes. Biogeography We will have mapped an enormous number and diversity of genes and genome compositions in space and time and will be able to retrieve and reanalyze this information and associated physical, chemical, and biological metadata. We will have substantial gene-expression and metabo- lomic data on the same sites and can begin to look at Earth as though it were an organism-like spatiotemporally defined entity with an evolved and homeostasis-promoting global “metabolism.” Gene frequency and expres- sion will make sense in that context even though Earth is not an organism. The question of whether “everything is everywhere” will be subsumed into this gene-level and genome-level analysis, which will be recast in terms of relative rates of divergence and dispersal of genes. Community Structure and Function Model-community projects undertaken in the next 5 years will have been completed and, in addition to a deep understanding of their target sys- tems, will provide templates for other studies, smaller in scope but greater in number and ultimately interconnectable. Community structure will be understood and described (“profiled”) in terms more of gene presence and abundance than of species presence and abundance, and we will have developed a typology or catalog of communities that will allow us to infer what sort of biogeochemistry is happening at any place and time and to monitor changes. Such profiling is already done with ribosomal RNA and a few other markers, but comprehensive functional gene (and gene-function) assessment will be vastly more subtle and informative. One safe prediction is that such profiling will be extensively applied and prove of great value in disease diagnosis and determination of nutritional status of humans (individual and communities) and of animals and plants that they use or care about. Probiotic therapies and regimens will become evidence-based and increasingly valuable, as will microbiome profiling in the detection of diseases that originate in the host.

OCR for page 134
 EPILOGUE Interactions Within and Between Communities Gene frequency and expression data will, in 2027, have long been the basis for constructing community “interactome” maps, comparable in char- acter but vastly more complex than maps now used by systems biologists to study individual organisms and their responses to perturbations. The combinations of genes and organisms that influence community robustness will have been identified and predictive principles of community behavior will have been derived. The development and implementation of such analytical models will allow computational microbial ecologists to predict responses (at the level of gene frequency, expression, and exchange) to envi- ronmental challenges of all sorts. Testing such predictions will lead to better models. Such reiterative approaches are already used, but models based on all genes rather than a few diagnostic markers will have immensely more explanatory and predictive power. The ultimate goal, perhaps in sight by 2027, would be a metacommunity model that seeks to explain and predict (and retrodict) the behavior of the biosphere as though it were a single superorganism. Such a “genomics of Gaia” would be the ultimate imple- mentation of systems biology. The enormous challenge that creation of such a metamodel represents is matched by its importance for the future of the human species. POTENTIAL ADVANCES IN EDUCATION AND PUBLIC UNDERSTANDING By 2027, we will have many more mechanisms for communication than we have now, but all will be usable to teach the public about microbes through the excitement and “big science” appeal of metagenomics. Micro- biology will be required in the K-12 curriculum and as a prerequisite for teaching certification, and metagenomics centers across the United States will have developed robust mechanisms for communication with diverse people, including those who do not have access to a university. The mecha- nisms might include distance-education courses, mobile microbiology units, press releases about milestones in projects, hosting of teachers in research laboratories, and teaching by metagenomics scientists in K-12 classrooms. Graduate students will be trained to teach microbiology in the classroom and in the larger community. SOME POTENTIAL SPECIFIC APPLICATIONS We see metagenomics as a new basic science with many eminently use- ful (and in tomorrow’s world essential) applications, some accomplishable over the short term and probably most on the drawing board by 2027.

OCR for page 134
0 THE NEW SCIENCE OF METAGENOMICS Earth Sciences The biological forcing of elemental cycles is key to understanding a wide variety of Earth-system processes. Large-scale, ecosystemwide fluxes of energy and matter, however, are difficult to model accurately or to study in the laboratory. By 2027, Earth-system processes will have been examined in much greater detail with metagenomics coupled with other synoptic physicochemical and biological measurements. Microbial-community genomics will provide information important for understanding energy fluxes and biogeochemical mechanisms in the deep subsurface, modeling biologically mediated rock weathering and surface chemistry, and defin- ing the key genetic and biogeochemical drivers of processes that influence greenhouse-gas production and consumption. The oceans, which harbor millions of microbes in each teaspoonful of seawater, will be modeled more fully as we become able to visualize the rich biological systems they encom- pass. In a practical sense, such processes as uranium immobilization or acid mine drainage cleanup, which involve coupled biological-geochemical interactions, will be enhanced and improved with new community-genomic datasets. Microbe-enabled oil recovery, subsurface methane production and consumption, and carbon storage and turnover are other critical interfaces between the microbial world and the Earth system. The new “whole-Earth catalog” of microbial genes and genomes provided by metagenomics will propel a new understanding and new technologies for more appropriate resource use and sustenance of the living Earth system. Predictive models of many vital biogeochemical processes will inform enlightened policy makers. We will be able to say, for instance, why it might or might not be a good idea to seed oceans with iron to increase carbon sequestration. Similarly, we will be able to model (and predict the extent of) methanogenesis in the permafrost as it thaws. Metagenomics-based environmental monitoring will be a thriving industry. Life Sciences Through a fine-scale and nuanced understanding of genetic and eco- logical processes, we will demolish many generalizations about microbes, replacing them with particularized knowledge. We anticipate that many basic concepts that have vexed biologists for decades (sometimes centuries), a few of which were alluded to earlier in this epilogue, will be recast in molecular terms. Taxonomy, the science of identification and naming organ- isms according to their relationships, will be radically transformed. The enormous combined genomic and metagenomics databases will enable us to predict the behavior of an isolate, a consortium, or a complex community on the basis of carefully targeted sequence or other molecular information.

OCR for page 134
 EPILOGUE Metagenomic methodology and concepts will have expanded well beyond the realm of viruses, bacteria, and archaea, to embrace the population biol- ogy and biogeography of microbial eukaryotes (protists, algae, and fungi). Indeed, the new research methodology and paradigm will have found uses even for macroscopic organisms, when it is population or ecological pro- cesses that are of interest. And with a proper appreciation of the roles of microbes in the balance of life, a new global systems ecology embracing all species, including humans, will have been born. This will mandate changes in how we teach biology at all levels. The teaching of microbiology, ecology, and evolutionary biology will all be profoundly affected by metagenomics, bringing the focus of a generation of students back “down to the ground,” where problems can be directly addressed. Biomedical Sciences The full extent of interindividual diversity within the human microbiome will be understood, and changes in microbial-community composition that contribute to or are responsible for a number of acute and chronic diseases will have been elucidated. Microbiome-based diagnosis will be an essential component in treatment for many diseases. Preventive medicine will be a major component of health care and health industries with the development of rational probiotic therapy as a means of maintaining a “healthy” human microbiome. By understanding how the human microbiome differs in health and disease, physicians will be on a much better footing to understand and predict the incidence of chronic inflammatory and infectious diseases, both viral and microbial. Therapeutic interventions (in addition to probiotics) will be based on comprehensive knowledge of the effects of treatment (such as with antibiotics) on the microbiota as a whole. New antibiotics from currently unknown natural (and generally microbial) sources will have come on line, and new strategies (such as those described below) for forestalling the development and spread of antibiotic resistance will have been devised. Agriculture Microbial communities will continue to affect productivity in agri- culture, both plant-based and animal-based. Metagenomics studies of gut populations in poultry, pigs, and other food animals will increase our knowledge of gut-microbe interactions, which will help to formulate more effective probiotic mixtures in the future. We expect a comparable impact on plant-based agriculture. The function of the crenarchaeotes and other microbes that colonize plant roots and their importance to carbon and nitrogen cycles will be better understood. We will understand how plants

OCR for page 134
142 THE NEW SCIENCE OF METAGENOMICS and their beneficial microbial partners deal with antagonistic microbes. Lessons will have been learned from the food crops that have been success- fully cultivated over the centuries. Using metagenomic approaches, we will exploit the interplay of microbes and plants more intelligently for human benefit. Bioenergy Fossil fuels are a nonrenewable natural resource. It is projected that energy demand will increase by more than 50% by 2025 (US Department of Energy 2005). The US economy depends on oil imports, so there is an interest in augmenting domestic energy production. Corn serves as the major feedstock for ethanol production, and biofuel-producing companies are using specialist microbes to convert cornstarch to ethanol, a high- octane, environmentally friendly biofuel. Cellulosic ethanol—made from such agricultural wastes as corn fiber, corn stalks, and wheat straw and other biomass, such as switchgrass and miscanthus—uses as substrates products that are not usable by humans as food. Furthermore, cellulosic materials are inexpensive, renewable, and their efficient use will reduce the cost of ethanol production. Most of the known ethanol-producing microbes are incapable of using cellulose to produce ethanol, because they lack the enzymes required to break it down. In nature, however, several microbes are equipped with arrays of enzymes that act together to release glucose from cellulose. The glucose can then be fermented to ethanol. Metagenomics will enable discovery of new cellulosic enzymes and novel microbial strategies for hydrolysis of biomass. These discoveries will lead to engineering of enzyme complexes and novel pathways for enzymatic hydrolysis of cel- lulose and a concomitant increase in production of biofuels from cellulosic materials. Bioremediation Metagenomics will shape bioremediation in many interrelated ways. First, vastly increased understanding of how microbes form “bucket bri- gades” for the degradation of xenobiotic compounds will allow us to dis- tinguish contaminated sites in which the native microbiota is competent to restore environmental health from sites in which intervention in the form of in situ bioaugmentation or intensive ex situ treatment at special facili- ties is needed. Second, metagenomics will facilitate sensitive monitoring of remediation activities of either sort. Third, it will identify key microbial processes and keystone species and indicate how community composition could best be complemented. Fourth, it will lead to the isolation of specific strains or consortia that could be used for such complementation. Fifth, a host of novel enzymes that might be useful in cellfree treatments of specific

OCR for page 134
143 EPILOGUE contaminants will be found. And sixth, where appropriate and permitted, the metagenomics database will provide a rich stock of genes for the con- struction of novel specialized strains for targeted use in bioeremediation. Biotechnology The biotechnology industry already employs hundreds of microbial enzymes and related products, and the global industrial enzyme market is currently in excess of $2 billion per year, primarily in technical (including scientific, pulp and paper), food, and agriculture and feed applications. The great majority of such enzymes are the result of traditional approaches: enrichment, culture, isolation, and enzyme purification. Collectively, the metagenomics database and the effort, now in full swing, to express, crys- tallize, and characterize structurally and functionally entire proteomes of many model organisms are likely to enhance the rate of discovery of such valuable catalysts by at least an order of magnitude—a revolution in green chemistry. Ironically, some of the key products of such activities to date have vital applications in the discovery process itself. For instance, the polymerase chain reaction—which is the basis of modern molecular envi- ronmental microbiology, DNA forensics, and molecular diagnosis—is based on genes cloned from thermophilic bacteria and archaea. Biodefense and Microbial Forensics The same methods that will allow us to assess community composi- tion and activity will enable construction of biosensors for biodefense and microbial forensics. In 2027, the threat of terrorist or criminal use of pathogenic organisms and their toxins against human populations or agricultural (plant and animal) targets may still be of concern. However, society’s ability to anticipate and respond to these threats will be markedly enhanced through the continued application of new technologies that will allow us to assess microbial community composition and activity in vari- ous environments. This will permit precise, rapid, and sensitive monitoring of air, water, and food supplies for potential biothreat agents with novel biosensors. We will be better able to identify the presence of a natural or engineered biothreat agent against a large natural microbial background, and we will be able to predict virulence properties and sensitivity to anti- viral or antimicrobial drugs. Another anticipated outcome of research in biodefense will be a strong forensic capability to carry out attribution for acts of bioterrorism that use animal, plant, and foodborne pathogens and toxins. Such capability will provide the law-enforcement, intelligence, agri- culture, public-health, and homeland-security communities with informa- tion to assist in identifying perpetrators of biocrimes and bioterrorism and to serve as a deterrence factor.