National Academies Press: OpenBook

The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research (2008)

Chapter: 3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important

« Previous: 2 The Integral Role of Theory in Biology
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 38
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 39
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 40
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 41
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 42
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 43
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 44
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 45
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 46
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 47
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 48
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 49
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 50
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 51
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 52
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 53
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 54
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 55
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 56
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 57
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 58
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 59
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 60
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 61
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 62
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 63
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 64
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 65
Suggested Citation:"3 Are There Still New Life Forms to Be Discovered? The Diversity of Life - Why It Exists and Why It's Important." National Research Council. 2008. The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research. Washington, DC: The National Academies Press. doi: 10.17226/12026.
×
Page 66

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

3 Are There Still New Life Forms to Be Discovered? The Diversity of Life—Why It Exists and Why It’s Important In an age when people can visit the bottom of the ocean or the inside of a volcano from the comfort of their living rooms, it may seem strange to ask whether there are any new life forms to be discovered. But, in fact, the extent of life’s diversity has not yet been determined. Just 30 years ago, scientists on board the deep-sea submersible Alvin discovered an unexpect- edly diverse community of sea life in hydrothermal springs 2.5 kilometers below the surface of the ocean near the Galapagos. Alvin’s crew found a diverse community, including giant tubeworms, huge clams, and ghost-like crabs thriving around the hot submarine springs (Van Dover, 2000). This complex ecosystem was fueled not by the harvesting of the sun’s energy by photosynthesis but by energy derived by bacteria from the hydrogen sulfide spewing from the vents. The study of life’s diversity involves more than just going into the world or the laboratory and looking for new things. The places we look, the tools we use, and the experiments we do are influenced by our theoretical and conceptual understanding of the limits of life, the mechanisms of evolution, and the role and significance of diversity. Conversely, new observations and experimental results are constantly forcing us to adjust our theoretical framework. This chapter gives examples of the extent of diversity at several different scales in biology and illustrates the many roles that theory plays in the study of these different kinds of diversity. The fantastic creatures that populate the ocean’s hydrothermal vents are just one example of situations where discoveries have triggered an expansion of biology’s theoretical framework. Our views of where life can exist have been regularly revisited; organisms are being discovered in 38

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 39 habitats­—from the human stomach to more than a mile underground— where conditions were thought to be too harsh to allow life. New birds, plants, and mammals are still found with some regularity. Entomologists name and describe new insect species at a rate of about 1,500 per year. The evidence that some genes have been conserved throughout evolution and the availability of polymerase chain reaction to survey those genes made it possible to begin exploring the diversity of the microscopic world. Sud- denly, tiny organisms that appeared under the microscope to have only a few basic and uncomplicated body forms were revealed to be unimaginably diverse—in fact a new kingdom of life, the Archaea, was discovered to be as different from bacteria as bacteria are from eukaryotes (Woese et al., 1990). The advent of high-throughput sequencing and sophisticated computational analysis has allowed biologists to begin to plumb the diversity of the micro- bial world, and it appears that life at the microscopic level is vastly more diverse than biologists ever imagined. A recent survey of microbes in the ocean using an approach called metagenomics not only revealed thousands of previously unseen genes but hundreds of novel protein families. Families of proteins that were already known, like the rhodopsins that absorb light in the human retina, were found to have hundreds of distinct members in the ocean sample (Bejà et al., 2000, 2001). The vast numbers of new genes are not necessarily mere variations on known themes; the potential func- tional diversity—in other words, proteins and synthetic pathways that carry out currently unknown reactions—to be found in microbial communities is enormous (e.g., Venter et al., 2004; Zhang et al., 2006; Gill et al., 2006). What is the significance of discovering one more beetle, one more bacterium, or one more protein? One answer lies in the incredible diversity of functions that evolution has generated. Nature has foreshadowed our technical developments, and functional biodiversity can be a fertile source of ideas for technology. For example, a group of neuroscientists has found a parasitic fly that can locate the sounds of its hosts—field crickets—with unparalleled accuracy. Remarkably, the fly’s ears are tiny and only one-half millimeter apart (Mason et al., 2001). The fly’s ears have inspired the design of directional microphones and a new generation of directional hearing aids. Another example is a group of brittlestars (relatives of sea stars) that have turned their skeletons into a visual system made up of arrays of micro- scopic lenses (Aizenberg et al., 2001). The lenses detect light and allow the animals to find dark hiding places on the ocean bottom. Such small lenses are beyond current human engineering capability. However, their precisely curved shape and the way they are arrayed are prompting engineers to cre- ate novel optical devices. Recognizing that nature provides a vast toolbox is only one motivation for studying life’s diversity. The complex interconnected web of living spe- cies is critical to human life. Humans depend on the living world in count-

40 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY less ways. The connection between biological diversity and the stability of ecosystems is only imperfectly understood. Clearly, the living world will continue to evolve in response to environmental change, but from the hu- man perspective the time scale of that adaptation is crucial. Understanding the role of biological diversity and how it is generated, maintained, and lost is a critical goal for 21st-century biology. MAKING SENSE OF LIFE’S DIVERSITY The Diversity of Species The effort to identify, describe, and name distinct organisms in a systematic and coherent framework has been underway for hundreds of years. These activities are called taxonomy. Currently, systematists­­—a name change that reflects a change in the underlying conceptual basis of classify- ing diversity—study the details of organisms’ characteristics and the inter- relationship of characteristics between different organisms (e.g., whether the middle finger of a human corresponds to the middle digit of a bird; Wagner and Gauthier, 1999). Systematists use such comparisons to organize organisms into a classification system that rationally groups similar organ- isms together. Both the methods by which these activities are carried out and the description of the astonishing diversity of organisms are works in progress. They are essential works, for a system of nomenclature and clas- sification is necessary in order to organize knowledge about the millions of species, known and yet to be described. Clearly a system of classification requires the underpinning of a robust theoretical framework. The still commonly taught hierarchical Linnaean form of classification (species, genus, family, etc.) was proposed and developed by Carl Linnaeus (1707-1778) a century before The Origin of Species. While Linnaeus is credited with devising a system for the orderly classification of species, in fact, his own classification schema for plants grouped them strictly accord- ing to the number and arrangement of their reproductive parts, leading to groupings­, like castor beans with conifers, that now sound illogical. Linnaeus’s binomial naming system has survived, but subsequent taxono- mists followed the example of naturalists like John Ray (1628-1707), who had begun to classify organisms on the basis of groups of morphological and physiological characteristics. The prevailing theory underlying the study of diversity at that time was that there existed a fixed number of spe- cies and that the job of naturalists was to name and catalog each of them in a logical way. The fastidious work of specimen collection followed by comparative morphology and physiology, while carried out within what is now seen to be a false theoretical framework (that the number of spe- cies was fixed and that species did not change over time), nevertheless

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 41 provided the body of data that Darwin used to develop the new theory of descent with modification. With the addition of Darwin’s theory of evolution, comparative morphology and physiology became a richer un- dertaking, and it became possible also to integrate extinct life forms into the tree of life by studying the characteristics of fossils. The classification systems developed by comparative taxonomists from John Ray forward, indeed, correspond surprisingly well with the genetic data that began to emerge after the identification of DNA as the molecule of heredity. The theoretical relationships between organisms proposed by taxonomists can now often be demonstrated through computational comparison of their genetic sequences­, a field known as “phylogenetics.” Indeed, the theoretical hypothesis of descent with modification provided a rich source of potential experiments that could be carried out bio-informatically. The comparison of gene sequences through phylogenetics has confirmed that many of the taxa (hierarchical groups of organisms such as “arthropods” or “insects”) recognized by pregenetic classification schemes correspond to evolutionary lineages. The theoretical basis of modern systematics rests on grouping species into taxa, or “clades,” that, according to the best interpretation of data, have descended from a common ancestor and thus form one branch of the great tree of life, the phylogeny of all organisms. Classification of organisms into named grouping entities (i.e., taxa) is a nontrivial task, but there has been enormous progress in phylogenetic systematics, owing both to the development of increasingly sophisticated statistical methods and algorithms for inferring phylogeny and to DNA sequencing. In particular, DNA sequences provide data that can be treated quantitatively and are more broadly comparable across the diversity of life than the type of data that predated the molecular revolution (Kim, 2001a). Indeed, the recent availability of genome-scale information and whole genomes enhances our ability to construct phylogenetic relationships by considering multiple related genes, genomic rearrangements, genomic con- tent, or even functional relationships of genomic components (e.g., Boore, 2006; Wolfe and Li, 2003). Phylogenetic descriptions of diversity are immensely useful, partly be- cause they capture a great deal of information and partly because they give us a guide to the history of organisms and their characteristics. Phylogenies summarize a great deal of history and can be used for tracing the evolution of the traits and molecular characteristics of even extinct organisms (see Box 3-1). Historical trends as revealed by phylogeny can have important applications as well. Just as the knowledge of past trajectory is used to gauge the future landing site of a thrown football, phylogenetic reconstruc-   A phylogeny is a tree-like diagram where branches represent evolutionary lineages and leaves represent current organisms.

42 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY Box 3-1 What Could Dinosaurs See? By comparing current DNA sequences, biologists can deduce the sequences of those genes in the ancestors of current species. Chang (2003) and colleagues investigated the characteristics of the visual pigments (rhodopsins) of archosaurs, the ancestors of dinosaurs, birds, and crocodiles. Phylogenetic analyses allowed the comparison of rhodopsin genes of a wide variety of living organisms and generation of the best estimate of what the gene sequence would have been in their distant, common ancestor. Most interestingly, the theoretically deduced gene sequence could be cloned into laboratory bacteria where it was shown to code for a functional protein. The function of the reconstructed protein could then be tested. It was shown to be most sensitive to light of the wavelength of 508 nm—a slightly longer wavelength than that perceived by modern vertebrates—suggesting that archosaurs may have been able to see in dim light. Thus the work both sheds light on the lifestyles of extinct organisms and validates the general approach of theoretical estimation of ancestral gene sequences, followed by direct laboratory study of the reconstructed proteins. SOURCE: Chang et al. (2002). tions can be used to estimate prospective evolution of rapidly evolving or- ganisms such as influenza viruses or antibiotic-resistant bacteria and hence to develop vaccine and treatment strategies (Smith et al., 2004; Koelle et al., 2006). There remain both practical and conceptual limitations to using phy- logenetic trees to create classifications. The limitations fall into two basic categories. First, the mathematics is extremely complicated. Second, while evolution is driven by general rules of natural selection, there is also an element of chance. Many possible genotypes may have the same phenotype and fitness, so that the eventual descendant whose sequence is studied today could have many equally possible ancestors. There are biological, statistical, and computational challenges in phylogenetic reconstruction. First, on the biological side, there are ap- proximately 1.5 million described organisms and vastly more unde- scribed organisms. It is still a huge challenge to obtain phylogenetically relevant information from such a large collection of organisms—the development of new technologies such as massively parallel sequencing will be critical to solving this problem. Accurate estimates of phylogeny re- quire statistical models of evolution as a base starting point. There are still

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 43 considerable problems in constructing biologically reasonable yet compu- tationally approachable statistical models. For example, it is very difficult to resolve the branching order among lineages that diverged either very recently or very long ago. Solutions to these problems will likely require statistical models of molecular processes other than simple single base-pair DNA mutations as employed now. The major challenge of phylogenetic tree estimation lies in the compu- tational domain. Consider that for only 10 species there are over 34 million possible alternative phylogenetic trees, and for 30 species there are more numbers of possible trees than there are atoms in the universe! The goal of phylogenetic estimation algorithms is to select the optimal tree among such impossibly large numbers of possibilities. The magnitude of this computa- tional challenge has led a computer scientist to exclaim “There are enough problems, already formulated or yet to be developed, to keep teams of algo- rithm designers busy for many years, and just the right combination of real data, credible simulation, and scaling issues to make phylogenetics [italics ours] the ideal testing ground for algorithm engineering” (Moret, 2005). In other words, the problems of phylogenetics are challenging enough to test the mettle of the state-of-the-art approaches of mathematicians, engineers, and computer scientists. The importance of getting it right, however, is high because the tree of life is our map to life’s history and to the relationships among organisms. The tree of life is used as a guide for research and to find out the origin of traits, including why human bodies are vulnerable to certain kinds of failure. The seemingly inexplicable narrowness of our birth canal and the persistence of genes that cause diseases have their origin in our evolutionary history, and why humans live as long as we do can be bet- ter understood when scientists find our position in the tree of life and trace how the working features of organisms have evolved along its branches (Nesse et al., 2006). The Challenge of Microbial Diversity A basic concept underlying phylogeny is that diversity arises from the branching of lineages from a common ancestor rather than from fusion (hybridization) of distinct lineages. The many species of finches on the Gala- pagos arose from a single ancestor species whose descendants specialized on different food sources, not from mixing and matching between an ancestral finch and other specialized birds. Therefore, evolutionary theory suggests that evolution should create genealogical trees rather than networks. This idea captures the broad pattern of evolution and has been immensely useful, yet it can be problematic for some organisms, especially the noneukaryotes. Early results of metagenomics studies (see Box 3-2) demonstrate that the genomes of bacteria and archaea are extremely variable. Organisms that

44 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY Box 3-2 Theoretical Questions That Can Be Addressed with Metagenomics One of the most exciting recent developments in microbiology is community genomics or metagenomics. Instead of trying to isolate and study individual micro- bial species, practitioners of this approach characterize DNA from entire mixed mi- crobial communities. The metagenome of a habitat includes the genomes of all the microbes living in that habitat. Thus, in metagenomics, genes and their functions are studied independently of the species from which the DNA is derived. Metage- nomics makes accessible the diversity of the microbial world and has considerable potential to transform biologists’ view of life. A recent report (The New Science of Metagenomics, NRC, 2007) expanded on the conceptual and theoretical ques- tions that may gain new answers in the light of metagenomic research. “Decades of genetic, molecular, and biochemical dissection of microbial life have revealed the detailed structure and inner workings of several bacteria and archaea. Although there is much more to learn even about model organisms, such as E. coli, many individual pathways for nutrient cycling, gene regulation, and reproduction are understood at a satisfying level of precision. But these processes in the majority of microbes remain unknown and knowledge of the evolution and ecology of microbial communities lags far behind cellular microbiology. Basic ideas that organize biologists’ understanding of the living world may need refinement in the face of greater under- standing of community function. What is a genome? The number of genes in the genome of a free-living bacterium ranges from 500 to 10,000 or more; the largest bacterial genomes are more than twice the size of the smallest eukaryotic genomes. In contrast, the genomes of many parasitic or symbiotic microbes are highly reduced, with not nearly enough genes to support them independently of their hosts. As more data accumulate, the definition of what constitutes a microbial genome will be better informed and underlying principles governing genomic plasticity in microbes may emerge. . . . If having a more flexible and dynamic genome structure is a fundamental life-strategy difference between bacteria and archaea, on the one hand, and eukaryotes, on the other, what are its advantages and limits? Can understanding the phenomenon help to explain the emergence of multicellular organisms that have more fixed genomes? What is the role of microbes in maintaining the health of their hosts? Closely associated microbial communities appear to be a common, if not universal, fea- ture of the physiology of multicellular organisms. These communities contribute to a variety of functions, from digestion to defense against pathogens. All plants and animals, including humans, can be considered superorganisms composed of many would be considered the same species on the basis of the similarity of cer- tain highly conserved genes may be found to have only 50 percent of the rest of their genes in common, with many other genes that are not found in every individual. Microbiologists are developing the concept of a “pan- genome” to describe the set of genes that are shared by all members of a

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 45 species—animal, bacterial, archaeal, and viral. Using the human as an example, the human “metagenome” might be considered an amalgamation of the genes contained in the Homo sapiens genome and in the microbial communities that colonize the body inside and out. The organisms within these communities are collectively known as the human “microbiome.” The metagenome of these communities encodes physiological traits that humans have not had to evolve, including the ability to harvest nutrients and energy from food that would otherwise be lost because humans lack the necessary digestive enzymes. Metagenomics will enable us to address a number of fundamental question. . . . Is there an identifiable core microbiome shared by all humans? How is each individual’s microbiome selected? What is the role of host genotype? Should differences in each individual’s microbiome be viewed, with the immune and nervous systems, as features of our biology that are profoundly affected by individual environmental exposures? How is the human microbiome evolving (within and between individuals) over differ- ent time scales as a function of changing diets, lifestyle, and biosphere? How can this knowledge be used to manipulate microbial communities to optimize their perfor- mance in a person or in a population? Most obviously, how does the microbiome affect health, and vice versa? In the future, previously unrecognized microbial involvement with disease states will be uncovered. Many host physiological states with primary genetic or biochemical causation will affect the microbiome in ways that may aid in diagnosis. Of course, these questions do not apply only to humans—study of host- associated microbial communities will contribute to understanding of the physiology of all organisms. What ecological and evolutionary role do viruses play? Viruses are important not only as pathogens but as agents of lateral gene transfer and catalysts that generate tremendous genetic variation in their specific hosts.  Viral activity also has important consequences for turnover of the elements, for example, in carbon cycling in aquatic systems.  It has only recently been recognized that virus particle numbers are enor- mous, often exceeding those of co-occurring cellular life.  For example, seawater contains 10 times more bacteriophage than cellular microbes.  Estimates suggest the biosphere harbors perhaps as many as 1031 viral particles (Edwards and Rohwer, 2005). Given these vast numbers, the influence of viruses on biodiversity and evo- lutionary catalysis, and their role in biogeochemical cycling, there is considerable interest in characterizing naturally occurring virus populations.  Metagenomics has recently provided an important avenue for exploring these ubiquitous and biologically important entities.” SOURCE: NRC (2007). microbial species (Tettelin et al., 2005). The great variability of microbial genomes is the result of horizontal gene transfer; bacteria and archaea can exchange genetic material by a number of different mechanisms, even with organisms that are distantly related. The prevalence of horizontal gene transfer means that the phylogenetic relationships of microbes may look

46 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY more like networks than trees and all of an organism’s different genes may not have the same phylogenetic relationships. For some microbiologists, the very concept of “species” seems problematic for organisms whose genomes can be so variable, but others maintain that the concept of species will be useful for categorizing noneukaryotic organisms. Until more is known about the extent and pattern of horizontal gene transfer, this conceptual issue will remain open. Horizontal gene transfer is most common in non- eukaryotes, but there is evidence of transfer of genes between symbiotic partners (Hoffmeister and Martin, 2003). While such events may be rare and not affect the overall shape of the tree of life, their existence provides evidence of additional sources of genetic variability on which natural se- lection can act. Defining the role of horizontal gene transfer is only one of several fundamental theoretical issues raised by the study of microbial communities (Box 3-2). Genetic Diversity Is Itself Diverse Biological diversity is more than species diversity. The study of biodi- versity usually focuses on changes in species numbers in time and space. Life, however, is diverse at all scales. There is diversity in the organization of genomes; in genes and their protein products; in genetic networks and the molecular machines they assemble and regulate; in strategies for defense against pathogens, mobility, and detection and reaction to the environment; and in the morphological, behavioral, and physiological characteristics of individuals within species. At all these levels, there is constant interaction between the theories currently used to describe the extent and consequences of diversity and the relentless flow of new examples of diversity. Genome Size The genome of an average mammal has around 3 billion pairs of nucleotides. This is about a hundred times longer than all the letters in a 20-volume encyclopedia arranged in a line (Avise, 2004). Genome sizes vary from a few thousand base pairs in viruses to 600,000 base pairs in some bacteria to more than 200 billion base pairs in some animals. Genome sizes do not correlate with position on the tree of life—bacterial genomes range from 0.6 Mbp (Mycoplasma genitalium, an intracellular pathogen) to approximately 1 Mbp for many free-living bacteria, to 10Mbps for the filamentous cyanobacterium Nostoc punctiforme. Invertebrate genome sizes vary by more than three orders of magnitude, from 29Mbp (the root-knot nematode) to 63 billion bps (an amphipod), while vertebrates vary about 400-fold in size (from the 342 Mbp of the green pufferfish to the 129 bil- lion base pairs of the marbled lungfish), as indicated in Figure 3-1. The lack

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 47 FIGURE 3-1  Genome size in various organisms. SOURCE: Molecular Biology of the Cell, 2002, by Alberts et al. Reproduced with permission of Li and Sinauer and Garland Science/Taylor & Francis LLC. 3-1 of obvious correlation between genome sizes, phylogenetic relationships, or organism complexity has stimulated the development of a new area of biological inquiry and experiment. The sheer size of the genome can accom- modate a lot of variation, and indeed genomes can differ enormously even within a single species. Stephens et al. (2001) have estimated that random pairs of homologous DNA sequences from humans would differ in about 1 out of every 1,000 base pairs, meaning that one human differs from another at an average of 3 million sites. Individual base pairs are not the only place at which genomes can vary; a recent study of 270 individuals found that approximately 12 percent of the genome showed differences in gene copy number from one individual to another (Redon et al., 2006). Repetitive genetic elements and transposable genetic elements (segments of DNA that can move from one spot to another in the genomes of their hosts) may be found in different places in different individuals.

48 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY Neutral Theory Variation in the genetic code is the raw material of natural selection and thus evolution. However, it is only relatively recently that it has been understood how vast the extent of genetic variation is, how many differ- ent forms it can take, and how its magnitude can be estimated. Levels of genetic variation within a population are determined by important natural processes, including mutation, demographic structure and fluctuations, and natural selection. Genetic variation across species is governed by similar factors, albeit at a longer time scale. Thus, understanding the extent and limits of variation is a critical component of a theoretical understanding of evolution. Prior to the molecular era, the magnitude of genetic variation was controversial. One camp (the “classic” camp) argued that genetic vari- ability was low and that most individuals in a population shared the same form of each gene. The alternative camp (the “balance” camp) maintained that variation was high and that most individuals had different forms of the same gene (Lewontin, 1974). The controversy simmered for years because genetic variation was so difficult to measure. The history of the explora- tion of genetic diversity is a good example of how scientific progress comes about from the interaction of the development of new technologies, the data generated, and the theory developed to make sense of the data. Only about 40 years ago, in 1966, several laboratories used the newly developed method of gel electrophoresis to separate the proteins produced by a gene. The method suggested that the genomes of humans and fruit flies had a lot more variation than anybody expected. The broad applicability of the initial observations was debated, but the spread of the measure- ment technologies soon revealed that the larger-than-expected variation was common for many different genes. Thus, the protein electrophoresis era helped to resolve the theoretical debate about estimates of genetic variability and shifted the debate from the amount of genetic variability to its causes. The “balance” school argued that genetic variation was the outcome of natural selection (in the jargon of population genetics, of su- periority of heterozygotes, frequency-dependent selection, and variation in fitness among habitats). In the meantime, the development of methods to sequence proteins produced data that suggested that in vertebrates new amino acid variants become fixed in a typical 100 amino acid protein at the rate of about 1 per 28 million years. Extrapolating to the size of the typical genome, Motoo Kimura in 1968 made calculations to show that such a rate would imply one amino acid variant being replaced in the en- tire population once every three years. If such a replacement were due to new advantageous variants, then all individuals without the variant must be eliminated from the population—an unsustainable “substitution load” for the population. Thus, he boldly hypothesized that most of the genetic

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 49 variation in a population must be neutral variants that are randomly chang- ing in their frequency. Neutral mutations are genotypic changes that do not cause phenotypic changes. Because natural selection is conspicuously absent from the equations, the theory is called “neutral.” Initially, the idea that mutational variants would have neutral effect on selective dynamics was controversial—mutation should surely be either advantageous or deleteri- ous. However, current molecular data conclusively confirm that dominant molecular variation in populations must be neutral. The theory of natural selection is not thereby overturned—it is clear that many DNA sequences have experienced natural selection. The addition of the “neutral” theory of genetic change, however, makes us look at variation in the genome in a new way, because other changes in the genome are the result of neutral processes (Bustamante et al., 2005). The theory of neutral evolution is a canonical example of theory providing an explanatory framework for new data with far-reaching implications for understanding the process of evolution and the functional consequences of molecular changes. The evidence that some differences between genomes have functional significance and some are neutral naturally led to efforts to find ways to dis- tinguish between the two. Theoreticians have developed methods to detect the telltale molecular evidence of natural selection and thus to quantify the relative importance of selection and neutral processes. Relevant examples have emerged from the fine details of the major histocompatibility complex system (Schaschi et al., 2006), from the self-incompatibility mating system in plants (Charlesworth et al., 2005), and from the evolution of the mecha- nisms that plants use to resist the attacks of their natural enemies (Rausher, 2001) (see Box 3-3). The sequence of bases is not the only information stored in the genome; chemical modifications of DNA, such as methylation, and the three-dimen- sional packaging of DNA have important effects on when various genes are expressed. These “epigenetic” mechanisms (mechanisms that affect the expression of genes or inheritance of traits in ways other than changing the sequence of the DNA) are yet another example of common phenomena that have a role in the origin, maintenance, and loss of diversity. Gene Duplication Sometimes a genome contains multiple copies of related genes. These gene families originated by gene duplications. In all three domains of life, a large proportion of all distinct genes were generated by gene duplica- tion (Zhang, 2003; Bowers et al., 2003). Estimates of the percentage of duplicate genes range from 17 percent in some bacteria (Himmelreich et al., 1996) to 65 percent in Arabidoposis thaliana (Arabidopsis Genome Initiative, 2000). Duplicated genes can be grouped in families that share

50 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY Box 3-3 Plant Resistance to Pathogens In plants, detection of a pathogen infection initiates a cascade of processes beginning with the death of cells at the site of infection and activation of a systemic protection system that attacks the pathogen. A receptor protein that recognizes the invading pathogen initiates the defense cascade. The specificity of the receptor to the pathogen is one line of evidence for the idea that the receptor is adapted against a pathogen. Several other lines of molecular evidence support this conjec- ture: (1) The genes for these receptors have undergone large numbers of changes that have led to numerous amino acid substitutions over a short period of time. (2) The rate of base substitutions that lead to amino acid changes (nonsynonymous substitutions) is higher in these genes than the rate of substitutions that do not lead to amino acid changes (synonymous substitutions). (3) The changes in the receptor gene are concentrated in the region that interacts with the pathogen’s molecule that elicits the response. (4) Finally, in some of these resistant genes, the phylogenetic evidence suggests that several forms of the same gene are often maintained in the population for a very long time, presumably as a result of natu- ral selection that favors the maintenance of several different variants (“balancing selection”; Stahl et al., 1999). The availability of extensive sequence data has led theoreticians to develop an arsenal of statistical techniques to deduce the prob- able action of natural selection on DNA sequences (Ford, 2002). common ancestors and in which the members can have diverse functions, but a common theme emerges: What is the fate of a gene after it duplicates? After a gene duplicates in an individual, its fate is similar to that of a new mutational variant. If the duplication is neutral, it has a tiny probability of being fixed. Sometimes the presence of a duplicate gene can be selectively beneficial because two genes make more RNA and protein. In this case, purifying selection acts to maintain the function of the two copies (Wagner, 2002). Sometimes the duplicated gene is redundant and the accumulation of deleterious mutations in one of the two genes transforms it into a pseu- dogene (a nonfunctional copy of an active gene). This process seems to be one of the sources of the many pseudogenes in genomes. Harrison et al. (2002) suggest that there is one pseudogene for every two functional genes in the human genome. Selection can favor the retention of two or more functional duplicates if the sequences of the two genes diverge and lead to different functions (see Box 3-4). RNase1, for example, has a double function: It is secreted by the pancreas into the intestinal lumen where it digests RNA, and it is expressed in many tissues where it defends against viral infection. Colobine

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 51 Box 3-4 New Function Through Gene Duplication One of the important outcomes of gene duplication is the origin of novel, albeit often related, function. One nice example is the genes that code for the red- and green-sensitive opsins in humans, which were generated by the du- plication of a sex-linked gene in hominoids and Old World monkeys, and which give us trichromatic vision. Howler monkeys, a group of New World monkeys, evolved trichromatism independently through a duplication of the same gene in the x chromosome. The olfactory receptor (OR) genes that form the largest gene family in mammalian genomes are another good example. A high percentage of these genes are “pseudogenes” that have lost their function, presumably as a result of disuse. Interestingly, the frequency of pseudogenes among the OR fam- ily members differs greatly among species. While humans, nonhuman primates, and mice have roughly the same number of OR genes (about 1,000), in humans about 60 percent of these are pseudogenes, while nonhuman apes have about 30 percent, and mice have only about 20 percent (Menashe et al., 2003). What are the factors that may cause this large interspecific variation in the proportion of pseudogenes in the OR family? Gilad et al. (2004) randomly sequenced 100 distinct OR genes from each of 18 primate species—four apes, six Old World monkeys, seven New World mon- keys, and one prosimian. They found that Old World monkeys had roughly the same percentage of OR pseudogenes as nonhuman apes (about 30 percent) but a much higher percentage than New World monkeys (about 17 percent), except for howler monkeys. The percentage of OR pseudogenes in the howler monkey was about 30 percent, much closer to that seen in the Old World monkeys and apes than in its New World relatives. The higher frequency of pseudogenes in the OR family must have evolved independently in howler monkeys and Old World monkeys. Recall that howler monkeys share trichromatic color vision with apes and Old World monkeys. The evolution of trichromatism seems to have coincided with the deterioration of the sense of smell. This leaves the question of why hu- mans have such high frequencies of OR pseudogenes. Gilad et al. (2003) specu- lated that cooking food reduces the need to identify odorous toxins in food, which may be denatured by heating. Paradoxically, cooking, which we associate with delicious aromas, may have diminished our capacity to smell diverse odors. monkeys have two copies of the RNase gene (RNase1 and RNAse1B), one of which retains the presumably ancestral function (RNAse1) and another that helps the monkeys digest bacterial RNA (RNAse1B). Unlike other primates, colobine monkeys are foregut fermenters and must digest large amounts of DNA from the rapidly growing fermenting bacteria in their guts (Zhang, 2003). Zhang et al. (2002) found that since duplication, RNAse1B had much higher rates of nucleotide substitutions at nonsynonymous sites

52 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY (sites where a DNA base change results in the incorporation of a differ- ent amino acid in the protein) than at synonymous and noncoding sites, providing evidence of selection for a function that complements that of the ancestral gene. Diversity in Functional Noncoding Sequences The “central dogma,” which states that the role of DNA is to code for RNA, which in turn codes for protein, focused scientists’ attention on documenting the variation in protein-encoding genes for much of the past 50 years. The prevailing theory suggested that understanding such varia- tion would explain much of life’s diversity. This focus prevailed even when biologists studying complex multicellular organisms, such as mammals and plants, knew that only a small percentage of their genomes actually encoded proteins. The rest of the genome was often referred to as “junk” DNA, which was thought to be made up of mostly remnants of transpos- able elements, DNA that selfishly existed only to replicate—with little impact on genome function—or pseudogenes. As more data accumulate, it is becoming clear that at least some of this “junk” DNA does contribute to functional diversity and thus could contribute to variation upon which selection can act. The diversity found in the portion of the genome previ- ously considered more or less inert is vast when one compares sequences between closely related species; considerable diversity is sometimes found even within a species. How much of this diversity contributes to function is still unknown, but results deriving from comparative genomics and high-throughput methods to examine genome-wide expression patterns combined with functional genetic analyses in fungi, plants, and animals challenge our previous conceptions and suggest much remains to be learned about how genome diversity dictates functional diversity. The discussion below is not meant to be comprehensive but serves to illustrate that while the “central dogma” is broadly correct for protein-coding genes, it is ap- parent that our theoretical framework explaining how genomes function requires expansion. A large portion of many eukaryotic genomes is made up of repetitive sequences, existing in tens, hundreds, thousands, or millions of copies within a genome (Morgante, 2006; Jurka et al., 2007). It is this repetitive portion of genomes that is usually not conserved at the nucleotide level between even closely related species, although organisms as different as plants and animals do share the same classes of sequences. Some of these sequences are simple tandem repeats (Armour, 2006), stretches of DNA where the same short sequence is repeated hundreds or thousands of times. The number of repeats can vary so much between individuals that these sequences are excellent markers for genetic and forensic studies (Armour,

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 53 2006). While the function of most of these sequences is unknown, there are a number of diseases associated with variations in triplet repeat lengths (Mirkin, 2006). Other repetitive sequences are derived from reverse transcription of RNA molecules into DNA, with subsequent integration into the genome. There are several different examples of such sequences, called transposable elements, which can move around the genome. Much of the repetitive DNA in the genome seems to consist of defective copies of these transposable elements that have suffered mutations, so they can no longer transpose. Some of these sequences have expanded tremendously with a single type of element contributing millions of copies to a genome. Functional transpos- able elements (those that encode the proteins necessary for transcription, reverse transcription, and integration and thus can still move) are found in much lower numbers (Ding et al., 2006). These active elements cannot only move themselves, they can move related defective elements and reverse transcribe mRNA or structural RNAs to generate pseudogenes. Still other classes of transposable elements transpose via DNA replication mechanisms (Morgante, 2006; Jurka et al., 2007). Integration of any of these types of sequences can affect the expression of adjacent genes through the regula- tory sequences they contain or disruption of regulatory sequences at the insertion site. Thus, diversity in where these sequences are located within an individual’s genome can have consequences for gene and genome function. In some species, such as humans, the insertion of many of these defective transposable element sequences was ancient. However, there are subsets of elements that have moved more recently and insertion sites for these differ from person to person. In a species such as maize, many more transposons are currently active relative to what has been described in mammals, which likely contributes to the amazing diversity between inbred lines in terms of numbers and organization of genes and gene fragments (Morgante, 2006). It is commonly thought that gene fragments are the ultimate in junk DNA; it is hard to imagine a function for a fragment of a gene inserted into a noncoding region between genes. However, the observation that many of these sequences are transcribed, sometimes on both strands, combined with the discovery of a number of RNA-mediated gene-silencing mecha- nisms involving double-stranded RNA, raises the possibility that in some instances these gene fragments contribute to diversity of gene expression patterns by targeting functional genes containing the same sequence. RNA interference, RNAi, is an evolutionarily conserved mechanism in fungi, plants, and animals that generates short 21-23 nucleotide RNAs (siRNA) from double-stranded RNA, which then target corresponding mRNAs for cleavage (Rana, 2007). MicroRNAs (miRNAs) are a class of short RNA that are encoded in the intergenic nonprotein-encoding regions of animal

54 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY and plant genomes. miRNAs are produced through processing of imperfect RNA hairpins and depending on their degree of complementarity elicit either translational control or mRNA cleavage, resulting in gene silencing that is essential for animal and plant development (Zhang et al., 2007). In addition to post-transcriptional gene silencing, there are RNAi-related path- ways that regulate gene expression by modifying DNA methylation or how DNA is packaged with the result that the functional gene sharing the RNAi sequence is silenced (Matzke and Birchler, 2005; Chan et al., 2005; Grewal and Elgin, 2007). The repetitive elements discussed previously are major targets of these RNAi-related transcriptional silencing pathways. When the pathways are disrupted, there can be significant consequences to the organ- ism; other genes can also be regulated via this mechanism (Zaratiegui et al., 2007). That there appear to be so many evolutionarily conserved and regulated sequences and regulatory pathways outside the traditional genes is a relatively new observation, ripe for theoretical input. In addition to the types of sequences discussed above, comparison of genomes across diverse species from vertebrates, invertebrates, plants, and yeast have identified a large fraction of conserved nonprotein and non- RNA- encoding sequences under selective constraints (e.g., Waterston et al., 2002; Kaplinsky et al., 2002; Siepel et al., 2005). While some of these sequences are likely to be regulatory, transcription factor binding sites do not necessarily show high sequence conservation even though a fraction can be functionally conserved (Dermitzakis and Clark, 2002; Fisher et al., 2006). From studies done to date, it is clear that noncoding DNA se- quences can have significant effect on phenotype and are subject to natural selection (reviewed in Bird et al., 2006). However, the functions of most conserved noncoding DNA sequences are unknown, let alone the functions of nonconserved noncoding DNA sequences; it is possible that species- or genera-specific sequences may serve a much wider range of roles than cur- rently imagined. In summary, the relatively recent recognition of new RNA pathways for controlling gene regulation, as well as the extensive transcription of the hu- man (ENCODE Project Consortium, 2007) and plant genomes (Stolc et al., 2005; Hanada et al., 2007) that results in the majority of DNA sequences being represented by transcripts, combined with the lack of understanding of evolutionary constraints on noncoding DNA, suggest much remains to be learned. A focus on only the variation in the protein-encoding portion of the genome is unlikely to lead to full understanding of life’s diversity or the mechanisms and evolution of genome function. New computational methods and new theory will be required to fully understand the function of the vast majority of genomes, the noncoding DNA.

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 55 Diversity of Molecular Function The previous section discussed the many ways that diversity can be generated at the level of genes and genomes. Over the billions of years of evolution, this variation has produced vast numbers of genes that encode functional proteins. Determining the function of a protein in one organism can be useful for predicting its homolog’s function in another organism. However, even for organisms that are very well studied­, like yeast or hu- mans, the functions of all gene products are not yet known. Determining the function of each gene product experimentally is not only inefficient but can also be misleading as the activity of a protein may differ according to context. Therefore, improving our ability to predict computationally the function of gene products, or to understand the functional consequences of mutation, is an important challenge. Since the mid-1990s, the increasing availability of genomic sequences and molecular diversity data has stimulated interest in the fields of bioin- formatics and computational biology. The recent discovery of great mo- lecular diversity in functional genomic elements other than protein-coding sequences (e.g., ENCODE, 2007) as described above suggests even greater theoretical challenges in this area. Accurate computational prediction of molecular function from sequence information and the use of comparative diversity data in genomic annotations remain a great challenge. Genomic sequence information includes both coding and noncoding functional se- quences. Function prediction from this information includes everything from prediction of molecular structure from protein sequences and RNA sequences to organization of these structures into functionally predictive frameworks. What ultimately is required is a collection of models that allow us to construct a “map” from sequence to molecular function to organismal function. Conceptually, this requires first a construction of a “sequence fea- ture space”—that is, a distillation of sequence features relevant to function prediction and a relational metric (distance measure) using those features (Kim, 2001b). At present, most standard approaches involve statistical characterization of known examples—the expanding information on mo- lecular diversity greatly helps these approaches. However, the ultimate goal, especially when presented with entirely novel sequences from, for example, metagenomics projects (see Box 3-2) where even the organism of origin is unknown, is the derivation from first principles of a functional theory of biomolecular sequences. At present, determining protein function from gene sequence is hard. It is complicated by the fact that proteins are part of complex machines, and many years of work may be required to deter- mine the full set of interactions and functions of any protein. However, once it has been done, scientists can benefit from the ability to extrapolate across the phylogenetic tree to other organisms. It is clear that a systematic

56 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY computational/theoretical framework for the prediction of function would provide a critical boost in efficiency compared to empirically driven, eclectic approaches. Diversity of Social and Behavioral Systems As if life were not diverse enough at the molecular, genomic, species, functional, and community levels, organisms also have wildly diverse be- havioral and social interactions. Even a brief survey of the range of diver- sity at this level would be difficult, so this section discusses one particular topic that crosses genetic, evolutionary, behavioral, and social boundaries: the area of sex, gender, and sexuality. This particular area is controversial and often even politically charged, but incontrovertibly reproduction is an essential characteristic of all living organisms. The debate over whether the accepted theoretical framework regarding the role of sexual selection in evolution, initially outlined by Darwin and subsequently built on for over a century, can accommodate new data and perspectives, serves as an example of the integral and often unacknowledged role of theory in bio- logical research. Some biologists have drawn attention to many examples of expres- sions of sex, gender, and sexuality throughout the animal kingdom that are unanticipated by and challenging to the prevailing theoretical frame- work. Within evolutionary biology, the conceptual treatment of sex roles originated with Darwin’s theory of sexual selection. Darwin introduced this theory because of traits like the peacock’s tail that are termed ornaments and that are not readily understood as adaptations for survival. Instead, Darwin hypothesized that such traits find their evolutionary value in how they promote mating. The process that causes traits to evolve because of how they contribute to mating is called “sexual selection,” which Darwin contrasted with “natural selection,” the process causing traits to evolve that promote survival. When Darwin proposed his theory of sexual selection, he took the peacock and peahen, and the stag and doe, as emblematic of males and females generally. He asserted generalizations like, “Males of almost all animals have stronger passions than females” and “the female . . . with the rarest of exceptions is less eager than the male . . . she is coy” (Darwin, 1871). Darwin amassed examples to support these claims of universality. Sexual selection thus enunciates a norm of natural sexual conduct. Species that depart from the sexual selection templates of passionate male and coy female are then seen as “exceptions” meriting special discussion to account for their deviant behavior. However, there are many species in which males and females are virtu- ally indistinguishable, as with the guinea pigs many people raise as pets,

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 57 or birds like penguins, where sexes can only be distinguished by careful inspection of the genitals. In other species, males are not passionate, nor females coy, and the females consistently pursue the males. Female alpine accentors from the central Pyrénées of France, for example, solicit males for mating every 8.5 minutes during the breeding season. Ninety-three percent of all solicitations are initiated by the female approaching the male, with the other 7 percent by him approaching her (Davies et al., 1996). This fre- quent sexual contact greatly exceeds that needed specifically to fertilize the relatively few eggs that are reared. Or what can be concluded from the seahorse and pipefish, in which the male is drab and the female ornamented, and in which the male raises the young in a pouch into which the female deposits eggs? Such species exhibit what biologists call “sex role reversal.” The females are said to compete for access to males, with the males choosing females for their ornaments, resulting in showy females and drab males, the reverse of the peacock. Such a situation contradicts the traditional assumption that the cheapness of sperm invites passionate male promiscuity and the expensiveness of eggs necessitates female coyness during their careful choice of good gene-bearing males. But male seahorses make tiny sperm just as male peacocks do, and female seahorses make large eggs just as peahens do; nonetheless, male seahorses care for the young and female seahorses entrust their eggs to a male’s pouch. In many species, multiple types of males and females, each with dis- tinct identifying characteristics, carry out special roles at the nest both before and after mating takes place. In the sandpiper-like European ruff, black-collared males build nests in small defended territories called courts within a communal display area called a lek. Meanwhile, white-collared males accompany females while the females feed. The white-collared males then leave the company of the females and fly to the lek where they are solicited by the black-collared males to join them in their courts. When the females eventually arrive at the lek to lay eggs, they are romanced by pairs of males—one black-collared male paired with one white-collared male in some courts, as well as by single black-collared males in courts by themselves. Evidently, females prefer to lay eggs in nests hosted by a pair of black-collared and white-collared males at which both males serve as parents, rather than in nests hosted solely by one black-collared male, per- haps because the white-collared male has formed a bond with the females while he was accompanying them during their feeding. Perhaps white-col- lared males serve as “brokers” who introduce females to the black-collared males, who have not previously had the opportunity to meet females while they were busy setting up and defending courts in the leking area. There are, in fact, many examples of family organizations consisting of trios such as the ruffs, or of species with reproductive social groups that consist of more

58 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY than one male and one female tending offspring together after mating takes place, or even participating jointly in courtship before mating takes place. Same-sex sexuality is also evident in many species. In more than 300 species of vertebrates, same-sex sexuality has been documented in the pri- mary peer-reviewed scientific literature as a natural component of the social system (Bagemihl, 1999). Examples include species of reptiles like lizards, birds like the pukeko of New Zealand and European oystercatcher, and mammals like giraffes, elephants, dolphins, whales, sheep, monkeys, and one of our closest relatives, the bonobo chimpanzee. For some biologists, this cornucopia of diversity in gender expression and sexuality severely strains Darwin’s sexual selection theory. At the same time, the last 50 years have witnessed a great expansion of Darwin’s sexual selection narrative that was originally focused rather narrowly on second- ary sexual characters like peacock tails and deer antlers. Many, perhaps even most, evolutionary biologists do not feel that the accumulation of counterexamples and exceptions has risen to the level of requiring a major overhaul of sexual selection theory. Others argue that, just as the fossil re- cord undermined the theory that each species was individually created and unchanging, these “exceptions” cannot be reconciled with current theory. It is not the role of this report to resolve that controversy but merely to use it as an example of the more universal process whereby observation, experimentation, and the building and testing of models and hypotheses are intimately affected by one’s initial theoretical viewpoint and the evolution of that theoretical viewpoint in response to ongoing research. Diversity in Context Diversity at the molecular, functional, and organismal levels is multi- plied at the environmental level, where groups of species co-inhabit count- less overlapping ecosystems. This is the context in which evolution plays out, where all the different kinds of variation at the genetic level provide, or fail to provide, a selective advantage and where external changes in an envi- ronment eventually lead to the adaptation, migration, or extinction of local species. The field of ecology has a long history of theoretical approaches to the understanding and prediction of what governs species diversity in different environments, the role of species diversity in ecosystem stability, and the impact of environmental change. What Governs the Assembly of Communities? What is it that determines how many and which species will form an ecosystem? How much of the resulting community is due to chance, to his- tory, or to underlying principles of energy and resource availability? The

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 59 greater our ability to identify underlying governing principles, the better the predictions of the effects of change. According to the competitive exclusion principle, two or more species that are identical in their use of a limiting resource (such as space or food) cannot coexist indefinitely, and only one of the populations will survive competition; if one is competitively superior, exclusion of the others proceeds all the more quickly. Many mathematically formulated hypotheses have been proposed, and tested to various extents, to explain assemblages or communities of coexisting species. The simplest is “niche partitioning,” whereby competing species do not fully overlap in resource use, each having a “refuge” resource of which it is the sole or competitively superior consumer. Any textbook of ecology describes ex- amples that conform to this prediction. Such patterns are ascribable both to evolutionary responses of species to each other and to purely ecological processes of assembly, wherein members of a species pool colonize a loca- tion and either form a stable population or not, depending on whether or not they “fit.” Resource partitioning among species is not always evident, especially among organisms such as plankton and terrestrial plants. Among the major factors proposed to maintain diversity are predation and disturbance. A panoply of specialized predators (or parasites), each specific to a different prey species, may hold each prey species at a low enough density to enable other species to persist. For instance, specialized consumers of seeds or seedlings may contribute to maintenance of tree species diversity in forests (Janzen, 1970; Connell, 1971). More generalized predators may likewise maintain diversity by preventing competitively dominant prey species from excluding others, although prey species that are less able to escape preda- tion may be eliminated. Likewise, physical disturbances may open sites for colonization, and species capable of high dispersal (or which lie in wait, as do buried seeds) may persist if they can reproduce before they are excluded by dominant competitors. Such “fugitive” species often characterize early stages in ecological succession. This idea underlies a number of models of patch dynamics, including lottery models in which ecologically equivalent species persist almost indefinitely if enough gaps open at random in a suf- ficiently large landscape. Lottery models mark a shift in ecological thinking from equilibrium to nonequlibrium models, the most renowned of which may be MacArthur and Wilson’s (1967) model of island biogeography, in which the number of ecologically equivalent species on an island is set by rates of distance- dependent colonization and area-dependent extinction. This is the simplest explanation for the dependence of diversity on area, one of the most abun- dantly documented of ecological patterns, and postulates that the diversity in a local area (e.g., an island) is not determined solely by local interac- tions but also by the species diversity and dynamics of a larger region that

60 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY feeds local diversity by immigration. Ecologists have increasingly accepted that this principle holds for local assemblages in continental sites as well, so landscape-level processes and regional species diversity strongly affect diversity and dynamics at a local level (Ricklefs and Schluter, 1993). MacArthur and Wilson’s model was extended, moreover, to continental biotas and to evolutionary time by Rosenzweig (1975), who modeled spe- cies diversity as a consequence of rates of speciation and extinction. Hub- bell (2001) has developed this approach to its fullest extent in his “neutral theory of biodiversity,” in which the population genetic theory of genetic drift is applied to ecologically equivalent species. Although Hubbell does not deny that species often partition resources and are differentially resis- tant to predation and disease, his model shows that these processes may not need to be invoked to explain the patterns of diversity in many communi- ties, such as abundance distributions of tropical forest trees. Why Are Some Communities More Diverse Than Others? Community ecologists have long felt that a theory of species diversity in communities should be able to explain variation in the number of coex- isting species among assemblages in different environments and different parts of the world. The challenge may be epitomized by the latitudinal gradient in species diversity: In most higher taxa of plants and animals, diversity is highest in tropical regions and declines toward both poles. On land, diversity declines from warm, wet environments (such as those that harbor tropical wet forest) toward colder high altitudes and toward more arid regions. Traditional theory assumed both ecological and evolutionary equilib- rium: It would not do to say that cold regions have fewer species because they pose special adaptive challenges, since that simply shifts the question to why cold-adapted clades should not have diversified as much as warm- adapted clades have. As many as 100 hypotheses for these patterns have been distinguished (Willig et al., 2003). Many ecological explanations suggested either that plant communities in warm, wet climates have higher productivity, and that this would support more species, or that tropical regions experience less variable climate, so that more specialized species could evolve and coexist by finely dividing resources among them. How- ever, tropical regions are not more climatically stable (they are often more variable in rainfall than temperate regions), and there is little or no evidence that tropical species are more specialized; for example, herbivorous insects in tropical wet forest appear to be no more host specific than in temperate- zone forests (Novotny et al., 2006). The primary productivity of tropical wet forests may actually be lower than that of high-latitude forests (Huston, 1994), and although high productivity might support higher population

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 61 densities of animal species and therefore reduce their extinction rate, it is hard to see how it would sustain higher plant diversity. In fact, whether spe- cies diversity of plants increases monotonically with productivity or peaks at intermediate productivity is a subject of some controversy (Huston, 1994; Gillman and Wright, 2006). In contrast, nonequilibrium explanations of the latitudinal diversity gradient, advanced in various forms for decades (e.g., Fischer, 1960), are gaining favor. One class of hypotheses holds that speciation rates are higher in tropical regions. The fossil record of bivalves (Jablonski et al., 2006) and of foraminifera and other planktonic organisms (Buzas et al., 2002; Allen and Gillooly, 2006) supports this hypothesis; in fact, bivalve taxa have originated mostly in the tropics and expanded toward the poles. Why, then, should speciation rates have a latitudinal bias? One possibility is that terrestrial tropical species, living in more constant temperatures, are physi- ologically intolerant of very different temperatures and are less capable of surviving the temperature stress they would experience in dispersing over mountain ranges (Janzen, 1967). Few data bear on this hypothesis, but those few largely support it (Ghalambor et al., 2006). It has also been suggested that high temperature increases rates of mutation and that this heightens evolutionary rates in general and speciation rates in particular (Allen et al., 2006; Gillman and Wright, 2006). A reported correlation between rates of molecular evolution and speciation (Webster et al., 2003) may support this hypothesis (which parts from the traditional supposition of population geneticists that genetic variation is so plentiful that phenotypic evolution is seldom limited by the rate of origin of adaptive mutations). A more deeply historical view, rapidly gaining adherents, is that the tropics have more species because most clades originated in tropical en- vironments and have remained mostly restricted to them because of the several factors that cause “niche conservatism” (Brown and Lomolino, 1998; Ricklefs, 2004). Until about 30 million years ago, tropical climates embraced a far greater area than they do now; in fact, the diversity of tree species in tropical, temperate, and boreal biomes is correlated with the area typified by those climates during the geological time (Eocene to Miocene) when most clades evolved (Fine and Ree, 2006). This “tropical conserva- tism hypothesis” (Wiens and Donoghue, 2004) builds on the strong correla- tion between species richness and geographic area and articulates in modern terms the older hypothesis that there has been more time for diversification in tropical regions (Stebbins, 1974). Plant genera that are distributed across continents have highly correlated latitudinal distributions (Ricklefs and Latham, 1992), exemplifying the long-sustained niche conservatism that is central to this hypothesis. A phylogenetic analysis showed that hylid frogs originated in the tropics, spread only recently into temperate regions, and display a strong correlation between the species richness of a region and

62 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY when that region was colonized (Wiens et al., 2006). An almost inescap- able conclusion is that patterns of species diversity can be understood best by taking into account evolutionary processes over very long periods of geological time. There is, perhaps, a profound lesson in this brief summary of efforts to develop and test general theories explaining patterns of species diversity. Many models and computational approaches have been brought to bear on understanding the complex relationships linking a community of species to one another and their physical environment. It now appears that at least part—perhaps a large part—of the explanation­ lies in history. The increas- ing availability of genomic sequences and refinement of phylogenetic theory will contribute to the validation of this theory, but if the role of historical chance is significant, there are both practical and philosophical implica- tions. If biodiversity depends on evolutionary processes acting on the avail- able genetic reservoir over geological time scales, the loss of species due to rapid, human-caused environmental change has profound consequences on the stock of genetic possibilities for the future. Philosophically, if biodiver- sity is largely the consequence of natural selection acting on random genetic events in specific communities and environments over very long time peri- ods, the search for underlying, quantifiable, predictable order in the origin, maintenance, and loss of species is made vastly more difficult. Loss of Diversity A population or species becomes extinct when its last member dies. Most ecological analyses of extinction follow either a “small population” paradigm or a “declining population” paradigm (Caughley, 1994). The former focuses on risks of extinction faced by small populations even in favorable environments, owing to stochastic fluctuations (Lande et al., 2003). In addition, some local populations (“sink” populations) cannot maintain a positive rate of increase without immigration from other popu- lations and dwindle if immigration is curtailed. In the declining population paradigm, populations are driven to low numbers by deterministic forces, including abiotic environmental changes (in climate, for example), changes in landscape (especially habitat loss), and changes in the biotic environ- ment. Most extinctions of entire species probably are attributable to these kinds of causes. Even aside from “mass extinction” events such as the K/T extinction (in which the dinosaurs perished) that has been attributed to a bolide impact, “background” extinctions have occurred throughout evolutionary history and have befallen far more than 99 percent of the species that have ever ex- isted. Clearly a species is a transient thing in this statistical sense. Remark- ably little is known about the causes of these extinctions, although certain

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 63 species characteristics, such as broad geographic range, ecological breadth, and high dispersal capability tend to be correlated with longer persistence times (Jablonski, 1995). Still, the ecological factors that cause extinction, and the organism-level or species-level traits that determine survival versus extinction, are little known. Even the factors that limit geographic ranges along environmental gradients, where local populations cannot persist, are understood for very few species (Parmesan et al., 2005). Some of the most immediate current threats to populations and species, however, are anthropogenic and are fairly obvious: overexploitation (especially of large vertebrates and marine resource species) and habitat destruction. Much of conservation biology focuses on understanding how species can be saved in the face of these threats. Models of population dynamics and of dispersal among subpopulations in increasingly patchy landscapes are important tools in conservation. Extinct species are those that have not adapted to whatever envi- ronmental changes befell them. The population genetic theory of micro- evolution should, ideally, enable us to predict population survival versus extinction, but doing so will require both significant theoretical advances and far more information than is currently available. The first question is whether or not the environmental change is one that would be expected to trigger an adaptive response. This can occur only if there is a change in the rank order of the fitness of different genotypes. Some changes, however, reduce population size without altering relative fitness. If a critical resource such as food or habitat dwindles, individuals may experience the same resource environment as when it is abundant, so there may be no change in relative fitness. Williams (1966) described such species as “running out of niche” but remaining well adapted to that niche to the bitter end. We need a better understanding of what environmental changes do not alter the regime of natural selection. When an environmental change does engender selection for adaptive change, there begins a race between a demographic process of declining population size and the evolutionary process of adaptation (Holt and Gomulkiewicz, 2004). The simplest models of adaptation to changing environments envisioned selection on a single quantitative character such as body size, in which the population mean can track a moving optimum, although lagging behind it, and the population can maintain positive popu- lation growth if the genetic variance of the character is high enough (Lynch and Lande, 1993). Since directional selection will exhaust initial genetic variation, long-continued evolution will then depend on a sufficiently high rate of mutational input of new genetic variation, which depends on popu- lation size. More realistic models must take into account the reduction in population size that results from the lag, the various genetic architectures that a trait may have, and the realistic expectation that the environmental

64 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY change may impose selection on multiple traits. Population genetic theory has shown that adaptation is likely to be slower, the greater the number of independent characters, or “dimensions” of genetic variation (Wagner, 1988; Orr, 2000), and that genetic correlations among characters may enhance or retard the rate of evolution, depending on where the new phenotypic optimum lies, relative to the multidimensional axis of greatest variation (Lande, 1979; Kirkpatrick and Lofsvold, 1992). Predicting which species will survive and which will become extinct as a result of an environmental change is an important and exceedingly difficult challenge. Consider the global temperature change, already underway, that inevitably will transpire at a rate that has perhaps never been equaled in evolutionary history (Parmesan, 2006). What aspects of a species’ environ- ment will change, what characteristics might, by evolving, provide adap- tation to these alterations, and what levels of selectable genetic variation might enable adaptive change in these features are all major unknowns. The negative impacts on populations are not at all limited to thermal stress; they are already known to include phenological (seasonal) mismatch between a species’ life cycle and the phenology of its food supply, critical changes in its physical environment (e.g., polar bears depend on dwindling ice floes for hunting seals), and changes in the community of species with which a spe- cies interacts (Parmesan, 2006). For any particular species, it would be hard to identify all the characteristics that might be directionally selected, given such a multiplicity of possible impacts. And there is increasing evidence that populations may have little or no genetic variation in some ecologi- cally critical characteristics (Blows and Hoffmann, 2005), such as dessica- tion resistance in flies (Hoffmann et al., 2003), the capacity of herbivorous insects to adapt to certain plants (Futuyma et al., 1995), and the ability of plants to adapt to toxic soils (Bradshaw, 1991). It is perhaps no wonder, then, that species display niche conservatism (Wiens and Graham, 2005) and that the response of most species to Pleistocene glacial/interglacial oscillations was not adaptation to the climatic changes visited upon their original locations but massive, repeated shifts in geographic range as spe- cies tracked the climatic “envelope” to which they were already adapted (Williams et al., 2004). Because of complex ecological linkages, species do not become extinct independently, and the extinction of key species can have cascading effects. For example, overexploitation of fish populations has had devastating ef- fects on coral reefs, kelp beds, and even the pelagic food web (Scheffer et al., 2005). Consequently, ecologists are increasingly concerned that the loss of species diversity may have drastic effects on ecosystem “services” such as productivity and may result in ecosystem collapse. Preliminary models, as well as data on the consequences of marine biodiversity loss, give credence to these fears (see Figure 3-2; Dobson et al., 2006; Worm et al., 2006). The

ARE THERE STILL NEW LIFE FORMS TO BE DISCOVERED? 65 FIGURE 3-2  Global loss of species from large marine ecosystems (LMEs). (A) Trajectories of collapsed fish and invertebrate taxa over the past 50 years (diamonds, collapses by year; triangles, cumulative collapses). Data are shown for all (black), species-poor (<500 species, blue), and species-rich (>500 species, red) LMEs. Regression lines are best-fit power models corrected for temporal autocor- 3-2 relation. (B) Map of all 64 LMEs, color-coded according to their total fish species richness. SOURCE: Worm, B. 2006. Impacts of Biodiversity Loss on Ocean Ecosystem Ser- vices. Science 314:787-790. Reprinted with permission from AAAS. possibility of devastating ecological effects of human impacts underscores the need for increasing theoretical and empirical studies of the interplay between species diversity and ecosystem characteristics. Extinction is, then, one of the least well-understood phenomena in ecology and evolutionary history. In evolutionary biology, a deeper under- standing is required of the causes of niche conservatism, the dimensionality of genetic variation, the factors that determine variability (the capacity of characters to vary), and the nature of and linkages between genetic and de- mographic processes in changing environments. Theoretical and empirical advances are needed in ecology to address questions about the abiotic and biotic factors that can extinguish populations and about the linkages among species and ecosystem processes that might accelerate losses in diversity, productivity, and ecosystem health.

66 THE ROLE OF THEORY IN ADVANCING 21ST-CENTURY BIOLOGY CONCLUSION The diversity of biological systems extends from the molecular to the global scale and all of the levels are linked. Survival or extinction of a spe- cies and the stability of an ecosystem may depend on the level of random, neutral genetic variations that have built up in individual members of vari- ous species over time and on the balance between the size of those species’ populations and the rapidity of change in their environment. At all levels, general theories to explain and predict diversity would be a great advance: from defining the evolutionary relationship of species, to predicting the function of proteins from gene sequence, to relating the form and functions of organisms to their genomes, to predicting the stability of ecosystems from their constituent species. The vastness of the diversity and the impor- tant, but as yet undefined, role of chance and history in biological systems make the development of such theories a grand challenge indeed.

Next: 4 What Role Does Life Play in the Metabolism of Planet Earth? »
The Role of Theory in Advancing 21st-Century Biology: Catalyzing Transformative Research Get This Book
×
Buy Paperback | $61.00 Buy Ebook | $48.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Although its importance is not always recognized, theory is an integral part of all biological research. Biologists' theoretical and conceptual frameworks inform every step of their research, affecting what experiments they do, what techniques and technologies they develop and use, and how they interpret their data.

By examining how theory can help biologists answer questions like "What are the engineering principles of life?" or "How do cells really work?" the report shows how theory synthesizes biological knowledge from the molecular level to the level of whole ecosystems. The book concludes that theory is already an inextricable thread running throughout the practice of biology; but that explicitly giving theory equal status with other components of biological research could help catalyze transformative research that will lead to creative, dynamic, and innovative advances in our understanding of life.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!