National Academies Press: OpenBook

In the Light of Evolution: Volume X: Comparative Phylogeography (2017)

Chapter: 6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg

« Previous: 5 Effects of the Population Pedigree on Genetic Signatures of Historical Demographic Events - John Wakeley, Landra King, and Peter R. Wilton
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

6

The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree

images

ROHAN S. MEHTA,*DAVID BRYANT, AND NOAH A. ROSENBERG*

Monophyletic groups—groups that consist of all of the descendants of a most recent common ancestor—arise naturally as a consequence of descent processes that result in meaningful distinctions between organisms. Aspects of monophyly are therefore central to fields that examine and use genealogical descent. In particular, studies in conservation genetics, phylogeography, population genetics, species delimitation, and systematics can all make use of mathematical predictions under evolutionary models about features of monophyly. One important calculation, the probability that a set of gene lineages is monophyletic under a two-species neutral coalescent model, has been used in many studies. Here, we extend this calculation for a species tree model that contains arbitrarily many species. We study the effects of species tree topology and branch lengths on the monophyly probability. These analyses reveal new behavior, including the maintenance of nontrivial monophyly probabilities for gene lineage samples that span multiple species and even for lineages that do not derive from a monophyletic species group. We illustrate the mathematical results using an example application to data from maize and teosinte.

__________________

* Department of Biology, Stanford University, Stanford, CA 94305; and Department of Mathematics and Statistics, University of Otago, Dunedin 9054, New Zealand. To whom correspondence should be addressed. Email: rsmehta@stanford.edu.

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

Mathematical computations under coalescent models have been central in developing a modern view of the descent of gene lineages along the branches of species phylogenies. Since early in the development of coalescent theory and phylogeography, coalescent formulas and related simulations have contributed to a probabilistic understanding of the shapes of multispecies gene trees (Tajima, 1983; Takahata and Nei, 1985; Neigel and Avise, 1986), enabling novel predictions about gene tree shapes under evolutionary hypotheses (Rosenberg, 2003; Degnan and Salter, 2005), new ways of testing hypotheses about gene tree discordances (Wu, 1991; Yu et al., 2012), and new algorithms for problems of species tree inference (Liu et al., 2009a; Wu, 2012) and species delimitation (Knowles and Carstens, 2007; Yang and Rannala, 2010). A “multispecies coalescent” model, in which coalescent processes on separate species tree branches merge back in time as species reach a common ancestor (Degnan and Rosenberg, 2009), has become a key tool for theoretical predictions, simulation design, and evaluation of inference methods, and as a null model for data analysis.

A fundamental concept in genealogical studies is that of monophyly. In a genealogy, a group that is monophyletic consists of all of the descendants of its most recent common ancestor (MRCA): every lineage in the group—and no lineage outside it—descends from this ancestor. Backward in time, a monophyletic group has all of its lineages coalesce with each other before any coalesces with a lineage from outside the group.

The phylogenetic and phylogeographic importance of monophyly traces to the fact that monophyly enables a natural definition of a genealogical unit. Such a unit can describe a distinctive set of organisms that differs from other groups of organisms in ways that are evolutionarily meaningful. Species can be delimited by characters present in every member of a species and absent outside the species, and that therefore can reflect monophyly (Sites and Marshall, 2004; De Queiroz, 2007). In conservation biology, monophyly can be used as a prioritization criterion because groups with many monophyletic loci are likely to possess unique evolutionary features (Moritz, 1994). Reciprocal monophyly, in which a set of lineages is divided into two groups that are simultaneously monophyletic, is often used in a genealogical approach to species divergence (Baum and Shaw, 1995; Hudson and Coyne, 2002). The proportion of loci that are reciprocally monophyletic is informative about the time since species divergence and can assist in representing the level of differentiation between groups (Edwards and Beerli, 2000; Rosenberg, 2003).

Many empirical investigations of genealogical phenomena have made use of conceptual and statistical properties of monophyly (Funk and Omland, 2003). Comparisons of observed monophyly levels to model predictions have been used to provide information about species diver-

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

gence times (Hare and Weinberg, 2005; Syring et al., 2007). Model-based monophyly computations have been used alongside DNA sequence differences between and within proposed clades to argue for the existence of the clades (Birky et al., 2005), and tests involving reciprocal monophyly have been used to explain differing phylogeographic patterns across species (Carstens and Richards, 2007). Comparisons of observed levels of monophyly with the level expected by chance alone (Rosenberg, 2007) have assisted in establishing the distinctiveness of taxonomic groups (Neilson and Stepien, 2009; Kubatko et al., 2011). Loci that conflict with expected monophyly levels have provided signatures of genic roles in species divergences (Wang et al., 1999; Ting et al., 2000; Dopman et al., 2005).

For lineages from two species under a model of population divergence, Rosenberg (2003) computed probabilities of four different genealogical shapes: reciprocal monophyly of both species, monophyly of only one of the species, monophyly of only the other species, and monophyly of neither species. The computation permitted arbitrary species divergence times and sample sizes—generalizing earlier small-sample computations (Tajima, 1983; Takahata and Nei, 1985; Neigel and Avise, 1986; Takahata and Slatkin, 1990; Wakeley, 2000)—and illustrated the transition from the species divergence, when monophyly is unlikely for both species, to long after divergence, when reciprocal monophyly becomes extremely likely. Between these extremes, the species can pass through a period during which monophyly of one species but not the other is the most probable state.

Although this two-species computation has contributed to various insights about empirical monophyly patterns (Birky et al., 2005; Hickerson et al., 2006a; Carstens and Knowles, 2007; Carstens and Richards, 2007; Syring et al., 2007; Bergsten et al., 2012), many scenarios deal with more than two species. Because multispecies monophyly probability computations have been unavailable—except in limited cases with up to four species (Rosenberg, 2002, 2003; Degnan, 2010; Zhu et al., 2011; Eldon and Degnan, 2012)—multispecies studies have been forced to rely on two-species models, restricting attention to species pairs (Baker et al., 2009; Neilson and Stepien, 2009; Bergsten et al., 2012) or pooling disparate lineages and disregarding their taxonomic distinctiveness (Ting et al., 2000; Carstens and Richards, 2007).

Here, we derive an extension to the two-species monophyly probability computation, examining arbitrarily many species related by an evolutionary tree. Furthermore, we eliminate the past restriction (Rosenberg, 2003) that the lineages whose monophyly is examined all derive from the same population. This generalization is analogous to the assumption that in computing the probability of a binary evolutionary character (RoyChoudhury et al., 2008; Bryant et al., 2012; RoyChoudhury

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

and Thompson, 2012), one or both character states can appear in multiple species. Our approach uses a pruning algorithm, generalizing the two-species formula in a conceptually similar manner to other recursive coalescent computations on arbitrary trees (Efromovich and Kubatko, 2008; RoyChoudhury et al., 2008; Bryant et al., 2012; RoyChoudhury and Thompson, 2012; Stadler and Degnan, 2012; Wu, 2012).

Like the work of Degnan and Salter (2005), which considered probability distributions for gene tree topologies under the multispecies coalescent model, our work generalizes a coalescent computation known only for small trees (Rosenberg, 2002, 2003) to arbitrary species trees. We study the dependence of the monophyly probability on the model parameters, providing an understanding of factors that contribute to monophyly in species trees of arbitrary size. Finally, we explore the utility of monophyly probabilities in an application to genomewide data from maize and teosinte.

RESULTS

Model and Notation

Overview

Consider a rooted binary species tree with leaves and specified topology and branch lengths. For each of the species represented by leaves of , a number of sampled lineages is specified. Given a specified partition of the lineages into two subsets, we consider a condition describing whether one, the other, both, or neither of the two subsets of lineages is monophyletic. Our goal is to provide a recursive computation of the probability that the condition is obtained under the multispecies coalescent model. Notation appears in Table S1.1

Lineage Classes

The initial sampled lineages are partitioned into class S (subset) for lineages within a chosen subset, and class C (complement) for all lineages not included in S. Coalescence between an S lineage and a C lineage produces an M (mixed) lineage. Any coalescence involving an M lineage also produces an M lineage. Coalescences between two S or two C lineages produce S and C lineages, respectively (Table 6.1).

__________________

1 Supporting information for this chapter, which includes Table S1, Figure S1, and Datasets S1 and S2, is available online at http://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1601074113/-/DCSupplemental.

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

TABLE 6.1 Lineage Classes Produced by Coalescence Events

Class 2
S C M
Class 1 S images images images
C images images images
M images images images

NOTE: Intraclass coalescences between pairs of lineages preserve the class; interclass coalescences result in M lineages.

Letting the number of S and C lineages present initially in the ith leaf be Si and Ci, respectively, the model parameters are Si and Ci for 1 ≤ i, and the species tree . For convenience, we aggregate the Si and Ci with into a parameter collection SC that we call the initialized species tree.

Monophyly Events

A monophyly event Ei is an assignment of labels to lineage classes S and C. We can choose to label a class “monophyletic” or “not monophyletic,” or assign no label at all, so that nine monophyly events are possible, six of which are relevant for our purposes (Table 6.2). All lineages in a monophyletic class must coalesce within the class to a single lineage before any coalesces outside the class. If multiple classes are labeled monophyletic, then each class must be separately monophyletic.

Species-Merging Events

We orient the species tree vertically, “up” toward the root and “down” toward the leaves. From a coalescent backward-in-time perspective, at every internal node of the species tree—representing a species-merging event—lineages enter from two branches directly below the node. We label one of these branches “left” and the other “right,” based on an arbitrarily labeled diagram of species tree . These labels are used only for bookkeeping; the labeling does not affect subsequent calculations. Lineages entering from the left and right branches are called “left inputs” and “right inputs,” respectively. Each node x of is associated with exactly

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

TABLE 6.2 Possible Monophyly Events for Two Disjoint Lineage Classes, S and C

Monophyletic Groups Description Notation
S Monophyly of S ES
C Monophyly of C Ec
Only S Paraphyly of C ESC
Only C Paraphyly of S ES′C
Both S and C Reciprocal monophyly ESC
Neither S nor C Polyphyly ES′C′

one branch, leading from node x to its immediate predecessor on . We refer to this branch with the shared label x.

For an internal branch x in , the number of class-S left inputs is images for class images for class M); the number of class-S right inputs is images for class images for class M). The total number of class-S inputs of x is images for class images for class M). The number of lineages that exit branch x, entering a branch farther up the species tree, is the set of outputs of branch images.

We combine the input and output values into two three-entry vectors: the “input states” images and the “output states” images. Note that images. We refer to the nodes directly below node x corresponding to its left and right incoming branches by xL and xR, respectively, and to nodes farther down the tree by sequences of Ls and Rs, which, read from left to right, give the steps needed to reach them from x. For example, xRL follows down from x to the right (xR), then from xR to the left (xRL).

The time interval associated with node x is Tx, the length of branch x. Branch lengths are measured in coalescent time units of N generations, where N represents the haploid population size along the branch and is assumed to be constant. Thus, larger population sizes correspond to shorter lengths of time in coalescent units. Coalescences between inputs during time Tx yield the outputs of x. The root branch of has infinite length.

The outputs of any nonroot branch are exactly the left or the right inputs of another branch farther up the tree; the outputs of the root are the outputs of the species tree. The root has only one output lineage: images. Inputs of a node x are the outputs of xL and xR, so that images. For convenience, when node x corresponds to leaf i, we let images (Fig. 6.1).

We define xSC to be the initialized species subtree with root x and Eix to be the monophyly event Ei for the subtree with root x, ignoring the rest of the species tree.

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
images
FIGURE 6.1 Notation for computing monophyly probabilities above a species tree node x. Nodes xLL, xLR, and xR are leaves. S lineages appear in blue, C lineages in orange, and M lineages in green. The figure illustrates reciprocal monophyly. Sequentially listing the numbers of S, C, and M lineages as a vector, the outputs of branch x are images. Inputs are images. Farther down the tree, branch xL has inputs images. Adopting the inputs convention that leaf inputs enter from the left, branch xR has inputs images and images. Descending one more level—which is only possible for xL—the inputs for branch xLL are images, and for branch xLR, they are images. Branch widths represent constant population sizes but do not indicate relative magnitudes of these sizes.

Coalescence Sequences

A coalescence sequence is a sequence of coalescences that reduces a set of lineages to another set of lineages. As an example, consider four lineages—labeled A, B, C, and D—that coalesce to a single lineage. One sequence has A and C coalesce first, followed by B and D, then the lineages resulting from the AC and BD coalescences. This sequence could be described as (A, C), (B, D), (AC, BD). If the first two coalescences happened in opposite order, the sequence would be (B, D), (A, C), (AC, BD).

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

Combinatorial Functions

The probability gn,j(T) that n lineages coalesce to j lineages in time T is given by equation 6.1 of Tavaré (1984). It is nonzero only when n ≥ j ≥ 1 and T ≥ 0, except that we set g0,0(T) = 1.

Following equation 4 of Rosenberg (2003), the number of coalescence sequences that reduce n lineages to klineages is In,k = [n!(n − 1)!]/[2n−kk!(k− 1)!]. This function is nonzero only when nk ≥ 1, with the convention I0,0 = 1.

Finally, the binomial coefficient,

images

by equation 5 from Rosenberg (2003), gives the number of ways that separate coalescence sequences consisting of r1 and r2 coalescences can be ordered in a larger sequence containing them both as subsequences. W2(r1,r2) is defined when r1,r2 ≥ 0.

The Central Recursion

Overview

We develop a recursion for the probability of a particular output state images and monophyly event images for a branch x given the initialized species subtree images We use the law of total probability to write the desired probability as a sum over all possible input states nxI of the probability of the input state multiplied by the conditional probability of the output given the input. Keeping in mind that {inputs of x} = {outputs of xL}∪{outputs of xR}, we then use the independence of the outputs for branches xL and xR to decompose the probability of the input state of x into a product of the probabilities of the output states of xL and xR. Schematically,

images

The third term on the right-hand side of Eq. 6.1, which we represent by F, is the probability that the inputs coalesce to the specified outputs during time Tx in accord with the monophyly event. We write the random variable for the output state of branch x as Zx, labeling the particular values attained by the random variable by images. By formalizing Eq. 6.1, we can write the central recursion of our analysis:

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
images

In this equation, we denote the total number of inputs of class S across all of the leaves subtended by xL or xR by images for class C). Each of the two summations is a nested triple sum, proceeding componentwise over the three entries in the vectors nxL and nxR—for example, for nxL, we sum from 0 to images, and from 0 to 1. We now explain the basis for this recursion.

Bounds of Summation

The sums in Eq. 6.2 traverse all possible inputs of branch x. We use summation bounds that only require information contained in the initialized species subtree xSC. Numbers of inputs are nonnegative, and for each lineage class, some branches have the possibility of having no inputs in the class. Thus, all lower bounds are 0.

For the upper bounds, because coalescence does not create new S and C lineages (Table 6.1), the numbers of S and C lineages never exceed the numbers of S and C leaves in the gene tree, respectively. Thus, for branch x, an upper bound for the possible number of inputs of class S or C from one side (L or R) is images for class S and images for class C.

We use Eq. 6.2 to calculate probabilities only for ES, EC, and ESC (Table 6.2), using them to obtain probabilities for the remaining events. These three events require complete intraclass coalescence separately in the appropriate classes before interclass coalescences are possible. As a result, they permit exactly one coalescence between an S lineage and a C lineage. Because the leaves possess no M lineages and because only the unique coalescence between an S and a C lineage creates an M lineage (Table 6.1), the number of M lineages never exceeds 1.

Probability of the Outputs of a Node Given the Inputs

Separating the function F from Eq. 6.2 into a term for the probability that the correct number of outputs is produced from the inputs and a combinatorial term Ki for the probability that the coalescence sequence generating those outputs occurs in accord with the monophyly event Ei, F takes the form

images
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

where images. For the case of i = S, in which monophyly of S is of interest, we have:

images

Here, images records the total number of class-S lineages in the species tree at the species merging event corresponding to node x. For cases 1 and 3, 0 < c2c1 and 0 < s2s1. For case 2, 0 ≤ c2 < c1, 0 < c1, and 0 < images. Note that it is not strictly necessary for images in case 2 (violation of ES would be accommodated elsewhere in the calculation, on another species tree branch), but we retain this condition for clarity.

Function F (Eq. 6.3) describes the probability of an output state and monophyly event given an input state and the initialized species tree. Its g term records the probability that the correct number of coalescences occur during the time Tx, defining a space of coalescence sequences from the input state to any output state with the same number of lineages as the desired output. Ki (Eq. 6.4) records the fraction of those sequences that produce the correct output and preserve the monophyly event Ei (in this case, ES).

The cases in Eq. 6.4 represent distinct scenarios for the types of input and output lineages present (Fig. 6.2AG). In case 1 (Fig. 6.2AE), no coalescence violates ES, as all coalescences have types (S, S) (case 1e), (C, C) (cases 1b, 1c, 1d), or (C, M) (cases 1c, 1d). No coalescences occur in case 1a. The correct output state is guaranteed (KS = 1), as each coalescence decrements the number of S (case 1e) or C lineages (cases 1b, 1c, 1d), and the only change from input to output is a reduction in S or C lineages.

In cases 2 and 3, both S and C lineages are present, and we enumerate the ways to obtain the desired output state from the input state in accord

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
images
FIGURE 6.2 All cases required for computing combinatorial terms KS and KSC in monophyly probabilities. (A–G) Cases for monophyly of S (Eq. 6.4). (H) A case for reciprocal monophyly (Eq. 6.5). In each panel, lineages coalesce from bottom to top, with the width of a shape corresponding to the number of lineages present. A single lineage is represented by a line, and multiple freely coalescing lineages are represented by shaded polygons with horizontal cross-section proportional to the number of extant lineages. Lineages represented in the same shape or in touching shapes can coalesce with each other. Lineage colors follow Fig. 6.1.

with the monophyly event. To obtain KS, we divide by the total number of coalescence sequences of correct length.

Case 2 describes the only possible way an S lineage and a C lineage can coalesce with each other under ES (Fig. 6.2F). All extant S lineages at the time of node x (images) must coalesce to a single lineage, and that lineage must coalesce with a C lineage when k class-C lineages remain from the cxI = c1 extant C lineages present in both species at node x. This coalescence results in a single M lineage and k – 1 lineages of class C, which can coalesce in any order to a single class-M lineage and cxO = c2 class-C lineages.

The number of ways that images lineages can coalesce to one lineage is images. The number of ways that c1 lineages can coalesce to k lineages is images. These separate sequences of images − 1 and c1k coalescences can be ordered in W2(images − 1, c1k) ways. The number of ways that a single S lineage can coalesce with one of k lineages of class C is k. Finally, k lineages—one M lineage and k – 1 class-C lineages—can coalesce to c2 + 1 lineages in images ways. The desired number of coalescence sequences of correct length that

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

result in the correct output state without violating ES is obtained by summing the product of these terms over possible values of k, which ranges from just enough C lineages (c2 + 1) to allow the correct number of output lineages (c2)—the resultant single S lineage coalesces with one C lineage and then no other coalescence occurs—to the total number c1 of incoming C lineages, when all of the S lineages coalesce before any of the C lineages coalesce. The denominator of ratio KS is the total number of ways of coalescing images + c1 input lineages to c2 + 1 output lineages: images. Note that setting c2 = 0 in the ratio, reflecting a scenario with only one output lineage, of class M, reduces the formula to the two-species equation 11 from Rosenberg (2003) (Supporting Information).

Case 3 describes any situation with S and C lineages present and no interclass coalescence (Fig. 6.2G). At node x, the sxI = s1 class-S lineages coalesce to sxO = s2 class-S lineages, and the cxI = c1 class-C lineages to cxO = c2 class-C lineages. Group S has not yet coalesced with the other sampled lineages and does not do so within this species tree branch; its monophyly is not necessarily determined on the branch. The number of ways s1 lineages can coalesce to s2 lineages is images lineages can coalesce to c2 lineages in Ic1,c2 ways. These sequences can be ordered in W2(s1s2,c1c2) ways. The numerator in the fraction of coalescence sequences of the correct length that result in the correct output state without violating ES is the product of these three terms. The denominator is the total number of ways of coalescing s1 + c1 input lineages to s2 + c2 outputs: Is1+ c1,s2+ c2.

Any pairing of an input state and an output state that does not belong in cases 1–3 of Eq. 6.4 must violate ES. This violation yields an output probability of KS = 0.

Reciprocal Monophyly

Monophyly events ESC and ES differ in that for ESC, unlike for ES, C and M lineages cannot coexist. Thus, cases 1c and 1d of Eq. 6.4 move to “otherwise” for KSC, producing KSC = 0 for the input states of those cases. Additionally, for ESC, an interclass coalescence can occur only after all S lineages have coalesced to a single S lineage and all C lineages have coalesced to a single C lineage, whereas ES required only that all S lineages coalesce. For ES, interclass coalescences occur only in case 2 of Eq. 6.4; for ESC, we modify this case by requiring first that before the interclass coalescence, the C lineages must be all C lineages in the tree at the time of node x (as we did for S lineages for case 2 of Eq. 6.4; cxI = cx). Second we require k = 1 and c2 = 0, so all C lineages coalesce to a single lineage before the interclass coalescence. Setting k = 1, c2 = 0, substituting cx for

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

c1 in case 2 of Eq. 6.4, and noting that I1,1 = 1, we obtain case 2 for KSC (Fig. 6.2H), applicable when nxI = (images) and images = (0,0,1):

images

For ESC, the input condition for case 2 can be satisfied only at the root of . For all input states other than those of Eq. 6.5 or cases 1c and 1d of Eq. 6.4, KSC = KS.

Completing the Calculation

Having obtained a recursion that propagates monophyly probabilities through a species tree, we apply Eq. 6.2 at the root to complete the calculation of the probability of a monophyly event on SC:

images

Specifying each possible monophyly event Eiroot in Eq. 6.6,

imagesimagesimagesimagesimagesimages

where CS is SC with the labels S and C switched. These recursive computations reduce to the known values for the two-species case (Supporting Information).

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

Effect of Species Tree Height T

To illustrate the features of monophyly probabilities, we now examine the effects on the probabilities of model parameters. First, we vary the tree height T and preserve relative branch length proportions, studying the limiting cases of T = 0 and T → ∞.

T = 0

At T = 0, nonroot species tree branches have length 0, so the species tree is a single infinitely long branch—the root—with initial sample sizes equal to the sums of the values at the leaves. Formally, because gi,j(0) = 1 if i = j, every nonroot branch outputs exactly its inputs. All images class-S lineages and all images class-C lineages enter the root. Using Eq. 6.7, and noting that gi,1(∞) =1, we find that images(ES|SC) is a simple function of the total numbers of S and C lineages:

images

with the last equality from equation 11 in Rosenberg (2003). Function f decreases with increasing s or c, as adding any lineage increases the chance of a monophyly-violating interclass coalescence.

T → ∞

As T → ∞, because limTgi,j(T) = 1 when j = 1, every branch exhibits complete coalescence. We define the minimal subtree with respect to S, images, as the smallest subtree of the species tree whose leaves contain all of the initial S lineages in the tree.

For large T, the monophyly probability depends on properties of images To be monophyletic, the S lineages must encounter C lineages only above their root. If images contains no C lineages, then complete coalescence in each branch implies monophyly of S lineages, and the monophyly probability is 1. If images contains C lineages and is at a leaf, k, then the limiting probability is f(Sk,Ck). Complete coalescence in every branch makes this leaf analogous to the root in the T = 0 case. Note that if Sk > 1 then the limit f(Sk,Ck) lies in the interior of the unit interval. This result contrasts with Rosenberg (2003), where lineage classes correspond to species tree leaves and the T → ∞ probability of ES is 1. In our scenario, because multiple lineage classes are permitted at a leaf, a nonzero limit can be below 1.

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

If images contains C lineages but is not a leaf, however, then complete coalescence in every branch implies that some proper subset of S lineages must coalesce with C lineages before all of the S lineages can coalesce with each other. In this case, the limiting monophyly probability is 0.

Finite, Nonzero T

The extreme cases assist in understanding the behavior of the probability of ES for intermediate T. We enumerate the possible situations based on images, continuing to assume that relative branch lengths are fixed and that a changing tree height changes all branch lengths proportionally.

If images contains no C lineages, then decreasing the tree height decreases the probability of monophyly by decreasing the time during which S lineages are able to coalesce with only themselves, eventually approaching a minimum f(s,c) achieved at T = 0. Similarly, increasing T increases the monophyly probability toward 1 as T → ∞.

If images contains C lineages and is a leaf, then decreasing the tree height decreases the monophyly probability by decreasing the time before more C lineages are added to the population that contains the S lineages. Shrinking the tree also increases the expected number of additional C lineages introduced at species merging events, further decreasing the monophyly probability. The minimal probability of monophyly therefore occurs at T = 0. Similarly, increasing the tree height increases the probability of monophyly, approaching a maximal value as T → ∞. Consequently, in this case, like in the previous case, the probability also increases monotonically in T.

If images contains C lineages and is not a leaf, then the minimal probability of monophyly, approached as T → ∞, is 0. As we will see in numerical examples, however, monotonicity of the monophyly probability with T is not guaranteed, and different initial sample sizes on the same species tree can generate different behavior.

Effect of Relative Branch Lengths

Next, to investigate the behavior of the monophyly probability as T increases, we devise a simple three-species, two-parameter scenario, subdividing the tree height T by a parameter r. We calculate the probability of ES for different sample-size conditions, varying r and T.

Fig. 6.3 shows the species tree and its resulting monophyly probabilities for four representative initial conditions. For each lineage class, S and C, the four cases place one or more lineage pairs into the three species, using different placements across the four cases. The cases include scenarios in which at least one species contains both S and C lineages (B, D, E), in which one (C) or both lineage classes span multiple species

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
images
FIGURE 6.3 The effect on monophyly probabilities of changing two branch lengths in relation to each other. (A) Model species tree. If the branch length coefficient r is 0, then the tree has a polytomy, and if r = 1, then the tree reduces to a two-species tree. (B–E) The probabilities of ES (Eq. 6.7) for monophyly of S for the tree in A under different scenarios: (B) (S1,S2,S3) = (2,0,2), (C1,C2,C3) = (2,2,2). (C) (S1,S2,S3) = (2,0,2), (C1,C2,C3) = (0,2,0). (D) (S1,S2,S3) = (2,2,0), (C1,C2,C3) = (2,0,2). (E) (S1,S2,S3) = (2,2,0), (C1,C2,C3) = (2,2,2).
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

(B, D, E), and in which the species containing S lineages are not monophyletic in the species tree (B, C).

The four cases (Fig. 6.3BE) illustrate differences in the pattern of increase or decrease in the monophyly probability with changes in r at fixed tree height T (Supporting Information). In most cases with fixed r, the probability decreases to 0 with increasing T, although in some boundary cases with r = 0 and r = 1 that change the case for the limiting behavior with T (see above on T → ∞), it approaches a positive value strictly within the unit interval. These scenarios highlight the fact that depending on the relative branch lengths and distribution of lineage classes across species, the monophyly probability can be monotonically increasing in T, monotonically decreasing, or not monotonic at all.

Effect of Pooling

Our next scenario simulates the difference between separating and pooling distinct species when computing monophyly probabilities, recalling that tests with more than two species have until now required the pooling of multiple clades (Carstens and Richards, 2007; Kubatko et al., 2011).

We consider four species trees with equal height and 12 lineages (Fig. 6.4). Six class-C lineages appear in one species descended from the root. The other six—the S lineage class—are evenly divided between one, two, three, or six other leaves. If we interpret the seven-leaf tree in Fig. 6.4D to be the “true” species tree, then the other trees represent pooling schemes, the two-leaf tree (Fig. 6.4A) being the only one possible to analyze using previous results.

Fig. 6.4EJ displays the probabilities of all possible monophyly events for each tree. For each event, pooling does not affect the extreme cases T = 0 and T → ∞. For intermediate T, the monophyly probability for the S lineages decreases as pooling is reduced from the case in which the six class-S lineages are treated as belonging to a single species to the case in which each lineage is in its own species (Fig. 6.4E); the monophyly probability for C remains largely unchanged (Fig. 6.4F). As pooling is reduced, the probability of monophyly of only S and not C decreases (Fig. 6.4G), and that of only C and not S increases (Fig. 6.4H). The reciprocal monophyly probability decreases (Fig. 6.4I) and the probability of no monophyly increases (Fig. 6.4J).

In this scenario, the S and C lineages meet only at the species tree root, and the monophyly probabilities are determined by the numbers of lineages that reach the root. Coalescence is faster with more nonisolated lineages; pooling species together results in more coalescence events and fewer S lineages entering the root, increasing the probability of monophyly of both S and C lineages as well as the reciprocal monophyly prob-

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
images
FIGURE 6.4 The effect on monophyly probabilities of pooling lineages from separate species. (A–D) Model species trees. Labels record numbers of input lineages (S in blue, C in orange). (E–J) Probabilities of monophyly events. The trajectories represent species trees with six class-S lineages evenly distributed over one (A), two (B), three (C), and six (D) species. (E) ES (Eq. 6.7). (F) EC (Eq. 6.8). (G) ESC (Eq. 6.10). (H) ESC (Eq. 6.11). (I) ESC (Eq. 6.9). (J) ESC (Eq. 6.12).
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

ability (Fig. 6.4E, F, and I). Decreasing the number of S lineages at the root decreases the number of coalescences needed to produce ES above the root, decreasing the chance of an interclass coalescence, whereas decreasing the number of S lineages does not change the number of coalescences necessary to produce EC and has a smaller effect on its probability (cf. Fig. 6.4E and F). The probability for ESC closely follows that of ES, as production of reciprocal monophyly is limited by the monophyly of the individual classes.

As can be seen from the increase in probability for ES as pooling is increased (Fig. 6.4E), the correct monophyly probability for clades that have been pooled tends to be lower than that obtained under a model where the pooled clades are treated as a single clade. The monophyly probability will likely be overestimated if populations are pooled.

Application to Data

To illustrate the empirical use of Eq. 6.7 and to test if our theoretical results reasonably replicate patterns in real data, we perform an analysis of monophyly frequencies using Zea mays maize and teosinte genomic data (Chia et al., 2012).

Hufford et al. (2012) analyzed 75 individuals from the data of Chia et al. (2012), considering four groups: teosinte varieties var. parviglumis (“parviglumis”) and var. mexicana (“mexicana”) and domesticated maize landraces (“landraces”) and improved lines (“improved”). Modifying the estimated tree of individuals from figure 1 in Hufford et al. (2012) to make a model “species” tree the leaves of which are the four groups (Fig. 6.5A), we compute theoretical monophyly probabilities for each of the groups via Eq. 6.7. We also estimate the empirical frequency of monophyly for each group by randomly sampling individuals from each group, constructing multiple gene trees per sample from SNP blocks, and averaging frequencies of monophyly in the gene trees over the random samples. This procedure employs 100 unique random samples of eight individuals from the Hufford et al. subset, each containing two individuals from each of the four groups. Finally, we compare the observed and theoretical monophyly frequencies.

The monophyly frequencies appear in Fig. 6.5B and are summarized in Table S2. The theoretical frequencies predict the observations reasonably well. For each clade, especially parviglumis and mexicana, the mean observed monophyly frequency over 100 samples closely coincides with the theoretical monophyly probability (Fig. 6.5B). Although the theoretical probability is noticeably below the mean for the improved and landrace clades and above the mean for parviglumis and mexicana, it lies well inside the observed distributions.

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
images
FIGURE 6.5 Monophyly frequencies in maize and teosinte. (A) Model species tree. (B) Violin-plot distributions across lineage subsamples of monophyly frequencies for four clades. Means of the observed distributions (excluding outliers for the improved and parviglumis clades) appear as circles and theoretical values appear as triangles. Outliers appear for a single point at frequency ~0.43 in the improved clade and for several points at frequency >0.17 in the parviglumis clade, with the cross indicating the mean of the parviglumis outliers (Supporting Information).

Eq. 6.7 relies on a model with selectively neutral loci and constant population size; a deviation from theoretical probabilities could suggest a violation of one of the model assumptions. Domestication imposes strong selection and population bottlenecks (Wang et al., 1999; Innan and Kim, 2004; Wright et al., 2005), factors that violate our model in a manner that would increase monophyly frequencies. Excess empirical monophyly in the improved and landrace clades (Fig. 6.5B, Table S2) is thus compatible with domestication in the history of these domesticated groups.

DISCUSSION

Extending a past computation (Rosenberg, 2003) from 2 to n species, we have obtained a general algorithm for the probability of any monophyly event of two lineage classes in a species tree of any size. In our generalization, unlike in previous calculations, no restriction exists on the

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

class labeling of lineages, so that monophyly probabilities can be computed on samples aggregated across multiple species. We have uncovered behaviors absent in the two-species case, including nonmonotonicity of the monophyly probability in the tree height and positive limiting probabilities below 1. Both phenomena occur in scenarios newly possible to include in monophyly calculations, in which the lineage set whose monophyly is of interest spans multiple species, or in which lineages of at least one species span both classes.

We have used a pruning algorithm similar to other species tree computations (Efromovich and Kubatko, 2008; RoyChoudhury et al., 2008; Bryant et al., 2012; RoyChoudhury and Thompson, 2012; Stadler and Degnan, 2012; Wu, 2012) that evaluate a quantity at a parent node in terms of corresponding values for daughter nodes. In previous applications of this idea, the states recorded at a node are generally simpler than our input and output states. For example, in evaluating the time to the MRCA (Efromovich and Kubatko, 2008), they are one-dimensional; our approach instead tracks lineage classes as three variables, accommodating complex transitions that occur at interclass coalescences.

Previous work on monophyly probabilities has been limited to small numbers of species (Rosenberg, 2002, 2003; Degnan, 2010; Zhu et al., 2011; Eldon and Degnan, 2012). This limitation has forced investigators to either group multiple species together into a single clade (Carstens and Richards, 2007; Kubatko et al., 2011)—a choice that our tree-pooling experiment shows can overestimate monophyly probabilities—or to consider pairwise comparisons when multispecies analyses would be preferable (Baker et al., 2009; Neilson and Stepien, 2009; Bergsten et al., 2012). By identifying a bias that occurs when pooling distinct species in monophyly probability computations, our experiment suggests that pooling should be avoided when possible. Our results allow researchers to move beyond such simplifications by performing monophyly calculations in larger species groups.

One application of our results is to extend a test of a null hypothesis that an observed monophyletic pattern is due to chance alone (Rosenberg, 2007). This test has been available only in situations with species-specific lineages and two-species trees; it can now be extended to arbitrary trees and non-species-specific lineages. The results also provide a step toward computations for monophyly events on three or more lineage groups considered jointly.

As an empirical demonstration, we analyzed data from maize and teosinte, calculating theoretical and observed monophyly frequencies in four groups. The empirical frequencies generally match the predictions; frequencies exceeding predicted values in the domesticated species may reflect the fact that domestication bottlenecks and strong selection can violate our model in a manner that increases the likelihood of monophyly.

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

We note that our Z. mays results should be viewed with caution. We assumed a model of instantaneous divergence events without incorporating the subsequent gene flow that likely occurred in this system (Hufford et al., 2012). Furthermore, our model species tree contains uncertainty; however, we do not expect a bias in any specific direction to have resulted from its construction. Perhaps more seriously, we generated the model tree from the same study whose data we used for constructing gene trees. However, considerations of monophyly were irrelevant in producing the model tree, so that construction of the model did not guarantee the agreement we obtained between theoretical and observed monophyly.

The maize analysis illustrates how our framework can be used to study monophyly in multispecies genomic data. The formulas derived here allow for greater flexibility in studies of monophyly and its relationship to species trees, contributing to a more comprehensive toolkit for phylogeographic, systematic, and evolutionary studies.

MATERIALS AND METHODS

Maize Species Tree

We used maize HapMap V2 SNP data from www.panzea.org/#!genotypes/cctl (Chia et al., 2012) consisting of 55 million SNPs and small indels from 103 Z. mays inbred lines. To construct Fig. 6.5A, we determined relative branch lengths from figure 1 in Hufford et al. (2012). We chose a tree height of 0.04, measured in units of N generations, where N is the haploid population size, noting that a ~10,000-year domestication time (Hufford et al., 2012) translates via conversion factors calculated from figure 7 in Ross-Ibarra et al. (2009) (top panel, TD column) to 0.036 units of N generations. We chose our root as the root of the Hufford et al. ingroup tree (second node from left in figure 1 of Hufford et al. (2012), call it x), our Parviglumis/Domesticated node as the MRCA of all domesticated lineages and parviglumis lineages TIL01, TIL03, TIL11, and TIL14 [y = xLLLLL in figure 1 of Hufford et al. (2012), oriented so that L is “down” rather than “left”], and our Landrace/Improved node as the MRCA of all domesticated lineages [yL in figure 1 of Hufford et al. (2012)].

Maize Samples

We chose 100 samples of four lineage pairs, selecting randomly among 29 improved, 12 landrace, 8 parviglumis, and 2 mexicana individuals. We chose pairs within groups so that the Hufford et al. tree, a genomewide tree of individuals, restricted to each eight-lineage sample would display

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

the model species tree in Fig. 6.5A, irrespective of which lineage in a pair was chosen to represent its group (Supporting Information).

Maize Gene Trees

The maize genome has ~2.3 × 109 bp (Schnable et al., 2009), with linkage disequilibrium (LD) decay at ~1,500 bp (Remington et al., 2001). For simplicity and to accommodate large quantities of missing data, despite genomewide variation in recombination rate and SNP density, we fixed a single block size for analyses throughout the genome. With ~5 × 107 SNPs in the dataset, SNP density per “LD block” is 32.6, which we round to 30. We divided the SNPs into nonoverlapping 30-SNP blocks and used every hundredth block in a concatenated genome starting from chromosome 1, resulting in ~6,000–7,000 gene trees per sample after removing blocks monomorphic in the sample and gene trees polytomic for the sample. We concatenated SNPs within blocks, computed blockwise Hamming distance matrices, and obtained gene trees using the hclust UPGMA (unweighted pair group method with arithmetic mean) clustering function in the R stats package. SNPs with missing data for a lineage pair were excluded in distance calculations.

Software Implementation

The Monophyler software package implementing Eqs. 6.7, 6.8, and 6.9 can be found at rosenberglab.stanford.edu/monophyler.html.

ACKNOWLEDGMENTS

We thank Jeff Ross-Ibarra for assistance with the maize data and John Rhodes and two reviewers for comments on a draft of the manuscript. We acknowledge support from NIH Grant R01 GM117590, NSF Grant DBI-1458059, a New Zealand Marsden grant, and a Stanford Graduate Fellowship.

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×

This page intentionally left blank.

Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 113
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 114
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 115
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 116
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 117
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 118
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 119
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 120
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 121
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 122
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 123
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 124
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 125
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 126
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 127
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 128
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 129
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 130
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 131
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 132
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 133
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 134
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 135
Suggested Citation:"6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg." National Academy of Sciences. 2017. In the Light of Evolution: Volume X: Comparative Phylogeography. Washington, DC: The National Academies Press. doi: 10.17226/23542.
×
Page 136
Next: 7 Phylogeographic Model Selection Leads to Insight into the Evolutionary History of Four-Eyed Frogs - Maria Tereza C. Thom and Bryan C. Carstens »
In the Light of Evolution: Volume X: Comparative Phylogeography Get This Book
×
Buy Hardback | $150.00 Buy Ebook | $119.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Biodiversity--the genetic variety of life--is an exuberant product of the evolutionary past, a vast human-supportive resource (aesthetic, intellectual, and material) of the present, and a rich legacy to cherish and preserve for the future. Two urgent challenges, and opportunities, for 21st-century science are to gain deeper insights into the evolutionary processes that foster biotic diversity, and to translate that understanding into workable solutions for the regional and global crises that biodiversity currently faces. A grasp of evolutionary principles and processes is important in other societal arenas as well, such as education, medicine, sociology, and other applied fields including agriculture, pharmacology, and biotechnology. The ramifications of evolutionary thought also extend into learned realms traditionally reserved for philosophy and religion.

The central goal of the In the Light of Evolution (ILE) series is to promote the evolutionary sciences through state-of-the-art colloquia--in the series of Arthur M. Sackler colloquia sponsored by the National Academy of Sciences--and their published proceedings. Each installment explores evolutionary perspectives on a particular biological topic that is scientifically intriguing but also has special relevance to contemporary societal issues or challenges. This tenth and final edition of the In the Light of Evolution series focuses on recent developments in phylogeographic research and their relevance to past accomplishments and future research directions.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!