Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
CALIBRATING THE CLOCK: USING STOCHASTIC PROCESSES TO MEASURE THE RATE OF EVOLUTION 117 OVERVIEW To illustrate the methods, we use a set of North American Indian mitochondrial sequences described in Ward et al. (1991). These authors sequenced the first 360 base pairs of the mitochondrial control region for a sample of 63 Nuu-Chah-Nulth (Nootka) Indians from Vancouver Island. The sample comprises individuals who were maternally unrelated for four generations, chosen from 13 of the 14 tribal bands. As a consequence the sample deviates from a truly random sample, although it will be treated as such for the purposes of this chapter. An important parameter in the analysis is the effective population size of the group. This is approximated by the number of reproducing females, giving a value of about 600 for the long-term effective population size N. The most common DNA changes seen in mitochondria are transitions (changes from one pyrimidine base to the other or one purine base to the other, that is, C â T or A â G) rather than transversions (changes from a pyrimidine to a purine or vice versa). Indeed, the sequenced region shows no transversions, so that each site in the sequences has one of just two possible nucleotides. We focus on the pyrimidine (C or T) sites in the region. There are 201 such sites, in which 21 variable (or segregating) sites define 24 distinct sequences (called alleles or lineages). The details of the data, including the allele frequencies, are given in Table 5.1. The parameter of particular interest here is θ, the population geneticist's stock in trade. The variable θ is a measure of the mutation rate in the region, and it figures in many important theoretical formulas in population genetics. For mitochondrial data, it is defined by θ = 2Nu, where N is the effective population size referred to earlier, and u is the mutation rate per gene per generation. Once θ is estimated, we can estimate u if N is known or N if u is known. In what follows, we estimate the compound parameter θ rather than its components. In the section immediately following, we begin by outlining the structure of the coalescent, a robust description of the genealogy of samples taken from large populations. The effects of mutation are superimposed on this genealogy in several ways. The classical case, which
CALIBRATING THE CLOCK: USING STOCHASTIC PROCESSES TO MEASURE THE RATE OF EVOLUTION 118