Hayasaka, 1990; Vigilant et al., 1989, 1991; Ward et al., 1991). Studies of mitochondrial sequences of different Native American tribes strongly suggest that there were multiple waves of colonization of North America by migrant groups from Asia, and even allow one to estimate the dates of these events (Schurr et al., 1990; Ward et al., 1991). Assuming a constant evolutionary rate, the pattern of mutations between diverse human groups has been used to argue (Cann et al., 1987) that the mitochondria of all living humans descended from a mother that lived in Africa some 200,000 years agothe so-called Eve hypothesis. Although the precise details of the hypothesis are disputed (Maddison, 1991; Nei, 1992; Templeton, 1992), the general power of the methodology is well accepted. (As an aside, the reader should note that the existence of a common ancestorEve, so to speakis a mathematical necessity in any branching process that satisfies very weak conditions. The biological controversies pertain to when and where Eve lived.)
Each of these applications requires a knowledge of the rate at which mutations occur in an mtDNA sequence. Estimates of this rate have been obtained by comparing a single DNA sequence from each of several species whose times of divergence are presumed known. Divergence is calculated from the number of nucleotide differences between species (using methods that correct for the possibility of multiple mutations at a site), and rate estimates are obtained by dividing the amount of sequence divergence by the divergence time. For data taken from multiple individuals in a single population, one requires a model that takes account of the population genetic aspect of the sampling: individuals in the sample are correlated by their common ancestry. In this chapter, we describe the underlying stochastic structure of this ancestry and use the results to estimate substitution rates.
We have chosen to focus on rate estimation to give the chapter a single theme. We are not interested per se in statistical aspects of tests for selective neutrality of DNA differences; rather, we assume neutrality for the data sets discussed as examples. The techniques described here should be regarded as illustrative of the theoretical and practical problems that arise in sequence analysis of samples from closely related individuals. The emphasis is on exploratory methods that might be used to summarize the structure of such samples.