several other eastern African populations (Underhill et al., 2000; Cruciani et al., 2002; Semino et al., 2002, 2004; Knight et al., 2003). Consistent with the mtDNA and NRY data, our MDS analysis shows that the Hadza and Sandawe cluster closely together with each other and with other eastern African populations (Fig. 5.3). Additionally, the Hadza are slightly farther from the SAK than the Sandawe along both dimensions (Fig. 5.3).

Tishkoff et al. (2009) provide evidence for an ancient common ancestry of Khoesan and Pygmy populations, suggesting the possibility of a proto-Khoesan hunter-gatherer population in eastern Africa that diverged >30 kya. STRUCTURE analysis revealed that the pygmies cluster together with other hunter-gatherer samples, including the SAK, Hadza, and Sandawe at low K values (K = 3), and then differentiate at higher K values (K = 5) (Table 5.1). The analysis also shows that the Mbuti pygmies cluster with the SAK at higher K values (K = 7), which could be due to either common ancestry or more recent gene flow. In addition, recent work on mtDNA, NRY, and autosomal data estimated the TMRCA of the pygmy and agricultural populations to be approximately 70–60 kya and the TMRCA of western and eastern pygmies to be approximately 20 kya (Destro-Bisol et al., 2004; Quintana-Murci et al., 2008; Patin et al., 2009). The findings of Tishkoff et al. (2009) raise the possibility that the pygmy populations, who have lost their indigenous language, once spoke some form of proto-Khoesan with click consonants. Interestingly, linguistic analysis of the SAK suggests that they originated in eastern Africa and possibly as far north as Ethiopia before migrating into southern Africa, consistent with the identification of rock art in the Sandawe homeland and in southern Africa that is thought to be related to Khoesan speakers (Lim, 1992). There is further evidence that, although there has not been recent gene flow among these populations, there has been recent admixture between the Sandawe and neighboring populations as well as between the pygmies and neighboring populations, and this recent admixture may be obscuring the more ancient relationships among the hunter-gatherer populations (Tishkoff et al., 2009). Future analyses that incorporate data from across the genome together with full-likelihood or approximate Bayesian computation methods will be necessary to more fully understand these complex population histories.


We have presented here a synthesis of the archaeological, linguistic, and genetic data used to infer African population history. The general picture that emerges is that genetic variation in Africa is structured geographically and to a lesser extent linguistically. This is consistent with the fact that populations in close geographic proximity to each other as well

