of estimating a demographic model is time-consuming and computationally intensive and requires substantial population genetic expertise.
Thus far, the bottom-up approach to studying domestication has been applied only to maize. Wright et al. (2005) formulated a demographic model of maize domestication using sequence polymorphism data from 793 genes in 14 maize inbred lines and 16 haploid plants from its wild progenitor, teosinte. With these data, Wright et al. (2005) first sought to estimate a plausible demographic model and then asked whether the data were more likely if directional selection on a subset of loci was included in the model. Applying a novel likelihood ratio approach to this problem, they estimated that 2–4% of their loci were linked to a target of artificial selection during domestication. Their approach also allowed them to rank loci in terms of evidence for selection. The list of selected genes is enriched for functions related to transcription factors, genes implicated in plant growth, and genes involved in amino acid biosynthesis. Moreover, genes identified as targets of selection clustered nonrandomly around previously identified QTLs for domestication traits (Wright et al., 2005) and are more highly expressed than random genes only in the maize ear (K. M. Hufford and B.S.G., unpublished results), an organ expected a priori to be the target of selection.
The demographic approach for finding candidate “adaptive” genes is model-intensive. As an alternative to estimating the demographic model, several studies have simply ranked genes empirically (Toomajian et al., 2006; Voight et al., 2006). This is an acceptable, but not optimal, solution based on a straightforward idea. Under the selection scenario described in Fig. 11.2, one expects that genes contributing to adaptive traits should have low genetic variation or skewed allele frequencies compared with nonselected genes (Tajima, 1989). Without knowing the exact demographic model, it thus makes sense to assay genetic polymorphism in a number of genes, compare them, and rank them by summary statistics. The candidate gene, if selected, should fall into the extreme tail of the distribution of summary statistics like S, the number of SNPs in the gene, or Tajima’s D, a measure of the allele frequency spectrum. If the gene is extreme, then the polymorphism data are consistent with an adaptive hypothesis. This idea can be applied to a genome-wide sample of genes to identify candidate genes de novo via bottom-up methods, or, alternatively, to compare a candidate gene identified by top-down approaches to a sample of reference loci.
Although empirical ranking is a suitable approach, its efficacy depends greatly on the particulars of individual evolutionary histories