natural selection. Natural selection can account for the absence of synonymous variation at any one of the 10 loci shown in Table 2, if the particular gene itself (or a gene with which it is linked) has been subject to a recent worldwide selective sweep, without sufficient time for the accumulation of new synonymous mutations. However, the 10 genes are located on, at least, six different chromosomes (Table 2), and thus six independent selective sweeps would need to have occurred more or less concurrently, which seems prima facie unlikely. A selective sweep simultaneously affecting all chromosomes could happen if one accepts the hypothesis that the population structure of P. falciparum is predominantly clonal, rather than sexual (see Escalante and Ayala, 1994; Ayala et al., 1999). This hypothesis is controversial, although we have argued that it may indeed be the case, the capacity for sexual reproduction of the parasite notwithstanding (Rich et al., 1997; Ayala et al., 1999).
The absence of synonymous polymorphisms in most P. falciparum genes must be made congruous with the substantial levels of polymorphism observed in such antigenic genes as Csp, Msp-1, and Msp-2. We propose that nucleotide polymorphism arises in antigenic genes promoted by natural selection acting on two different “mutation” processes. First, the familiar process of single-site nucleotide mutation generates amino acid replacements that give rise to polymorphisms at antigenic sites subject to diversifying selection. Second, there is intragenic recombination that generates variation at a rapid rate in repetitive segments (often occurring in tandem) of antigenic genes. The variation generated by intragenic recombination is also subject to diversifying natural selection because it contributes to the parasite's ability to evade the immune response of the human host. We will show that some of the reported nucleotide variation between antigenic alleles is an artifact stemming from misalignment of gene sequences that are of different lengths as a consequence of unequal numbers of repetitions generated by intragenic recombination.
The Csp gene is comprised of two terminal regions that are not repetitive (5′ NR and 3′ NR), which embrace a central region (CR) made up of a variable number (mostly, between 40 and 50) of tandemly arranged 12-nt-long repeats. As shown in Table 2, there are no silent polymorphisms in the 5?NR and 3′NR regions of the gene, which is part of the evidence supporting a recent origin of P. falciparum populations.