genome into functional categories and we then study the pattern of mutational change within each functional category. The functional categories are (i) DNA regions that do not code for tRNA, ribosomal RNA (rRNA), or protein (referred to as ''noncoding DNA"); (ii) protein-coding genes; and (iii) chloroplast introns. The methodological approach is that of comparative sequence analysis, where complete DNA sequences from phylogenetically structured samples are analyzed to reveal the pattern of mutational change in evolution.
The chloroplast genome is highly condensed compared with eukaryotic genomes; for example, only 32% of the rice genome is noncoding. Most of this noncoding DNA is found in very short segments separating functional genes. A number of recent studies have revealed complex patterns of mutational change in noncoding regions. The most widely studied noncoding region of the chloroplast genome is the region downstream of the rbcL gene in the grass family (Poaceae). This noncoding sequence is flanked by the genes rbcL and psaI (the gene encoding photosystem I polypeptide I) and is 1694 bases long in the rice genome, making it one of the longest noncoding regions of the genome and the longest when introns are excluded (Figure 1). A pseudogene for the chloroplast gene rpl23 is also located within this noncoding segment (Hiratsuka et al., 1989). Hiratsuka et al. (1989) argued that a large inversion, unique to the grass family, arose through a recombinational interaction between nonhomologous tRNA genes, and the same process of recombination between short repeats has been invoked as the mechanism responsible for the origin of the rpl23 pseudogene Ψrpl23 (Ogihara et al., 1992).
The functional rpl23 gene of the rice genome is located in the inverted repeat about 27 kb away. Bowman et al. (1988) suggested that the pseudogene was being converted by the functional rpl23 gene because the genetic divergence among pseudogenes was lower than the divergence observed for the surrounding noncoding regions. Subsequent work, based on a phylogenetic analysis (Morton and Clegg, 1993), has provided additional support for a model of gene conversion between the rpl23 pseudogene and its functional counterpart in at least two lineages of the grass family.
Four independent deletion events of at least 850 bases in length have been observed spanning almost identical stretches of the noncoding region between rbcL and psaI (Morton and Clegg, 1993; Ogihara et al., 1988). Based on flanking sequence data, it has been suggested that recombination between short direct repeats was responsible for these