α-helices; and (ii) an amino-terminal N domain consisting of a five-stranded antiparallel β-sheet and four or five α-helices that form a lid-like structure covering the barrel of an adjacent large subunit. Furthermore, the active site is well known. It consists of charged and polar residues at the carboxyl-terminal ends of the β-strands in the α/β-barrel of the C domain of one subunit and asparagine and glutamic acid residues from the lid of the adjacent N domain.
One approach to examining patterns of amino acid replacements in a structural context is to map amino acid replacements on a fully resolved, unambiguous tree. Since the evolutionary relationships among all of the >750 taxa are not known to any degree of certainty, a subset of the taxa were chosen for this analysis. We used 105 taxa, including three prokaryotes, four algae (including Charophytes), five bryophytes and ferns, eight gymnosperms, and 85 angiosperms (for details, contact the authors). In general, conventional phylogenetic relationships that were supported by the analysis of the rbcL nucleotide sequence data (Chase et al., 1993; Duvall et al., 1993) were used to construct this tree. The amino acid sequences for the 105 taxa were translated from the nucleotide sequences and aligned (along with the >700 amino acid sequences in the large data set) by using the following method. A preliminary alignment obtained using CLUSTAL V (Higgins et al., 1992) was refined by aligning similar, presumably homologous features of the solved three-dimensional crystal structures for the RuBisCo large subunit. Nine gaps were required to align the sequences. There were 494 sites in the aligned data set. The computer program MACCLADE version 3.04 (Maddison and Maddison, 1992) was then used to locate amino acid replacement on branches of the tree and to count the total number of amino acid changes required through the phylogeny.
Of the 1350 amino acid replacements; 762 could be unambiguously inferred; 568 and 182 were in α-helices and β-strands, respectively. Only 488 (36%) of the replacements were in the α/β-barrel structure, which constitutes 46% of the sites. For the complete sequence, the most common unambiguous replacements were Glu -> Asp, and Ala -> Ser (40 and 35 changes, respectively). When the number and types of replacements were examined for the various structures, interesting patterns emerged. Figure 2 shows the fractions of sites with replacements and the number of changes for the complete sequences, α-helices, β-strands, and other structures. The distributions are highly skewed, indicating that some sites may accept as many as 26 replacement events while other sites do not accept amino acid replacements. [Character state distributions (amino acids at a given site) for the 105 taxon dendrogram does not allow unequivocal reconstruction of the particular residues at all nodes in the dendrogram. For these residues the number