The parameters used for the simulations were (i) length of 118 potentially variable amino acids (162 aligned positions minus 44 permanently invariable positions), (ii) number of covarions = 28 (determined as shown in Results), (iii) persistence = 0.01, (iv) half the variable sites had two alternative amino acids at each site; the other half had three alternatives, and (v) clock times are set by the paleontological dates in Table 1 with a rate of six replacements per 10 Myr.
The SODs are abundant enzymes in aerobic organisms, with highly specific activity that protects the cell against the harmfulness of free oxygen radicals (Fridovich, 1986). The SODs have active centers that contain either iron, manganese, or both copper and zinc (Fridovich, 1986). The Cu,Zn SOD is a well-studied protein found in eukaryotes but also in some bacteria (Steinman, 1988). The amino acid sequence is known in many organisms, plants, animals, fungi, and bacteria. The three-dimensional structure for the bovine SOD has been determined at a 2-Å resolution (Tainer et al., 1982); it is conserved in humans (Getzoff et al., 1989) and presumably in Drosophila (Kwiatowski et al., 1992) and bacteria (Bannister and Parker, 1985). The amino acids essential for catalytic action (Tainer et al., 1983), as well as those for protein structures, are strongly conserved (Getzoff et al., 1989; Kwiatowski et al., 1992).
The Phylogeny. The tree used for this study is shown in Figure 1. It is not the most parsimonious. Rather, we required the tree to conform to what is believed to be correct on a priori grounds (i.e., based on knowledge that is independent of the SOD data). We only let parsimony dictate regions of the tree where other evidence does not seem to us to be determinative. The reason for this is that we wish to optimize the correctness of the tree, rather than its being most parsimonious, because estimates of divergence, against which clock measures are to be tested, will be more valid the closer the tree is to reality. The accuracy of the covarion estimate is similarly constrained. The most parsimonious tree we have found requires 1940 nucleotide substitutions; the tree in Figure 1 requires 1984. The sequences used were amino acid sequences, back translated into ambiguous codons so that the changes are in substitutions rather than replacements, although nearly all substitutions are replacement substitutions. The number of differences between pairs of sequences used for the clock test does not depend upon the topology of the tree. The average differences are shown in Table 1 for those contrasts for which there are reasonable, nonmolecular, paleontological dates.