identifies of HM mutations present in such a data set. After partitioning the excess caused by HM mutations out of the data set, we can determine whether the remaining excess is consistent with what we would expect given our sampling scheme. This was done through comparison with trees produced from sampling a simulated data set.
Fig. 1 shows the phylogenetic tree for which we recently reported an excess of mutations assigned to the terminal branches (Bush et al., 1999a, b). This tree was constructed by using the maximum parsimony routine of PAUP* 4.0b2 (Swofford, 1999) using 357 sequences, each 987 nt in length, produced from isolates collected between 1983 and September 30, 1997 (Bush et al., 1999a, b). The terminal nodes of a tree are the sequences obtained from isolates in the laboratory. Internal nodes are the ancestors of the terminal nodes as reconstructed by the parsimony algorithm. Terminal branches attach terminal nodes, that is, the sequence from an isolate, to the tree. All other branches are internal branches.
We use the term egg isolates when referring to the 152 isolates that were propagated in embryonated chicken eggs in the laboratory. The egg isolates also may have been previously propagated in cell culture. We use the term cell isolates to refer to the 148 isolates propagated in cell culture but never in eggs. The remaining sequences were obtained from direct PCR (n = 3) or from isolates of partially unknown passage history (n = 54). The propagation histories of these isolates (GenBank accession nos. AF008656–AF008909 and AF180564–AF180666) can be found in the curated influenza database at Los Alamos National Laboratory (http://www.flu.lanl.gov/).
For this study we constructed additional trees by using two different samples of the original data set. Trees constructed using only the 152 sequences obtained from isolates propagated in eggs or using only the 148 sequences from viruses propagated in cells are referred to as the egg tree and the cell tree, respectively. Twenty two codons (Table 1) have been reported to undergo HM replacements in influenza isolates grown in eggs (Bush et al., 1999a). We refer to these 22 codons as the HM codons and the other 307 codons as the non-HM codon set. Silent and nonsilent nucleotide substitutions were abbreviated by the letters S and NS, respectively.
Analyses reported in this paper were performed by using all substitutions, only nonsilent substitutions, or only silent substitutions. Because the results in most cases were very similar, and because nonsilent and