. "6 Effects of Passage History and Sampling Bias on Phylogenetic Reconstruction of Human Influenza A Evolution." Variation and Evolution in Plants and Microorganisms: Toward a New Synthesis 50 Years after Stebbins. Washington, DC: The National Academies Press, 2000.
The following HTML text is provided to enhance online
readability. Many aspects of typography translate only awkwardly to HTML.
Please use the page image
as the authoritative form to ensure accuracy.
Variation and Evolution in Plants and Microorganisms: TOWARD A NEW SYNTHESIS 50 YEARS AFTER STEBBINS
TABLE 5. The phylogenetic distribution of nonsilent (NS) substitutions on trees constructed using only sequences from egg-cultured or cell-cultured isolates and using only non-HM codons
Number of branches
Exp NS
Obs NS
χ2
Branches on egg tree
Terminal
152
119.3
155
10.7
Internal
150
117.7
82
10.8
Total
302
237.0
237
21.5
Branches on cell tree
Terminal
148
121.3
158
11.1
Internal
146
119.7
83
11.2
Total
294
241.0
241
22.3
Trees constructed without the HM codons using sequences from isolatespropagated in egg culture or in cell culture both showed significantexcesses of nonsilent substitutions on their terminal branches (P < 0.05, df = 1 for both tests). The percent excess of nonsilent substitutionson the terminal branches was 30% for both the egg and cell trees.
we do not believe, based on HI tests, to be closely related to isolates already sequenced. For instance, in the 1996–1997 influenza season we sequenced only 7% of the isolates on which we performed HI assays. Because the isolates sent to the Centers for Disease Control and Prevention from the World Health Organization collaborating laboratories may themselves already be biased against commonly occurring isolates, the bias against sequencing closely related viruses is even greater than 7%. Based on this bias we expect the terminal branches on our trees to be longer on average than they would have been had we sampled randomly. Because we do not know the genetic structure of the influenza population circulating in nature, we cannot know how we actually sampled it. Thus, we cannot calculate the exact distribution of mutations we should expect on the terminal and internal branches of the tree constructed by using our sample. We can, however, determine whether the excess we have observed is consistent with what we would expect based on our sampling protocol.
We sampled a simulated viral population using various sampling schemes to determine the extent to which our observation is consistent with this hypothesis. We constructed a hypothetical population of 16 viral isolates and sampled it as illustrated in Fig. 3. The samples consisted of eight relatively unrelated isolates (the dispersed sample), eight closely related isolates (the clumped sample), and two collections of eight isolates sampled in an intermediate manner. To ensure that samples all included the total range of variation in the population, each included the upper-most and bottom-most isolate on the 16-isolate tree. The percent excess of