SUPPLEMENTARY FIGURE A14-5 For each gene segment (PB2, PB1, PA, HA, NP, NA, M & NS), we plot the isolation date of each influenza sequence against the genetic distance from that sequence to the root of the phylogeny. The linear regression gradient is therefore an estimate of the rate of sequence evolution and the x-intercept is an estimate of the TMRCA of the whole phylogeny. Phylogenies were estimated using neighbour-joining, with rooting chosen to maximise the regression fit. The chosen root was typically very close to the earliest sampled sequence. Residual analysis was performed to identify and remove significant outliers, which most likely result from isolation data annotation errors in the sequence database. For each gene, the degree of scatter about the linear regression reflects evolutionary rate heterogeneity among lineages, such that a “strict clock” corresponds to all the points falling exactly on the regression line. The 2009 outbreak sequences (highlighted in light blue) are entirely typically of the long termtrends in divergence, hence there is no evidence that the branch leading to the outbreak has evolved unusually rapidly or slowly. For further discussion of this methodology, see Drummond AJ, Pybus OG, Rambaut A. 2003. Inference of viral evolutionary rates from molecular sequences. Advances in Parasitology 54:331-358.