Read "Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology" at NAP.edu

« Previous: Alignment Given

Page 96 Cite

Suggested Citation:"Alignment Unknown." National Research Council. 1995. Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology. Washington, DC: The National Academies Press. doi: 10.17226/2121.

Page 97 Cite

Page 98 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

HEARING DISTANT ECHOES: USING EXTREMAL STATISTICS TO PROBE EVOLUTIONARY ORIGINS 96 therefore cannot be optimized. We give the statistical distribution of the alignment score for completeness, however. Let s(A,B) be a real valued random variable. Define the score S by and let E(S) denote the expectation of S and Var(S) denote the variance. Clearly, E(S) = nE(s(A,B)) and Var(S) = nVar(s(A, B)). Since S is the sum of independent, identically distributed random variables s(A,B), the central limit theorem implies that for large n S â Normal(nE(s(A, B)), nVar(s(A, B))). If s(A, B) â {0, 1}, then S is binomial (n, p), where p = P{s(A, B) = 1}. Even when the letters are not identically distributed, the limiting distribution is normal under mild assumptions (Chung, 1974). Alignment Unknown The assumptions of the last section are carried over: AlA2. . . An and B1B2. . . Bm are composed of independent and identically distributed letters and s(A, B) is a real valued random variable on pairs of letters. We extend s(Â·,Â·) to s(A,â) and s(â,B) so that deletions are included. We assume that the value of s for all deletion scores is smaller than maxs(A,B). An alignment score S is the maximum over all possible alignments

HEARING DISTANT ECHOES: USING EXTREMAL STATISTICS TO PROBE EVOLUTIONARY ORIGINS 97 The optimization destroys the classical normal distribution of alignment score, but an easy application of a beautiful theorem known as Kingman's subadditive ergodic theorem gives an interesting result: Theorem 4.1 (Kingman, 1973) For s,t nonnegative integers with 0 â¤ s â¤ t, let Xs,t be a collection of random variables that satisfy (i) Whenever s t u, Xs,u â¤ Xs,t + Xt,u, (ii) The joint distribution of Xs,t} is the same as that of {Xs+l, t+1}, (iii) The expectation gt = EX0,t exists and satisfies gt â¥ â Kt for some constant Then the finite limtââ X0,t / t = Î» exists with probability 1 and in the mean. The essential assumption is (ii), the subadditivity condition. To motivate and illustrate this theorem, recall the strong law of large numbers (SLLN), which treats independent, identically distributed (iid) random variables W1,W2,. . . with Âµ = E(Wj). The SLLN asserts that with probability 1. It is easy to see that additivity holds. Set Of course (i) is satisfied:

HEARING DISTANT ECHOES: USING EXTREMAL STATISTICS TO PROBE EVOLUTIONARY ORIGINS 98 Since the Wi are iid, (ii) is evidently true. Finally, g(t) = E(U0,t) = tÂµ, so that (iii) holds with Âµ = -K. Therefore the limit exists and is constant with probability 1. Notice that this setup does not allow us to conclude that the limit is Âµ. This is a price of relaxing the assumption of additivity. Returning to the statistical distribution of alignment score, recall that an alignment score S is the maximum over all possible alignments Define X,, by Then evidently, âXs,u â¥ (âXs,t) + (âXt,u) and Xs,u â¤ Xs,t + Xt,u. We have that gt = E(X0,t) exists since the expectation of a single alignment exists and âX0,t is the maximum of a finite number of alignment scores. The final hypothesis to check is gt â¥ âKt for some constant K and all t > 1. Clearly, E(âX0,t) â¤ t maxs(A, B) so that gt â¥ â(maxs(A, B))t = âKt.

Next: Alignment Given »

Welcome to OpenBook!

You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

Do you want to take a quick tour of the OpenBook's features?

No Thanks

Take a Tour »

Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology (1995)

Chapter: Alignment Unknown

Welcome to OpenBook!

Get Email Updates