National Academies Press: OpenBook
« Previous: FINDING LOCAL SIMILARITIES
Suggested Citation:"Variations in Gap Cost Penalties." National Research Council. 1995. Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology. Washington, DC: The National Academies Press. doi: 10.17226/2121.
×
Page 70
Suggested Citation:"Variations in Gap Cost Penalties." National Research Council. 1995. Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology. Washington, DC: The National Academies Press. doi: 10.17226/2121.
×
Page 71

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

SEEING CONSERVED SIGNALS: USING ALGORITHMS TO DETECT SIMILARITIES BETWEEN BIOSEQUENCES 70 dot plots give a meaningful visualization of all the similarities between segments in a single snapshot and are ubiquitous. VARIATIONS ON SEQUENCE COMPARISON In this section a number of the most important variations on sequence comparison are examined. The survey is by no means exhaustive. Variations in Gap Cost Penalties How to assign scores to alignment gaps has always been more problematic than scoring aligned symbols, because the statistical effect of gaps is not well understood (see Chapter 4). Nature frequently deletes or inserts entire substrings as a unit, as opposed to individual polymer Figure 3.7 Dot plot of somatotropin alignments.

SEEING CONSERVED SIGNALS: USING ALGORITHMS TO DETECT SIMILARITIES BETWEEN BIOSEQUENCES 71 elements. It is thus natural to think of cost models in which the score of a gap is not just the sum of scores assigned to the individual symbols in the gap, as was used in the previous two sections, but rather a more general function, gap(x), of its length x. For example, it is common to score a gap according to the affine function gap(x) = r + sx, where r > 0 is the penalty for the introduction of the gap and s > 0 is the penalty for each symbol in the gap. Such affine gap costs are particularly important when comparing proteins. For example, a gap penalty of 8 + 4x works well in conjunction with the aligned symbol scores of Figure 3.5. Because a gap is viewed as detracting from similarity, its score is a penalty that is subtracted from the total. Accommodating affine gap scores involves the following variation on the central recurrence (Gotoh, 1982). For each subproblem, Ai versus Bj, one develops recurrences for (1) the best alignment that ends with an A-gap, Ag(i, j), (2) the best alignment that ends with a B-gap, Bg(i, j), and (3) the best overall alignment, S(i ,j). This leads to the following system of recurrence equations: Ag(i,j) = max{Ag(i− 1,j)− s,S(i− 1,j)− (r + s)} Bg(i,j) = max{Bg(i, j − 1)− s,S(i, j − l)−(r + s)} S(i,j) = max{S(i − 1, j 1)+(ai, b),Ag(i,j),Bg(i,j)} . S terms contributing to an Ag or Bg value are penalized r + s because a gap is being initiated from that term. Ag terms contributing to Ag values and Bg terms contributing to Bg values are penalized only s because the gap is just being extended. An algorithm that applies these recurrences at each (i,j) leads to an O(MN) time algorithm for global alignments with affine gap costs. Simply adding a 0 term to the S-recurrence gives an algorithm for local alignments with affine gap costs. Summation and affine functions are not the only options available for scoring gaps. The gap cost function gap(x) can be taken to be a concave (flat or cupped downward) function of length, that is, a function such that gap(x + 1)− gap(x)≤ gap(x)− gap(x −1) for all x > 0. The class of concave gap cost functions includes affine functions but is much wider than just affine functions. For example, for positive a and b, the function gap(x) = a logx + b is a concave function that finds occasional use in

Next: The Duality Between Similarity and Difference Measures »
Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology Get This Book
×
 Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology
Buy Paperback | $80.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

As researchers have pursued biology's secrets to the molecular level, mathematical and computer sciences have played an increasingly important role—in genome mapping, population genetics, and even the controversial search for "Eve," hypothetical mother of the human race.

In this first-ever survey of the partnership between the two fields, leading experts look at how mathematical research and methods have made possible important discoveries in biology.

The volume explores how differential geometry, topology, and differential mechanics have allowed researchers to "wind" and "unwind" DNA's double helix to understand the phenomenon of supercoiling. It explains how mathematical tools are revealing the workings of enzymes and proteins. And it describes how mathematicians are detecting echoes from the origin of life by applying stochastic and statistical theory to the study of DNA sequences.

This informative and motivational book will be of interest to researchers, research administrators, and educators and students in mathematics, computer sciences, and biology.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!