Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
SEEING CONSERVED SIGNALS: USING ALGORITHMS TO DETECT SIMILARITIES BETWEEN BIOSEQUENCES 78 scores in order or give a color-coded visualization of the edit graph that colors edges according to the score of the best path utilizing the edge. Another interesting variation is to explore the range of solutions not by enumerating near-optimal answers, but by studying the range of optimal answers produced by parametrically varying aspects of the underlying scoring scheme (Waterman et al., 1992). Approximate Pattern Matching A variation on the local alignments problem discussed above is the approximate match problem. For this problem, imagine that A is a very long sequence and B a comparatively short query sequence. The problem is to find substrings of A, called match sites, that align with the entirety of B with a score greater than some user- specified threshold. An example might be to find all locations in a chromosome's DNA sequence (A) where a particular DNA sequence element (B) or some sequence like it occurs. It is not hard to see that this problem is equivalent to finding sufficiently high scoring paths that begin at a vertex in row 0 and end at row N of the edit graph for A and B. By simply permitting 0 to be a term in the computation of S-values in row 0 and checking values in row N, one obtains the desired modification of the basic dynamic programming algorithm. The problem is taken to another level by generalizing B, the query, from a sequence to a pattern (that describes a set of sequences). This variation is called approximate pattern matching. Computer scientists working on text-searching applications have long studied the problem of finding exact matches to a pattern in a long text. That is, given a pattern as a query, and a text as a database, one seeks substrings of the database text that match the pattern (exactly). Pattern types that have been much studied include the cases of a simple sequence, a regular expression, and a context-free language. Such patterns are notations that denote a possibly infinite set of sequences, each of which is said to (exactly) match the pattern. For example, the regular expression A(T|C)G* denotes the set of sequences that start with an A followed by a T or a C and then zero or more G's, that is, the set {AT, AC, ATG, ACG, ATGG, ACGG, ATGGG,. . .}. Assuming the pattern takes P symbols to specify and the text is of length