Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
MAPPING HEREDITY: USING PROBABILISTIC MODELS AND ALGORITHMS TO MAP GENES AND GENOMES 51 ⢠False negatives: one may fail to identify some proportion of the YACs containing an STS; ⢠False positives: some proportion of the YACs detected as containing an STS may not actually do so; and ⢠Chimeric YACs: some proportion of the YACs may not represent a single contiguous region, but two unrelated regions that have been joined together in a single clone. Moreover, the occurrence of false negatives and positives may not be random but systematic (owing to deletions of clones or contamination of samples). In short, algorithms must be robust to errors in the data. Producing such algorithms is an interesting challenge that draws on methods from graph theory, operations research, and statistics. As of this writing, the best approach has not yet been determined. CONCLUSION Genetic and physical mapping are key tools for describing the function and structure of chromosomes. Only in the simplest cases is such mapping completely devoid of mathematical issues. In the case of human genetics, mathematics plays a crucial role. In essence, mapping problemsâlike many problems in computational biologyâinvolve indirect inference of the structure of a biological entity, such as a chromosome, based on whatever data can be effectively gathered in the laboratory. It is not surprising that mapping problems draw on statistics, probability, and combinatorics. Although the field of mapping dates nearly to the beginning of the 20th century, the area remains rich with new challengesâbecause new laboratory methods constantly push back the frontiers of the maps and features that can be mapped in DNA.
MAPPING HEREDITY: USING PROBABILISTIC MODELS AND ALGORITHMS TO MAP GENES AND GENOMES 52
MAPPING HEREDITY: USING PROBABILISTIC MODELS AND ALGORITHMS TO MAP GENES AND GENOMES 53 Figure 2.10 Expected coverage properties for STS content mapping, as a function of the coverage a in YACs and b in STSs. Calculations assume YACs of constant length L and a genome of length G. The graphs show (A) the expected proportion of the genome covered by anchored ''contigs"; (C) the expected number of anchored contigs, and (D) the expected length of an anchored contig. Graphs show the situation for a = 1,2,. . .,10 (only the cases a = 1,5,10 are explicitly marked). Results are expressed in units of G/L. Table B lists the value of G/L for certain representative genomes and cloning vectors, including two different sizes of YACs. Reprinted, by permission, from Arratia et al. (1991). Copyright 1991 by Genomics.