Page 277

Index

A

a-helix, 242-248, 254

Adenine (A), 8, 9, 99

Algorithms, 35-36, 84-86, 87

approximate pattern matching, 78-79

difference measures, 72-73

dynamic programming, 60-64, 78, 82, 84, 85, 86, 109

in evolutionary analysis, 106, 110-112

gap cost penalties, 70-72

in genetic mapping, 35-36

global alignment, 58-64, 94-99

heuristic, 82-84

K-best alignments, 76-78

local alignment, 65-70, 99-106

multiple alignments, 73-76

in physical mapping, 46-51

Alleles, 6

Amino acids, 4, 57

see also DNA; Protein folding; Sequence similarity and comparison

Amplification, see Polymerase chain reaction

Ancestry, see Evolutionary analysis

ANREP systems, 87

Antidiagonals, 62, 79-80

APC gene, 34, 37-38

Approximate pattern matching, 78-79, 86

Approximate repeats, 87

ARIADNE systems, 87

Assay techniques, 2-3

Autosomes, 26

B

Base pairs, 8, 26, 48, 153, 154, 163, 179, 185, 188, 189, 191, 194, 204, 249

see also Adenine, Thymine, Cystosine, Guanine, Uracil

Bayesian statistics, 35

b-sheet, 242-248, 254

Bernoulli random variables, 102, 125

Biochemistry, 2-5

Biosequences, see Databases of DNA sequences; Sequence similarity and comparison; Sequencing methods and technology

BLASTA algorithm, 82-84

Booth-Leuker algorithm, 50-51

BRCA1 (breast cancer) gene, 33

C

Cancer, 33, 34, 37-42, 58, 91, 183, 196

Catenanes, 205, 212

Cauchy's formula, 136

Cellular structures, 9

Chaperonins, 238-239

Chen-Stein method, 102, 106, 110

Chimeras, 51



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 277
Page 277 Index A a-helix, 242-248, 254 Adenine (A), 8, 9, 99 Algorithms, 35-36, 84-86, 87 approximate pattern matching, 78-79 difference measures, 72-73 dynamic programming, 60-64, 78, 82, 84, 85, 86, 109 in evolutionary analysis, 106, 110-112 gap cost penalties, 70-72 in genetic mapping, 35-36 global alignment, 58-64, 94-99 heuristic, 82-84 K-best alignments, 76-78 local alignment, 65-70, 99-106 multiple alignments, 73-76 in physical mapping, 46-51 Alleles, 6 Amino acids, 4, 57 see also DNA; Protein folding; Sequence similarity and comparison Amplification, see Polymerase chain reaction Ancestry, see Evolutionary analysis ANREP systems, 87 Antidiagonals, 62, 79-80 APC gene, 34, 37-38 Approximate pattern matching, 78-79, 86 Approximate repeats, 87 ARIADNE systems, 87 Assay techniques, 2-3 Autosomes, 26 B Base pairs, 8, 26, 48, 153, 154, 163, 179, 185, 188, 189, 191, 194, 204, 249 see also Adenine, Thymine, Cystosine, Guanine, Uracil Bayesian statistics, 35 b-sheet, 242-248, 254 Bernoulli random variables, 102, 125 Biochemistry, 2-5 Biosequences, see Databases of DNA sequences; Sequence similarity and comparison; Sequencing methods and technology BLASTA algorithm, 82-84 Booth-Leuker algorithm, 50-51 BRCA1 (breast cancer) gene, 33 C Cancer, 33, 34, 37-42, 58, 91, 183, 196 Catenanes, 205, 212 Cauchy's formula, 136 Cellular structures, 9 Chaperonins, 238-239 Chen-Stein method, 102, 106, 110 Chimeras, 51

OCR for page 277
Page 278 Chirality, 213-215 Chromosomal walking, 17, 18, 42, 43 Clones and cloning, 13, 14, 26, 42-43, 209 Closed circular DNA, 153-154, 155, 156, 157, 181, 204 Coalescent, 117, 119-121 combinatorial structures, 119, 136-148 Ewens sampling formula, 119, 122-124, 136-139 K-allele model, 130-132 likelihood methods, 146-148 tree construction and movement, 124-127 see also, Finitely-many-sites model; Infinitely-many-sites model Codons, 12, 115, 239 Colon cancer, 34, 37-42 Combinatorics, 119, 136-148, 185 Computing time and memory capacity algorithmic efficiencies, 35-36, 84-86, 87 approximate pattern matching, 79, 87 dynamic programming algorithms, 62-63, 64, 68, 83, 84 gap cost functions, 72 heuristic algorithms, 83-84 K-best paths, 77 multiple alignments, 75 parallel processing, 79-81, 84 sublinear similarity searches, 84-85 Consecutive ones property, 50 Consensus scores, 76 Contigs, 47-50 Crick and Watson model, 153, 204-205 Crossovers, 27-29 Cruciforms, 154 Crystallography, 202, 203, 240 Cystic fibrosis (CF), 16-18, 20-21, 26 Cytosine (C), 8, 9, 99 D Databases of DNA sequences, 13, 17, 56, 81, 87 similarity searches in, 78-79, 82-86, 87, 91-92, 94 see also FASTA, BLASTA Dayhoff matrix, 66, 67, 83 Diagnostics, see Genetic diagnostics Difference measures, 72-73 Diffusion processes, 37-42, 148 Dimers, 212 DNA (deoxyribonucleic acid), 8-9, 92 primers, 13, 15, 16 protein binding, 166-167, 168, 170-171, 181 transcription, 9-12, 154, 179, 196-198, 204-205 see also DNA polymorphisms and mutations; Protein folding; Sequence similarity and comparison; Sequencing methods and

OCR for page 277
Page 279 technology; Strand separation and unwinding; Supercoiling DNA polymerases, 8, 16, 154 DNA polymorphisms and mutations, 8-9, 16-17, 26, 30, 34, 57, 106 in evolutionary analysis, 114-135 as markers, 31, 34 minimal cost alignments, 72-73 in mitochondria, 115-116, 117, 118, 148-149 rates of, 66, 67, 116, 117, 124-125 see also Genetic maps and mapping Dot plots, 68, 70 Duplex unwinding elements (DUEs), 183, 194, 195 Dynamic programming algorithm, 60-64, 78, 82, 84, 85, 86, 109, 251 E Edit graphs, 59-61, 68-70, 75 Effective population size, 117 Efficient algorithms, 35-36, 84-86, 87 Electron microscopy, 202, 211, 227 Electrostatic interactions, 251 Energetics, 154, 180, 182, 186-195 Enzymes, 3, 7, 180, 238. see also under names of specific types Eve hypothesis, 116 Evolutionary analysis, 57-58, 90-94 coalescent structures, 117, 119-135, 148-149 common origins, 57, 248 extremal statistical methods, 106-112 minimal cost alignments, 72-73 multiple alignments, 73, 76 random combinatorial structures, 136-148 use of mitochondrial DNA, 57-58, 90-94, 115-116, 117, 148-149 trees, 73, 76, 87, 124, 129, 132, 266 see also Eve hypothesis Ewens sampling formula (ESF), 119, 122-124, 136-139 Extremal statistical methods, 106-112 global sequence comparisons, 94-99 local sequence comparisons, 99-106 F False negatives and positives, 51 Familial adenomatous polyopsis (FAP), 37-38 FASTA algorithm, 82, 83, 84 Fingerprinting methods, 42-47 Finitely-many-sites model, 132-135 Fleming-Viot process, 148

OCR for page 277
Page 280 Foldases, 237-238 Fourier transforms, coefficient, 240 4-plat knots, 215-216, 220, 222 Fractionation, 2-3 Free energy, 154, 180, 182, 186-195 G Gap costs, 70-72, 77-78 Gaussian processes, 41 Gel electrophoresis, 210-211, 227 GENBANK database, 81 Generalized Levenshtein measure, 73, 87 Gene splicing, see Recombinant DNA technology Gene therapy, 18 Genetic code, 12, 239 Genetic diagnostics, 16, 17 Genetic distance, 28-29 Genetic heterogeneity, 34 Genetic maps and mapping, 16, 18-19, 26, 27-30, 51 and incomplete pedigree information, 30, 31, 34-35 markers in, 31 and maximum likelihood estimation, 34-42 and non-Mendelian genetics, 30, 31, 33-34 Genetic markers, 31, 34, 42 Genetics, 5-7 Genotype, 38, 40 Geometry, 166, 203, 210, 211, 220, 223 descriptors and methods, 155-163 see also Topology Global alignment, 5, 58-64, 94-99 maximum-scoring, 63 Graph theory, 46, 51 Guanine (G), 8, 9, 99 H Haldane mapping function, 29, 41 Hierarchical condensation methods, 248-251 Helix, 8, 9, 153 destabilization, 184, 188, 196 Helical periodicity, 154 Heterozygotes, 6, 16, 31 Heuristic algorithms, 82-84 Histones, 154, 175 HIV protease structure, 254-255 Homeomorphisms, 212-213 Homology modeling, 252 Homozygotes, 6, 31 Human Genome Project, 18-22, 26 Hydrophilic side chains, 244, 253, 263 Hydrophobic side chains, 244, 245, 253 Hydrophobicity, 4 I Incomplete penetrance, 31, 33, 34 Independent assortment, 29 Indexing, of databases, 87 Infinitely-many-sites/alleles

OCR for page 277
Page 281 model, 122, 124, 125, 127-130 In vitro assays, 3 Isomerases, 238 K K-allele model, 130-132 K-best alignments, 76-78 kDNA (kinetoplast DNA), 231 Kingman's subadditive ergodic theorem, 97 Knot theory, 212 see also Tangles and knots L Large Deviation Theory of Diffusion Processes, 37-42 Levenshtein measure, 73, 87 LexA binding sites, 198-199 Ligases, 13 Likelihood methods, 34-42, 146-148 Linear DNA, 155, 156 Linking number (Lk), 155, 157-158, 163-164, 173-174, 181 minichromosomes, 175, 177 surface, 167-171, 173-174 topoisomerase reactions, 164-166 Local alignment, 5, 65-70, 99-106 Longest common subsequence, 99 M Macromolecules, 3 Mapping, see Genetic maps and mapping; Physical maps and mapping; Restriction maps; Sequencing methods and technology Markers, see Genetic markers Markov models, processes, 36, 146-147, 249 Maximum likelihood estimation, 34-35 and efficient algorithms, 35-36 and statistical significance, 37-42 Measure-valued diffusions, 148 Membrane-bound transporters, 17-18, 20 Mendelian genetics, 5-7, 27, 31 Minichromosomes, 174-177 Min (multiple intestinal neoplasia) trait, 38-39 Mirror images, 213-215 Mismatch ratio, 86 Mitochondrial DNA (mtDNA), 115-116, 117, 118, 135, 148-149, 204 Molecular biology, overview, 7-12 Möbius, 143, 181 Monte Carlo methods, 146-147, 149, 241 Morgans, 28 mRNA (messenger RNA), 9, 12, 92 Multiple alignments, 73-76 Multiple minima problem, 241 Mutation, see DNA polymorphisms and mutations Myoglobin, 265-266

OCR for page 277
Page 282 N Native American population studies, 116, 117 Neighborhood concept, 83 Neural networks, 259-263 Nonadditive scoring schemes, 87 Nuclear magnetic resonance (NMR), 203, 240 Nucleic acids, 3 Nucleosomes, 154, 166, 174-177 Nucleotides, 8, 57, 118, 204 distances, 29, 81 O Oncogenes, 58, 91, 196 Ornstein-Uhlenbeck process, 41 Overwinding, 154 P Packing density, 252 Palindromes, 87 Papilloma virus, 196, 199-200 Parallel computing, 79-81, 84, 87 Penetrance, 31, 33, 35 Phenocopy, 34 Phenotype, 38, 40 Phylogeny, 73, 76, 87 see also evolutionary trees Physical maps and mapping, 17, 19, 26, 29 fingerprinting methods, 42-47 PIR database, 81 PLANS (Pattern Language for Amino and Nucleic Acids Sequences), 263-264 Platelet-derived growth factor (PDGF), 91 Plectonemic forms, 154, 156, 169, 170, 215-216 Poisson distributions, 144 see also Boltzmann equation, 254; Dirichlet distribution, 144 in coalescent trees, 121, 124-127 in sequence comparisons, 29, 100-104, 108-110 Poly-adenylation, 196 Polygenic inheritance, 34 Polymerase chain reaction (PCR), 13, 15, 16, 46 Polymorphism, see DNA polymorphisms and mutations Polyoma virus, 196 Primers, 13, 15, 16 Principle of optimality, 63 Probabilistic combinatorics, 136 Processing time, see Computing time and memory capacity Protein folding, 5, 12, 236-248 hierarchical condensation methods, 248-251, 256-265 prediction of, 5, 254-255, 265-266 threading methods, 248-254 Proteins, 3-5, 7-8, 57, 92 see also Amino acids; Protein folding; Sequence similarity and comparison Public databases, see Databases of DNA sequences Pure breeding, 5 Purines (R), 99, 117, 200

OCR for page 277
Page 283 Pyrimidines (Y), 99, 117, 118, 123, 128, 200 Q QUEST systems, 87 R Rational tangles, 218-221, 228-229 RecA binding, 198-199, 211, 227 Recessive traits, 16 Recombinant DNA technology, 13-16, 17 Recombination, 27-28, 205, 213, 225-230 frequency, 28-30, 31, 35 site-specific, 207-212, 222-225 Replication processes, 92, 154, 179-180, 183, 204 Resolvase, 213, 225-230 Restriction enzymes, 13 Restriction fragment lists, 45-46 Restriction maps, 44-45, 87 R-group, 237 Ribosomes, 9, 10, 12, 92 RNA (ribonucleic acid), 9, 179, 196, 237 evolutionary analysis, 92-93, 106-107, 110-112 polymerase, 9 rRNA, 92, 93, 106, 107, 110, 112 see also mRNA, tRNA, 11 Rule-based methods, 263-264 S Scoring schemes gap cost penalties, 70-72 global alignments, 59-64 K-best alignments, 76-78 local alignments, 65-68 minimal cost alignments, 72-73 multiple alignments, 74-76 nonadditive, 87 unit-cost, 58-59, 86 Sedimentation rate, 100 Self-replication, 92 Sequence similarity and comparison, 56-58, 86-87, 91, 199 approximate pattern matching, 78-79, 86 database searches, 78-79, 82-86, 87, 91-92, 94 difference measures, 72-73 in evolutionary analysis, 57-58, 72-73, 76, 90-94, 106-112, 115 gap cost penalties, 70-72 global alignment, 5, 58-64, 94-99 heuristic algorithms, 82-84 K-best alignments, 76-78 local alignment, 5, 65-70, 99-106 multiple alignments, 73-76 parallel computing, 79-81, 84, 87 sublinear, 84-86 Sequence tagged sites (STSs), 46, 47-53

OCR for page 277
Page 284 Sequencing methods and technology, 13, 17, 19, 26, 81 error detection and correction, 73 shotgun method, 43-44 Sex chromosomes, 26 Shotgun method, 43-44 SIMD (single-instruction, multiple-data) computers, 80 Site-specific recombination, 207-212, 222-225 Smith-Waterman algorithm, 66, 68, 83, 84, 109 Solvent-accessible contact areas, 252-253 SOS genes, response, 183, 198-199, 200 Statistics of coverage, 46-51 Stochastic processes, 26, 48 coalescent structures, 119-135, 148-149 combinatorial structures, 119, 136-148 likelihood methods, 146-148 Storage capacities, see Computing time and memory capacity Strand separation and unwinding, 8, 179-180, 181-184, 219-220 energy states, 154, 180, 182, 186-195 site prediction, 184-186, 196-200 Stress responses, 183, 198-199, 200, 204 Strong law of large numbers (SLLN), 97, 98, 100 Sublinear similarity searches, 84-86 Sum-of-pairs scores, 76 Supercoiling processes, 153-163 closed curves, 153-154, 155, 156, 157, 181, 204 nucleosomes, 154, 174-177 topoisomerase reactions, 163-166 see also Strand separation and unwinding; Superhelicity Superhelicity, 162, 181-183, 193 Surface linking number (Slk), 167-171, 173-174 Synapsis, 207-209, 223-225, 226-227 Systolic arrays, 80-81, 83-84 T Tangles and knots, 204-207, 211, 212-222 gel mobility, 231-232 recognition, 230-231 site-specific recombination models, 222-225 Threading methods, 248-254 Thymine (T), 8, 99 Topology, 155, 166-167, 168, 170-171, 203-204, 205, 207, 244, 247 of strand separation, 180, 181-184, 219-220

OCR for page 277
Page 285 surface linking number, 167-171, 173-174 tangles and knots, 204-207, 211, 212-225, 230-231 see also Geometry Topoisomerase, 164-166, 175 Toroidal surfaces, 155, 168, 170, 182, 228-229, 231 Traceback procedures, 64 Transcription processes, 9-12, 154, 179, 196-198, 204-205 Transitions, 117, 185-186, 188, 190, 194-195 Trivial tangles, 218-219 tRNA (transfer RNA), 92, 93, 106, 107, 110-112 t-test, 40-41 Twist (Tw), 157, 159-160, 162, 164, 173-174 topoisomerase reactions, 164-166 U Underwinding, 154, 164 Unit-cost scoring scheme, 58-59, 86 Unwinding, see Strand separation and unwinding Uracil (U), 9, 12 V Variable population size processes, 148-149 Virtual surfaces, 17-171 Vitalism, 3 VLSI (very large scale integration) chips, 80-81 v-sis oncogene, 58, 91 W Winding number, 167, 171-172, 173-174 Writhe (Wr), 157, 159, 160, 161, 162, 164 topoisomerase reactions, 164-166 X X-ray crystallography, 203, 240 Y YAC (yeast artificial chromosomes) libraries, 46-47, 53 Z z-DNA, 154

OCR for page 277