APPENDIX

B

Participant Biographies

Bill Andersen is chief technology officer at Knowledge Bus, Inc. The company is working with the European Media Laboratory Scientific Databases and Visualization Group in Heidelberg, Germany, on the creation of an ontology comprising knowledge of biochemical pathways. This ontology will be used to support multiple activities, including database generation, visualization, simulation, and natural-language processing of textual research reports. This work is being done in collaboration with ZMBH, EMBL, and Lion Bioscience AG and has as its initial goal the comprehensive analysis of Mycoplasma pneumoniae. Mr. Andersen's work has been primarily in artificial intelligence and databases. His graduate work at the University of Maryland was on parallel algorithms for frame-based inference systems and on management of large knowledge-based systems. Starting in 1995, while working for the US Department of Defense, he began work on the the automatic generation of databases from computational ontologies, leading eventually to the founding of Knowledge Bus, Inc., in 1998 to commercialize the technology. Mr. Andersen has a BA in Russian language and a BS in computer science from the University of Maryland. He is currently working on his PhD in computer science at the University of Maryland.

James Bower is professor of Biology at California Institute of Technology. His laboratory created and continues to support the GENESIS neural simulation system, which is one of the two leading software systems used around the world to construct biologically realistic neural models at lev-



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 33
Bioinformatics: Converting Data to Knowledge APPENDIX B Participant Biographies Bill Andersen is chief technology officer at Knowledge Bus, Inc. The company is working with the European Media Laboratory Scientific Databases and Visualization Group in Heidelberg, Germany, on the creation of an ontology comprising knowledge of biochemical pathways. This ontology will be used to support multiple activities, including database generation, visualization, simulation, and natural-language processing of textual research reports. This work is being done in collaboration with ZMBH, EMBL, and Lion Bioscience AG and has as its initial goal the comprehensive analysis of Mycoplasma pneumoniae. Mr. Andersen's work has been primarily in artificial intelligence and databases. His graduate work at the University of Maryland was on parallel algorithms for frame-based inference systems and on management of large knowledge-based systems. Starting in 1995, while working for the US Department of Defense, he began work on the the automatic generation of databases from computational ontologies, leading eventually to the founding of Knowledge Bus, Inc., in 1998 to commercialize the technology. Mr. Andersen has a BA in Russian language and a BS in computer science from the University of Maryland. He is currently working on his PhD in computer science at the University of Maryland. James Bower is professor of Biology at California Institute of Technology. His laboratory created and continues to support the GENESIS neural simulation system, which is one of the two leading software systems used around the world to construct biologically realistic neural models at lev-

OCR for page 33
Bioinformatics: Converting Data to Knowledge els of scale from subcellular to systems. As part of the GENESIS project, Dr. Bower's research group has been developing software tools to facilitate access of modelers to the data on which their models depend and access of nonmodelers to model-based analysis of their systems. The GENESIS project also involves a significant educational component, which now forms the basis for many courses in computational neuroscience around the globe. Overall, the GENESIS project is intended to provide a new mechanism for scientific communication and collaboration involving both models and data. Dr. Bower's laboratory has also been involved in the development of silicon-based neural probes for large-scale multineuron recording procedures. These data are critical for the evaluation of network models of nervous-system function. Dr. Bower has a BS in zoology from Montana State University and a PhD in neurophysiology from the University of Wisconsin, Madison. He was a postdoctoral fellow at New York University and at the Marine Biological Laboratory in Woods Hole. He has been at California Institute of Technology since 1993. Douglas Brutlag is director of the Bioinformatics Resource at Stanford University School of Medicine and professor of biochemistry and medicine at Stanford University. Dr. Brutlag's group works in functional genomics, structural genomics, and bioinformatics. They develop methods that can learn conserved structures, functions, features, and motifs from known protein and DNA sequences and use them to predict the function and structures of novel genes and proteins from the genomic efforts. The group uses statistical methods and machine learning to discover first principles of molecular and structural biology from known examples. They are also interested in predicting the interactions between ligands and proteins and between two interacting macromolecules and are actively studying the mechanisms of ligand-protein and protein-protein docking. Their research approach uses a variety of different representations of sequences and structures. Multiple representations of sequences include simple motif consensus sequence patterns, parametric representations, probabilistic techniques, graph theoretic approaches, and computer simulations. Much of the work consists of developing a new representation of a structure or a function of a macromolecule, applying the methods of machine learning to this representation, and then evaluating the accuracy of the method. The group has developed novel representations of sequence correlations that have predicted amino acid side-chain interactions that stabilize protein strands and helices. They have developed novel algorithms for aligning sequences that give insight into the secondary structure of proteins and developed novel methods for discovering both sequence and structural motifs in proteins that help establish

OCR for page 33
Bioinformatics: Converting Data to Knowledge semantics of protein structure and function. Dr. Brutlag obtained his PhD in biochemistry from Stanford University in 1972. Michael Cherry is head of the Genome Databases Group at the Department of Genetics, Stanford University; School of Medicine Project manager and head curator, Saccharomyces Genome Database; principal investigator, Arabidopsis thaliana Database; computing manager, Stanford DNA Microarray Database; and co-principal investigator, Arabidopsis Functional Genomic Consortium. His group at Stanford is involved with bioinformatics and computational genomics. The group provides two resources: the Saccharomyces Genome Database and the Stanford Microarray Database. It provided the Arabidopsis thaliana Database until September 1999. he genome databases are service projects for the scientific community that collect, maintain, and distribute information. The group also creates software tools and adds value to these Web resources via curation. The group is involved in various analyses of genomes and their gene products. The third major project is on DNA expression microarrays. It is working with Stanford laboratories on yeast, human, mouse, E. coli, C. elegans, and Arabidopsis microarrays. Dr. Cherry's interests are in integrating and facilitating the analysis of the vast amounts of information in genome and microarray databases. *Susan B. Davidson is professor of Computer and Information Science and co-director of the Center for Bioinformatics at the University of Pennsylvania, where she has been since 1982. She got her BS in mathematics at Cornell University (1978) and her PhD in electrical engineering and computer science at Princeton University (1982). Jointly with G. Christian Overton, Val Tannen, Peter Buneman, and Limsoon Wong at Penn, she has developed BioKleisli, a system for integrating biomedical databases that is being used within the Tambis project at the University of Manchester and for several projects in SmithKline Beecham pharmaceuticals. Her current research projects include techniques for automating the development, annotation, and refreshing of biomedical-data warehouses and the use of high-speed networks to connect Mouse Brain Atlas image data with genomic data. *David Eisenberg is director of the UCLA-Department of Energy Laboratory of Structural Biology and Molecular Medicine and professor of chemistry and biochemistry in the Department of Biological Chemistry, UCLA. Following a thread of discovery from his earlier work on sequence families and assignment of protein sequences to 3D folds, he is now concen- * Planning Group Members

OCR for page 33
Bioinformatics: Converting Data to Knowledge trating on assigning genome sequences to biologic functions. The new methods that he and his co-workers developed go beyond traditional sequence similarity; they depend on correlation of other properties: correlated inheritance of proteins into species, correlated fusion of domains into single protein chains, and correlated mRNA expression patterns. These methods are intended to guide, complement, and interpret experiments. When applied to whole sequenced genomes, these methods show astonishing power for identifying protein functions, protein pathways, and networks of function. His honors include a Rhodes Scholarship, an Alfred P. Sloan Fellowship, a Guggenheim Fellowship, National Academy of Sciences membership, American Academy of Arts and Sciences membership, the Protein Society Stein and Moore Award, the Pierce Award of the Immunotoxin Society, a Repligen Award in Molecular Biology, Biophysical Society fellowship, and the Amgen Award of the Protein Society. *David Galas is chief academic officer of the Keck Graduate Institute in Claremont, CA. He was formerly president and chief scientific officer of Seattle-based Chiroscience R |andsymbol| D, Inc., one of the first biotechnology companies to assemble a full gene-to-drug discovery program. Previously, Dr. Galas served as director for health and environmental research at the US Department of Energy, where he headed the Human Genome Project from 1990 to 1993. He also served as professor of molecular biology at the University of Southern California, where he directed the molecular-biology section for 5 years. Dr. Galas earned his PhD in physics from the University of California, Davis-Livermore in 1972. Daniel Gardner is professor of Physiology and Physiology in Neuroscience at Cornell University. Dr. Gardner has just published the first comprehensive description of a datastructure for neurobiologic databases. In collaboration with cortical neurophysiologists at several institutions, he is also developing an Internet-accessible database called the Cortical Neuron Database. It will contain electrophysiologic and other information describing cortical neurons and their characteristic responses to somatosensory and other stimuli. Dr. Gardner is using a Common Data Model, designed to serve the needs of interoperability between disparate neuroscience data resources throughout the Human Brain Project and beyond. In addition, he heads the development of the Aplysia database project. James Garrels is cofounder of Proteome, Inc. Dr. Garrels spent 17 years at the Cold Spring Harbor Laboratory, where his group developed the * Planning Group Members

OCR for page 33
Bioinformatics: Converting Data to Knowledge QUEST system for two-dimensional gel electrophoresis and computer analysis. This was a leading-edge facility in a field that is now called proteomics. In 1995, he cofounded Proteome, Inc. with his wife, Dr. Brooks. They have built a growing business around the production of highly annotated proteome databases using genomic and literature sources. They have comprehensively curated proteome databases for yeast and worm (C. elegans), with more species on the way. William Gelbart is professor of Molecular and Cellular Biology at Harvard University. Dr. Gelbart is also a scientific adviser to the Genomes Division of the National Center for Biotechnology Information and an external adviser to the National Human Genome Research Institute (NHGRI) large-scale human and mouse genome-sequencing projects. Since its inception 7 years ago, Dr. Gelbart has been the principal investigator of FlyBase, the NHGRI-funded database of the genome and genetics of the fruit fly Drosophila. Among its other duties, the roughly 30-person FlyBase group is responsible for maintaining the annotation of the soon-to-be-released full sequence of the Drosophila melanogaster genome. In addition, it maintains a thorough curation of the Drosophila literature and through collaborations with other databases is involved in many projects to provide a rich set of links and relationships with information from other biologic systems. Such database interoperability is one of the major issues facing bioinformatics, and FlyBase is heavily involved in exploring this area. Dr. Gelbart obtained his PhD in 1971 from the University of Wisconsin. He did his postdoctoral work at California Institute of Technology and the University of Connecticut. Peter D. Karp is senior computer scientist and director of the Bioinformatics Group at SRI International. His bioinformatics work has focused on metabolic-pathway bioinformatics and on biologic databases. He is the bioinformatics architect of EcoCyc and of MetaCyc. MetaCyc is a multispecies metabolic-pathway database. EcoCyc is a pathway-genome database for E. coli that integrates information about its full metabolic-pathway complement and its genome. Those data are combined with a powerful graphical user interface. EcoCyc is the first database to describe the full metabolic map of a free-living organism. Dr. Karp has also developed novel algorithms for predicting the metabolic map of an organism from its genome. His work on databases has included development of the object-oriented database system that underlies EcoCyc and work in the area of interoperation of heterogeneous biologic databases. He has organized two workshops in this area and has written several publications on database interoperation. Dr. Karp earned his PhD in

OCR for page 33
Bioinformatics: Converting Data to Knowledge computer science from Stanford University in 1989. He was a postdoctoral fellow at the NIH National Center for Biotechnology Information. *Richard M. Karp is a professor of Computer Science and Engineering at the University of California, Berkeley and a senior research scientist at the International Computer Science Institute in Berkeley. He received his AB (1955), SM (1956), and PhD (1959) from Harvard University. He has worked at IBM Research (1959-1968), Berkeley (1968-1994, 1999-present), and the University of Washington (1995-1999). The unifying theme in his work has been the study of combinatorial algorithms. He has worked on NP completeness, parallel algorithms, probabilistic analysis of algorithms, randomized algorithms, and on-line algorithms. His current research is concerned with strategies for sequencing genomes, the analysis of gene-expression data, and other combinatorial problems arising in molecular biology. He has received the US National Medal of Science, the Harvey Prize (Technion), the Turing Award (Association for Computing Machinery), the Centennial Medal (Harvard University), the Fulkerson Prize (American Mathematical Society and Mathematical Programming Society), the von Neumann Theory Prize (Operations Research Society of America and the Institute for Management Science), the Lanchester Prize (Operations Research Society of America and the Institute for Management Science), the Von Neumann Lectureship (Society for Industrial and Applied Mathematics), and the Distinguished Teaching Award (University of California, Berkeley). He is a member of the National Academy of Sciences, the National Academy of Engineering, and the American Philosophical Society and a fellow of the American Academy of Arts and Sciences. He holds four honorary degrees. Stephen H. Koslow is director of the Office on Neuroinformatics and associate director of the National Institute of Mental Health (NIMH). From 1990 to 1999, he served as the director of the NIMH Division of Neuroscience Research. Before that he worked in the NIMH Intramural Research Laboratories and in the extramural programs, where he established the first neuroscience research program. Dr. Koslow serves as the chair of a Neuroinformatics Working Group of the OECD Megascience Forum and as a cochair of the EC-US Neuroinformatics Committee. He has received numerous awards in recognition of his accomplishments, serves on the editorial boards of numerous neuroscience journals, and is a consultant to a number of private organizations and businesses. He received his BS from Columbia University and his PhD in pharmacology from the University of Chicago. * Planning Group Members

OCR for page 33
Bioinformatics: Converting Data to Knowledge John Mazziotta is professor of Neurology, Radiological Sciences, and Pharmacology at UCLA; director of the division of brain mapping; and associate director of the Neuropsychiatric Institute. He runs the largest consortium of the Human Brain Project and is constructing a probabilistic database for brain imaging. *Perry L. Miller is director of the Center for Medical Informatics and professor of anesthesiology at Yale University School of Medicine. He has been involved in a number of research projects involving clinical informatics, including work on computer-based clinical-decision support, network-based clinical information access, informatics in support of clinical research, and work as part of the Next Generation Internet initiative. He collaborates with several colleagues at Yale doing neuroinformatics research as part of the national Human Brain Project. He has also collaborated for many years with various researchers to build databases and informatics tools in support of genetics and genomics. Dr. Miller received his PhD in computer science from Massachusetts Institute of Technology and his MD from the University of Miami. G. Christian Overton was the director of the Center for Bioinformatics at the University of Pennsylvania. He held dual appointments as associate professor at University of Pennsylvania in the Departments of Genetics and Computer and Information Sciences. His work focused on annotation of the human genome through computational analyses and database integration. Database integration, which remains one of the more formidable challenges facing bioinformatics, enables access to vertical information within a species (genome, transcriptome, and proteome information) and horizontally across species to identify orthologous relationships. Dr. Overton received his PhD in biophysics from Johns Hopkins University in 1978. He did his postdoctoral work at the Wistar Institute in developmental biology and earned a master's degree in computer and information science at the University of Pennsylvania. After spending 5 years in the computer industry, he returned to academe to participate in the Human Genome Project. Brian Ray is senior editor of Science and editor of the Signal Transduction Knowledge Environment (STKE). Dr. Ray is responsible for the selection and editing of research papers in signal transduction, the cell cycle, and cell biology. His interest and experience in bioinformatics have developed from his role in the design and implementation of Science's Signal * Planning Group Members

OCR for page 33
Bioinformatics: Converting Data to Knowledge Transduction Knowledge Environment. The Knowledge Environment is a resource for scientists that uses the World Wide Web to provide efficient access to multiple kinds of information, including a large database of information on signaling molecules and their interactions. Dr. Ray earned his bachelor's degree from the University of California, Berkeley and his PhD from the University of Virginia. He did postdoctoral research with Tom Sturgill. Dr. Ray is best known for his work in the discovery of mitogen-activated protein (MAP) kinase, now known to be a member of a class of enzymes that participate in regulation of a broad range of cellular processes from cell division to cell death. Dong-Guk Shin is professor of Computer Science and Engineering at the University of Connecticut. Dr. Shin's research interests include database interoperability, knowledge discovery from databases, and graphical user interface design for databases. For the last few years, Dr. Shin has been leading a number of research projects related to bioinformatics through funding from National Institutes of Health, National Science Foundation and Department of Energy. He has developed a user-friendly graphical ad hoc query interface that enables computational biologists to quickly learn and examine public genome database schemata and produce semantically correct SQL queries graphically. He has also been developing a graphical data-flow editor that computational biologists can use to integrate a series of data analysis and database querying activities into one seamless data flow. Recently, Dr. Shin has been expanding his previous graphical query editor work so that it can allow computational biologists to express queries against GenBank in any manner they wish. Another current project of Dr. Shin is to develop a database including physiology models, cell images, and biochemical and electrophysiologic data to support the Virtual Cell framework. In 1999, Dr. Shin was the recipient of the University of Connecticut's Chancellor's Information Technology Award. Dr. Shin holds MSE (1981) and PhD (1985) degrees in computer science and engineering from the University of Michigan, Ann Arbor. He joined the University of Connecticut faculty in 1986. During the 1993-1994 academic year, he was a visiting faculty member at the Genome Data Base at Johns Hopkins University. *Ray White is the Thomas D. Dee II Professor of Human Genetics at the University of Utah, founding director and senior director of science at the Huntsman Cancer Institute, and chair of the Department of Oncological Sciences at the University of Utah. His research is directed toward the * Planning Group Members

OCR for page 33
Bioinformatics: Converting Data to Knowledge indentification and characterization of genes associated with inherited cancer syndromes. In the early 1980s, his work was instrumental in clarifying the genetic mechanism underlying development of retinoblastoma, an inherited cancer of the eye; his concept provided a paradigm for a class of genes that have come to be called tumor suppressors. Honors have included the 1993 Rosenblatt Prize for Excellence from the University of Utah, the Rosenthal Foundation Award from the American Association for Cancer Research, the Charles S. Mott Prize for Cancer Research from the General Motors Foundation, the National Medical Research Award from the National Health Council, the Distinguished Research Award from the University of Utah, the Allan Award for Cancer Research from the American Society of Human Genetics, the Friedrich von Recklinghausen Award from the National Neurofibromatosis Foundation, and the Brandeis University Lewis S. Rosenstiel Award for Distinguished Work in Basic Medical Sciences. Dr. White earned a BS from the University of Oregon and a PhD from the Massachusetts Institute of Technology. He pursued postdoctoral study at Stanford University and was a member of the faculty of the University of Massachusetts School of Medicine at Worcester before going to Utah in 1980. Gio Wiederhold is professor of Computer Science at Stanford University, with courtesy appointments in the Departments of Medicine and Electrical Engineering. Dr. Weiderhold's current research focus is on gaining precision in integration of information from multiple autonomous sources. Issues addressed in that domain include the resolution of semantic inconsistencies, effective delegation of processing to remote nodes, simulation for augmentation of results, and providing security and privacy in collaborative settings. His early research contributions included development of real-time data-acquisition technology for medical research (1966), time-oriented databases for ambulatory care (1972), the initiation of knowledge-based research (1977), and the concept of mediated architectures for information integration (1990). During a 3-year leave at ARPA/DARPA, Dr. Wiederhold initiated programs in intelligent information integration and participated actively in the establishment of the National Science Foundation Digital Library initiative. Recent research results include an approach to protect the release of private data in settings where broad access must be granted. Dr. Wiederhold was educated in the Netherlands, started programming there in 1957, and came to the United States in 1958. In 1965, he joined Stanford to direct a computing project for professors Feigenbaum and Lederberg. He obtained a PhD in medical information science from the University of California, San Francisco, and joined the Stanford faculty in 1976. He has been elected a fellow of the ACMI, the IEEE, and the ACM.