of polymorphisms for those genes, and to foster epidemiological studies of gene-environment interactions in disease etiology. As discussed later in this chapter, evidence is particularly strong for increased chemical susceptibilities of individuals with polymorphisms of genes encoding drug-metabolizing enzymes (DMEs). EGP is using genomic technologies, such as high-throughput sequence analysis developed for the HGP, which will facilitate epidemiological studies of gene-environment interactions in disease.

The type of genomic technologies used in CGAP and EGP, and many of the data themselves, are anticipated to have an enormous impact on future research in developmental toxicology. Ideally, a developmental toxicology counterpart to such programs as CGAP and EGP would focus on obtaining data on gene expression at all times and in all tissues during normal development. Comparisons then could be made between embryos and fetuses from pregnant control (normal) animals and pregnant treated animals to look for differences in gene expression. Changes could be identified in the expression of genes encoding proteins known to function in cell signaling, transcriptional regulation, cell division, cell motility, cell adhesion, apoptosis, differentiation, metabolism, repair, electrolyte balance, homeostasis, or transport. Such studies will help to elucidate mechanisms by which extrinsic chemicals (potential developmental toxicants) act as agonists or antagonists to receptor- and enzyme-mediated subcellular processes during embryogenesis and fetal development. Such research will be a major force in merging the fields of developmental biology, genomics, and developmental toxicology.

Management of Genome Sequence and Functional Genomics Data

The explosion of molecular biology in the past two decades has led to enormous advances in DNA sequencing, which in turn has led to the increasingly rapid identification of genes as ORFs and as EST sites and the identification of the function of gene products by sequence motifs (e.g., homeodomains, zinc-finger domains, kinase domains, and SH2 and SH3 domains). There are four major nucleotide sequence databases: GenBank and the Genome Sequence Database (GSDB) in the United States, the European Molecular Biology Library (EMBL), and the DNA Data Bank of Japan (DDBJ). All groups exchange new and updated sequences electronically and usually on the same day of submission.

There is a two-decade history of these databases. The goal of the Los Alamos Sequence Library in 1979 at the Department of Energy (DOE)-sponsored Los Alamos National Laboratory (LANL) was to store DNA sequence data in electronic form. Within the same year, a similar database was also established at the EMBL in Heidelberg. In 1982, it was agreed that any data submitted or entered by one group would be forwarded immediately to the other, thereby avoiding duplication of effort. In 1982, the LANL database became GenBank when Bolt, Beranek and Newman (BBN) became the primary contractor for distribution of



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement