Appendix C

The Rapidly Expanding Field of
“—Omics” Technologies

Technologic advances in “-omics” technologies—especially in the genomics, proteomics, metabolomics, bioinformatics, and related fields of the molecular sciences (referred to here collectively as panomics)—have transformed the understanding of biologic processes at the molecular level and should eventually allow detailed characterization of molecular pathways that underlie the biologic responses of humans and other organisms to environmental perturbations. The following sections discuss recent advances in –omics technologies and approaches. They also discuss some of the implications of –omics technologies for the US Environmental Protection Agency (EPA), areas in which EPA is at the leading edge of applying the technologies to address environmental problems, and the areas in which EPA could benefit from more extensive engagement.

GENOMICS

Beginning in the late 1990s, the Human Genome Project (DOE 2011) ushered in an unprecedented leap in technologies that allow scientists to discern the fundamental sequences of genes of entire genomes—not only the human genome but a plethora of model organisms, such as plants, microorganisms, invertebrates, vertebrates, and even the long-extinct woolly mammoth (Miller et al. 2008; NHGRI 2012). The ability to derive, quickly and relatively inexpensively, the entire sequence of an organism’s genome provides unprecedented opportunities in biologic and ecologic sciences, including the opportunity to understand how environmental factors influence biology at the molecular level.

The Human Genome Project fueled the development of faster and less expensive DNA sequencing. So called first-generation sequencing technologies, originally described by Sanger and Coulson (1975), have served as the primary technology for DNA sequencing for the last several decades, with estimated costs of $3 billion to sequence the human genome (NHGRI 2010; Woollard et al. 2011). Large-scale sequencing projects based on several next-generation se-



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 215
Appendix C The Rapidly Expanding Field of "Omics" Technologies Technologic advances in "-omics" technologies--especially in the genom- ics, proteomics, metabolomics, bioinformatics, and related fields of the molecu- lar sciences (referred to here collectively as panomics)--have transformed the understanding of biologic processes at the molecular level and should eventually allow detailed characterization of molecular pathways that underlie the biologic responses of humans and other organisms to environmental perturbations. The following sections discuss recent advances in omics technologies and ap- proaches. They also discuss some of the implications of omics technologies for the US Environmental Protection Agency (EPA), areas in which EPA is at the leading edge of applying the technologies to address environmental problems, and the areas in which EPA could benefit from more extensive engagement. GENOMICS Beginning in the late 1990s, the Human Genome Project (DOE 2011) ush- ered in an unprecedented leap in technologies that allow scientists to discern the fundamental sequences of genes of entire genomes--not only the human ge- nome but a plethora of model organisms, such as plants, microorganisms, inver- tebrates, vertebrates, and even the long-extinct woolly mammoth (Miller et al. 2008; NHGRI 2012). The ability to derive, quickly and relatively inexpensively, the entire sequence of an organism's genome provides unprecedented opportuni- ties in biologic and ecologic sciences, including the opportunity to understand how environmental factors influence biology at the molecular level. The Human Genome Project fueled the development of faster and less ex- pensive DNA sequencing. So called first-generation sequencing technologies, originally described by Sanger and Coulson (1975), have served as the primary technology for DNA sequencing for the last several decades, with estimated costs of $3 billion to sequence the human genome (NHGRI 2010; Woollard et al. 2011). Large-scale sequencing projects based on several next-generation se- 215

OCR for page 215
216 Science For Environmental Protection: The Road Ahead quencing technologies can now be conducted faster and less expensively than was possible with previous generations of technologies. Next-generation se- quencing technologies are substantially different from those based on the origi- nal Sanger method (Box C-1) and promise remarkable increases in sequencing capabilities. Next-generation sequencing instruments have made it possible to sequence huge amounts of DNA quickly, thoroughly, and affordably and have opened opportunities to study a wide array of biologic questions, from the metagenom- ics of water, to characterization of the genetic basis of species differences in response to environmental insults, to human variability in susceptibility to envi- ronmentally related diseases. Third-generation sequencing promises to provide full genome sequencing of individuals (humans or other organisms) for less than $1,000 per genome by the end of 2013 (Valigra 2012), and at least one company already offers such services at about $5,000 per genome (Knome 2012). TRANSCRIPTOMICS The sequencing of the human genome, and of the genomes of hundreds of other model organisms of great importance for human and environmental health constitutes an enormous step forward in understanding genetic origins of dis- ease, genetic variability, evolutionary biology, and many other subjects of scien- tific relevance to EPA. However, from a biologic perspective, it is the expres- sion of the genes in specific cells and tissues that ultimately defines an organism and how it responds to its environment. Thus, measuring the extent of gene ex- pression at a given time in a particular cell or tissue is potentially even more informative of biologic mechanisms. The universe of small RNA molecules that are transcribed from DNA and that are present in a cell or tissue at any given time is referred to as the transcriptome. In the last 2 decades, new tools have been developed that allow one to analyze the entire transcriptome in a cell or tissue and to study changes in gene expression that might be created by changes in the environment, such as exposure to a chemical. There are now microarray methods that allow for the analysis of virtually all mRNA molecules that are transcribed from active genes. Typically, these arrays contain hundreds of thou- sands of unique features that quantitatively identify the amount of a particular mRNA transcript in the sample. Having multiple features that can use the array to look at different parts of a single gene, such as different exons or exonintron boundaries (potential splice sites), provides a remarkable snapshot of what genes are functioning in a cell at a particular time. To study complex and common diseases that may be influenced by envi- ronmental factors (such as cardiovascular disease and cancer), human studies typically require high-quality DNA from thousands of patients, often from small quantities of tissues or blood. Several common commercial microarrays for RNA applications in studies of this sort have been available for more than a dec- ade and measure the expression of individual genes. However, understanding the human transcriptome is much more complex than simply measuring the com-

OCR for page 215
Appendix C 217 plement of mRNAs from the genome because alternative splicing1 is common and contributes largely to protein and functional diversity in humans and other higher organisms (Xu et al. 2011). Technologies for measuring mRNA tran- scripts in all their varieties, including alternatively spliced transcripts and copy- number variants, have grown rapidly in the last few years. For example, a new approach called the Glue Grant Human Transcriptome Array completes a com- prehensive analysis of the human transcriptome using a 6.9 millionfeature oli- gonucleotide array. The array assesses gene-level and exon-level expression by using high-density tiling of probes that cover a large collection of transcriptome. It can also detect alternative splicing and can analyze noncoding transcripts and common variants (such as single nucleotide polymorphisms) of genes (so called cSNPs) (Xu et al. 2011). This technology was recently used in a multicenter clinical program that produced high-quality reproducible data (Xu et al. 2011). It is an example of the rapid change in technologies in the -omics world and will increasingly provide new approaches to understanding how environmental fac- tors influence the development of common diseases. Such technologies will also have many applications in the fields of microbial genomics, evolutionary biol- ogy, and other areas of interest to EPA. BOX C-1 Comparison of Sanger and Next-Generation Sequencing (NGS) The initial preparation of the DNA sample is more labor intensive for NGS than for Sanger, but the amount of sequence data obtained per sample is substantially more. The number of sequencing reads from a single instrument per run is of the order of thousands with Sanger, but millions to billions with NGS; for ex- ample, a bacterial genome can be sequenced in a single run in days using NGS, versus months using Sanger sequencing. Read lengths from Sanger sequencing are up to 900 [base pairs], but in NGS vary from 30 to 500 [base pairs] depending on the platform. DNA sequencing costs have been driven down by NGS and base pair per dollar costs show a consistent 19-months doubling time reduction for Sanger sequencing. For NGS, the equivalent figure is approximately 5- months doubling time cost reduction. NGS can detect somatic mutations at [less than or equal to] 1%, whereas Sanger sequencing has significantly less sensitivity. The greater versatility of NGS is illustrated in generating whole- genome datasets, such as miRNA and ChIP-Seq; Sanger sequencing lacks this capability. Abbreviations: ChIP-Seq, chromatin immunoprecipitation sequencing; miRNA, micro RNA; NGS, next-generation sequencing. Source: Woollard et al. 2011. 1 Alternative splicing "the process by which individual exons of pre-mRNAs are spliced to produce different isoforms of mRNA transcripts from the same gene" (Xu et al. 2011).

OCR for page 215
218 Science For Environmental Protection: The Road Ahead PROTEOMICS Proteomics is the study of the entire complement of proteins in a cell or tissue--the proteome. The proteome is much more complicated than the genome because the proteome differs from cell to cell and from time to time, whereas the genome of an organism is largely unchanged between cells and over time. Fur- thermore, most proteins in a cell undergo posttranslational modifications (for example, phosphorylation, glycosylation, methylation, and ubiquination), which can result in several functional forms of the same protein. The proteome is po- tentially far more informative than the genome with respect to environmental response. Measuring and understanding changes in the proteome after environ- mental perturbations are therefore increasingly important in many fields of envi- ronmental science and engineering. Proteomic technologies and approaches will have an increasingly important role in environmental monitoring and health risk assessment of relevance to EPA. For example, proteome-based biomarkers may be useful in deciphering the associations between pesticide exposure and cancer and will perhaps lead to potential predictive biomarkers of pesticide-induced carcinogenesis (George and Shukla 2011). Proteomics has been used to explore "a multitude of bacterial processes, ranging from the analysis of environmental communities [and the] identification of virulence factors to the proteome-guided optimization of production strains" (Chao and Hansmeier 2012). Proteomics has become a valuable tool for the global analysis of bacterial physiology and pathogenicity, although many chal- lenges remain, especially in the accurate prediction of phenotypic consequences based on a given proteome composition (Chao and Heinsmeyer 2012). Lemos et al. (2010) have discussed the advantages of and challenges to using proteomics in ecosystems research. METABOLOMICS Substantial improvements in instrumentation, especially nuclear magnetic resonance spectroscopy (Serkova and Niemann 2006) and mass spectrometry (Dettmer et al. 2007), provide increasingly sensitive approaches to measuring hundreds or even thousands of small molecules in a cell in a matter of minutes. The new technologies have given rise to a promising new -omics technology referred to as metabolomics--the "systematic study of the unique chemical fin- gerprints that specific cellular processes leave behind" (Bennett 2005) or, more specifically, the study of their small-molecule metabolite profiles. "In analogy to the genome, which is used as synonym for the entirety of all genetic informa- tion, the metabolome represents the entirety of the metabolites within a biologi- cal system" (Oldiges et al. 2007). The total number of metabolites in a single cell, tissue, or organism is, of course, highly variable and depends on the bio- logic system investigated. Hundreds of distinct metabolites have been identified in microorganisms. For example, the Escherichia coli database EcoCYC con- tains over 2,000 metabolite entries (Keseler et al. 2011), and the metabolome of

OCR for page 215
Appendix C 219 the common baker's yeast, Saccharomyces cerevisiae, has about 600 metabo- lites, the major ones having molecular weight below 300 g/mol (reviewed in Oldiges et al. 2007). It has been projected that plants have more than 200,000 primary and secondary metabolites (Mungur et al. 2005). Although far less mature than transcriptomics and proteomics, me- tabolomics offers great promise for the development of early biomarkers of dis- ease (Hollywood et al. 2006) and other uses of relevance to EPA. Because me- tabolomics in many ways is the final integration of genomics, transcriptomics, and proteomics, it is likely that future developments in this area will become essential for understanding the functions of the genomes of organisms of interest to EPA, ranging from pathogenic bacteria in drinking water to humans. Indeed, EPA scientists are applying metabolomics approaches to aquatic toxicology (Ekman et al. 2011), in vitro assessments for developmental toxicology (Klein- streuer et al. 2011), and carcinogenic risk assessment (Wilson et al. 2012 in press), to name a few. EPIGENETICS As noted by Rothstein et al. (2009), "epigenetics is one of the most scien- tifically important, and legally and ethically significant, cutting-edge subjects of scientific discovery." Epigenetic changes are the chemical alterations or chemi- cal modifications of DNA that do not involve changes in the nucleotide se- quence in the DNA. Those alterations play a critical role in how and when a particular gene is expressed. It is clear that environmental factors, including diet, can influence how epigenetic regulation of gene expression occurs. It is espe- cially important during periods of cell and tissue growth, such as embryonic and fetal development. Epigenetic changes can be triggered by environmental fac- tors. For example, exposure to metals, persistent organic pollutants, and some endocrine disruptors modulate epigenetic markers in mammalian cells and in other environmentally relevant species and have the potential to cause disease (Vandegehuchte and Janssen 2011; Guerrero-Bosagna and Skinner 2012). Some studies have demonstrated that epigenetic changes can sometimes be transferred to later generations, even in the absence of the external factors that induced the epigenetic changes (Skinner 2011). EPA scientists in the National Health and Environmental Effects Research Laboratory (NHEERL) are aware of the growing importance of epigenetics in environmental health assessment. A seminal review of the application of epige- netic mechanisms to carcinogenic risk assessment was published by NHEERL's scientist Julian Preston (2007). Since then, relatively few publications from NHEERL or other EPA laboratories have addressed epigenetics. A PubMed search identified five publications by EPA scientists in the last 5 years. A recent review by Jardim (2011) discussed the implications of microRNAs (a form of epigenetic regulation of gene expression) for air-pollution research, and Lau et al. (2011) reviewed fetal programming of adult disease (also thought to be an epigenetic phenomenon) and its implications for prenatal care. Hsu et al. (2007)

OCR for page 215
220 Science For Environmental Protection: The Road Ahead addressed the implications of epigenetics in the carcinogenic mode of action of nitrobenzene, but only two original research publications that provided experi- mental data from EPA have directly assessed epigenetic mechanisms. One study (Grace et al. 2011) evaluated the role of maternal influences on epigenetic pro- gramming in the in utero development of endocrine signaling in the brain. The second (DeAngelo et al. 2008) provided dose-response data on the development of hepatocellular neoplasia in male mice exposed over a lifetime to trichloroace- tic acid, a putative carcinogenic product of trichloroethylene solvent breakdown and a chlorination disinfection byproduct. Although they did not assess epige- netic changes experimentally, they suggested that epigenetic mechanisms might explain the observed tumors inasmuch as the compound was not genotoxic. EPA has not published many original papers on epigenetics, but the EPA grants data- base lists 36 extramural research grants to universities across the country that are exploring the role of epigenetics in environmental response (EPA 2012). Given the relevance of this emerging field, it is important that EPA scientists and regu- lators become more active in the accumulation of epigenetic knowledge and its application to human and environmental health risk assessment. Although much remains to be learned about epigenetic phenomena, it is likely to be a critical contributor to many diseases that have both a genetic and environmental com- ponent, and will be especially important in understanding how exposures early in life might contribute to disease onset later in life. BIOINFORMATICS Rapid advances in biotechnology have resulted in an explosion in -omics data and in information on biochemical and physiologic processes in complex biologic systems. The advent of the internet, new technologies, and high- throughput sequencing has spurred further growth of -omics data and has made it possible to disseminate data globally (Attwood et al. 2011). Since the 1990s, the field of bioinformatics has seen growth in response to the need for the gen- eration, storage, retrieval, processing, analysis, and interpretation of -omics data. It draws on the principles, theories, and methods of the biologic sciences, com- puter science and engineering, mathematics, and statistics, and it has always been at the core of understanding of biologic processes and disease pathways (Attwood et al. 2011). As the -omics revolution continues, bioinformatics will continue to evolve, and EPA will continue to require inhouse expertise and state-of-the-science capacity in the field. Analysis of biologic data has evolved from comparisons of various kinds of sequence data (Needleman and Wunsch 1970; Smith and Waterman 1981; Lipman and Pearson 1985) to algorithms that can search various sequence data- bases. Methods and tools have also been developed for the analysis of sequence, annotation, and expression data in support of a wide variety of applications, such as pattern recognition, protein and RNA structure prediction, micro data analysis (Attwood et al. 2011), and biomarker discovery (Baumgartner et al. 2011; Roy et al. 2011). There is an increasing emphasis on understanding biologic systems

OCR for page 215
Appendix C 221 through modeling of biologic, physiologic, and biochemical processes (Deville et al. 2003; Ng et al. 2006; Viswanathan et al. 2008;), including genegene and proteinprotein interactions (Tong et al. 2004; Rual et al. 2005); pathway analy- sis (Schilling et al. 2000; Wishart 2007; Viswanathan et al. 2008); and network mapping (Lee and Tzou 2009.). An integrative approach is needed to use different types of databases to identify distinct system components (organized in modules and subnetworks) and to understand their relationships and thereby reduce the complexity of a biologic system as a whole (Lee and Tzou 2009). There are outstanding chal- lenges to the integrative modeling of biologic systems, some of which are sum- marized in a recent report from the SYSGENET Bioinformatics Working Group (Durrant et al. 2011). Because integrative systems modeling requires synthesiz- ing and harmonizing the analyses of transcriptome, proteome, interactome, me- tabolome, and phenome data, which are likely to be held in numerous heteroge- neous databases, it is critical to improve the interoperability, compatibility, and exchange of software modules that are the foundation of data-processing plat- forms (such as TIQS and xQTL), database platforms (such as GeneNetwork and XGAP), and data-analysis toolboxes (such as HAPPY and R/QTL). A standard computer language for software development and cloud sourcing would facili- tate efficient software dissemination to the bioinformatics community. In addi- tion, further development of public repositories for data models and software source code would promote the use of common data structures and file formats. To stay at the cutting edge of bioinformatics and take full advantage of its rapid advance, EPA will need a highly skilled bioinformatics workforce that can closely follow the development of trends in bioinformatics tools and software closely. As discussed in Chapter 3, EPA already has a leadership role in bioin- formatics as applied to toxicity assessment and is well positioned to contribute to standardization and harmonization processes in the field. REFERENCES Attwood, T.K., A. Gisel, N.E. Eriksson, and E. Bongcam-Rudloff. 2011. Concepts, historical milestones and the central place of bioinformatics in modern biology: A European perspective. Chapter 1 in Bioinformatics - Trends and Methodologies, M.A. Mahdavi, ed. InTech-Open Access [online]. Available: http://www.intechop en.com/books/bioinformatics-trends-and-methodologies [accessed Mar. 30, 2012]. Baumgartner, C., M. Osl, M. Netzer, and D. Baumgartner. 2011. Bioinformatic-driven search for metabolic biomarkers in disease. J. Clin. Bioinform. 1:2, doi:10.1186/ 2043-9113-1-2. Bennett, D. 2005. Growing pains for metabolomics. The Scientist 19(8):25-28. Chao, T.C., and N. Hansmeier. 2012. The current state of microbial proteomics: Where we are and where we want to go. Proteomics 12(4-5):638-650. DeAngelo, A.B., F.B. Daniel, D.M. Wong, and M.H. George. 2008. The induction of hepatocellular neoplasia by trichloroacetic acid administered in the drinking water of the male B6C3F1 mouse. J. Toxicol. Environ. Health A 71(16):1056-1068.

OCR for page 215
222 Science For Environmental Protection: The Road Ahead Dettmer. K., P.A. Aronov, and B.D. Hammock. 2007. Mass spectrometry-based me- tabolomics. Mass Spectrom. Rev. 26(1):51-78. Deville, Y., D. Gilbert, J. van Helden, and S.J. Wodak. 2003. An overview of data models for the analysis of biochemical pathways. Brief. Bioinform. 4(3):246-259. DOE (US Department of Energy). 2011. Major Events in the US Human Genome Project and Related Projects. Office of Science, US Department of Energy [online]. Avail- able: http://www.ornl.gov/sci/techresources/Human_Genome/project/timeline.shtml [accessed Mar. 30, 2012]. Durrant, C., M.A. Swertz, R. Alberts, D. Arends, S. Mller, R. Mott, J.C. P. Prins, K.J. van der Velde, R.C. Jansen, and K. Schughart. 2011. Bioinformatics tools and database resources for systems genetics analysis in micea short review and an evaluation of future needs. Brief. Bioinform. 13(2):135-142. Ekman, D.R., D.L. Villeneuve, Q. Teng, K.J. Ralston-Hooper, D. Martinovic-Weigelt, M.D. Kahl, K.M. Jensen, E.J. Durhan, E.A. Makynen, G.T. Ankley, and T.W. Collette. 2011. Use of gene expression, biochemical and metabolite profiles to enhance exposure and effects assessment of the model androgen 17-trenbolone in fish. Environ. Toxicol. Chem. 30(2):319-329. EPA (US Environmental Protection Agency). 2012. Research Project Search. Extramural Research. US Environmental Protection Agency [online]. Available: http://cfpub. epa.gov/ncer_abstracts/index.cfm/fuseaction/search.welcome [accessed Apr. 4, 2012]. George, J., and Y. Shukla. 2011. Pesticides and cancer: Insights into toxicoproteomic- based findings. J. Proteomics 74(12):2713-2722. Grace, C.E., S.J. Kim, and J.M. Rogers. 2011. Maternal influences on epigenetic pro- gramming of the developing hypothalamic-pituitary-adrenal axis. Birth Defects Res. A Clin. Mol. Teratol. 91(8):797-805. Guerrero-Bosagna, C., and M.K. Skinner. 2012. Environmentally induced epigenetic transgenerational inheritance of phenotype and disease. Mol. Cell Endocrinol. 354 (1-2):3-8. Hollywood, K, D.R. Brison, and R. Goodacre. 2006. Metabolomics: Current technologies and future trends. Proteomics 6(17):4716-4723. Hsu, C.H., T. Stedeford, E. Okochi-Takada, T. Ushijima, H. Noguchi, C. Muro-Cacho, J.W. Holder, and M. Banasik. 2007. Framework analysis for the carcinogenic mode of action of nitrobenzene. J. Environ. Sci. Health C Environ. Carcinog. Ecotoxicol. Rev. 25(2):155-184. Jardim, M.J. 2011. microRNAs: Implications for air pollution research. Mutat. Res. 717(1-2):38-45. Jeong, Y., B.F. Sanders, and S.B. Grant. 2006. The information content of high- frequency environmental monitoring data signals pollution events in the coastal ocean. Environ. Sci. Technol. 40(20):6215-6220. Keseler, I.M., J. Collado-Vides, A. Santos-Zavaleta, M. Peralta-Gil, S. Gama-Castro, L. Muiz-Rascado, C. Bonavides-Martinez, S. Paley, M. Krummenacker, T. Altman, P. Kaipa, A. Spaulding, J. Pacheco, M. Latendresse, C. Fulcher, M. Sarker, A.G. Shearer, A. Mackie, I. Paulsen, R.P. Gunsalus, and P.D. Karp PD. 2011. EcoCyc: A comprehensive database of Escherichia coli biology. Nucleic Acids Res. 39(suppl. 1):D583-D590. Kleinstreuer, N.C., A.M. Smith, P.R. West, K.R. Conard, B.R. Fontaine, A.M. Weir- Hauptman, J.A. Palmer, T.B. Knudsen, D.J. Dix, E.L. Donley, and G.G. Cezar. 2011. Identifying developmental toxicity pathways for a subset of ToxCast chemi- cals using human embryonic stem cells and metabolomics. Toxicol. Appl. Phar- macol. 257(1):111-121.

OCR for page 215
Appendix C 223 Knome. 2012. Knome, Inc. [online]. Available: http://www.knome.com/ [accessed Apr. 2, 2012]. Lau, C., J.M. Rogers, M. Desai, and M.G. Ross. 2011. Fetal programming of adult dis- ease: Implications for prenatal care. Obstet. Gynecol. 117(4):978-985. Lee, W.P., and W.S. Tzou. 2009. Computational methods for discovering gene networks from expression data. Brief. Bioinform. 10(4):408-423. Lemos, M.F., A.M. Soares, A.C. Correia, and A.C. Esteves. 2010. Proteins in ecotoxicol- ogy - how, why and why not? Proteomics 10(4):873-887. Lipman, D.J., and W.R. Pearson. 1985. Rapid and sensitive protein similarity searches. Science 227(4693):1435-1441. Miller, W., D.I. Drautz, A. Ratan, B. Pusey, J. Qi, A.M. Lesk, L.P. Tomsho, M.D. Pack- ard, F. Zhao, A. Sher, A. Tikhonov, B. Raney, N. Patterson, K. Lindblad-Toh, E.S. Lander, J.R. Knight, G.P. Irzyk, K.M. Fredrikson, T.T. Harkins, S. Sheridan, T. Pringle, and S.C. Schuster. 2008. Sequencing the nuclear genome of the extinct woolly mammoth. Nature 456(7220):387-390. Mungur, R., A.D. Glass, D.B. Goodenow, and D.A. Lightfoot. 2005. Metabolite finger- printing in transgenic Nicotiana tabacum altered by the Escherichia coli glutamate dehydrogenase gene. J. Biomed. Biotechnol. 2005(2):198-214. Needleman , S.B., and C.D. Wunsch. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3):443- 453. Ng, A., B. Bursteinas, Q. Gao, E. Mollison, and M. Zvelebil. 2006. Resources for integrative systems biology: From data through databases to networks and dynamic system models. Brief. Bioinform. 7(4):318-330. NHGRI (National Human Genome Research Institute). 2010. The Human Genome Pro- ject Completion: Frequently Asked Questions. National Human Genome Research Institute [online]. Available: http://www.genome.gov/11006943 [accessed Apr. 3, 2012]. NHGRI (National Human Genome Research Institute). 2012. Highlights. National Hu- man Genome Research Institute [online]. Available: http://www.genome.gov/ [ac- cessed Apr. 3, 2012]. Oldiges, M., S. Ltz, S. Pflug, K. Schroer, N. Stein, and C. Wiendahl. 2007. Metabolom- ics: Current state and evolving methodologies and tools. Appl. Microbiol. Bio- technol. 76(3):495-511. Preston, R.J. 2007. Epigenetic processes and cancer risk assessment. Mutat. Res. 616(1- 2):7-10. Rothstein, M.A., Y. Cai, and G.E. Marchant. 2009. The ghost in our genes: Legal and ethical implications of epigenetics. Health Matrix J. Law Med. 19(1) [online]. Available: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1140443 [accessed Apr. 6, 2012]. Roy, P., C. Truntzer, D. Maucort-Boulch, T. Jouve, and N. Molinari. 2011. Protein mass spectra data analysis for clinical biomarker discovery: A global review. Brief. Bioinform. 12(2):176-186. Rual, J.F., K. Venkatesan, T. Hao, T. Hirozane-Kishikawa, A. Dricot, N. Li, G.F. Berriz, F.D. Gibbons, M. Dreze, N. Ayivi-Guedehoussou, N. Klitgord, C. Simon, M. Boxem, S. Milstein, J. Rosenberg, D.S. Goldberg, L.V. Zhang, S.L. Wong, G. Franklin, S. Li, J.S. Albala, J. Lim, C. Fraughton, E. Llamosas, S. Cevik, C. Bex, P. Lamesch, R.S. Sikorski, J. Vandenhaute, H.Y. Zoghbi, A. Smolyar, S. Bosak, R. Sequerra, L. Doucette-Stamm, M.E. Cusick, D.E. Hill, F.P. Roth, and M. Vidal.

OCR for page 215
224 Science For Environmental Protection: The Road Ahead 2005. Towards a proteome-scale map of the human proteinprotein interaction network. Nature 437(7062):1173-1178. Sanger, F., and A.R. Coulson. 1975. A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase. J. Mol. Biol. 94(3):441-448. Schilling, C.H., D. Letscher, and B.. Palsson. 2000. Theory for the systemic definition of metabolic pathways and their use in interpreting metabolic function from a pathway-oriented perspective. J. Theor. Biol. 203(3):229-248. Serkova, N.J., and C.U. Niemann. 2006. Pattern recognition and biomarker validation using quantitative 1H-NMR-based metabolomics. Expert Rev. Mol. Diagn. 6(5):717-731. Skinner, M.K. 2011. Environmental epigenetic transgenerational inheritance and somatic epigenetic mitotic stability. Epigenetics 6(7):838-842. Smith, T.F., and M.S. Waterman. 1981. Identification of common molecular subse- quences. J. Mol. Biol. 147(1):195-197. Tong, A.H.Y., G. Lesage, G.D. Bader, H. Ding, H. Xu, X. Xin, J. Young, G.F. Berriz, R.L. Brost, M. Chang, Y. Chen, X. Cheng, G. Chua, H. Friesen, D.S. Goldberg, J. Haynes, C. Humphries, G. He, S. Hussein, L. Ke, N. Krogan, Z. Li, J.N. Levinson, H. Lu, P. Mnard, C. Munyana, A.B. Parsons, O. Ryan, R. Tonikian, T. Roberts, A.M. Sdicu, J. Shapiro, B. Sheikh, B. Suter, S.L. Wong, L.V. Zhang, H. Zhu, C.G. Burd, S. Munro, C. Sander, J. Rine, J. Greenblatt, M. Peter, A. Bretscher, G. Bell, F.P. Roth, G.W. Brown, B. Andrews, H. Bussey, and C. Boone. 2004. Global mapping of the yeast genetic interaction network. Science 303(5659):808-813. Valigra, L. 2012. Ion Torrent Claims to be First with $1K Genome Sequencer. MHT, January 11, 2012 [online]. Available: http://www.masshightech.com/stories/2012/ 01/09/daily29-Ion-Torrent-claims-to-be-first-with-1K-genome-sequencer.html [ac- cessed Apr. 3, 2012]. Vandegehuchte, M.B., and C.R. Janssen. 2011. Epigenetics and its implications for ecotoxicology. Ecotoxicology 20(3):607-624. Viswanathan, G.A., J. Seto, S. Patil, G. Nudelman, and S.C. Sealfon. 2008. Getting started in biological pathway construction and analysis. PLoS Comput. Biol. 4(2):e16. Wilson, V.S., N. Keshava, S. Hester, D. Segal, W. Chiu, C.M. Thompson, and S.Y. Eul- ing. 2012. Utilizing toxicogenomic data to understand chemical mechanism of ac- tion in risk assessment. Toxicol. Appl. Pharmacol in press. Wishart, D.S. 2007. Current progress in computational metabolomics. Brief. Bioinform. 8(5):279-293. Woollard, P.M., N.A. Mehta, J.J. Vamathevan, S. Van Horn, B.K. Bonde, and D.J. Dow. 2011. The application of next-generation sequencing technologies to drug discovery and development. Drug Discov. Today 16(11/12):512-519. Xu, W., J. Seok, M.N. Mindrinos, A.C. Schweitzer, H. Jiang, J. Wilhelmy, T.A. Clark, K. Kapur, Y. Xing, M. Faham, J.D. Storey, L.L. Moldawer, R.V. Maier, R.G. Tomp- kins, W.H. Wong, R.W. Davis, and W. Xiao. 2011. Human transcriptome array for high-throughput clinical studies. Proc. Natl. Acad. Sci. USA 108(9):3707-3712.