quencing technologies can now be conducted faster and less expensively than was possible with previous generations of technologies. Next-generation sequencing technologies are substantially different from those based on the original Sanger method (Box C-1) and promise remarkable increases in sequencing capabilities.
Next-generation sequencing instruments have made it possible to sequence huge amounts of DNA quickly, thoroughly, and affordably and have opened opportunities to study a wide array of biologic questions, from the metagenomics of water, to characterization of the genetic basis of species differences in response to environmental insults, to human variability in susceptibility to environmentally related diseases. Third-generation sequencing promises to provide full genome sequencing of individuals (humans or other organisms) for less than $1,000 per genome by the end of 2013 (Valigra 2012), and at least one company already offers such services at about $5,000 per genome (Knome 2012).
The sequencing of the human genome, and of the genomes of hundreds of other model organisms of great importance for human and environmental health constitutes an enormous step forward in understanding genetic origins of disease, genetic variability, evolutionary biology, and many other subjects of scientific relevance to EPA. However, from a biologic perspective, it is the expression of the genes in specific cells and tissues that ultimately defines an organism and how it responds to its environment. Thus, measuring the extent of gene expression at a given time in a particular cell or tissue is potentially even more informative of biologic mechanisms. The universe of small RNA molecules that are transcribed from DNA and that are present in a cell or tissue at any given time is referred to as the transcriptome. In the last 2 decades, new tools have been developed that allow one to analyze the entire transcriptome in a cell or tissue and to study changes in gene expression that might be created by changes in the environment, such as exposure to a chemical. There are now microarray methods that allow for the analysis of virtually all mRNA molecules that are transcribed from active genes. Typically, these arrays contain hundreds of thousands of unique features that quantitatively identify the amount of a particular mRNA transcript in the sample. Having multiple features that can use the array to look at different parts of a single gene, such as different exons or exon—intron boundaries (potential splice sites), provides a remarkable snapshot of what genes are functioning in a cell at a particular time.
To study complex and common diseases that may be influenced by environmental factors (such as cardiovascular disease and cancer), human studies typically require high-quality DNA from thousands of patients, often from small quantities of tissues or blood. Several common commercial microarrays for RNA applications in studies of this sort have been available for more than a decade and measure the expression of individual genes. However, understanding the human transcriptome is much more complex than simply measuring the com-