SUMMARY OF THE WORKSHOP

INTRODUCTION

A workshop on the validation of toxicogenomic technologies was held on July 7, 2005, in Washington, DC, by the National Research Council (NRC). The workshop concept was developed during deliberations of the Committee on Emerging Issues and Data on Environmental Contaminants (see Box 1 for a description of the committee and its purpose) and was planned by the ad hoc workshop planning committee (The ad hoc committee membership and biosketches are included in Appendix A.) These activities are sponsored by the National Institute of Environmental Health Sciences (NIEHS). The day-long workshop featured invited speakers from industry, academia, and government who discussed the validation practices used in gene-expression (microarray) assays1,2 and other toxicogenomic technologies. The workshop also included roundtable discussions on the current status of these validation efforts and how they might be strengthened.

1

The microarray technologies referred to in this report measure mRNA levels in biologic samples. DNA from tens of thousands of known genes (for example, genes that code for toxicologically important enzymes such as cytochrome P450) are placed on small glass slides, with each gene in a specific position. These chips are exposed to mRNA isolated from biologic samples (for example, from rats that have been exposed to a pharmaceutical compound of interest). The mRNA in the sample is treated so that when it hybridizes with the complementary DNA strand on the chip, the resulting complex can be detected. Because the chips can hold DNA from thousands of genes, gene expression (the level of each mRNA) of all these genes can be simultaneously detected.

2

These technologies are commonly referred to as gene-expression arrays, transcript/transcriptional profiling, DNA microarray expression analysis, DNA microarrays, or gene chips; more broadly, the use of these technologies is referred to as transcriptomics.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary SUMMARY OF THE WORKSHOP INTRODUCTION A workshop on the validation of toxicogenomic technologies was held on July 7, 2005, in Washington, DC, by the National Research Council (NRC). The workshop concept was developed during deliberations of the Committee on Emerging Issues and Data on Environmental Contaminants (see Box 1 for a description of the committee and its purpose) and was planned by the ad hoc workshop planning committee (The ad hoc committee membership and biosketches are included in Appendix A.) These activities are sponsored by the National Institute of Environmental Health Sciences (NIEHS). The day-long workshop featured invited speakers from industry, academia, and government who discussed the validation practices used in gene-expression (microarray) assays1,2 and other toxicogenomic technologies. The workshop also included roundtable discussions on the current status of these validation efforts and how they might be strengthened. 1 The microarray technologies referred to in this report measure mRNA levels in biologic samples. DNA from tens of thousands of known genes (for example, genes that code for toxicologically important enzymes such as cytochrome P450) are placed on small glass slides, with each gene in a specific position. These chips are exposed to mRNA isolated from biologic samples (for example, from rats that have been exposed to a pharmaceutical compound of interest). The mRNA in the sample is treated so that when it hybridizes with the complementary DNA strand on the chip, the resulting complex can be detected. Because the chips can hold DNA from thousands of genes, gene expression (the level of each mRNA) of all these genes can be simultaneously detected. 2 These technologies are commonly referred to as gene-expression arrays, transcript/transcriptional profiling, DNA microarray expression analysis, DNA microarrays, or gene chips; more broadly, the use of these technologies is referred to as transcriptomics.

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary BOX 1 Overview of the Committee on Emerging Issues and Data on Environmental Contaminants The Committee on Emerging Issues and Data on Environmental Contaminants was convened by the National Research Council (NRC) at the request of NIEHS. The committee serves to provide a public forum for communication between government, industry, environmental groups, and the academic community about emerging issues in the environmental health sciences. At present, the committee is focused on toxicogenomics and its applications in environmental and pharmaceutical safety assessment, risk communication, and public policy. A primary function of this committee is to sponsor workshops on issues of interest in the evolving field of toxicogenomics. These workshops are developed by ad hoc NRC committees largely composed of members from the standing committee. In addition, the standing committee benefits from input from the Federal Liaison Group. The group, chaired at the time of the meeting by Samuel Wilson, of NIEHS, consists of representatives from various federal agencies with interest in toxicogenomic technologies and applications. Members of the Federal Liaison Group are listed in Appendix C of this report. The workshop agenda (see Appendix B) had two related sections. Part 1 of the workshop, on current validation strategies and associated issues, provided background presentations on several components essential to the technical validation of toxicogenomic experiments including experimental design, reproducibility, and statistical analysis. In addition, this session featured a presentation on regulatory considerations in the validation of toxicogenomic technologies. The presentations in Part 2 of the workshop emphasized the validation approaches used in published studies where microarray technologies were used to evaluate a chemical’s mode of action.3 This summary is intended to provide an overview of the presentations and discussions that took place during the workshop. This summary only describes those subjects discussed at the workshop and is not intended to be a comprehensive review of the field. To provide greater depth and insight into the presentations from Part 1 of the workshop, 3 Mode of action refers to the pharmacologic or toxicologic end point or event in an organism that is elicited by a compound.

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary original extended abstracts by the presenters are included as Attachments 1 through 4. In addition, the presenters’ slides and the audio from the meeting are available on the Emerging Issues Committee’s Web site.4 WORKSHOP SUMMARY Introduction Kenneth S. Ramos, of the University of Louisville and co-chair of the workshop planning committee, opened the workshop with welcoming remarks, background on the standing and workshop planning committees, and speaker introductions. Ramos also provided a brief historical perspective on the technological advances and applications of toxicogenomics. Beginning in the early 1980s, new technologies, such as those based on polymerase chain reaction (PCR),5 began to permit evaluation of the expression of individual genes. Recent technological advances (for instance, the development of microarray technologies) have expanded those evaluations to permit the simultaneous detection of the expression of tens of thousands of genes and to support holistic evaluations of the entire genome. The application of these technologies has enabled researchers to unravel complexities of cell biology and, in conjunction with toxicologic evaluations, the technologies are used to probe and gain insight into questions of toxicologic relevance. As a result, the use of the technologies has become increasingly important for scientists in academia, as well as for the regulatory and drug development process. John Quackenbush, of the Dana-Farber Cancer Institute and co-chair of the workshop, followed up with a discussion of the workshop concept and goals. The workshop concept was generated in response to the standing committee’s and other groups’ recognition that the promises of toxicogenomic technologies can only be realized if these technologies are validated. The application of toxicogenomic technologies, such as DNA microarray, to the study of drug and chemical toxicity has improved the ability to understand the biologic spectrum and totality of the toxic response and to elucidate potential modes of toxic action. Although early studies energized the field, some scientists continue to question 4 At http://dels.nas.edu/emergingissues. 5 PCR is a highly sensitive method that uses an enzyme system to amplify (increase) small amounts of mRNA so that it can be more easily detected.

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary whether results can be generalized beyond the initial test data sets and the steps necessary to validate the applications. In recognition of the importance of these issues, the NRC committee dedicated this workshop to reflecting critically on the technologies to more fully understand the issues relevant to the establishment of validated toxicogenomic applications. Because transcript profiling using DNA microarrays to detect changes in patterns of gene expression is in many ways the most advanced and widely used of all toxicogenomic approaches, the workshop focused primarily on validation of mRNA transcript profiling using DNA microarrays. Some of the issues raised may be relevant to proteomic and metabolic studies. Validation can be broadly defined in different terms depending on context. Quackenbush delineated three components of validation: technical validation, biologic validation, and regulatory validation (see Box 2).6 Because of the broad nature of the topic, the workshop was designed to primarily address technical aspects of validation. For example, do the technologies actually provide reproducible and reliable results? Are conclusions dependent on the particular technology, platform, or method being used? Part 1: Current Validation Strategies and Associated Issues The first session of the workshop was designed to provide background information on the various experimental, statistical, and bioinformatics issues that accompany the technical validation of microarray analyses. Presenters were asked to address a component of technical validation from their perspective and experience; the presentations were not intended to serve as comprehensive reviews. A short summary of the topics in each presentation and a discussion between presenters and other workshop participants is presented below. This information is intended 6 Another aspect of validation discussed by Russell Wolfinger, of the SAS Institute and workshop planning committee member, was statistical validation, which involves verifying that data processing algorithms are performing as intended and are producing results that are reliable, reproducible, specific, and sensitive. However, he commented that consideration of statistical validation separately is debatable because statistical and bioinformatics methods could be viewed as being an integral part of the other three kinds of validation described (technical, biologic, and regulatory).

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary BOX 2 Validation: Technical Issues Are the First Consideration in a Much Broader Discussion In general, the concept of validation is considered at three levels: technical, biologic, and regulatory. Technical validation focuses on whether the technology being used provides reproducible and reliable results. The types of questions addressed are, for example, whether the technologies provide consistent and reproducible answers and whether the answers are dependent on the choice of one particular technology versus another. Biologic validation evaluates whether the underlying biology is reflected in the answers obtained from the technologies. For example, does a microarray response indicate the assayed biologic response (for example, toxicity or carcinogenicity)? Regulatory validation begins when technical and biologic validation are established and when the technologies are to be used as a regulatory tool. In this regard, do the new technologies generate information useful for addressing regulatory questions? For example, do the results demonstrate environmental or human health safety? to be accessible to a general scientific audience. The reader is referred to the attachments by the presenters of this report for greater technical detail and a comprehensive discussion of each presentation. Experimental Design of Microarray Studies Kevin Dobbin, of the National Cancer Institute, provided an overview of experimental design issues encountered in conducting microarray assays. Dobbin began by discussing experimental objectives and explaining that there is no one best design for every case because the design must reflect the objective a researcher is trying to achieve and the practical constraints of the experiments being done. Although the high-level goal of many microarray experiments is to identify important pathways or genes associated with a particular disease or treatment, there are different ways to approach this problem. Thus, it is important to clearly define the experimental objectives and to design a study that is driven by those objectives. Experimental approaches in toxicogenomics can typically be grouped into three categories based on objective: class comparison, class prediction, or class discovery (see Box 3 and the description in Attachment 1).

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary BOX 3 Typical Experimental Objectives in mRNA Microarray Analyses Class Comparison Goal: Identify genes differentially expressed among predefined classes of samples. Example: Measure gene products before and after toxicant exposure to identify mechanisms of action (Hossain et al. 2000). Example: Compare liver biopsies from individuals with chronic arsenic exposure to those of healthy individuals (Lu et al. 2001). Class Prediction Goal: Develop a multigene predictor of class membership. Example: Identify gene sets predictive of toxic outcome (Thomas et al. 2001). Class Discovery Goal: Identify sets of genes (or samples) that share similar patterns of expression and that can be grouped together. Class discovery can also refer to the identification of new classes or subtypes of disease rather than the identification of clusters of genes with similar patterns. Example: Cluster temporal gene-expression patterns to gain insight into genetic regulation in response to toxic insult (Huang et al. 2001). Dobbin’s presentation outlined several experimental design issues faced by researchers conducting microarray analyses. He discussed the level of biologic and technical replication7 necessary for making statistically supported comparisons between groups. He also discussed issues related to the study design that arise when using dual-label microarrays,8 7 Biologic replicates are mRNA samples from separate individual subjects that were experimentally treated in an identical manner (for example, five mRNA isolates from each identically exposed animal). Technical replicates would, for example, be tests of different sample aliquots drawn from the same biologic sample. 8 Microarray technologies use two different approaches to detecting RNAs that have hybridized to the DNA probes on the array. Single-label technologies use a single fluorescent dye to detect hybridization of a single RNA sample to a single array, and comparisons are then made between arrays. Dual-label technologies compare two samples on each array by labeling each RNA with a unique fluorescent dye (often represented as red and green) before applying them to the arrayed probes.

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary including strategies for the selection of samples to be compared on each microarray, the use of control samples, and issues related to dye bias.9 The costs and benefits of pooling RNA samples for analysis on microarrays were discussed in relation to the study’s design and goals. As an example to help guide investigators, Dobbin presented a sample-size formula to determine the number of arrays needed for a class comparison experiment (see Equation 1). This formula calculates the statistical power of a study based on the variability estimates of the data, the number of arrays, the level of technical replication, the target fold-change in expression that would be considered acceptable, and the desired level of statistical significance to be achieved (see Attachment 1 for further details). The ensuing workshop discussion on Dobbin’s presentation focused on the interplay between using technical replicates and using biologic replicates. Dobbin emphasized the importance of biologic replication compared with technical replication for making statistically powerful comparisons between groups, because it captures not only the variability in the technology but also samples the variation of gene expression within a population. where n = number of arrays needed m = technical replicates per sample δ = effect size on base 2 log scale (e.g., 1 = 2-fold) α = significance level (e.g., .001) 1-β = power z = normal percentiles (t percentiles preferable) t2g = biological variation within class s2g = technical variation. 9 When two dyes are used, slight differences in their efficiencies at each step in the process—labeling, hybridization, and detection—can cause systematic biases in the measurements that must be estimated from the data and then removed so that effective comparisons can be made.

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary Multiple-Laboratory Comparison of Microarray Platforms Rafael Irizarry, of Johns Hopkins University, described published studies that examined issues related to reproducibility of microarray analyses and focused on between-laboratory and between-platform comparisons. The presentation examined factors driving the variability of measurements made using different micorarray platforms (or other mRNA measurement technologies), including the “lab effect,”10 practitioner experience, and use of different statistical-assessment and data-processing techniques to determine gene-expression levels. Irizarry’s presentation focused on understanding the magnitude of the lab effect, and he described a study where a number of laboratories analyzed the same RNA samples to assess the variability in results (Irizarry et al. 2005). Overall, the results suggest that labs using the Affymetrix microarray systems have better accuracy than the two-color platforms, although the most accurate signal measure was attained by a lab using a two-color platform. In this analysis, a small group of genes had relatively large-fold differences between platforms. These differences may relate to the lack of accurate transcript information on these genes. As a result, the probes used in different platforms may not be measuring the same transcript. Moreover, disparate results may be due to probes on different platforms querying different regions of the same gene that are subject to alternative splicing or that exhibit divergent transcript stabilities. Beyond describing the results of the analysis, Irizarry provided suggestions for conducting experiments and analyses to compare various microarray platforms. The suggestions included use of relative, as opposed to absolute, measures of expression; statistical determinations of precision and accuracy; and specific plots to determine whether genes are differentially expressed between samples. These techniques are described in Attachment 2. Irizarry also commented that reverse transcriptase PCR (RTPCR) should not be considered the gold standard for measuring gene expression and that the variability in RTPCR data is very similar to microarray data if enough data points are analyzed. In this regard, the large quantity of data produced by microarrays is useful in describing the variability in the technology’s response. However, this attribute is sometimes portrayed as a negative because the data can appear variable. Conversely, 10 The lab effect relates to differences in results from different laboratories that may relate to, for example, analyst techniques, lab equipment, or differences in reagents.

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary RTPCR produces comparatively few measurements, and one is not able to readily assess the variability. Irizarry also commented that obtaining a relatively low correspondence between lists of genes generated by different platforms is to be expected when comparing just a few genes from the thousands of genes analyzed. On this point, it was questioned how and whether researchers can migrate from the common practice of assessing 1,000s of genes and selecting only a few as biomarkers to the practice of converging on a smaller number of genes that reliably predict the outcome of interest. Also, would a high-volume, high-precision platform be a preferred alternative? Further questions addressed measurement error in microarray analyses and whether, because of the magnitude of this error, it was possible to detect small or subtle changes in mRNA expression. In response, Irizarry emphasized the importance of using multiple biologic replicates so that consistent patterns of change could be discerned. Statistical Analysis of Toxicogenomic Microarray Data The next presentation by Wherly Hoffman, of Eli Lilly and Company, discussed the statistical analysis of microarray data. This presentation focused on the Affymetrix platform and discussed the microarray technology and statistical hypotheses and analysis methods for use in data evaluation. Hoffman stated that, like all microarray mRNA expression assays, the Affymetrix technology uses gene probes that hybridize to mRNA (actually to labeled cDNA derived from the mRNA) in biologic samples. This hybridization produces a signal with intensity proportional to the amount of mRNA contained in the sample. There are various algorithms that may be used to determine hybridized mRNA signal intensity from background signals. Hoffman emphasized the importance of defining the scientific questions that any given experiment is intended to address and the importance of including statistical expertise early on in the process to determine appropriate statistical hypotheses and analyses. During this presentation, three types of experimental questions were addressed along with the statistical techniques for their analysis (as mentioned by Hoffman, these techniques are also described in Deng et al. 2005). The first example presented data from an experiment designed to identify differences in gene expression in animals exposed to a compound at several different doses. Hoffman discussed the statistical techniques used to evaluate differences

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary in expression between exposure levels while considering variation in responses from similarly dosed animals and variation in responses from replicate microarrays. In this analysis (using a one-factor [dose] nested analysis of variance [ANOVA] and t-test), it is essential to accurately define the degrees of freedom. Hoffman pointed out that the degree of freedom is determined by the number of animal subjects and not the number of chips (when the chips are technical replicates that represent application of the same biologic sample to two or more microarrays). Thus, technical replicates should not be included when determining the degrees of freedom. If this is not factored into the calculation, the P value is inappropriately biased because exposure differences appear to have greater significance. The second example included data from an experiment designed to evaluate gene expression over a time course. The statistical analysis on this type of experiment must capture the dose effect, the time effect, and the dose-time interaction. Here, a two-factor (dose and time) ANOVA is used. The third example provided by Hoffman was an experiment to determine those genes affected by different classes of compounds (alpha, beta, or gamma receptor agonists). This analysis evaluated dose-response trends of microarray signal intensities when known peroxisomal proliferation activated receptor (PPAR) agonists were tested on agonist knockout and wild-type mice to determine those probe sets (genes) that responded in a dose-response manner. Here, a linear regression model is used for examining the dose-response trends at each probe set. This model considers the type of mice (wild type or mutant), the dose of the compound, and their interaction. Hoffman also discussed graphical tools to detect patterns, outliers, and errors in experimental data, including box plots, correlation plots, and principal component analysis (PCA). Other visualization tools, such as clustering analysis and the use of volcano plots used to show the general patterns of microarray analysis results, were also presented. These tools are further discussed in Attachment 3. Finally, multiplicity issues were discussed. Although microarray analyses are able to provide data on the expression of thousands of genes in one experiment, there is the potential to introduce a high rate of false positives. Hoffman explained various approaches used to control the rate of false positives, including the Bonferroni approach, but commented that recent progress in addressing the multiple testing problems has been made, including work by Benjamini and Hochberg (1995). (These approaches as well as the relative advantages and disadvantages are further discussed in Attachment 3.)

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary The short discussion following this presentation centered primarily on the visualization tools presented by Hoffman and the type of information that they convey. Diagnostic Classifier—Gaining Confidence Through Validation Clinical diagnosis of disease primarily relies on conventional histological and biochemical evaluations. To use toxicogenomic data in clinical diagnostics, reliable classification methods11 are needed to evaluate the data and provide accurate clinical diagnoses, treatment selections, and prognoses. Weida Tong, of the Food and Drug Administration (FDA), spoke about classification methods used with toxicogenomic approaches in clinical applications. These classification methods (learning methods) are driven by mathematical algorithms and models that “learn” features in a training set (known members of a class) to develop diagnostic classifiers and then classify unknown samples based on those features. Tong’s presentation focused on the issues and challenges associated with sample classification methods using supervised12 learning methods. The development of a diagnostic classifier can be divided into three steps: training, where gene expression or other toxicogenomic profiles are correlated with clinical outcomes to develop a classifier; validation, where profiles are validated using cross-validation13 or external valida- 11 Classification methods are algorithms used to assign test cases to one of a number of designated classes (StatSoft, Inc. 2006). Most classification schemes referred to in this workshop report refer to classifying a chemical compound based on mode of toxicologic action. Another common scheme is the classification of a biologic sample (for example, classifying a tumor into subtypes based on invasiveness potential). 12 The term supervised learning is usually applied to cases in which a particular classification is already observed and recorded in a training data set, and one wants to build a model to predict the class of a new test sample. For example, one may have a data set from compounds with a known mode of toxicologic action. The purpose of the classification analysis would be to build a model to predict which compounds (from tests of unknown compounds) would be in the same class as the test data set. 13 Cross-validation is a model evaluation method that indicates how well the learning method will perform when asked to make new predictions for data not already seen. The basic premise is not to use the entire data set when training a

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary point between adaptation and frank toxicity. Patton commented that the current understanding of modes of action can be sufficient to manipulate the biologic systems and to use other tools (such as proteomics or pathology) to demonstrate the validity of mechanistic or mode-of-action biomarkers. Thus, biologic validation will not be solely achieved with microarray technologies. To achieve biologic validation, a framework to describe how to assemble information to execute external validation of the pathway is needed. Impact of Individual, Species, and Environmental Variability on Microarray Results Georgia Dunstan, of Howard University, emphasized the importance of genetic variation when considering toxicogenomic results and of not extrapolating beyond the reference (for example, species and conditions) of the original experiments and platforms. The contribution of genetic variation between model systems and between individuals in response to various stressors cannot be avoided. Leigh Anderson, of Plasma Proteome Institute, commented on the lack of microarray studies that characterize individual variation. For example, studies on the variation between inbred rat strains would be useful because understanding the interindividual variation will be critical in choosing genes as stable biomarkers. However, although that research is relatively simple, it is not being done, raising the question of how cross-species extrapolation can be discussed when the variation within a species has not been determined? It indicates the time is still early for these technologies. Cheryl Walker, of the M.D. Anderson Cancer Center, commented on the impact of environmental variability on the stability of the genomic biomarkers. For instance, what happens to these biomarkers and signatures as you start to get away from a controlled light/dark cycle, diet, and nutritional status? Anderson replied that, at least in the field of proteomics, there are a known series of situations to be avoided (for example, animals undergoing sexual maturation and the use of proteins controlled by cage dominance). He commented that these types of variables and effects should be catalogued for microarray assays as well. Hisham Hamedah indicated that there was an ongoing effort at the International Life Sciences Institute (ILSI) to obtain data from several companies on control animals. At the member companies of ILSI, there are different strains, different feeding regimens, and different method-

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary ologies. This effort has obtained data from about 450 chips tested on liver samples. Although the analysis is not yet complete, the goal is to identify genes with very low variance, and the results could possibly be extrapolated to other tissues or even other species. Bill Mates, of Gene Logic, responded that in his experience, expression changes respond to these environmental variables in predictable ways. In fact, the wealth of information provided in microarray assays can permit researchers to find errors in study implementation, and he mentioned an experience where perturbations in expression profiles indicated improper animal feeding or watering. Sarah Gerould, of the U.S. Geological Survey (USGS), commented that Dunstan’s initial comments on cross-species differences were particularly important in ecotoxicology, which focuses on different kinds of fish, birds, and insects. Beyond the variety of organisms and their range of habitats, the organisms are exposed to multiple contaminants, increasing the potential difficulties in ecotoxicologic applications of these technologies. Evaluating Low-Dose Effects of Chemicals Jim Bus, of Dow Chemical, commented that most of the presentations during the workshop focused on screening pharmaceutical compounds in an overall attempt to avoid potential adverse outcomes as they enter the therapeutic environment. However, for the chemical manufacturing industry, the questions are of a different nature (although screening is a component) because, unlike pharmaceutical exposures, humans are not intentionally dosed, and the exposures are substantially lower. Therefore, toxicogenomic assays present an opportunity for biologic validation of effects, particularly at the low end of the dose-response curve where conventional toxicologic animal tests are insufficient. Currently, low-dose effects are addressed by, for example, 10-fold uncertainty factors or a linearized no-threshold model, but these techniques are primarily policy and are not biologically driven. Bus referred to “real world” environmental exposures that are thousands-fold lower than those assessed using conventional toxicologic models. In particular, he was interested in determining how toxicogenomics can assist in bridging the uncertainty associated with default uncertainty factors and models. These issues emphasize the need for biologic validation of these technologies and the potential for their application to the regulatory arena.

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary Federico Goodsaid, of FDA, brought up a related issue, commenting that conventional animal-based toxicology experiments use a limited number of samples and replicates, limiting the ability to see low-dose effects. Further, one of the most daunting tasks in using animal models to try to predict human safety issues is extrapolating from a limited sampling to what would happen with humans. In this effort, toxicogenomics perhaps represents a tool to go beyond what is currently available and to increase the power of the animal models and look at very low doses over a long period of time. Summary Statements and Discussion Kenneth S. Ramos moderated a summary discussion where, to initiate the discussion, he asked whether participants were comfortable with technical validation of the microarray technologies and whether it is appropriate for the field to progress to focusing on biologic validation. Indeed, similar to the roundtable session, the ensuing discussion focused on issues surrounding biologic validation, and some participants brought up themes mentioned earlier, such as the need to define mRNA expression changes that do and do not constitute a negative effect and that genetic diversity will confound extrapolation between species and among humans. Several participants also commented on the current state of validation efforts. The themes that emerged in this discussion are presented here. Validation Issues with Microarray Assays Are Not Novel Several participants suggested that many of the validation issues brought up throughout the day were not isolated to toxicogenomic assays. Linda Greer, of Natural Resources Defense Council, noted that in conventional animal bioassays, we often do not understand the biology underlying why, for example, animals may or may not get tumors; we do not understand the individual variation within an inbred animal strain nor how to make comparisons between species. Greer stated that she was actually relieved to hear the lack of dispute regarding technical validation of microarray technologies, because a common perception among non-specialists is that the technology does not produce consistent results. However, she noted that technical questions have been narrowed and addressed as demonstrated by, for example, the afore-mentioned series of

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary papers in Nature Methods.23 Also, the questions regarding biologic validation apply to a lot of toxicology issues. In this regard, toxicogenomic assays are in the same state as many other conventional methodologies. Rafael Irizarry responded to concerns that microarray assays rely heavily on “black box” mathematical preprocessing, where complex electrical and optical data is converted to a gene-expression level. He noted that other technologies also rely on mathematical algorithms or other preprocessing of raw data, for instance, RTPCR and functional magnetic resonance imaging (FMRI). However, a notable difference is that these other technologies do not produce a wealth of data like the microarray technologies, possibly explaining why the issue is rarely considered. Bill Mattes commented on the level of consistency of microarray results generated among different labs. He stated that microarrays are like other technically demanding technologies, where everybody is not able to produce reliable data. For instance, inexperienced practitioners of histopathology via microscopy, which is considered a “gold standard” for detecting pathologic responses, will not produce reliable results. In this regard, Mattes suggested discussing the development of standards to qualify good practice. Irizarry also suggested that the scientific community use a type of internal validation of practitioners where, for example, laboratories would periodically hybridize a universal standardized reference and submit results to compare against other researchers. Validation in What Context: Technical, Biologic, or Regulatory Leonard Schectmann, of FDA, asked which participants had used microarray technologies and whether they were confident that the technologies had been sufficiently validated and were ready for widespread use (or “prime time” as stated by a few participants). Ramos suggested that, in this context, perhaps prime time was not the best term; rather, that these technologies had undergone sufficient validation and that it was understood they could be used to generate reliable and reproducible results. Other participants also asserted that it was necessary to understand the context in which the term validation was being used. Yvonne Dragan, of FDA, said that it has been shown with microarray technologies that technical reproducibility can be achieved in the 23 May 2005, Volume 2, No. 5

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary laboratory and that reproducibility across different laboratories depends on each laboratory’s proficiency at the technique, with replicability being possible if the labs are proficient at the analyses. Whether reproducibility is possible across platforms depends on whether the platforms are assessing the same thing. For example, if probe sets on the microarray chip are from different locations in the same gene, then different questions are being asked. Therefore, overall, the answers depend on the level at which validation is desired. Asking whether microarrays are capable of accurately measuring mRNA levels is one question. Biologic questions are different, however, and one has to ask whether this method is the right one to address the questions being asked? Kerry Dearfield, of USDA, commented that the question of whether a technology was ready for prime time really meant it was ready to be accepted by the regulatory agencies. In this regard, Dearfield commented that the microarray technologies were not quite there yet. Technical reproducibility, while important, does not specifically address the types of questions being asked in the regulatory field, and accurately answering the biologic questions is essential. It will be necessary to directly tie expression changes to some type of adverse end point and thus be able to address questions of regulatory interest (for example, safety or efficacy). Another application for risk assessment is when toxicogenomics will be used to examine if effects can be seen at earlier times after dosing or at lower doses. Tough questions will remain. For example, if expression changes that can be associated with a pathologic effect are seen at low doses where that pathology has not been observed, how will that information be considered in the regulatory arena? Would regulations change based on expression changes? To progress, it is necessary to ensure that the technologies are technically solid and generating reproducible, believable information. Then, that information has to be linked to biologic effects that people are concerned about. This type of technical and biologic validation needs to be tied together prior to use in the regulatory arena to address public health concerns. To get to this point, the technology will need to go through some form of internationally recognized process where, for example, performance measures for the technologies are specified, so the agencies can use the generated information. Carol Henry, of the American Chemistry Council, commented on the potential for a group of independent researchers to recommend principles and practices necessary for the technical validation of these technologies, so the field can advance to the point where the technologies can be used in public health and environmental regulatory settings. Developing these practices could aid getting the technologies into a formal proc-

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary ess where agencies might actually be able to accept it; right now, toxicogenomic submissions are being considered on a case-by-case basis. Richard Canady, of the Office of Science and Technology,24 commented that a reason to go through a validation process or certification of practitioners is to improve practice overall and reduce the odds that questionable data sets would become an accepted part of the weight of evidence for regulatory decisions (particularly for data that support a decision almost entirely). He also commented that it is important to recognize the value of data to support arguments about, for instance, dose-response extrapolation below the observable range, where it may not be possible to obtain biologically validation. In these applications, data would be used in weight-of-evidence arguments to help researchers understand the biology. Although it is good to think about validation of assays and maybe even certification of practitioners, it is bad to close the door and categorically exclude information. Goodsaid stated that efforts were under way at FDA to develop an efficient and standard process to receive genomic information and to minimize the confusion regarding potential regulatory applications of the technologies. Wrap-Up Discussion To finish the workshop, John Quackenbush assembled several summary statements of themes he heard emerge from the workshop discussions and projected these for the audience (see Box 6). The statements encapsulated the technical and biologic validation considerations addressed in the speaker’s presentations and the discussion that followed. Discussion on the summary statements was brief, and the workshop was adjourned. 24 Dr. Canady is currently employed by the Food and Drug Administration.

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary BOX 6 Summary Statements from John Quackenbush, Harvard University Technical reproducibility has been established within specific platforms. Users of the technology now believe the technology is “valid” in the sense that we can extract “biology” (and possibly extrapolate to mechanistic understanding) from the data, provided that well-controlled and well-designed assays are completed. Validation of the predictive algorithms and methods require sufficiently large data sets. Validation in a broader context will require a focus on specific applications, and there is a need to look at dose response, time-course data, and epidemiological effects. As we move to regulatory use, the best strategy is to carry out validation in an application-specific fashion. Many of the classical questions derived from toxicology—cross-species extrapolation, population and environmental effects, dose effects, and so forth—apply to microarrays as well. “Omics” may allow us to shed light on these questions in a more quantitative fashion than classical toxicology. The only way to address validation questions is to begin to collect data and to analyze it in the proper context. REFERENCES Alon, U., N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, and A.J. Levine. 1999. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96(12):6745-6750. Benjamini, Y., and Y. Hochberg. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Stat. Soc. B Met. 57(1):289-300. Boser, B.E., I.M. Guyon, and V.N. Vapnik. 1992. A training algorithm for optimal margin classifiers. Pp. 144-152 in Proceedings of the Fifth Annual International Conference on Computational Learning Theory, 27-29 July 1992, Pittsburgh, PA. Pittsburgh, PA: ACM Press. Brown, M.P., W.N. Grundy, D. Lin, N. Cristianini, C.W. Sugnet, T.S. Furey, M. Ares, Jr., and D. Haussler. 2000. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. USA 97(1):262-267. Deng, S., T.-M. Chu, Y.K. Truong, and R. Wolfinger. 2005. Statistical methods for gene expression analysis. Computational Methods for High-

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary throughput Genetic Analysis: Expression Profiling. Chapter 5 in Bioinformatics, Vol. 7, Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics. Wiley [online]. Available: http://www.wiley.com/legacy/wileychi/ggpb/aims-scope.html [accessed April 6, 2006]. Eisen, M.B., P.T. Spellman, P.O. Brown, and D. Botstein. 1998. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95(25):14863-14868. Everitt, B. 1974. Cluster Analysis. London: Heinemann. FDA (Food and Drug Administration). 2005. Guidance for Industry: Pharmacogenomic Data Submissions. U.S. Department of Health and Human Services, Food and Drug Administration, Center for Drug Evaluation and Research, Center for Biologics Evaluation and Research, Center for Devices and Radiological Health. March 2005 [online]. Available: http://www.fda.gov/cder/guidance/6400fnl.pdf [accessed April 10, 2006]. Furey, T.S., N. Cristianini, N. Duffy, D.W. Bednarski, M. Schummer, and D. Haussler. 2000. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10):906-914. Guyon, I., J. Weston, S. Barnhill, and V.N. Vapnik. 2002. Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1-3):389-422. Hamadeh, H.K., P.R. Bushel, S. Jayadev, K. Martin, O. DiSorbo, S. Sieber, L. Bennett, R. Tennant, R. Stoll, J.C. Barrett, K. Blanchard, R.S. Paules, and C.A. Afshari. 2002a. Gene expression analysis reveals chemical-specific profiles. Toxicol. Sci. 67(2):219-231. Hamadeh, H.K., P.R. Bushel, S. Jayadev, O. DiSorbo, L. Bennett, L. Li, R. Tennant, R. Stoll, J.C. Barrett, R.S. Paules, K. Blanchard, and C.A. Afshari. 2002b. Prediction of compound signature using high density gene expression profiling. Toxicol. Sci. 67(2):232-240. Hastie, T., R. Tibshirani, M.B. Eisen, A. Alizadeh, R. Levy, L. Staudt, W.C. Chan, D. Botstein, and P. Brown. 2000. “Gene shaving” as a method for identifying distinct sets of genes with similar expression patterns. Genome Biol. 1(2):RESEARCH0003. Hossain, M.A., C.M.L. Bouton, J. Pevsner, and J. Laterra. 2000. Induction of vascular endothelial growth factor in human astrocytes by lead. Involment of a protein kinase C/activator protein 1 complex-dependent and hypoxia-inducible factor 1-independent signaling pathway. J. Biol. Chem. 275(36):27874-27882. Huang, Q., R.T. Dunn, II, S. Jayadev, O. DiSorbo, F.D. Pack, S.B. Farr, R.E. Stoll, and K.T. Blanchard. 2001. Assessment of cisplatin-induced nephrotoxicity by microarray technology. Toxicol Sci. 63(2):196-207. ICCVAM (Interagency Coordinating Committee on the Validation of Alternative Methods). 2003. Guidelines for Nomination and Submission of New, Revised, and Alternative Test Methods. NIH Publication No. 03-

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary 4508. U.S. Department of Health and Human Services, U.S. Public Health Service Research, National Institute of Health, National Institute of Environmental Health Sciences, Triangle Research Park, NC. September 2003 [online]. Available: http://iccvam.niehs.nih.gov/docs/guidelines/subguide.htm [accessed April 10, 2006]. Iida, M., C.H. Anna, W.M. Holliday, J.B. Collins, M.L. Cunningham, R.C. Sills, and T.R. Devereux. 2005. Unique patterns of gene expression changes in liver after treatment of mice for 2 weeks with different known carcinogens and non-carcinogens. Carcinogenesis. 26(3):689-699. Irizarry, R.A., D. Warren, F. Spencer, I.F. Kim, S. Biswal, B.C. Frank, E. Gabrielson, J.G. Garcia, J. Geoghegan, G. Germino, C. Griffin, S.C. Hilmer, E. Hoffman, A.E. Jedlicka, E. Kawasaki, F. Martinez-Murillo, L. Morsberger, H. Lee, D. Petersen, J. Quackenbush, A. Scott, M. Wilson, Y. Yang, S.Q. Ye, and W. Yu. 2005. Multiple-laboratory comparison of microarray platforms. Nat. Methods 2(5):329-330. Joliffe, I.T. 1986. Principal Component Analysis. New York: Springer. Kohonen, T. 1995. Self-Organizing Maps. Berlin: Springer. Kramer, J.A., S.W. Curtiss, K.L. Kolaja, C.L. Alden, E.A. Blomme, W.C. Curtiss, J.C. Davila, C.J. Jackson, and R.T. Bunch. 2004. Acute molecular markers of rodent hepatic carcinogenesis identified by transcription profiling. Chem. Res. Toxicol. 17(4):463-470. Lu, T., J. Liu, E.L. LeCluyse, Y.S. Zhou, M.L. Cheng, and M.P. Waalkes. 2001. Application of cDNA microarray to the study of arsenic-induced liver diseases in the population of Guizhou, China. Toxicol. Sci. 59(1):185-192. Michel, C., R.A. Roberts, C. Desdouets, K.R. Isaacs, and E. Boitier. 2005. Characterization of an acute molecular marker of nongenotoxic rodent hepatocarcinogenesis by gene expression profiling in a long term clofibric acid study. Chem. Res. Toxicol. 18(4):611-618. Nature Methods. 2005. May 2005, Volume 2, No. 5. Schneider, J., and A.W. Moore. 1997. Cross validation. In A Locally Weighted Learning Tutorial Using Vizier 1.0. February 1, 1997 [online]. Available: http://www.cs.cmu.edu/~schneide/tut5/node42.html [accessed April 6, 2006]. Schölkopf, B., and A. Smola. 2002. Learning with Kernels. Cambridge, MA: MIT Press. StatSoft, Inc. 2006. Electronic Statistics Textbook, Statistics Glossary. StatSoft, Inc., Tulsa, OK [online]. Available: http://www.statsoft.com/textbook/glosfra.html [accessed April 6, 2006]. Steiner, G., L. Suter, F. Boess, R. Gasser, M.C. de Vera, S. Albertini, and S. Ruepp. 2004. Discriminating different classes of toxicants by transcript profiling. Environ. Health Perspect. 112(12):1236-1248. Thomas, R.S., D.R. Rank, S.G. Penn, G.M. Zastrow, K.R. Hayes, K. Pande, E. Glover, T. Silander, M.W. Craven, J.K. Reddy, S.B. Jovanovich, and C.A. Bradfield. 2001. Identification of toxicologically predictive gene

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary sets using cDNA microarrays. Mol. Pharmacol. 60(6):1189-1194. Tong, W. 2006. Decision Forest Classification Method [online]. Available: http://www.toxicoinformatics.com/DecisionForest.htm [accessed Aug. 10, 2006]. Tong, W., Q. Xie, H. Hong, L. Shi, H. Fang, R. Perkins, and E.F. Petricoin. 2004. Using decision forest to classify prostate cancer samples on the basis of SELDI-TOF MS data: Assessing chance correlation and prediction confidence. Environ. Health Perspect. 112(16):1622-1627. Vapnik, V.N. 1998. Statistical Learning Theory. New York: Wiley. Yeang, C.H., S. Ramaswamy, P. Tamayo, S. Mukherjee, R.M. Rifkin, M. Angelo, M. Reich, E. Lander, J. Mesirov, and T. Golub. 2001. Molecular classification of multiple tumor types. Bioinformatics 17(Suppl. 1):S316-S322.

OCR for page 1
Validation of Toxicogenomic Technologies: A Workshop Summary This page intentionally left blank.