2

Evaluation of Current Animal Models

Session chair Stevin Zorn, executive vice president of neuroscience research at Lundbeck Research USA, opened the session with the following questions: What leads to poor translation of animal models of nervous system disorders to clinical practice? Is it the models themselves, researcher expectations of models, how models are used to make predictions and decisions, or perhaps the level of knowledge about underlying pathophysiology for any given disease? Finally, Zorn asked, what is the impact of generalizing animal model capabilities, of poor study design, or of publication bias?

To set the stage for discussion, Steven Paul provided background on the challenges of drug discovery for nervous system disorders and why animal models are useful. Mark Tricklebank followed with a discussion about validation of animal models for drug discovery and how translation of preclinical research can be enhanced through skillful study design, planning, and proper statistical analysis; these points were echoed by other speakers and many participants throughout the workshop. Next, Katrina Kelner described three forms of publication bias that can impact the success of animal models: (1) the tendency to publish positive findings; (2) the publication of poorly designed and executed animal studies that could contribute to incorrect conclusions; (3) and assumptions about what constitutes “good science.”



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 9
2 Evaluation of Current Animal Models Session chair Stevin Zorn, executive vice president of neuroscience research at Lundbeck Research USA, opened the session with the follow- ing questions: What leads to poor translation of animal models of nerv- ous system disorders to clinical practice? Is it the models themselves, researcher expectations of models, how models are used to make predic- tions and decisions, or perhaps the level of knowledge about underlying pathophysiology for any given disease? Finally, Zorn asked, what is the impact of generalizing animal model capabilities, of poor study design, or of publication bias? To set the stage for discussion, Steven Paul provided background on the challenges of drug discovery for nervous system disorders and why animal models are useful. Mark Tricklebank followed with a discussion about validation of animal models for drug discovery and how translation of preclinical research can be enhanced through skillful study design, planning, and proper statistical analysis; these points were echoed by other speakers and many participants throughout the workshop. Next, Katrina Kelner described three forms of publication bias that can impact the success of animal models: (1) the tendency to publish positive find- ings; (2) the publication of poorly designed and executed animal studies that could contribute to incorrect conclusions; (3) and assumptions about what constitutes “good science.” 9

OCR for page 9
10 ANIMAL MODELS FOR NERVOUS SYSTEM DISORDERS EXPECTATIONS FOR ANIMAL MODELS IN DRUG DEVELOPMENT The past decade has been challenging for the biopharmaceutical in- dustry. These challenges include the development and regulatory approv- al of innovative new medicines, said Steven Paul, director of the Helen and Robert Appel Alzheimer’s Disease Research Institute at Weill Cornell Medical College and former president of Lilly Research Labora- tories. It has been a particularly difficult time for neuroscience research and development. There have been therapeutic successes in select areas, for example, multiple sclerosis and epilepsy. However, in other areas of research, such as schizophrenia and depression, new drugs are not signif- icantly better than those developed 50 years ago. For the most part, there are still no disease-modifying drugs for diseases like Alzheimer’s and Parkinson’s. Many pharmaceutical companies are restructuring and/or deemphasizing research on nervous system diseases and disorders, Paul noted, while some have completely left the field. A major problem in bringing a new therapy to market is attrition, Paul said. When a new drug target is discovered and validated, lead can- didate molecules are then identified and optimized. Preclinical testing evaluates the pharmacology and toxicology of the lead compound in an- imal models and Phase I clinical studies establish safety and dosing in humans. The problem of attrition occurs in Phases II and III, during the testing of whether the drug “works” as a treatment for the particular disease. A 2010 study by Paul and colleagues found that roughly 66 percent of compounds that entered Phase II development did not advance to Phase III and about 30 percent of those that did enter Phase III failed to move on to regulatory submission (Paul et al., 2010). A more recent analysis suggests that, from 2008 through 2010, 82 percent of com- pounds failed to advance from Phase II clinical trials, with more than half failing due to issues of efficacy (Figure 2-1). Equally concerning, Paul said, is data from 2007 through 2010 show that 50 percent of compounds failed in Phase III, 66 percent of which were due to efficacy concerns (Arrowsmith, 2011b). By Phase III, Paul suggested, there should be con- fidence in efficacy and compounds should really only fail infrequently or due to rare adverse events. The approval rate for new drug applications (NDAs) by the U.S. Food and Drug Administration (FDA) is currently about 70 percent. In other words, 30 percent of new drugs that make it through Phase III to regulatory filing will not gain regulatory approval.

OCR for page 9
EVALUATION OF CURRENT ANIMAL MODELS 11 FIGURE 2-1 Failure of drugs in Phase II according to reason for failure (a) and therapeutic area (b). SOURCE: Paul and Tricklebank presentations, March 28, 2012, from Arrowsmith, 2011a. Animal models have long had an important role in the drug devel- opment process. As outlined by Paul, current expectations are that animal models can help researchers to • Better understand the fundamental pathology and pathogenesis of a disease. An example is how genes and mutations result in observed disease phenotypes. This knowledge of underlying dis- ease biology can aid in selecting and validating drug targets and defining pathways of intervention; • Test drugs and treatments that could be effective for a specific disease or a particular symptom of a complex disease; • Ascertain the safety of a drug or treatment, for example, toxicity or adverse events; and • Establish the therapeutic index, or dose, that produces the desired effect compared to the dose that is toxic or lethal, for a given drug or treatment prior to testing in humans. Paul concluded that although, animal models cannot solve all of the chal- lenges of drug development, they are an important part of the solution. In particular, improvements in the ability of animal models to predict effi- cacy can help to reduce issues of attrition.

OCR for page 9
12 ANIMAL MODELS FOR NERVOUS SYSTEM DISORDERS CHOICE AND VALIDATION OF ANIMAL MODELS FOR CENTRAL NERVOUS SYSTEM DRUG DISCOVERY Mark Tricklebank, director of the Lilly Center for Cognitive Neuro- science at Eli Lilly and Company in the United Kingdom, said there are an unprecedented number of potential molecular targets in the nervous system, but there is little direct, clinically-based evidence to support a rational method for choosing one target over another. Selection of targets at the start of a drug discovery program would benefit from a solid bio- logical hypothesis. This might be accomplished through manipulation of the target in vitro and development of a strong hypothesis of the expected in vivo results (Sarter and Tricklebank, 2012). Once a therapeutic target is selected, advances in in vitro screening technologies have made identi- fication of potential drug molecules relatively easy. Tricklebank con- curred with Paul that the system breaks down at the target validation stage, with too many false positives or false negatives. Working Backward from Failure in the Clinic Investigational new therapeutics are dismissed on the basis of the clinical result, most commonly, for lack of efficacy. However, Tricklebank cited a recent analysis suggesting that in many cases where the conclusion is lack of efficacy, the fact is that the hypothesis was not tested adequately and the results were not definitive (Morgan et al., 2012). The way in which animal models are used may also contribute to failure in clinical trials, Tricklebank offered several possible reasons: • Preclinical evidence supporting the hypothesis is given unrealis- tic weight in comparison to evidence against the hypothesis. Pressure on researchers to focus on positive results and ignore information that might need to be addressed and understood be- fore deciding to move forward. • Insufficient attention is paid to the design and analysis of exper- iments so that false positives or false negatives incorrectly influ- ence decisions.

OCR for page 9
EVALUATION OF CURRENT ANIMAL MODELS 13 • Preclinical data collected to support the hypothesis are irrelevant to the mechanisms underlying the disease of interest. For exam- ple, not much is known about the pathophysiology of psychiatric diseases, and therefore it is difficult to make rational drug target choices. • The preclinical data accurately conveys the drug responses in a very specific, genetically circumscribed population of animals maintained in highly controlled environments, but this does not necessarily hold true in the heterogeneous human clinical population. • The measurements taken have, at best, only face validity for the variables that the experimenter would like to measure, and they need to be quantified in relation to the disease of interest. There is a need to understand in greater detail what researchers want to investigate, Tricklebank said. • The measured effects are confounded by competing responses. • There is a species gap, so the systems being manipulated in ex- perimental animals might never fully predict outcomes in humans. Expanding on the issue of experimental design and analysis, Tricklebank cited a study of 513 publications in top neuroscience jour- nals that looked for appropriateness of statistical analysis (Nieuwenhuis et al., 2011). The study showed that 79 of these publications used an in- correct statistical approach. In a number of cases, the signal was so large that this error was irrelevant, but in two-thirds of the cases, this incorrect statistical analysis actually influenced the interpretation of the experi- mental results, Tricklebank explained. Another issue is validation of the assay and model. Tricklebank out- lined six types of validity to be considered in defining animal models (Box 2-1; see also Markou et al., 2009).1 There is some confusion, he noted, about what an “assay” is and what a “model” is. An assay is a means of quantifying a dependent variable. A model is a theoretical de- scription of the way a system, process, or disease works. An animal 1 The types of validity described in Box 2-1 can be generalized under external validity or the “extent to which the results of an animal experiment provide a correct basis of generalizations to the human condition.” Not included in this list, but discussed through- out the workshop by participants, is the concept of internal validity or the “extent to which the design and conduct of the trial eliminate the possibility of bias” (van der Worp et al., 2010).

OCR for page 9
14 ANIMAL MODELS FOR NERVOUS SYSTEM DISORDERS BOX 2-1 Types of Validity Relative to Neuroscience Animal Models Construct validity: Ideally mimicking the molecular and/or structural basis of the disease Convergent validity: Evidenced by high correlation among perfor- mance patterns across cognitive tasks designed to measure the same neurocognitive process Criterion validity: The ability of performance in one task to predict per- formance on another, more ecologically valid test Discriminant validity: Evidenced by low correlation among outcomes across tasks designed to measure distinct neurocognitive constructs Face validity: Degree of similarity to disease-specific symptoms Predictive validity: Based on currently available therapy SOURCE: Tricklebank presentation (March 28, 2012). model induces over- or under-expression of a biological variable which the assay quantifies. Improving the Probability of Clinical Success Through Validation Ensuring that preclinical models have some validity for the system of interest starts with a better definition of the animal behaviors to be meas- ured across the domains of sensory-motor function, arousal, affect, moti- vation, cognition, and social processes. Assays to measure these aspects of animal behavior should be designed to be as similar as possible to the assays used in the clinical situation, Tricklebank said. Also, assays meas- uring these functions in the clinical study should be as close as possible to those used in animals. Focusing on measuring the right thing and measuring it accurately is important. This includes maximizing signal-to- noise, removing confounds, determining both intra- and inter-laboratory reproducibility, and testing compounds against the most appropriate base- line perturbation.

OCR for page 9
EVALUATION OF CURRENT ANIMAL MODELS 15 Consider the manipulation: Is that manipulation clinically relevant; is it engaging the circuitry that is thought to be dysfunctional in the clinical indication of interest? From a practical perspective, Tricklebank sug- gested that increasing the view of psychiatric disorders as aspects of dis- turbed brain circuitry will lead to more rational profiling of compounds. Validation of potential compounds delivered via local injection, for ex- ample, would require evidence of engagement of the neurocircuitry in- volved in the disease. Biomarkers can also serve as indicators of the circuitry involved in the measured behaviors. Improving animal models might occur through the use of clinically relevant pharmacological, envi- ronmental, neurodevelopmental, and genetic methods to perturb or impair normal function. In discussion, Sharon Rosenzweig-Lipson of IVS Pharma Consulting added that in addition to establishing the validity of an animal model, it is important to understand how decisions are made based on the model. How much of failure to translate is attributable to novel animal models versus models that are well defined (e.g., standards of care, known mechanisms)? Collaborative Partnerships and Cross-Disciplinary Research The design and validation of preclinical assays and models for drug discovery is ideally served by precompetitive, collaborative approaches via industrial, academic, and clinical consortium, Tricklebank said. The Lilly Center for Cognitive Neuroscience, he explained, has adopted a very detailed approach to profiling a molecule though assays, disease models, tools, targets, and biomarkers as the basis for Phase I clinical studies. To do this, Lilly is leveraging external capacity, capabili- ties, and innovation all along the development pathway through a consor- tium of academic and industry scientists. On a much broader scale, the European Union (EU) through its In- novative Medicines Initiative (IMI) launched the Novel Methods leading to New Medications in Depression and Schizophrenia (NEWMEDS) pro- ject, an international consortium of academic and industry scientists. 2 Tricklebank is involved in NEWMEDS Work Package 2 (WP2), which is focused on cognition assays and animal models. A key aspect of WP2 is multisite validation, establishing new assays in one laboratory and then running them in the partner laboratories to gauge reproducibility from 2 Discussed further by Steckler in Chapter 6.

OCR for page 9
16 ANIMAL MODELS FOR NERVOUS SYSTEM DISORDERS methodological and pharmacological sensitivity perspectives. One exam- ple, Tricklebank mentioned, involves the development of a touchscreen- based translational assay of animal cognition and validation of the phar- macological sensitivity. 3 Multiple laboratories within the consortium are using this touchscreen assay under standardized experimental conditions to compare the rate of task acquisition and response to drugs adminis- tered during task acquisition or posttraining. Tricklebank noted that convergent approaches combining behavior and physiology are critical for the preclinical validation of drug targets. He concluded that cross-disciplinary approaches are essential for the pre- clinical validation of drug targets, and collaborative precompetitive ap- proaches to verifying findings will pay off in the long run. THE IMPACT OF PUBLICATION BIAS The process of peer review and publication in established journals improves the quality of science and helps to filter out invalid or unim- portant results. However, scientific literature is still subject to publication bias (Easterbrook et al., 1991; Sena et al., 2010; ter Riet et al., 2012). Katrina Kelner, editor of Science Translational Medicine, reviewed three types of publication bias. Failure to Publish Negative Results Sociological factors strongly influence what is published in the litera- ture and what is not, Kelner said. A much larger fraction of papers that report positive findings are published than those reporting negative find- ings. The end result is that scientific literature can be distorted. Kelner illustrated this with a hypothetical scenario (see Box 2-2). This problem has no easy solution. There have been several calls for the pharmaceutical industry to publish their preclinical data and results from failed clinical trials (Clozel, 2011; Rogawski and Federoff, 2011). Several new journals are dedicated to publishing negative results (e.g., Journal of Negative Results in Biomedicine, Journal of Negative Results—Ecology and Evolutionary Biology, Journal of Articles in Sup- port of the Null Hypothesis). In addition, Kelner noted that PLoS ONE will publish studies with negative findings. To this point, one participant 3 Discussed further by Bussey in Chapter 4.

OCR for page 9
EVALUATION OF CURRENT ANIMAL MODELS 17 BOX 2-2 Hypothetical Publication Bias Scenario: Failure to Publish Negative Results Twenty laboratories tested drug X and only one laboratory found that it lowers blood pressure and published their statistically significant results (p < 0.05). The rest of the laboratories did not observe any effect and did not publish their results. Because the p value was set at 0.05, there was a 5 percent chance of getting a positive result by chance alone. If researchers could look at the results of all 20 studies done all over the world together, they would have concluded that it is unlikely that drug X lowers blood pressure. Therefore, based on the single positive result that was pub- lished, drug X appeared to have important clinical potential to control blood pressure. The results were recognized by peer/ service user reviewers and endorsed by publication in a top-tier journal. Shortly after, another laboratory that worked on drug X tried to replicate the published study but could not. Its work was published in a lower-tier journal and far fewer people saw it. Three other laboratories that had found negative results decided to pub- lish their work, but no journal was interested. SOURCE: Kelner presentation (March 28, 2012). suggested that scientific quality and transparency would increase if there were a widespread effort by journal editors to publish a certain percent- age of articles each year that contain negative results. During the discussion, a participant suggested that top-tier journals establish sections for “replications and challenges” where one laborato- ry’s failure to replicate another’s work can reach the same level of atten- tion as the original article. This could be a data-driven commentary on articles that have been published. Kelner supported the idea of a site that could publish negative results following review for rigorous experimental design and analysis. A participant from the National Institutes of Health (NIH) noted that the concept of a repository of negative data has been proposed many times. The NIH has, in essence, such a repository of data from its funded studies and perhaps that knowledge of what works and what does not could somehow be tapped to help address publication bias. One participant noted that another pathway toward publication bias might arise if agencies disproportionately fund labs with publications of positive results in high-impact journals; this might lead to greater fund-

OCR for page 9
18 ANIMAL MODELS FOR NERVOUS SYSTEM DISORDERS ing of fewer labs that are emphasizing singular approaches. A participant followed up that this could potentially lead to a push for postdoctoral fellows and graduate students to seek out training opportunities in highly recognized labs leading toward further scientific biases. Poorly Designed and Executed Studies Another type of publication bias is the publication of studies that are poorly designed, executed, and/or analyzed, in which the conclusions drawn are invalid or not meaningful.4 This results in proliferation of arti- cles that are basically uninformative. Weeding these studies out of the submission pool can be difficult. Journals rely, by necessity, on the ex- pertise of their peer reviewers and not all reviewers are aware of or ap- propriately trained in experimental design and data analysis. Kelner noted that medical journals have been ahead of preclinical journals in adopting specific, independent review criteria, for example, the Consoli- dated Standards of Reporting Trials or CONSORT guidelines (Schultz et al., 2010). Journals can help address this, Kelner said, by enforcing quality standards for peer review and scientists can help journals by developing and disseminating relevant standards for their field. In addition, she said, funding agencies can require high-quality output from their funded scientists. Cultural Assumptions About What Good Science Is The third type of publication bias Kelner described results from cul- tural assumptions about what constitutes “good science.” A common as- sumption is that the best science increases fundamental knowledge, not practical application. This belief is part of the cultural fabric of the scien- tific community, including study sections, mentors, and some top-tier journals, she said. 4 The National Institute of Neurological Disorders and Stroke convened major stake- holders in June 2012 to discuss how to improve the methodological reporting of animal studies in grant applications and publications. The main workshop recommendation was that at a minimum studies should report on sample-size estimation, whether and how animals were randomized, whether investigators were blind to the treatment, and the handling of data (Landis et al., 2012).

OCR for page 9
EVALUATION OF CURRENT ANIMAL MODELS 19 Some journals, such as Science Translational Medicine, are embrac- ing a new assumption, that is, the best science makes progress in improv- ing the lives of human beings. From this new perspective, Kelner suggested, work that was previously thought to be dull becomes exciting, other studies become much less interesting, and the difficulties of doing research in humans might become worth tackling. This is not to say that science should stop doing fundamental research, she stressed, but rather, unexamined assumptions about what makes “good science” can introduce biases in what research is completed and which studies are published. New Knowledge Calls for New Models Neuroscience has changed significantly over the past few decades, moving from the study of simple systems, such as the squid axon, to im- aging studies of complex systems such as humans. Scientists can conduct sophisticated, non-invasive studies in the living human brain to learn about diseases and the effects of treatments. In conclusion, Kelner chal- lenged workshop participants to think about whether it is time to design a whole new set of animal models based on this knowledge, rather than trying to refine existing models. She suggested that resources be focused on advancing the understanding of neurological and psychiatric disease in people, and then, on the basis of that information, building new in vitro and animal models for drug development. In response, Paul noted that most psychiatric drugs were discovered in humans, not in animals (Conn and Roth, 2008). Chlorpromazine, for example, was developed as an anesthetic sedative, but had a unique calming effect that was then tested and approved for use in psychotic schizophrenic patients. A participant added that if the mouse models for polygenic psychiatric disorders are really as poor as people think, per- haps it is time to ask under what circumstances it would be both worth- while and ethical to go straight into human clinical trials after establishing safety.

OCR for page 9