To bring the workshop to a close, each session chair provided a brief synopsis of his or her session, reflecting on recurring themes and thoughts for the future.1
CURRENT ANIMAL MODELS
There is much that can be done in animal models that cannot be done in nonanimal models and humans, and there are regulatory requirements for testing in animals before investigational compounds can be tested in humans. Studies in animals allow, for example, evaluation of drug targets, testing of pharmacokinetics, metabolism, distribution of investigational compounds, and prediction of the dose that will be maximally efficacious and yet still tolerable and safe. Animal studies expand the understanding of the nervous system diseases and disorders in a defined system and allow researchers to draw links to clinical pathways and formulate hypotheses for human testing,
In summary, said session chair Stevin Zorn, although we can learn much from animal models, it was emphasized in the presentations and discussion that what we learn depends heavily what questions are asked,
1The topics highlighted in this chapter are based on the summary remarks made by each session chair during the final workshop session and at the end of his or her session. Additional comments by participants related to the closing remarks are also included. As noted in Chapter 1, comments included here should not be construed as reflecting any group consensus or endorsement by the Institute of Medicine or the Forum on Neuroscience 67 and Nervous System Disorders.
how those questions are asked, and how the results are interpreted. When failures in translation occur, some potential questions to ask are: was it the animal model itself, the analysis, the clinical trial, or another factor? In this regard, the importance of training in skillful study design and appropriate statistical analysis was discussed.
Humans are uniquely different from animals in many ways and there really is no such thing as an animal model of human disease (e.g., an “animal model of depression”). It is important to recognize that animal models are only models of what is being modeled. In other words, these animals are modeling specific perturbations in an effort to understand the biology and assess potential therapies.
Bringing a new drug to market has numerous challenges and significant costs and many potential therapies fail to meet expectations in Phase III clinical trials. Solving the problem of attrition of investigational new drugs will require a “renaissance,” Zorn suggested. Preclinical and clinical scientists need to reunite and redefine what is needed to enhance translation from preclinical models to human clinical trials. Precompetitive alliances and cross-sector collaborations were mentioned as potential approaches. An issue that was raised by multiple participants was that effective, cross-discipline discussion will require a better understanding of each other’s vocabulary.
Also discussed was the need for better definition of the behaviors and physiological measures examined in both the animal model and the human condition. Clearly defining what is measured is important in terms of gauging reproducibility and validity of model systems. There was much discussion about the importance of measuring “the right thing,” but exactly what that is for a given model or disease was, and will continue to be, passionately debated.
Another factor impacting the use of animal models is publication bias, including the tendency in the literature to publish more positive than negative findings; the publication of poorly designed, executed, or analyzed studies that could contribute to uninformative conclusions; and cultural assumptions about what constitutes “good science.”
To foster discussion it was suggested that perhaps it is time to design a whole new set of animal models based on emerging knowledge (e.g., from imaging and genetics), rather than trying to refine existing models. It was also suggested that much can now be learned from sophisticated, non-invasive studies in the living human brain that can inform development of animal models.
Potential Methods for Increasing Translation
• Development of hypothesis-driven experiments
• Sample size calculations
• Blind coding of analyses
• Randomization of treatment assignments
• Blinding of experimenter to treatment assignments
• Careful matching of control and experimental groups (e.g., sex, age, strain)
• Rigorous statistical approaches
• Reproduction in independent cohorts
• Multiple outcome measures
• Matching basic and clinical endpoints
• Regular testing for genetic drift or loss of phenotype
• Use of sensitive positive controls
• Control of testing parameters (e.g., time of testing, lighting conditions)
• Automated testing
• Low-stress, non-aversive testing environments
NOTE: This list was identified and summarized by the rapporteurs. This list is not comprehensive and should not be construed as reflecting any group consensus.
DISORDER-FOCUSED BREAKOUT DISCUSSIONS
To foster more in-depth analysis of the translational success of animal models, the second session of the workshop featured concurrent breakout discussions on six areas of neuroscience research: neurodegeneration, Alzheimer’s disease, stroke, schizophrenia, addiction, and pain. Based on the breakout summaries provided by each group moderator when the full workshop reconvened (see Chapter 3), session co-chair Richard Hodes offered his perspective on the common themes and differences across the groups.
Variable State of Understanding of Pathophysiology
The understanding of underlying pathophysiologic processes is quite different for various nervous system disorders examined, Hodes said.
Our understanding of the genetic component of Alzheimer’s disease, for example, has generated therapeutic targets that have been reproduced in animal models of particular disease features. This knowledge is both a strength and a weakness, Hodes cautioned. While there are potential targets to capitalize on, it is important to remember, as Zorn noted above, that the full disease has not been modeled, only the target of interest. This is in contrast to other conditions such as schizophrenia, where there was discussion about the usefulness of current models and defining exactly what should be modeled.
Therefore, the next steps for improving translation of animal models may differ depending on the current state of knowledge about the basic biology of diseases and disorders.
Adequacy of Models
Hodes recalled the summary from the stroke breakout group, where it was noted that adequate animal models of stroke exist, but translation of the science from animal model to humans has failed. It was suggested that this discordance between animal and human studies in stroke could be due to bias in both animal and clinical study design or to the failure of animal models to adequately mimic clinical disease. This was characteristic of much of the discussion in that there are a variety of possibilities to explain where the faults may be when models do not successfully translate. Hodes cautioned against letting the animal model become the “standard” and lamenting that the clinical state is not cooperating with the model.
Risk That Research Is Constrained by Models
As alluded to above, when models are considered adequate (e.g., arterial occlusion models of stroke) or compelling (e.g., transgenic mouse models of Alzheimer’s disease pathology), there is a risk that research then becomes restricted to those areas or models. In the case of Alzheimer’s research, for example, the amyloid hypothesis has dominated. Only very recently has there been more interest in tauopathy and in mouse models that recapitulate tau-dependent neurodegeneration. It is important to keep an open mind when working with established models, and not become locked into them in a counterproductive way.
Hodes observed that the discussion of the strengths and weaknesses of collaborative efforts tended to group cooperation into two types: parallel partnerships and cooperative relationships that involve sequential roles. Some discussions, for example, focused on concerns that academic institutions and not-for-profit organizations carry out the basic research, and then hand it off to drug discovery and development, as if these are two discrete components in the process and not one continuous pathway.
Issues surrounding the standardization of animal models were discussed from the perspective of preclinical models of anxiety, cognitive assays, and Alzheimer’s disease models. Session chair Walter Koroshetz pointed out an overarching concern for these and all models is that the process of tool development and standardization of models is not easily funded or staffed.
Koroshetz observed that inter-laboratory standardization may be more difficult for some tests than others. It is important to know what the different performance characteristics are for a particular test. For one of the studies, even with great efforts to standardize all parameters among test sites, there was inter-site variability in the results of one type of behavioral assessment (elevated plus maze), but consistency among test sites with another (voluntary alcohol consumption).
Another topic of discussion was concern about over-standardization artificially inflating significance and reducing generalizability. This was shown in a study in which heterogeneity was introduced systematically in the comparison of two strains of mice (by altering housing cage size, and lighting during testing). Under highly standardized conditions, the magnitude of the difference among strains was variable across the experiments. However, when select parameters were heterogeneous, there was remarkable consistency in the strain differences. There was also discussion about the use of primarily one or two particular inbred isogenic mouse strains for behavioral studies and whether that raises questions about generalizability beyond the model.
In discussing better ways in which to assess animal behavior, it was noted that there can be tremendous interference introduced by the experimenter. One approach is to “take the experimenter out of the experi-
ment,” automating as much as possible to minimize experimenter contact with the animal and to reduce variability. The use of visual touchscreens for cognitive testing of both mice and humans was described as an example of standardization, and translation of testing methods.
The discussion also expanded on the point made in the prior two sessions that animal models do not simulate every aspect of a complex disease, but they are very useful for dissecting out particular pieces. This applies not only to molecular pathology, but also to underlying perturbations to networks and functional circuits.
Koroshetz reiterated a list of best practices for preclinical animal studies offered by Mucke, which included blind coding of all analyses and allocations, carefully matching experimental and control groups, rigorous statistical approaches, reproducing results in independent cohorts at different times and in different conditions, using multiple outcome measures, quality control of animal models, validating across models and in the human condition, and including sensitive positive controls to help eliminate false negatives. Several participants noted that a positive control need not necessarily be a compound, but rather, some demonstration that the assay is as sensitive as expected to pharmacological manipulations.
Overall, there was spirited discussion about the value of standardization, with concerns raised that premature standardization might not be helpful and can stifle innovation. It was suggested that improving experimental procedures may be the most helpful to the field going forward. Many participants stressed the importance of best practices, training scientists in well-established principles of experimental design and analysis (e.g., statistical power), and bringing researchers together to share and compare approaches.
CORRESPONDING ANIMAL AND CLINICAL ENDPOINTS
Session chair Sharon Rosenzweig-Lipson observed that even though there may be corresponding preclinical and clinical endpoints for an aspect of disease, scientists may not know whether that aspect predicts the whole disease or disease reversal. In other words, during discussions of corresponding endpoints, researchers take a step forward in translation, but not necessarily in prediction of therapeutic efficacy. It is important to establish what the corresponding endpoint is intended to predict.
In this session, panelists discussed the role of corresponding endpoints, the choice of endpoints, and bidirectional translation for the study
of nervous system disorders. Prepulse inhibition was offered as an example of the ability to study the same endpoint in an animal model and in human testing. Relatively similar manipulations alter the phenotype in a corresponding manner in both animals and humans. The prepulse inhibition assay has predictive validity for testing the activity of antipsychotics and for developing typical and atypical antipsychotics. Such a corresponding endpoint is valuable in the development of therapies for specific symptoms, in this case therapies acting at specific nodal points within the complex circuitry of schizophrenia.
The experimental autoimmune encephalomyelitis (EAE) model was described as an example of both success and failure within the same drug development program. The model predicted the efficacy of a humanized monoclonal antibody for the treatment of multiple sclerosis but failed to predict a serious adverse event. In a second example, differences in the way the EAE model was used by two different laboratories resulted in completely opposite results regarding the role of tumor necrosis factor (TNF) in multiple sclerosis. Clinical trials were conducted based on the animal study showing control of EAE by inhibition of TNF, but the trial results soon confirmed the animal study showing exacerbation of disease when TNF is blocked.
Another example described how neuroimaging tools have allowed us to understand and exploit the fact that the functional components of hippocampal circuits are very similar in their core functions across animal models and humans.
Every time we use an animal model to make a prediction, Rosenzweig-Lipson said, we need to know the level of understanding of the underlying pathophysiology on which the model is based, the validity of the model, and the level of risk in using the model to make a prediction. For some of the cognition models, for example, there is no “gold standard” model and the risk in making predictions using such a model may be very high. The risk may be lower for models based on a stronger understanding. The key questions are how big is the risk and how good is the prediction. Rosenzweig-Lipson suggested that there should to be honesty in dialogues about predictive value so that later, when there is a failure, it is understood that there was, for example, only a 20 percent certainty that the model was going to make a good prediction. In some cases it is not the models that need to be improved, she said, but the dialogue about the models. One participant noted that the use of multiple models and reduction of relying on a single model might reduce risk through a layered strategy.
THE BASIC AND CLINICAL SCIENCE GAP
Several examples of efforts to bridge the translation gap between preclinical models and clinical trials were discussed, including National Institute of Mental Health–funded consensus-building initiatives (MATRICS and CNTRICS); a for-profit consultancy stepping into the gap between academia and industry to bring validated models to drug developers (P1vital); a company using quantitative systems pharmacology as a translation tool, applying mathematical model-based decision support to drug development (In Silico Biosciences); and a European Union government–facilitated, precompetitive public–private partnership to address specific issues in drug development (NEWMEDS).
Session chair Mark Geyer highlighted several take-away messages from the session. He continued on the theme that discussions about animal models should focus on what is being predicted. This means not attempting to predict Phase III clinical trials outcomes, but more toward predicting Phase IIA results.
As has been done in CNTRICS, Geyer suggested it is important to take the following steps:
• Be clear about what is being measured.
• Determine how to measure it in a human, both in healthy and affected individuals.
• Design tasks that are simple, manageable clinical tools for experimental medicine.
• Design tasks that are also nonverbal and amenable to study in animals.
The structure of NEWMEDS and other IMI initiatives in the European Union are enabling researchers to interact both laterally (across institutions and companies) to share data and ideas and vertically (from preclinical through to clinical). Geyer suggested that the United States can learn from this and improve upon it.
Following the session summaries, workshop co-chair Hodes called for final comments and suggestions from participants for going forward.
Participants continued to explore the impact of the terminology used to describe models, and offered suggestions for further discussion.
Model Nomenclature as a Confounder
Despite the broad recognition by many participants that animal models need to be recognized as models of only an aspect of a disease and do not represent an entire disease pathophysiology and phenotype, it was acknowledged that most models are generally referred to as an animal model of a particular disease (e.g., “an animal model of schizophrenia”). Expanding on the discussion in session IV (Chapter 5), it was suggested that equating an animal model of a particular phenotype to a human disease, and discussing the results of a study as a “treatment” or “cure” for the phenotype or disease, can be very misleading. The animal, in fact, never had the full human disease and was not cured. It was also suggested that researchers are misleading each other with these “therapeutic misconceptions.” In this context, it was suggested that authors of manuscripts, reviewers, and journal editors should carefully examine the weight assigned to published results.
It was reiterated that clinical trials are conducted for a particular disease and geared toward a drug label indication for treating specific symptoms in the context of the disease. Some participants, however, suggested engaging regulators and others in discussions on this topic during early stages of development, qualification, and validation of biomarkers and endpoints.
One can back-translate from the disease to the animal model, but that does not mean it is an animal model of the disease. In depression, for example, one model used shows an effect on emotional processing, which is a prelude to mood change. This rat model is not about mood; it is about how the animal evaluates rewards and how researchers evaluate emotional stimuli.
The use of animal models has led to a better understanding of nervous system disorders and diseases and the development of new therapeutics. However, given that there are still few treatment options for many diseases, this workshop sought to concentrate on several important questions: What leads to poor translation of animal models to clinical prac-
tice? Is it the models themselves, research expectations, how models are used to make predictions and decisions, or perhaps the level of knowledge about underlying pathophysiology for any given disease? At each step of the therapeutic development pathway, speakers and participants suggested specific areas for improvement, including the validation of targets, the design of experiments, how results are statistically analyzed, and the way in which positive and negative results are reported. Many participants supported increasing cross-sector collaboration, strengthening training programs and improving the reproducibility of research with a goal of improving translational efforts. Finally, several participants pointed to the need to merge new tools, technologies, and techniques with current research methods, including animal models, as a way to accelerate therapeutic development for nervous system diseases and disorders.