Sex Differences in Drug Development: Policy and Practice
The innovative pharmaceutical and biotechnology industries are critical stakeholders in the long process of bringing new therapies from bench to bedside. Morgan Sheng, vice president of neuroscience at Genentech, gave the keynote address at the workshop. He offered an industry view of sex differences in translational neuroscience. Panelists from the Food and Drug Administration (FDA), the National Institutes of Health, and industry discussed the history, guidelines, and regulations regarding the inclusion of males and females in clinical trials, and how and when industry considers and addresses the study of sex differences, given current regulatory requirements.
SEX DIFFERENCES IN TRANSLATIONAL NEUROSCIENCE: A VIEW FROM INDUSTRY
Those in the neuroscience field would generally agree that there are significant sex-based differences in how the brain works, and in diseases of the nervous system, Sheng said. They concur that it is important, even ethically imperative, to consider sex differences in all levels of research, but especially in research that involves humans and drugs given to humans. But we live in a resource-limited world, and a challenge is determining the right time to invest these limited resources so that data on sex differences can be translated into actions.
Quite simply, the business of the pharmaceutical industry is to make drugs that can help people—and to sell those drugs. Genentech focuses on
development of novel drugs and targeted therapies, always with the best interests of patients in mind, Sheng said. Targeted therapies focus on treating the underlying mechanism of the disease to modify the disease course. Treatment or palliation of symptoms is also important, Sheng said, but Genentech’s philosophy is to develop disease-modifying medications. Even patients with the same basic indication for treatment can have different mechanisms of disease. To define the most appropriate population for treatment, and work toward personalized health care, more diagnostic tests that differentiate these subsets are also needed.
Although some are pessimistic about the “broken business model” of the modern pharmaceutical industry, Sheng said Genentech remains optimistic about the future of the industry. Tremendous technological advances have been made in the understanding of basic biological mechanisms. Much is known about disease pathways, which are beginning to be linked into a systems biology. Sheng anticipated that in another few decades, thousands of plausible or rational disease targets would be identified for possible drug treatment. To truly personalize health care, thousands of targets and thousands of drugs are needed.
The standard flow of drug discovery starts with a basic understanding of the mechanisms of the disease. Then drug targets are identified based on rational understanding. Those targets are “druggable,” meaning they can be attacked by small molecules, proteins, antibodies, or maybe in the future, gene therapy with small interfering RNAs. Differentiated molecular medicines are then developed to attack these targets, tailored to the groups for which they are most effective.
Sex difference is one obvious way to describe subgroups of a patient population. Another is genetic groups, which are independent of sex. Race and age also have significant effects on disease. In general, Sheng said, drug development in neuroscience is at a disadvantage because the basic workings of the brain and its disease mechanisms are still poorly understood.
Considering Sex Differences in Drug Development
Sex differences can provide information about disease mechanisms and clues about why people contract a disease. Sex differences also affect the quality of animal disease models, and should guide choices of animal models. Most importantly, from a treatment point of view, sex can affect how a drug is metabolized and how well the patient responds to the drug.
How should sex differences be considered during drug development? One could analyze sex differences in a “blanket fashion.” This means covering sex differences at a purely descriptive level, with no hypothesis, regardless of the amount of labor or time expended and without thinking about the cost. This would entail including equal numbers of males and females
in all genetic studies, pathological examinations, translational experiments, and efficacy studies, and analyzing them together and separately. This approach does not consider race (or strain, for mice), age, environment, or other important factors. For certain neurodegenerative illnesses, these other factors may be more important to consider than sex.
Another issue is that of experimental animals raised in cages. Disease phenotypes in mice that have been genetically engineered to have defects can be reversed simply by changing their environment: giving them more space and several other mice with whom they can interact. Environmental enrichment is costly in an animal care facility, but it may be more or just as important as sex differences.
A better approach to determine when to consider sex differences is when doing so provides the most value, or the most benefit for the expenditure. The first step is to analyze sex differences in general, unbiased descriptive studies of humans. A great deal of existing data can be mined for sex differences, Sheng said. One approach would be to look at the human genetics of nervous system diseases, particularly through genomewide association studies (GWAS). Other approaches include, for example, prospective clinical studies of the natural history of disease, and biomarkers of disease and its progression.
One example from the recent literature involves a genetic polymorphism in the gene for brain-derived neurotrophic factor (BDNF). BDNF is involved in many processes in the brain, including plasticity of synapses, neuronal development survival, and neurogenesis. Reduction in BDNF has been implicated in stress and depression, and antidepressant drugs increase BDNF. The Val66Met polymorphism in BDNF is associated with impaired secretion of BDNF and is linked to several neuropsychiatric illnesses. Researchers looked at the genome-wide association between the Val66Met polymorphism and major depressive illness, and found no association overall. But when they segregated males and females, they found a significant association in males (Verhagen et al., 2010). The implications of this are not clear, but it is an example of interesting studies to come, where sex differences are parsed out through large-scale, genome-wide association studies.
A more difficult question than when should sex differences be studied is when they should be investigated, moving from descriptive studies to experimental hypothesis-driven studies. This question needs to be addressed in an indication-specific fashion. One factor that might compel one to move from descriptive to experimental studies is if there are large sex differences in the human clinical disease (epidemiology, features, outcomes, etc). Experimental studies are also possible if the basic pathogenesis of that disease is understood, and if there is a reasonable animal model in which to study the potential sex difference. If the sex difference is largely due to environmental
or social factors, studying that in an animal is difficult. If there is not a large sex difference, if basic pathogenesis is poorly elucidated, or if there is no appropriate animal model, then it is very difficult to scientifically study the sex differences in preclinical translational studies, Sheng said.
Studying Sex Differences in Neurological Diseases
Neurodegenerative diseases are a scourge of modern civilization, Sheng explained. Alzheimer’s disease (AD) affects about 5 million people in the United States. It results in impaired memory, progressing to profound dementia. The disease mechanism is not yet established, and even though textbooks focus on beta-amyloid plaques, that is still a hypothesis, Sheng said. No disease-modifying treatment is available yet, although several are in various phases of clinical development.
Alzheimer’s disease is about twice as common in women than in men. This could be partly due to age, because women live longer, or it could be that men get other diseases before they get AD. Whether or how this observed sex difference may be important for the disease is unclear, Sheng noted. Although the sex difference is not huge, it would be worth studying if the right kind of animal model were available. But current animal models for AD are not ideal.
With this in mind, when should sex differences in AD be studied? Given a finite pool of money to try to find a cure or a new target for AD drugs, Sheng said he would not spend it studying the sex difference. Rather, studying age factors would be a better use of resources because of the hundred-fold increase in its incidence between the ages of 60 and 90. Another area to focus on would be the apolipoprotein E gene variant, ApoE4. People who carry a single allele of ApoE4 are 3.5 times more likely to get AD, and those with two copies of the gene are 10 times more likely.
Given the lack of good animal models, the difficulty in accounting for sex differences, and the fact that not enough is known about the basic mechanism of Alzheimer’s disease, drug discovery is more likely to be successful by focusing on some of these other factors.
Another neurodegenerative disease being studied is amyotrophic lateral sclerosis (ALS, or Lou Gehrig’s disease), a relatively uncommon illness that is more prevalent in men. A mouse model does exist for these studies. Although the underlying mechanisms are not fully understood, mutations in the superoxide dismutase 1 (SOD1) gene are associated with a small percentage of ALS cases. Transgenic expression of the human SOD1 mutation in mice recapitulates the human disease. Animals survive between 21 and 25 weeks, with females surviving longer by several days to a week.
In addition, Sheng noted, an investigational therapy also shows a sex difference. In this case a sex difference in humans can be translated to a sex difference in the animal model and in a therapy response.
Sex differences have also been observed in neuropsychiatric illnesses such as depression (twice as common in women) and schizophrenia (1.4 times as common in men, with earlier onset). Sex differences in neuropsychiatric disorders are important, but may be due in part to environmental and social factors. As a result, studying relevant neuropsychiatric sex differences is difficult in preclinical experiments. Another primary hurdle in studying sex differences in these diseases is the lack of good animal models.
When so little is known about the basic physiologic and disease mechanisms of the brain, Sheng asked, how high a priority should studying sex differences in preclinical research be given? In many diseases where the pathophysiology is relatively unknown (and where there are no large sex differences), is it more reasonable to start by trying to understand the basic mechanism with the assumption that it might apply to both sexes before we attempt to analyze the differences between the sexes?
Autism is a common neurodevelopmental disorder with a significant sex difference; males are four times more likely to be affected than females. One theory for this disparity is the “extreme male brain theory of autism” by Simon Baron-Cohen. This theory posits that males generally have less empathy than females, and autism is an extreme form of that sex distribution that occurs in the general population. More recently, autism has been shown to have a significant genetic component. Very good mouse models are available for the specific genetic causes of subsets of autism spectrum disorder, including Fragile X syndrome, tuberous sclerosis, and Rett syndrome (which affects only females).
This area should be fertile ground for hypothesis-driven translational neuroscience research that investigates sex differences, Sheng said. However, opportunities are often missed because experimental expediency can corrupt the study and weaken conclusions. Behavior analysis in translational research is labor intensive and can take a very long time. The significant variability requires a very large sample size, and multiple assays are involved. Although Genentech typically includes both males and females in all behavior experiments, in general, males are often the preferred sex for behavior studies because they are considered to be less variable, and because using only one sex halves the numbers needed for analysis.
Sometimes males are selected because they make the research easier. Rett syndrome, an autism spectrum disorder, is a severe form of X-linked mental retardation affecting only females (because affected males do not survive). Rett syndrome is caused by mutations in a DNA binding protein, MECP2. There is an excellent animal model in which MECP2 knockout mice display a range of physiological and neurological abnormalities that mimic the human syndrome. The phenotype can be rescued by activating the gene again weeks after birth.
Sheng described a study showing that insulin-like growth factor 1 (IGF-1) administered to MECP2 mutant mice reverses much of the Rett syndrome-like symptoms in the animals (e.g., it prolongs life by about 50 percent, improves locomotor and autonomic functions, enhances brain plasticity). A look at the methods, however, reveals that this study was conducted on males even though this is an almost exclusively female disease in humans. The reason is that males express a far more severe phenotype. In humans, males are stillborn or die early, but in mice, they do not die immediately and express a variety of defects. As a result, measuring the effects of rescue treatments is easier in males. The researchers really should have used both sexes of mice, Sheng said, but added that the study would have taken much longer, and the female results could have been ambiguous because the phenotype is less severe.
Sex differences are significant in normal brain structure and function as well as in behavior, Sheng said. These differences are critical to understanding and treating human diseases of the nervous system. Translational research in neuroscience is particularly complex and arduous, but sex differences should be explored in translational experiments if at all feasible, and investigated when appropriate.
However, basic mechanisms of normal brain function and pathogenesis of most central nervous system disorders are still poorly understood. Therefore, for many indications, studying sex differences may not be completely appropriate yet. In some cases, the most valuable investment of resources may be in basic neuroscience, to gain improved basic knowledge (and hence improved animal models) that is essential to the investigation and understanding of sex differences in physiology and disease.
WOMEN IN CLINICAL TRIALS: FDA POLICIES
Beginning around 70 years ago, and into the early 1970s, women were prescribed diethylstilbestrol to prevent miscarriages and premature deliveries, said Ameeta Parekh, director of research and development in the
Office of Women’s Health at the FDA. Years later, as daughters of these women became adults, the daughters faced miscarriages, infertility, and a higher prevalence of vaginal and cervical cancers. Another drug, Thalidomide, taken by pregnant women in the 1950s to prevent nausea related to pregnancy, was later discovered to result in phocomelia (flipper-like limbs) and stunted limb growth in children. These examples represent the most extensive outbreaks of drug-induced birth defects in medical history, and were the basis behind the regulatory history that starts with 1977 FDA guideline, General Consideration for Clinical Evaluation of Drugs. With a strong tone of protectionism, the guideline said that women of childbearing potential should be excluded from the earliest dose-ranging studies. However, the guideline was broadly taken to mean that women should be excluded from all phases of clinical trials rather than just the earliest phases. As a result, women were excluded or underrepresented in clinical trials, which in hindsight was more detrimental than beneficial. Critics of the guideline said it precluded a woman’s ability to decide for herself whether to participate, and violated the principle of informed consent. Advocacy groups contended that females were being denied access to important and innovative therapies.
In response to these concerns, several new FDA guidelines followed: the 1988 guideline, Format and Content of the Clinical and Statistical Section of an Application, and the 1989 guideline, Study of Drugs Likely to Be Used in the Elderly, both of which recommended analysis by age, race, and sex.
But a guidance is not a regulation or a mandate, and there was concern as to whether these guidelines were enough. Advocacy groups lobbied Congress, and in 1992, Congress requested a Government Accountability Office (GAO) survey of the representation of women in clinical trials. GAO concluded that women were not adequately included (GAO, 1992). For 60 percent of the drugs GAO reviewed, the representation of women in the trial was less than the prevalence of the disease. Even when women were included in the studies, the data were not analyzed for sex differences. Overall, GAO concluded that there was a lack of understanding of sex and gender differences.
In 1993, following the GAO report, a new guideline was issued that reversed the 1977 policy of exclusion of women of childbearing potential from early trials. The guideline, Study and Evaluation of Gender Differences in the Clinical Evaluation of Drugs, recommended collection and analysis of data on sex differences for effectiveness, adverse events, and pharmacokinetics. The guideline also addressed reducing the risk of fetal exposure through protocol design.
In 1998, the FDA issued regulations addressing investigational new drug applications (INDs, generally submitted to the FDA before an investigational
product can be shipped for clinical trials) and new drug applications (NDAs, submitted to obtain marketing approval for a new product).1 The regulations require that NDA submissions and IND annual reports include information on trial participation, safety, and efficacy, and with data presented by age, race, and sex.
IND regulations required that the data be tabulated by age, race, and sex, and NDA regulations required that safety and efficacy data be presented by age, race, and sex. However, there is no requirement for any specific number or percentage within any subgroup or subpopulations, and as a result, the studies may or may not be powered sufficiently to look at differences between subgroups.
In 2000, the regulation was amended to allow the FDA to put a clinical hold on IND studies of treatments for life-threatening diseases if women were excluded due to reproductive potential.
In 2001, nearly 10 years after the 1992 GAO report, Congress requested that GAO again report on the status of women in clinical trials. This time GAO found that appropriate numbers of women were included in studies submitted as part of NDAs, and that participation of women was similar to that of men, except for the earliest phase clinical trials and in select therapeutic areas, particularly cardiovascular disease (GAO, 2001). However, analysis by sex was not consistently present. More recent data, Parekh concluded, show that women’s participation in early phase trials has continued to increased since the 2001 GAO report (Pinnow et al., 2009).
A STRATEGY FOR TRANSLATIONAL PSYCHOPHARMACOLOGY IN MOOD DISORDERS
Carlos Zarate, Jr., chief of experimental therapeutics in the Mood and Anxiety Disorders Program at the National Institute of Mental Health (NIMH), described a strategy for translational psychopharmacology in mood disorders involving multimodal imaging, complex math modeling, and psychiatric stress testing.
As discussed in Chapter 3, depression and other mental disorders are complex behavioral disorders with clear biological differences between men and women. However, little has been found in terms of sex differences in treatment response in depression (e.g., dosing, pharmacokinetics, adverse effects, drug interactions, or the roles of sex-linked genetic traits, menopause, perimenopause, and the menstrual cycle).
The association between sex differences in depression and treatment response remains unclear. One area of focus is genetics, and there has been
considerable interest in certain single nucleotide polymorphisms within the serotonin transporter gene that may be associated with observed differences in clinical response to selective serotonin reuptake inhibitors (SSRIs). But whether polymorphisms are relevant in clinical practice is questionable.
The largest study conducted to date on sex and treatment response in depression is the STAR*D (Sequenced Treatment Alternatives to Relieve Depression) trial funded by NIMH, in which 3,000 outpatients received the SSRI citalopram for 10 to 14 weeks. The remission rate (absence of depressive symptoms) was about 24 percent in men and 29 percent in women, or about a 5 percent difference in favor of women over men. The time to response and remission was not different between males and females. In the end, Zarate explained, the sex differences in remission are small, with low overall remission rates and an extended time to achieve remission.
Much work is currently being done on how endophenotypes might relate to genes associated with depression. In a similar approach, Zarate is looking at treatment response first, and then which genetic underpinnings might be responsible for that treatment response. His approach employs multimodal imaging, objective data, and psychiatric stressors or challenges.
Ketamine, a non-barbiturate anesthetic used worldwide for anesthesia, is useful as a tool for translational psychopharmacology in mood disorders. Ketamine acts by blocking the N-methyl- D-aspartic acid receptor, and has been studied in many conditions throughout the years (e.g., schizophrenia, cognition, alcoholism, chronic pain syndrome). In patients with treatment-resistant major depressive disorder, treatment with ketamine resulted in a robust, rapid, and sustained antidepressant response within 2 hours of a single infusion, compared to the weeks to months that other therapies take to achieve remission (Zarate et al., 2006). Zarate is now working to identify biomarkers that will predict response to ketamine.
In preclinical models, ketamine’s mechanism of action appears to be enhanced α-amino-3-hydroxyl-5-methyl-4-isoxazole-propionate throughput. On a clinical level, Zarate has found that synaptic plasticity or potentiation is associated with the antidepressant effects of ketamine (using slow-wave activity as a putative marker of synaptic plasticity). Using ketamine as a model, within about 4 to 6 hours responders and non-responders can be identified using multimodal imaging technologies. One can then apply psychiatric stress testing and complex math models to see what moderators might be relevant.
In conclusion, Zarate noted that sex difference matters, but the interaction with other variables is the important issue. To date little is known about the impact of sex difference in response to depression treatment. Neurobiological parameters may be valuable in predicting treatment response, and may better explain variance in response than common subdiagnostic
categories. One approach to looking at sex differences in treatment response is to identify biomarkers of response and determine which combination of factors, including sex difference, explains the greatest variance of response. This is a step toward personalized health care.
PRACTICAL ISSUES OF ADDRESSING SEX DIFFERENCES DURING TRANSLATIONAL RESEARCH AND CLINICAL DRUG DEVELOPMENT
Douglas Feltner, vice president of Global Translational Medicine and Neuroscience at Pfizer Inc, reviewed some of the practical issues in drug development. First, it is important to remember that the molecules being studied in translational research are drug candidates, not drugs, and little is generally known about them. Safety and efficacy must be characterized, but cost and time are critical constraints. Ultimately, a candidate does not become a drug until it receives marketing approval and has an approved product label.
In animal toxicology studies, rodent and non-rodent animal species are exposed to drug concentrations that are far in excess of the projected efficacious concentration so that safety issues that might be of concern in humans can be identified. In the IND toxicology studies (conducted before filing an IND to support moving the product into humans), both sexes of animals are used. Animal toxicology data are used to set a human exposure limit. The key toxicology questions are, is there a therapeutic index (a dose that is effective, but not toxic); are any toxic findings in animals reversible; and can they be monitored in humans? For example, tissue necrosis or cellular hyperproliferation are generally not reversible and usually not monitorable. Changes in heart rate, blood pressure, or QT intervals are generally reversible and monitorable.
Sex differences do occur in toxicology findings, Feltner said, and fall into three main types: (1) the same finding may occur in both male and female animals, but at different exposures; (2) a unique safety finding may occur only in one sex, sometimes in a reproductive organ, but other times in other organ systems; and (3) safety findings may occur at the same exposure in male and female animals, but the associated doses may be quite different due to variations in bioavailability, distribution, metabolism, or elimination.
The translatability of irreversible and non-monitorable animal safety findings with no projected therapeutic index are really not known because in nearly all of these cases, these drug candidates are dropped from development. If there is a backup molecule, and it is suspected that the toxicology findings are not related to the mechanism of action of the drug candidate, but rather some structural effect, that backup candidate might be investigated further.
On the other hand, for reversible and monitorable safety findings in animals, something is often known about translatability because those problems may have been addressed before. Sometimes sex differences in animal safety findings are reversible and monitorable, and have an acceptable therapeutic index, Feltner said. If, for example, the safety finding occurs in males at 100 times the efficacious concentration, and is not observed in females until 1,000 times the efficacious concentration, that is not really a concern because the dose would not be set that high in humans.
With regard to efficacy, the point was made throughout the workshop that most animal efficacy studies are done in males to reduce variability. Feltner noted that regardless of whether male or female rats would be more or less variable, having just one sex of rodent is likely to produce less variability than having both. Male/female efficacy differences are generally not explored in animals. The bigger issue for neuroscience discovery is successfully matching novel targets to the right patients. Too often the efficacy observed in animal models does not translate to efficacy in a patient population. Would studying both male and female animals help with this problem?
IND toxicology is completed prior to starting reproductive toxicology studies, so results are not generally available at the start of Phase I trials. Key data are necessary before exposing women of childbearing potential to an investigational product, including one that has potential maternal and fetal toxicity, which is derived from embryo-fetal development studies in rats and rabbits. Female and male fertility studies are completed prior to initiation of Phase III trials, and pre- and postnatal development studies are completed prior to submission of the NDA. Embryo-fetal development studies are usually completed shortly before Phase IIa in order to allow for adequate patient recruitment, including a more representative population, but it depends on indication, Feltner said.
This means that, for practical reasons, males predominate in Phase I studies because embryo-fetal toxicology is not yet done. Women of non-childbearing potential can also participate in Phase I studies. After embryofetal developmental toxicology is done, the male-to-female ratio for recruitment depends on the disease being studied (although for reasons unknown, the actual recruitment may have slightly more males than the epidemiology of the disease would predict).
Differences in tolerability related to male/female exposure differences may be found in Phase I studies. Careful pharmacokinetic characterization allows for understanding of these sex differences in exposure by dose, but with the small sample size of Phase I studies, only limited information on differences in tolerability can be obtained, and none on efficacy.
As product development approaches the NDA submission, sex difference effects in exposure are examined in a population pharmacokinetic
analysis across all of the studies that will be part of the submission. A pharmacokinetic/pharmacodynamic model is built as data accumulate; efficacy and common adverse events are related to either exposure or dose, by sex; and findings are then used to support dosing recommendations.
Ultimately, moving a drug candidate through development involves responding to the new data that are always emerging, Feltner said. In some circumstances more research is necessary to understand sex differences in efficacy or safety in the clinic, and in other circumstances, nothing more needs to be done than is currently being done. The next steps depend on the cumulative data up to that point.
Participants made a variety of additional points regarding preclinical research, including the value of animal models, reporting results of animal studies, and the need for clinical and basic researchers to work together on animal model development.
One participant expressed the opinion that psychiatric disorders are essentially human, and it is unlikely that, for example, an animal model will ever be depressed from a human point of view. However, understanding traits in animal models and looking at endophenotypes does provide relevant information about the disorders. Understanding the mechanisms behind the disruption of the reward circuitry around food motivation, for example, is extremely important for understanding the nature of eating disorders in humans.
A participant suggested that it is not necessarily the animal models that are at fault, but the quality of the science that is done around those models. A participant urged industry to make preclinical animal data available as soon as possible so that those in the public sector could review outcomes. The data do not have to be published in the traditional sense, he said, but simply made available, especially for systematic reviews.
Stevin Zorn, workshop cochair, noted that many animal models used in drug discovery research were developed more than 50 years ago, when clinicians and basic animal researchers worked closely together to model human diseases. Given what is now known about the complexities of diseases, and the impacts of not just single, but multiple, genetic defects, it is time to get basic and clinical researchers back together to reevaluate what the animal models are, what information the models can provide, and what can be modeled.
Current diagnostic criteria were raised as a barrier to progress in research. Symptom-based disorders are qualitatively different from neurodegenerative
or neuroinflammatory disorders, a participant said. In the ongoing large-scale GWAS studies, the biggest bottleneck is going to be the phenotyping of patients. The symptom-based diagnostic categories used in pain and psychiatry are consensus-based criteria that subclassify further into artificial categories that do not consider the full syndromes. Much better phenotyping is needed, and sex-based differences should be included automatically in these phenotypes. Otherwise they are not complete, accurate, and descriptive phenotypes.
Zarate concurred with concerns about diagnostic categories. Statistics show that an individual with two comorbid anxiety disorders has a 50 percent chance of having a third. With three or four comorbid anxiety disorder diagnoses, why not just have an anxiety disorder across all of the comorbidities and focus on that? As time passes, hopefully more biomarkers will be recognized in psychiatry, and diagnostic groups will be needed less.