Genomics, Proteomics, and the Changing Research Environment
Since 1944, when Avery, MacLeod, and McCarty published experimental evidence suggesting that DNA serves as the repository of genetic information (Avery et al., 1944), our understanding of the organization and biological function of DNA has increased dramatically. Their revolutionary insight led to the elucidation of the so-called genetic code, which underpins the central dogma of molecular biology: DNA makes RNA (specifically messenger or mRNA), which makes proteins. Subsequently, exploitation of tools from physics and chemistry enabled spectacular advances in genetics, leading to the molecular biology revolution in the late 1970s to early 1980s, and ushered in the era of DNA cloning with its powerful new tools to study biology.
The Human Genome Project (HGP) (with its many spin-offs, such as the SNP Consortium,1 the HapMap Project,2 and the Protein Structure Initiative),3 aims to provide a complete working knowledge of the human genome and, in the longer term, proteomics, which together will provide information and the tools necessary for advancing our understanding of human health and disease. Most recently, the advent of new technologies permitting the simultaneous study of many thousands of genes, messenger RNAs, single nucleotide polymorphisms (SNPs), proteins, or the products of genes in parallel is producing a flood of information and claims about the role of genes in human disease and behavior.
This new knowledge is revolutionizing the field of medical diagnostics and could yield a powerful arsenal of therapies that offer the promise of cures instead of just amelioration of symptoms. Precisely because of this potential, the rise of genomics and proteomics has generated numerous policy battles, of which disputes about intellectual property are but one.
This chapter provides background information on the science of genomics and proteomics and their impact on the changing paradigm in genetic or personalized medicine and briefly describes some of the policy debates that have ensued regarding openness and access to genomic and proteomic data as they have affected the conduct of science. Chapter 3 focuses more specifically on intellectual property issues affecting these fields as they have entered the U.S. patent system and the courts.
THE IMPORTANCE OF DNA SEQUENCE
After 1953, when Watson and Crick proposed the essentially correct model for the three-dimensional structure of the DNA double-stranded helix (Watson and Crick, 1953), it soon became evident that genetic information stored in DNA was both finite and discrete (or digital) in nature. Knowledge of the order of the four bases—adenine, guanine, cytosine, and thymine (A, G, C, and T)—within each DNA strand, or sequence, of an organism provides full knowledge of all the genetic information passed from one generation to the next. According to Crick, he and Watson speculated about determining the full sequence of human DNA early on but discarded the idea as one that would not reach fruition for centuries (Crick, 2004).
Astounding progress over the ensuing three decades in the discipline now known as molecular genetics, however, proved their pessimistic estimates incorrect. A DNA fragment from any organism can be inserted (or cloned) into the bacterium E. coli, which in turn can generate for further study huge numbers of copies of the desired gene fragment. In 1977, the Nobel laureate chemist Frederick Sanger developed efficient methods for using these amplified samples of genetic fragments to determine the sequence of the DNA bases and published the entire sequence of some small viral genomes (Sanger et al., 1977). By the mid-1980s, much of the molecular genetics research community was engaged in isolating and sequencing from particular organisms DNA for individual genes of interest.
Open, facile access to this relatively limited amount of DNA sequence information became an important priority for molecular biologists and molecular geneticists alike. As a result, in 1979 GenBank was established as a nucleic acid sequence database at the Los Alamos National Laboratory and was funded by the National Institute of General Medical Sciences three years later. In 1988, the National Center for Biotechnology Information (NCBI) of the National Institutes of Health (NIH) was organized, and it took over the management of GenBank.
The GenBank database is designed to provide and encourage access to the
most up-to-date and comprehensive DNA sequence information to members of the scientific community. Because protein primary structures now are determined mostly by complementary DNA (cDNA) sequence analysis, links between the nucleotide and protein sequence databases are common. GenBank belongs to an international collaboration of sequence databases, which also includes the European Molecular Biological Laboratory and the DNA Data Bank of Japan. Protein sequences are archived in another international consortium, Universal Protein Resource (UNIPROT),4 which is a central repository of protein sequence and function.
NCBI places no restrictions on the use or distribution of the GenBank data. However, some submitters may claim patent, copyright, or other intellectual property rights in all or a portion of the data they have submitted. There were 37,893,844,733 bases in 32,549,400 sequence records as of February 2004.
THE HUMAN GENOME PROJECT
In an effort to marshal these rapid advances, Robert L. Sinsheimer of the University of California, Santa Cruz, formally proposed in 1985 the possibility of a concerted effort to sequence the human genome. In 1986, Renato Dulbecco, a Nobel laureate and a member of the Salk Institute, made in the pages of Science magazine a similar proposal to provide the underpinning for the study of cancer (Dulbecco, 1986). Influential and widely circulated reports by the U.S. Department of Energy (DOE), the congressional Office of Technology Assessment (U.S. Congress, 1988), and the National Research Council (NRC, 1988) all followed and recommended such a project. The NRC report recommended that the U.S. government financially support a project and presented an outline for a multistep research plan to accomplish the goal over 15 years. Soon thereafter, NIH and DOE signed a Memorandum of Understanding to “provide for the formal coordination” of their activities “to map and sequence the human genome.” In fiscal year 1988, Congress formally launched the Human Genome Project (HGP) by appropriating funds to both DOE and NIH for that specific purpose.
As envisioned in the NRC report, the HGP did not begin immediately with human sequencing. Instead, the program sought to build infrastructure through a variety of projects. These efforts included the exploration of alternative sequencing technologies, the adaptation of existing technologies to the simpler problem of sequencing smaller genomes of laboratory organisms, and the development of low-resolution maps of the human genome. Other countries—in particular Britain, France, and Japan—also initiated the HGP, and indeed several early successes came from outside the United States.
Despite broad governmental support, the HGP generated considerable con-
troversy in the scientific community. The shift from traditional, hypothesis-driven, small-laboratory, one-gene-, one-protein-at-a-time science to this new data-driven, large-scale engineering program initially engendered resistance in the molecular genetics community. Even the project’s supporters were far from united in their vision of how best to proceed. Many felt that the project would become feasible only with the discovery of completely novel sequencing methods that would be orders of magnitude faster and cheaper than previous methods. Others, particularly Craig Venter, then an investigator at NIH, argued that for the human genome—when much of the sequence was thought to be without function (so-called junk DNA)—a much more efficient strategy would be to sequence only the protein-coding genes through cDNAs, thereby reducing the amount of required sequence by a factor of 10 or more.
Despite conservative expectations, rapid progress was made on many fronts. A framework human genetic map soon emerged, with far greater resolution than initially anticipated. Circular DNA molecules, or vectors, were devised for carrying ever-larger amounts of DNA into bacteria, thereby facilitating construction of physical maps of whole genomes. Adaptation of conventional DNA sequencing approaches to highly automated machines yielded a dramatic expansion in global DNA sequencing capacity that produced in rapid succession the sequence of the first bacterial genome (H. influenza), the first genome of a eukaryote, an organism with a cellular nucleus (baker’s yeast or S. cerevisiae), and the first genome of a multicellular animal (the roundworm C. elegans). Given the pragmatic nature of most scientists, it came as no surprise that the enormous utility of these whole genome sequences across the biological scientific enterprise quickly overcame the objections of remaining skeptics. Novel sequencing methods were not required; instead, the basic Sanger method was almost completely transformed by new machines—developed, for example, by Lloyd Smith, Leroy Hood, and Michael Hunkapiller at the California Institute of Technology—and software by others, such as Phil Green, to deal with the data. The early enthusiasm for sequencing cDNAs or their cousins, expressed sequence tags (ESTs), waned as this information proved to be no substitute for the full genome sequence. However, once a full genome sequence was obtained, both cDNA and EST information proved highly useful in finding genes. EST and cDNA sequencing also provided a rapid means of identifying and characterizing some medically significant genes, opening a path to early intellectual property claims. Venter pursued EST sequencing vigorously, and two companies, Incyte and Human Genome Sciences, devoted extensive resources to capturing these sequences and obtaining patent rights to them (see Chapter 3).
Based on this initial flurry of success, the international HGP began the systematic sequencing of the human genome in 1996 on a pilot scale and in 1999 initiated a full-scale effort. Because many investigators wanted to participate in such a historic project, the pilot phase included laboratories throughout the world. The pilot phase was intended to evaluate the cost and quality of the product,
select among the variations in sequencing strategies that were still in play, and determine whether performance and economies of scale warranted reducing the number of participants.
Funded participants met in early 1996 to coordinate their efforts. Among the critical decisions made by the group was the adoption of the “Bermuda Rules” as the basis for data sharing and release (see discussion and Box B). In a subsequent meeting, the group also considered a proposal to switch from a clone-based strategy to a whole-genome shotgun, or fragment-based, approach. The potential value of rapid access to large parts of the genome (and therefore genes) was not disputed, but the proponents could not describe a path from the shotgun data to a high-quality complete sequence. The challenges of assembling sequences of individual DNA fragments and in turn assigning all the pieces to specific chromosomal locations in the correct order and orientation were additional concerns. After vigorous debate, the switch in strategy was rejected.
As the pilot phase drew to a close, the successful groups coalesced around a common strategy and methodology, and a few groups emerged as leaders. Economies of scale also were evident. Most importantly, the pilot phase demonstrated that the strategy was capable of producing high-quality sequences in large contiguous blocks at acceptable costs and that costs were continuing to fall. Funding agencies in the United States and the United Kingdom elected to proceed with a full-scale effort, limiting resource allocations to only a small number of highly successful research teams.
Just as these decisions were being made, Craig Venter and the DNA sequencing instrument manufacturer Applied Biosystems, Inc. (ABI) surprised the genomics community with their announcement of a joint venture to sequence the human genome using a whole-genome shotgun approach, in direct competition with the international effort. Unlike the public project, their data were to be held by a company (Celera, Inc.) and initially released only to paying subscribers. Patents would be sought for genes of interest. The scientists leading both the public and private ventures had strong motives to pursue their own courses, and they justified their plans to their funders. A race was on.
On June 26, 2000, the public and private groups announced jointly at a White House-sponsored event that each had succeeded in producing an initial draft of the human sequence, with simultaneous publications describing their findings appearing in 2001 (Lander et al., 2001; Venter et al., 2001). The international HGP published a full and significantly more accurate human genome sequence in 2004. Genome sequences from species across the evolutionary tree continue to flood the databases today.
HUMAN GENETIC VARIATION
One of the most important uses of the human genome sequence information is its explanation of how DNA sequence variation leads to differences among
individuals (phenotypic variation) and guidance on how to apply that information for the betterment of humankind. The development of genetic and physical maps and the ongoing release of the human sequence over the past 15 years have greatly increased the number of genetic diseases for which the causative defective, or mutant, gene has been identified. Today the genetic bases for all the major Mendelian (single gene) diseases are known. OMIM (the Online Mendelian Inheritance in Man) now lists approximately 2,000 genes in which the molecular basis for the Mendelian phenotype is known. Just 15 years ago, only a handful of such genes were known, and the cloning of a gene responsible for human genetic disease became front-page news.
Such molecular insights into disease are leading to new strategies for diagnosis and therapy. Definitive diagnoses can be made directly on the defective genes themselves, without the ambiguities of previous indirect phenotypic measures. DNA testing also can be carried out prospectively, permitting action to be taken before overt symptoms develop, an important advantage in genes that predispose individuals to cancer, for example. Tests can even be conducted prenatally, as early as the pre-implantation stage of development, or even in vitro, allowing prospective parents a choice—a significant benefit in cases of devastating childhood genetic diseases such as Tay-Sachs, sickle cell anemia, or cystic fibrosis.
Exploiting molecular insights with which to craft alternative therapies has proven to be more challenging than developing new diagnostic tools, but important progress is being made. The most obvious gene-based strategy is called “gene transfer” or “gene therapy,” which involves correcting the underlying genetic defect by providing a patient’s cells with a functional gene that directly reverses the deleterious effects of the mutant or missing gene. To date, success has been limited to a small number of relatively special cases. More encouraging, however, is the growing realization that our knowledge of the precise molecular nature of the genetic defect or mutation can lead to specific therapies that block the consequences of the mutation indirectly. For example, the discovery and characterization of the chromosomal fusion that causes white blood cells to divide uncontrollably, giving rise to the cancer known as chronic myelogenous leukemia (CML), eventually led to the development of the small-molecule drug imatinib (Gleevec®/Glivec®), which has yielded spectacular results in patients who would otherwise have died within a few years of diagnosis. Imatinib works by blocking the inappropriate function of a fusion protein, BCR-ABL, which is encoded by a new gene that is created by fusion of two chromosomes in the patient’s leukemic cells. Alternative treatment strategies are aimed at trying to restore protein functions that have been lost as a consequence of mutations. For example, drugs that directly influence the functioning of mutant forms of the cystic fibrosis transmembrane conductance regulator protein are now going into clinical trials with the hope that they will be able to restore sufficient function to alleviate the devastating symptoms of cystic fibrosis.
We know, however, that human genetic variation is at the root of many more
diseases than these relatively rare single gene disorders. Detailed comparisons of human DNA sequences have demonstrated that two copies of the human genome differ by about 1 base in every 1,300. In all, there are approximately 3 billion bases in the human genome, which means that the DNA sequences of any two individuals differ at more than 2,000,000 base positions along the DNA double helix. Among such differences are those that underlie heritable variation among individuals for an enormous number of traits, such as eye and skin color. In addition, combinations of particular genetic variations within populations give rise to genetically complicated or multifactorial diseases (e.g., hypertension, colon cancer). Medical geneticists are just now beginning to use comparative human genome sequencing to understand the extent of genetic variation and to describe the common variants shared across populations. Two prominent extensions of the HGP, the SNP project and the HapMap project, have begun to build this foundation. Indeed, understanding how genetic variation leads to individual human variation is one of the great scientific challenges of the twenty-first century.
The path forward will inevitably involve an increasingly broad survey of genetic variation across the genome in larger and larger groups of individuals. Correlation of genetic and phenotypic differences will establish causal relationships, ultimately revealing the identities of the multitude of genes that contribute to particular traits. Methods for assaying genetic variation are changing rapidly, with various revolutionary approaches nearing commercial testing. The most impressive of such innovations would allow for the complete cataloging of an individual’s DNA sequence at a cost of less than $1,000 per person. As these cutting-edge technologies are introduced and an increasing number of causal relationships are known, the field of diagnostics will move from its current focus on single genes to a search of all the genes responsible for a particular disease. This knowledge will be critical to realizing the goals of personalized medicine, among other potential benefits, in which drugs are targeted to small groups and even individuals who are likely to benefit from the therapy and unlikely to suffer adverse reactions. For example, Genzyme has just introduced a test (based on research from the Massachusetts General Hospital and the Dana-Farber Cancer Center) that identifies cancer patients who are more likely to have a favorable response to cancer drugs that target the epidermal growth factor receptor.
The scientific community will need freedom to operate to realize these achievements, but concerns exist about the multitude of existing patents on genes and fragments of the human genome—with the prospect of even more—that could impede or even block progress. Early-stage applications are likely to be affected more severely.
WHAT ARE GENOMICS AND PROTEOMICS?
The success of the initial phase of the HGP and the attendant availability of the human genome sequence and the genomes of numerous other organisms have
transformed the study of biology. Most obviously, the full catalog of genes from each genome opens up new avenues of study. No longer does an investigator need to confine his or her inquiries to a single gene or a small set of genes; instead, the behavior of an ensemble of genes can be investigated simultaneously. At the level of science policy, the genome projects have served to validate data-driven or discovery-based approaches as legitimate intellectual competitors of more traditional hypothesis-driven research programs. Finally, at the level of methods development and scientific instrumentation, the genome project made “high-throughput” methods, including robotics and sophisticated computing, part of the biologist’s standard toolkit. This constellation of attributes—comprehensiveness, data-driven character, and large scale—distinguishes genomics from its parent science, genetics.
The characteristics of comprehensiveness, scale, and intellectual attitude differentiate proteomics from more traditional ways of studying proteins in much the same way. Interactions between pairs of proteins can be evaluated, not just those of a few likely candidates. Protein identification using mass spectrometric analysis can compare the patterns obtained to what is possible in the genome and quickly identify many of the proteins in a particular mixture, rather than exhaustively characterize each constituent one at a time. In a subdiscipline of proteomics known as structural proteomics, the three-dimensional structures of proteins can be examined in systematic, data-driven projects, rather than focusing on the precise details of a single protein. Indeed, “-omics” has become shorthand for data-driven, large-scale, comprehensive projects of a enormous variety.
The results of genomics and proteomics increasingly promise the potential for future widespread adoption in medicine and biology. Simultaneous measurement of many mRNA levels now can reveal patterns of gene expression for an organism or a tissue under various conditions that can then be compared, pointing to genes characteristic of certain states or reactions. For example, distinct sub-types of large-cell lymphomas, with quite different responses to chemotherapy, can be distinguished from one another by measuring mRNA expression patterns, thereby providing a means of directing therapy (Staudt and Dave, 2005). Likewise, serum samples can be decomposed into a spectrum of proteins, looking for patterns—referred to as biomarkers—of a particular disease, opening up the possibility for early detection and diagnosis.
The DNA found within each cell contains the genetic blueprint for the entire organism. Each gene contains the information necessary to instruct the cellular machinery how to make mRNA, and in turn the protein encoded by the order of bases constituting the gene. Each one of these proteins is responsible for carrying out one or more specified molecular functions within the cell. Differing patterns of gene expression (i.e., different mRNA and protein levels) in different tissues explain differences in both cellular function and appearance.
The original central dogma of molecular biology posited that each gene encodes the information for the synthesis of a single protein. In recent years, how-
ever, it has become apparent that the primary RNA transcript of eukaryotic genes can be processed in more than one way—one gene can produce more than one protein. In one dramatic case, it is thought that a single gene in Drosophila has the potential to encode more than 30,000 closely related but distinct proteins (Gravely, 2005). Furthermore, once a protein is produced through a process called translation, it can be further modified by the covalent attachment of substances such as sugars, fats, phosphate groups, and other so-called post-translational modifications that affect the function that the protein performs for the cell. Together, these various possibilities constitute the proteome—the entire set of proteins made by a cell or, in the case of multicellular organisms, such as humans, all the cells of the body—that is many, many times larger and much more complex than its corresponding genome.
In addition to addressing the full complement of genes in a genome, genomics involves global studies of gene expression, including expression patterns of particular cell types and gene expression under specific conditions or during particular stages of development. For example, muscle cells preferentially express the gene encoding the red-colored, oxygen-storage protein myoglobin at high levels, thereby ensuring that we can exercise. In contrast, skin cells strongly express the genes encoding the skin keratin proteins that provide the protective layer covering our bodies. Skin cells do not express myoglobin, and muscle cells do not express skin keratins. Such genes are said to be silent or “switched off.” Thus, each cell, despite having the same genome, utilizes a different complement of expressed mRNAs, or “transcriptome.”
Like genomics, proteomics aims to study the entire repertoire of proteins within an organism. Proteomics is far less advanced than the field of genomics because robust technologies to study the structure and prevalence of all proteins in a cell in a high-throughput manner are only now being fully developed. The challenge that proteomics faces is enormous because of the finding that many genes code for multiple proteins, and those proteins are modified post-translationally in complex ways. As with genomics, proteomics has much to tell us about complex disease states and our own evolution. Differences in protein levels and protein modifications can be measured by two-dimensional gel electrophoresis, mass spectrometry, and protein microarrays. Researchers have claimed that measured differences in proteins from the blood of different patients can be used to predict the onset of ovarian cancer (Petricoin et al., 2002), although these approaches yet must be demonstrated to be reliable. Furthermore, scientists using a variety of experimental approaches can determine which proteins interact with one another while performing their cellular functions.
Today, such tools are being used to understand the protein composition of complex biological networks that are responsible for carrying out complicated tasks within cells. In some cases, these networks of proteins are used to transmit signals from the surface of the cell to the nucleus, where they can switch genes on and off. Proteins making up such signal transduction pathways have emerged
as important new targets for cancer drugs because of the causal link between the expression of certain genes (e.g., BCR-ABL in CML) and uncontrolled cell division.
THE IMPORTANCE OF PROTEIN STRUCTURE
Unlike DNA, which has a double helical structure regardless of its sequence composition, proteins assume three-dimensional shapes that depend on their precise sequence of amino acids and the microenvironment of the protein. The three-dimensional shapes give rise to the various functions carried out by proteins—the building blocks, machines, and control networks of organisms. These three-dimensional shapes (protein structure) can be determined, as was first accomplished in 1957 for myoglobin by Sir John Kendrew and co-workers (Kendrew et al., 1958), using a physicist-invented technique called x-ray crystallography. In this method, an x-ray beam is directed through a crystal composed solely of the protein of interest. The spray of x rays emerging from the crystal (the diffraction pattern) can be analyzed using computational methods to provide a full atomic description of the three-dimensional shape of the protein. In the 1980s another method, called nuclear magnetic resonance (NMR), was introduced into use to determine the structures of small proteins. Most recently, cryo-electron microscopy has been used to visualize the three-dimensional shapes of collections of proteins found within cells that are referred to as macromolecular “machines.”
An early example of the importance of three-dimensional structure in biology was first appreciated when Watson and Crick showed that DNA strands are usually organized in pairs into a double helix, which immediately suggested how the genetic information stored in DNA can be imparted equally to two different daughter cells upon cell division. Subsequently, Kendrew’s structure of myoglobin explained how this protein carries out the specialized task of oxygen storage in muscle cells. Simply put, and in principle, “function follows form” in biology, to recast a phrase borrowed from modern architectural theory.
Insights coming from structural proteomics hold the promise of understanding biological processes at the molecular level. The other important practical advantage of knowing the three-dimensional structure of a protein is that small-molecule drugs can be engineered to fit into regions of the protein that are responsible for cellular activities. By using the protein structure to guide optimization of small-molecule drugs, pharmaceutical and biotechnology companies are seeking to develop drugs that preferentially bind to their target proteins and do not cause unwanted side effects by binding to other proteins in the body.
As the number of protein structures elucidated with x-ray crystallography and NMR increased, a need was perceived by the scientific community to archive the atomic coordinates (or positions) for each protein. In 1971, the Protein Data Bank (PDB), the worldwide repository for three-dimensional biological macromolecular structure data, was established at Brookhaven National Laboratories
with just seven structures; each year a handful more was deposited. In the 1980s, the number of deposited structures began to increase dramatically, because of improvements in technology for all aspects of the crystallographic process, the addition of structures determined by NMR methods, and the emergence of data-skewing norms in the relevant scientific community. By the early 1990s, the majority of scientific journals required the deposition of atomic coordinates into the PDB and a PDB accession code as a condition of publication. At least one U.S. funding agency (the National Institute of General Medical Sciences) adopted guidelines published by the International Union of Crystallography requiring data deposition for all three-dimensional protein structures. In 1998, the management of the PDB moved to the Research Collaboratory for Structural Bioinformatics—a consortium involving the University of California San Diego and Rutgers, the State University of New Jersey. In 2005, there were more than 30,000 structures in the PDB, and more than 10,000 researchers in biology, medicine, and computer science access the PDB Web site daily. In 2004, the journal Molecular & Cellular Proteomics introduced guidelines for authors planning to submit manu-scripts containing large numbers of proteins identified primarily by multidimensional liquid chromatography (LC/LC) coupled online with tandem mass spectrometry (MS/MS), or LC-MS/MS (Carr et al., 2004). The guidelines address the need for the scientific community to make such data readily available.
CHANGING SCIENTIFIC AND CLINICAL PARADIGMS
In the last decade, these advances in genomics and increasingly in proteomics have combined with technical advances in molecular biology, liquid-handling robotics, miniaturization, image analysis, and computing platforms to transform the way in which biologists approach the study of cells and even entire organisms. Together they have produced a relentless movement away from an earlier necessary penchant for reductionism toward the goal of understanding how entire biological processes work and are regulated—the goal of systems biology. Today, systems biologists study the complex interplay of a host of genes as these genes give rise to a disease symptom, such as hypertension, or analyze hundreds of proteins in a blood sample to identify patterns that may be indicative of a particular cancer.
This movement toward a more “holistic” view of biology already has begun to change the face of the academic biological research enterprise. Interdisciplinary research teams involving biologists, chemists, physicists, mathematicians, and computer scientists are coming together in growing numbers. As discoveries stemming from genomics/proteomics are transformed into valuable items of intellectual property owned by universities, the new generation of biologists will wield ever more influence.
Clinical medicine and the pharmaceutical/biotechnology companies industry are faced with yet more disruptive influences from the genomics/proteomics revo-
lution. Within a decade, DNA sequence information will become integral to the practice of medicine. Prescription drug usage ultimately may be dictated by a given patient’s genetic makeup. Instead of trying to develop therapeutics to relieve symptoms, drug companies will come under increasing pressure to tailor therapies to individual groups of patients sharing a particular genomic/proteomic signature or fingerprint (as well as certain nongenetic traits). This sea change will first become apparent in the design and execution of clinical trials, in which genetic predispositions to therapeutic benefits and risks will be analyzed. The benefit to pharmaceutical manufacturers will be cheaper, faster clinical trials promising a higher likelihood of detecting a positive signal and a reduction in the number of adverse events, leading to more rapid approvals. Such advances will, however, come with a price for pharmaceutical and biotechnology companies. The advent of personalized medicine may well bring an end to the era of so-called blockbuster drugs, because product development will be restricted to smaller target patient populations. To say the least, the current economics of drug discovery, development, and marketing will change considerably.
Until very recently the field of human genetics has been restricted to studying diseases whose etiology can be traced to mutations in a single gene (e.g., cystic fibrosis). However, few common diseases are monogenic. The emerging focus of personalized medicine increasingly is on polygenic disorders and the importance of epigenetic changes in disease or the role of a family of somatic cell mutations and epigenetic changes in tumors. In these polygenic disorders, multianalyte tests will be required, and the relative contributions of each genetic or epigenetic change will need to be defined. For statistical reasons, relatively large populations will need to be studied to define the relative contributions of each change in the DNA sequence or degree of methylation, mRNA expression level or protein concentration, and degree of post-translational modification. The opportunity to create new intellectual property is rich, and it is possible that it will be difficult for one gene patent to block the development of these tests. For example, in gene expression arrays, many genes tend to show highly correlated expression levels; thus, it is often possible to substitute one gene for another without a substantial loss of predictive power. Because of the need to conduct substantial clinical trials to prove the practical value of these poly-analyte tests, however, the companies that develop these tests will have proprietary products.
INTELLECTUAL PROPERTY AND COMMERCIALIZATION
In the 1980s, two series of events converged with the rapid advances in science to transform the conduct of academic biological science: the development of public policies that encourage—even require—scientists and their institutions to pursue commercialization of research, and the growth of the biotechnology industry, which has benefited immensely from the intellectual capital found in academic institutions and the basic research investment made by the U.S. govern-
ment in biological and biomedical research. These trends, described briefly below, radically altered the culture of biology and challenged its longstanding norms of sharing and openness.
Technology Transfer Policies
In a message on October 31, 1979, and again in his State of the Union Address on January 21, 1980, President Jimmy Carter urged Congress to spur industrial innovation by enacting a three-part reform in the policy and operation of the patent system. One part of the legislative package urged the creation of a uniform government patent policy for university and small business federal contractors and grantees under which they could retain ownership of patents arising from research performed with federal support. Although the presumption was that title to patents produced by other contractors—for example, large corporations—would be kept by the government, these contractors could be granted an exclusive license for commercial exploitation of the invention.
In 1980, in response to concerns about U.S. competitiveness in the global economy, Congress enacted two laws that encourage government-owned and government-funded research laboratories to pursue commercialization of the results of their research. These laws are known as the Stevenson-Wydler Technology Innovation Act (P.L. 96-480) and the Patent and Trademark Amendments of 1980 (P.L. 96-517), the latter also known as the Bayh-Dole Act. Their stated goal is to promote economic development, enhance U.S. competitiveness, and benefit the public by encouraging the commercialization of technologies that would otherwise not be developed into products because of the lack of incentives associated with exclusive rights.
The Stevenson-Wydler Technology Act, which established basic federal technology transfer policies, enables NIH and other federal agencies to execute license agreements with commercial entities in order to promote the development of technologies discovered by government scientists. The act also provides a financial return to the public in the form of royalty payments and related fees. In 1986, the directives of this act were augmented by its amendment, the Federal Technology Transfer Act of 1986 (FTTA), which authorizes federal agencies to enter into cooperative research and development agreements with nonfederal partners to conduct research. The FTTA also authorized federal agencies to pay a portion of royalty income (currently a maximum of $150,000 per inventor per year from all royalty sources) to inventors who had assigned their rights to the government. These payments are not considered outside income; rather, they are deemed part of the employee’s federal compensation.
The Bayh-Dole Act was designed to address barriers to commercial development affecting nongovernmental entities, with the aim of moving federally funded inventions toward commercialization. The act enables grantees and contractors, both for-profit and nonprofit, to choose to retain title to government-funded in-
ventions, and it charges them with the responsibility to use the patent system to promote utilization, commercialization, and public availability of inventions. Other provisions ensure among other things that sponsoring agencies have a nonexclusive license to use the invention for government purposes; that nonprofit organizations cannot assign rights to the invention without the approval of the sponsoring federal agency; and that organizations other than small businesses will be prohibited from granting exclusive rights to the invention from the earlier of 5 years from its first commercial use or 8 years from the date of invention. The law also empowers any federal agency to require inventors or their assignees to grant licenses in order to (1) achieve practical application of the invention in its field of use; (2) alleviate health or safety needs; (3) meet requirements for public use specified by federal regulations; or (4) achieve participation by U.S. industry in the manufacturing of an invention. And the law prohibits licensing that reduces competition.5
Recipients of federal research funds—academic institutions and industry—have 25 years of experience in technology transfer under Bayh-Dole. To accomplish technology transfer, institutions typically seek patent protection for inventions arising from their research and license rights to private entities in order to promote commercialization. In this way, private entities interested in practicing an invention in which they have no ownership may by entering into a licensing agreement with the patent owner obtaining rights to use and commercialize it.
Because most universities share a substantial portion of the royalty income generated from patent licenses with faculty inventors, patents offer an additional incentive for researchers to pursue projects that have commercial potential. Although the rules that universities use for allocating royalties vary, a typical payment system gives a first cut from royalty income to the university to reimburse it for the costs of filing the patent. After costs are recovered, the income is then divided among the university’s technology transfer office, the faculty members listed as inventors, the faculty members’ departments, and other departments in the university. Some of these agreements can provide faculty members with as much as 50 percent of the total royalty revenue after patent costs are recovered.
Patenting also increases incentives for faculty members to keep their findings secret until a fully developed patent application or a provisional application is filed. Secrecy can be problematic for the careers of students and junior faculty members who must publish their research findings to establish their reputations and obtain funding. For this reason, most universities strive to file patent applications quickly so that publications are not delayed. Patents, because they are not validated by other academics, may not be a source of academic credit, even though
P.L. 96-517. Summary at http://thomas.loc.gov/cgi-bin/bdquery/z?d096:HR06933:@@@L%7CTOM:/bss/d096query.html%7C, consulted August 26, 2005.
they may be sources of credit in commercial science. U.S. patent law provides a grace period enabling an inventor to disclose an invention—for example, in a paper or conference presentation—up to one year before filing a patent application. Other countries do not have this one-year grace period, so a truly valuable invention with worldwide implications cannot be disclosed before the inventor applies for a patent if foreign patents are to be sought. Many universities have adopted regulations limiting the time that commercial sponsors can delay publication so that patents can be filed. Little, however, can be done to prevent faculty members themselves from delaying disclosure of their research to protect their own interests as potential inventors (Blumenthal et al., 1986; 1996; 1997).
Emergence and Expansion of the Biotechnology Industry
The importance of the university scientist to commercial biotechnology has been well documented (U.S. Congress, 1988; NRC, 1988; Blumenthal et al., 1996). Early concerns about collaborative research arrangements in biotechnology, particularly those involving universities and industry, focused primarily on issues of academic freedom, proprietary information, patent rights, and other potential conflicts of interest among collaborating partners (Blumenthal et al., 1986; Kenney, 1986; Kodish et al., 1996). Biotechnology firms and large pharmaceutical companies, however, continue to support biotechnology research in universities.
Concerns persist regarding the subtle impacts of these collaborative arrangements, specifically whether university-industry relationships adversely affect the academic environment of universities by inhibiting the free exchange of scientific information, undermining interdepartmental cooperation, creating conflict among peers, or delaying or completely impeding publication of research results (Firlik and Lowry, 2000).
Several drawbacks to university involvement with industry-sponsored research have been identified (Dueker, 1997; Firlik and Lowry, 2000). University officers and faculty are concerned about constraints on academic freedom and the inevitable conflict between commercial trade secrecy requirements and traditional academic openness. Many circumstances and forces at play prompt companies to control research conduct and protect the secrecy of research data. In surveys of the biotechnology industry conducted in the 1990s, 56 percent of companies reported that in practice, the university research they supported often or sometimes resulted in information that was kept confidential to protect its proprietary value beyond the time required to file a patent (Blumenthal et al., 1997). On occasion, conflicts between companies and faculty about the content in published reports of industry-sponsored research have been reported. For example, companies have preferred not to publish the results of studies resulting in less-than-optimal data, although academics asserted that the insights to be gained by publication would advance scientific understanding (Bodenheimer, 2000).
The birth of the modern biotechnology industry can be traced to the early 1970s, with the discovery of genetic engineering techniques, such as recombinant DNA methods and hybridoma production. These discoveries were made by biochemists and molecular biologists, many of whom were working at large academic medical centers.
The formation of Genentech is often considered the starting point of the biotechnology industry. Genentech was founded in 1976 by University of California, San Francisco, scientist Herbert W. Boyer and venture capitalist Robert Swanson. In 1978, the company announced that it had successfully cloned a human insulin gene using recombinant DNA technology. This discovery was licensed to Eli Lilly, the largest U.S. producer of insulin, and in 1982 recombinant human insulin was the first recombinant drug to gain Food and Drug Administration (FDA) approval. Human insulin was considered a significant advance in the treatment of diabetes, since a number of diabetics were allergic to traditional insulin extracted from the pancreatic glands of pigs and cows.
Genentech went on to develop and market its own recombinant drugs, the first being recombinant human growth hormone, which was approved in 1986 for use in children with a rare form of dwarfism caused by a lack of sufficient endogenous growth hormone. Prior to the development of Genentech’s human growth hormone, these children were treated with growth hormone obtained from cadaver pituitary glands. Problems with this material included periodic shortages and also, rarely, the development of a lethal neurodegenerative disease called Creutzfeldt-Jakob disease, which came from an undetectable infectious agent found in cadaver pituitary tissue.
Another example of the medical advances and commercial successes that could be obtained from genetic engineering comes from Baxter’s recombinant factor VIII (Recombinate), which was developed by the Genetics Institute, then licensed to and manufactured by Baxter. Factor VIII is a blood coagulation protein missing in hemophilia A, the genetically inherited bleeding disorder that afflicts about 20,000 males in the United States. Prior to the availability of recombinant Factor VIII, the protein was collected from pooled human blood which, prior to the use of the HIV test in 1985, was often contaminated with the AIDS virus. As a result, almost all hemophilia A patients who received Factor VIII from pooled human blood before 1985 were infected with HIV, and many have died of AIDS. Similarly, hemophiliacs had also contracted hepatitis when these viruses contaminated the blood pool. Recombinant human Factor VIII, approved in 1992, eliminated the constant problem of blood contamination and offers lifesaving benefits to hemophilia A patients (Kaufman, 1989).
Recombinant insulin, growth hormone, and Factor VIII typify the advantages that can be achieved with biotechnology. These products proved to be significantly better than previous medical treatment options, and many of the new biotechnology therapies were the first treatments available for a given disease. The drugs were patentable and could command premium pricing, which helped to offset the high development and manufacturing costs and the relatively small market for the diseases treated with recombinant proteins (Thackray, 1998).
Today genomics companies are often divided into large-scale sequencers, positional cloners, and those that do functional genomics. Large-scale sequencers, such as Human Genome Sciences, Inc., develop research databases of genes, gene fragments, or gene expression patterns, which enable drug discovery. Celera Genomics entered this field as it began its human genome sequencing.
Positional cloning companies study the genomes of individuals from families that have specific diseases and try to determine which genes cause the disease. From this information, disease genes can be identified, and tests to detect them can be developed. Companies such as Myriad Pharmaceuticals and Millennium Pharmaceuticals perform this kind of work.
Functional genomics companies conduct research to identify the function of genes. For example, they compare the genes in humans to those in other species, which is valuable because genes often perform the same function regardless of the species, a phenomenon called homology, and it is usually easier to assess gene function in smaller organisms. “Tool” companies, like Affymetrix, develop “array technologies” that can analyze rapidly which genes are expressed in a given tissue or cell. By comparing differences in gene expression between diseased and healthy tissue, this technology is used to discover genetic changes leading to disease.
Notwithstanding the early successes of Celera, Millennium, and Affymetrix, current business models in the biotechnology industry have shifted dramatically from the halcyon days of the genomics company bubble (1999-2000). Today, it is generally believed that long-term value creation in biotechnology can come only from the sale of pharmaceutical products. Most of the so-called platform companies of the late 1990s have disappeared or migrated to drug discovery. The prescient (not to mention lucky) few, such as Perlagen, have evolved into product companies with varying degrees of reliance on their original platform technologies.
On the other hand, much of this research activity is mutually beneficial and also advantageous to society, because it can accelerate the commercialization of useful medical products. Such collaborations also allow companies to remain up-to-date and informed of discoveries as they come off the bench. For clinical research, such as human drug trials, access to patients at university hospitals can be a primary motivator for collaborative research.
Further, biotechnology advances have frequently returned new science and new products to universities for basic and translational research. The production of recombinant erythropoietin (EPO) by Amgen in the late 1980s provides one such example. EPO is the hormone that regulates red blood cell proliferation. Although it had been discovered several decades before, it had never been available in sufficient quantities for many studies. The availability of the recombinant protein enabled such research.
From the university perspective, working with industry carries real benefits. In the medical and agricultural sciences, universities have benefited from research collaborations with industry, especially in the face of limited government research funding. Increased access to resources allows universities to expand research programs, attract new faculty, build facilities, purchase equipment, and enhance their reputations. Industrial relationships bring valuable equipment and prototypes to the university laboratory. Collaboration with industry also provides faculty with an understanding of industrial problems, enriching the training of engineers and scientists for their future work in an industrial environment.
SCIENTIFIC NORMS AND EVOLVING SCIENCE POLICIES
Since the inception of the HGP, debates about access to data and information have remained ongoing. In addition, there have been disputes about the propriety of patenting genes, partial genes, or gene products, as well as disagreements in the research community about whether academic institutions should be encouraging patenting and licensing strategies versus facilitating open access to data and resources. Early battles about the appropriateness of patenting ESTs have led some in the scientific community to be skeptical about the general direction of patent policy in this area.6 Moreover, controversy lingers over whether research organizations should be exempt from infringement when using patented products or tools in the course of research (as discussed further in Chapter 3). Collaboration with industry may be encouraged, however, by the recent Merck KGaA v. Integra Lifesciences I, Ltd7 case, which protected researchers and their academic institutions from litigation, as well as the company that paid for the work, be-
cause the work was related to its FDA submission.8 This decision seems suggestive of a growing understanding of the need to draw clearer lines regarding what conduct may, or may not, constitute patent infringement.
As science has developed and commercial activities have increased, scientists from both the public and private sectors have been addressing policy issues and appropriate norms of behavior for genomics-related research. As described below, NIH has responded to these concerns by issuing a number of guidance documents that aim to promote the sharing of both resources and data.
Concerns About Openness and Access
The tradition of sharing materials and results with colleagues speeds scientific progress and symbolizes to the nonscientific world that the goals of science are to expand knowledge and to improve the human condition. One reason for the remarkable success of science is the communal nature of scientific activity. Thus, undue restrictions on data, information, and materials derived from science, especially publicly funded science, has been a theme of many discussions in the science policy community over the past 20 years.
Science builds on previous discoveries, with dissemination of discoveries through publication a crucial part of the process. Publication in journals has its roots in the 17th century, with the initiation of the Philosophical Transactions of the Royal Society by Henry Oldenburg. Publication does more than bring together individuals who would otherwise work in isolation from one another; it provides a record of the collective body of scientific knowledge. The expectation is that a publication will contain a detailed description of the methods and materials sufficient enough that others can attempt to replicate the results, and if validated, build upon them without constraint. “Science is fundamentally a cumulative enterprise. Each new discovery plays the role of one more brick in an edifice” (Lander, in NRC, 2003, p. 29). Because publication and the implied unrestricted use of the results of research are central to the activity of science, the norms associated with the dissemination of results have been reinforced both implicitly and directly, with an NRC report serving as a recent example.
Publication shares with patenting a similar purpose—inducing the investigator or inventor to reveal the discovery in order to advance knowledge. Both systems expect the description to be detailed enough to allow replication. But the social contract of the exchange is quite different. Patents give the holder the right to exclude others from making, using, or selling the patented product or process in return for disclosure; thus, patents constitute a limited monopoly. Publication ascribes credit to the authors for the primacy of their discovery with its attendant
benefits in exchange for unconditional use of the discovery, including the materials and methods, for the benefit of science. Publication, of course, does not preclude patenting (or vice versa), and from a social perspective the two systems are complementary: Patenting fosters commercialization of ideas, and scientific publication communicates the ideas that build the edifice of science. Scientific publication also influences the issuance of patent rights by helping to define the landscape of the prior art and obviousness criteria used in assessing the novelty of patent claims. But the different expectations from the social contract can create tensions.
Over the past 25 years, with the increasing relevance of scientific discovery for the commercial world, these tensions have increased. This is particularly true in areas such as research tools, where the discovery itself is a method or device the main commercial value of which is in furthering research. It is further complicated when an invention has the potential to be both a research tool and a therapeutic product. For example, a gene is a research tool when it is used as a probe, but it could be a therapeutic agent in a gene transfer study. Many genomic and proteomic patents are likely to have this dual character. This can be addressed by licensing the different uses under different terms (e.g., nonexclusive licensing for research and diagnostics, and exclusive by field of use for therapeutics).
Genetics is another area of conflict in which DNA is not only a tool but is something that can be abstracted simply as information and deposited in a public database, such as GenBank. In addition, DNA often cannot be invented around—that is, if one wants to study a human gene, there is effectively only one sequence to investigate. Medical research presents even further challenges with the mixing of research and medical practice in academic medical centers, especially in the conduct of clinical trials.
The EST Patent Debate
As discussed in further detail in Chapter 3, in the late 1980s and early 1990s genomic companies and universities, and NIH itself (under the leadership of then-director Bernadine Healy) began filing patents on ESTs, sparking debate in the scientific community over the public health consequences of genomic patenting and its impact on the culture of openness in science.
An EST is a small region in the active part of a gene. In the absence of genome sequence, an EST can be labeled and used as a probe to locate and isolate functional genes. Combined with a genome sequence, an EST can provide a valuable clue to the presence of a gene in the genome. Generally, EST patent applications contain broad claims, and researchers typically have identified new ESTs, guessed at the biological function of the encoded protein fragments through computerized searches of the DNA and protein databases, and then sought utility patents on the proteins on grounds of hypothetical function.
That strategy stimulated a forceful statement in 2000 by Aaron Klug and
Bruce Alberts, the presidents, respectively, of the Royal Society of London and the National Academy of Sciences in the United States. They called guessing at gene function by computerized searches of genomic data bases “a trivial matter.” Its outcome might satisfy “current shareholders’ interests,” but it did “not serve society well.” Holding that its results did not warrant patent protection, they stressed that “the human genome itself must be freely available to all humankind” (Alberts and Klug, 2000).
On the other hand, in the mid 1990s USPTO considered ESTs patentable subject matter based on a variety of utilities, such as a DNA probe. ESTs may be novel, because the sequence has not yet been published. Further, ESTs may “enable and describe at least one use, as a DNA probe.”
The scientific community has expressed several concerns about the allowance of broad patent claims on ESTs, including:
whether a DNA sequence including well-studied genes later found to contain the EST sequence would infringe the patent on that EST sequence;
whether companies currently using gene sequences in clinical trials or those selling recombinant proteins could infringe on one or more EST patents and as a result be forced to re-engineer their gene sequence and repeat years of experiments to avoid the infringement; and
whether industry would delay or refrain from investing in genomic research and development due to uncertainty surrounding the scope of millions of secret EST claims at USPTO that have not yet been made public either by publication of the application or issuance of the patent itself.
Many scientists believe that ESTs should not be patentable, based partly on the way they are discovered. Companies in the EST race ran genetic samples through DNA-sequencing machines that automatically identified expressed sequences but did not reveal what corresponding protein the sequence encoded or its functions. NAS President Bruce Alberts noted that, “This involves very little effort and almost no originality” (Abate, 1999). The scientific community’s concerns center on the fact that ESTs have a number of immediately useful characteristics that are critical to research on hundreds of diseases. For example, an EST can be used as a label to localize that sequence on a chromosome. Because the sequence information contained in an EST is enough to distinguish one gene from all others, each EST may be used to identify the chromosomal location of its corresponding gene. The ability to identify where a particular gene is located on the chromosome is important in the detection of chromosomal mutations and corresponding disease states. Using an EST as a tool in this way could provide diagnostic tests for many diseases. Thus, restrictive patenting and licensing of ESTs potentially could throw a roadblock in the way of many pathways that investigators are taking.
In 1999, amidst growing concerns about the United States Patent and Trade-
mark Office’s (USPTO’s) publicly signaled acceptance of broad claims on ESTs and other genomic inventions, then NIH Director Harold Varmus and the National Human Genome Research Institute Director Francis Collins wrote to USPTO urging the implementation of strict criteria for biotechnology patents, and specifically urging that the bar be raised for utility standards for DNA patenting so that longstanding requirements for utility would clearly apply. Since 2001, USPTO clarification of standards has become much more rigorous—that is, fewer claims are allowed.
Ultimately, through a rigorous application of existing law, USPTO did decide that it was necessary to require more than just a piece of DNA with a unique position in the genome to establish that something was useful. The applicant also must demonstrate that he or she had conceived of some practical and substantial function of that piece of DNA. However, the utility could not be something merely general and nonspecific, for example, a computer homology search indicating a similarity to a characterized protein. On the other hand, experimental data were not needed. This policy fell short for those expecting that there would be a requirement for more rigorous experimental evidence of utility. Varmus and Collins appealed to USPTO for consideration, writing:
While we were pleased with the PTO’s new stance on the utility of polynucleotides for which only generic utilities are asserted, we were very concerned with the PTO’s apparent willingness to grant claims to polynucleotides for which a theoretical function of the encoded protein serves as the sole basis of the asserted utility.9
Currently, many patent applications directed to ESTs are pending before USPTO.
There have been persistent concerns that the patenting of an EST or indeed a gene without knowledge of its function, or with only a sketchy knowledge of its function, could preclude a product patent by some future party who discovers a much more detailed and significant functional role for that gene. Although a process patent may well be available, its value is more limited because of monitoring problems. Equally worrisome, the initial patent may block the research needed to elucidate the full role of the gene. This illustrates one feature of the patent law. A patent issued on something novel and useful can be enforced against persons seeking to use the patent for uses developed in the future. So far, USPTO has taken the statutory requirements for patents seriously and has issued few patents on ESTs. One well-known case illustrates the point, however. A firm received a patent on the gene encoding the CCR5 lymphocyte receptor without any prior knowledge of its link to HIV infection. Once the disease link was established, the
patentee declared its intention to enforce the patent against others, making use of the discovery in the development of any pharmaceutical to combat HIV (Johnson and Kaur, 2005).
Recent litigation has addressed the confusion related to the status of EST patents. In July 2005 a three-judge panel of the U.S. Court of Appeals for the Federal Circuit upheld USPTO in rejecting an EST patent application by Monsanto Corporation (In re Fisher, No. 04-1465) on the grounds that it lacked a specific, substantial, credible utility. Although the panel was split, the dissenting judge agreed that the patent should be denied—but on the grounds of obviousness rather than on utility.
Finally, now that the one-gene/one-protein postulate has proven to be overly simple (that is, one gene can encode many related variants), prior patents on genes claiming to be responsible for the production or regulation of one protein may be questioned as new discoveries are made about the many new and previously undiscovered related proteins encoded by the patented gene. Although much remains to be clarified about the appropriate standards for DNA patenting and its impact on the conduct of biological research, the burgeoning field of protein analysis raises even more consequential and distinct intellectual property policy issues. The discipline of proteomics may become an even more commercially important and active patenting arena than DNA because of its closer proximity to disease detection and therapy. Moreover, proteomics may raise novel questions of patent law that must be addressed carefully by a system that for other biological materials has evolved painfully slowly.
The Worm Model
In emerging fields, scientists often have banded together to share ideas and results before formal publication, in hopes that even more rapid scientific advances may occur. In contrast to publication, in which readers are expected to use the information without constraints, these prepublication sharing groups often impose some additional “rules of etiquette”—some restraint on the use of the information to protect the interests of the individual scientists in exchange for early access. Genetics provides several such examples, including the Drosophila community, in which the early adopters of this organism communicated data and materials amongst themselves before publication. The community of C. elegans (worm) researchers is another example and is particularly relevant, because the precedents developed can be traced directly to the data release policies of the public HGP.
C. elegans, a small free-living nematode, was selected by Sydney Brenner in the 1960s to extend his molecular genetic dissection of life from prokaryotes to animals. He considered—but discarded as too complex—the scientifically better-known and established fruit fly, Drosophila melangaster. Despite the obscurity of the worm, Brenner’s vision of what might be learned from molecular genetic
studies of this 959-celled animal soon attracted a small group of graduate students and postdoctoral fellows to Brenner’s laboratory in the Medical Research Council (MRC) Laboratory of Molecular Biology, Cambridge, England.
Within the overall goal of understanding how genes specified the development and behavior of the worm, individual students applied themselves to different aspects of its development, from the early events in embryogenesis to the wiring of the nervous system. But all were united both by the overall goal and by the need to develop the tools necessary to study this recent arrival in the laboratory. Mutants of all kinds had to be isolated and the associated DNA mutations positioned on the genetic map. The genetic tools to manipulate these mutants more effectively were revised constantly. Basic methods of manipulation had to be refined. A simple but telling example is the invention of a formed, platinum wire to replace the sharpened stick used to pick individual worms under a dissecting microscope. Word quickly spread through the laboratory and soon everyone had a platinum wire “pick.”
Within a few years, Brenner’s postdoctoral fellows went off to establish their own laboratories, often in the United States. Feeling isolated and at a competitive disadvantage relative to their peers studying the better-established Drosophila, they found that communication with their colleagues in the larger Cambridge group and with other new worm laboratories proved critical for success of the next generation. This phenomenon was institutionalized by the establishment in 1975 of the Worm Breeder’s Gazette (WBG) by Bob Edgar of Santa Cruz, an unedited compilation of contributions that was copied and sent off to all subscribers. Twice a year, laboratories submitted brief descriptions (typically one page) of new methods and exciting results, and a month later the WBG would appear. The WBG also periodically provided the community with updated genetic maps, the basic guide for all work. The information usually was shared well before any publication was planned, greatly speeding dissemination of ideas and findings. In return, readers eschewed “unfair” use of such privileged information, with fairness and individual judgment enforced by community reaction.
In addition to the WBG, the community shared stocks, mailing mutant strains around the world. Mutants were a currency of the community, providing the means of discovery. Sharing might be preceded by an explicit agreement on usage for recently obtained mutants, but mapping stocks, other tools, and published mutants were shared without constraint. These practices led eventually to the formation of a stock center, started by Bob Herman of the University of Minnesota, which centralized the activity.
It was within this community that John Sulston of the MRC Laboratory of Molecular Biology in the early 1980s began the construction of a physical map of the worm genome. The construction of the map required specialized methods carried out on thousands of clones and was considered a massive project at the time. But to be of use, the physical map had to be associated with the genetic map through genes, thereby locating the DNA fragments on the chromosomes. The
community-at-large carried out this crucial activity. The central mapping laboratories (the Cambridge laboratory was joined by the Waterston laboratory at Washington University in St. Louis in the mid-1980s) provided clones to individual laboratories; in return, the laboratories identified the clones containing particular genes. As more genes were placed on the physical map, its utility became greater, rapidly escalating productivity. Again, rules of etiquette were developed to balance the interests of the community and the individual contributing laboratories. Progress on the map was communicated initially via the WBG, but as electronic media and the Internet developed, updates were provided through these media. Formal publications on the map were limited to descriptions of the methodologies and broader conclusions.
As the physical mapping project transformed into a genome sequencing project beginning in 1990, the two mapping laboratories continued their collaboration with each other and the community. The two laboratories jointly developed strategy and shared methods freely. Often the two laboratories competed in finding solutions to problems; at other times, the laboratories focused on complementary problems.
The keys to making this long-distance collaboration work were full disclosure and complete sharing of the results. In working with the community, the sequencing laboratories made the areas to be sequenced clear and submitted the results to international public databases as each segment of the genome was sequenced. But as the volume of sequence data increased along with the interest in them, it became clear that the user community could exploit the sequence data well before they were in final form. Because the public databases at the time accepted only complete sequence, the centers began posting intermediate sequence products on their Web sites. Again rules had to be developed to govern access to this prepublication data. Users were cautioned about the incomplete, imperfect nature of the sequence and were asked to consult with the centers before publication, to find out if a more complete version of the sequence was available. The centers asked to be acknowledged for providing the sequence. The users also were asked to notify the centers of any specific genes found within the sequence. This clear and practical experience in collaboration and in sharing early sequence data with the community proved invaluable when in 1996 the nascent human sequencing centers met in Bermuda to formulate plans.
The Bermuda Rules
In 1996, an international group of scientists, from both the public and private sectors, who were engaged in genomic DNA sequencing, passed a unanimous resolution, commonly referred to as the Bermuda rules, which stated that “all human genomic DNA sequence information, generated by centers funded for large-scale human sequencing, should be freely available in the public domain in
The following principles were endorsed by all participants. These included officers from, and scientists supported by, the Wellcome Trust, the U.K. Medical Research Council, the NIH NCHGR (National Center for Human Genome Research), the DOE (U.S. Department of Energy), the German Human Genome Programme, the European Commission, HUGO (Human Genome Organisation), and the Human Genome Project of Japan.
Primary Genomic Sequence Should Be in the Public Domain
It was agreed that all human genomic sequence information, generated by centres funded for large-scale human sequencing, should be freely available and in the public domain in order to encourage research and development and to maximise its benefit to society.
Primary Genomic Sequence Should Be Rapidly Released
- Sequence assemblies should be released as soon as possible; in some centres, assemblies of greater than 1 Kb would be released automatically on a daily basis.
- Finished annotated sequence should be submitted immediately to the public databases.
It was agreed that these principles should apply for all human genomic sequence generated by large-scale sequencing centres, funded for the public good, in order to prevent such centres establishing a privileged position in the exploitation and control of human sequence information. It was also agreed that patents should not be sought.
order to encourage research and development and to maximize its benefit to society” (see Box C). Since the sequencing phase of the publicly funded HGP began, all of the data generated by participants have been deposited in publicly available databases every 24 hours. By 2003, an essentially complete copy of the human genome sequence was posted on the Internet, with no barriers to its use, and therefore no subscription fees or other obstacles.
With the 1998 entry of Celera Genomics into the race to sequence the human genome, issues of access to the emerging data became more contentious between the public and private projects. On March 16, 2000, President Bill Clinton and U.K. Prime Minister Tony Blair issued a joint statement: “to realize the full promise of this research, raw fundamental data on the human genome, including the
human DNA sequence and its variations, should be made freely available to scientists everywhere”10
At the beginning of the structural genomics initiative, international meetings were held to determine policies for data sharing. At a meeting of genomics and proteomics investigators held at Airlie House in Virginia in the summer of 2000, participants agreed that the coordinates for structures would be deposited in the PDB no later than six months after the completion of a structure determination.11 The proposed delay was in direct response to the patent policies in Europe and Asia and the funding model for work being carried out on those continents. The policy for the pilot study phase of the U.S. Protein Structure Initiative (PSI) projects required data deposition within six weeks of completion of a structure determination. In Phase 2 of the PSI, the waiting period will be reduced to four weeks.
Unlike the consensus leading to the Bermuda Rules governing the HGP, general agreement on the desirability of instantaneous release of all interim data produced by structural genomics efforts was a nonstarter at Airlie House and subsequent international meetings. As for the issue of structural data release described above, much of the resistance was fueled by perceptions that patents on protein structures could generate significant financial returns within Japan and Europe. Funding agency representatives from these countries argued that without the prospect of such benefits, they would be hard pressed to make the case for funding structural genomics initiatives.
NRC Report Regarding Sharing
In 2003, NRC published Sharing Publication-Related Data and Materials, which stated that “Community standards for sharing publication-related data and materials should flow from the general principle that the publication of scientific information is intended to move science forward” (NRC, 2003, p. 4). The report authors argued that an author’s obligation is not only to release data and materials to enable others to verify or replicate published findings but also to provide them in a form upon which other scientists can build with further research. Furthermore the report stated, “All members of the scientific community—whether working in academia, government, or a commercial enterprise—have equal responsibility for upholding community standards as participants in the publication system, and all should be equally able to derive benefits from it” (p. 4).
One of the principles embraced by the NRC committee was that if material
Office of the Press Secretary. March 14, 2000. Joint statement by President Clinton and Prime Minister Tony Blair of the United Kingdom. Available at http://clinton4.nara.gov/WH/New/html/20000315_2.html. Accessed June 21, 2005.
integral to a publication is patented, the provider of the material should make the material available under a license for research use:
When publication-related materials are requested of an author, it is understood that the author provides them (or has placed them in an authorized repository) for the purpose of enabling further research. That is true whether the author of a paper and the requestor of the materials are from the academic, public, private not-for-profit, or commercial (for-profit) sector. Notwithstanding legal restrictions on the distribution of some materials, authors have a responsibility to make published materials available to all other investigators on similar, if not identical, terms (p. 7).
NIH Policies for Sharing and Nonexclusive Licensing
In 1999 and 2004, NIH leadership, concerned about increasingly restrictive access to research resources and data, issued guidance in two areas to encourage best practices in the scientific community. The 1999 Principles and Guidelines for Recipients of NIH Research Grants and Contracts on Obtaining and Disseminating Biomedical Research Resources (64 FR 72090)12 were aspirational principles aimed at NIH-funded institutions and intended to balance the need to protect intellectual property rights with the need to disseminate new discoveries broadly. The principles apply to all NIH-funded entities and address biomedical materials, which are defined broadly to include cell lines, monoclonal antibodies, reagents, animal models, combinatorial chemistry libraries, clones and cloning tools, databases, and software (under some circumstances).13
Sharing Biomedical Research Resources
The 1999 principles were developed in response to complaints from researchers that restrictive terms in material transfer agreements14 were impeding the sharing of research resources. These restrictions came both from industry sponsors and from research institutions. In the Principles and Guidelines, NIH urges recipient institutions to adopt policies and procedures to encourage the exchange
A copy of the complete principles can be obtained at the NIH Web site at www.nih.gov/od/ott/RTguide_final.htm.
The Guidelines were issued following recommendations made to the NIH Advisory Committee to the Director by a special subcommittee chaired by Rebecca Eisenberg.
In the conduct of research, there is often a need to obtain compounds, reagents, test animals, cell lines, or other materials from outside individuals or entities. This is sometimes a matter of convenience—to save time and the expense of creating new research inputs—and sometimes a matter of necessity. But it also can be motivated by a desire to facilitate a research collaboration with investigators at another institution or to enable a potential corporate partner to evaluate the merit of an invention. Material transfers sometimes occur without formalities, but increasingly a material transfer agreement is used to define the rights and obligations of the parties (see Chapter 4).
of research tools, specifically: (1) minimizing administrative impediments to the exchange of biomedical research tools; (2) ensuring timely disclosure of research findings; (3) ensuring appropriate implementation of the Bayh-Dole Act; and (4) ensuring dissemination of research resources developed with NIH funds.
Four main principles are addressed in the report:
Ensure Academic Freedom and Publication
Recipients are expected to avoid signing agreements that unduly limit the freedom of investigators to collaborate and publish.
Brief delays in publication may be appropriate to permit the filing of patent applications and to ensure that confidential information obtained from a sponsor or the provider of a research tool is not inadvertently disclosed.
Ensure Appropriate Implementation of the Bayh-Dole Act
Recipients must maximize the use of their research findings by making them available to the research community and the public, and through their timely transfer to industry for commercialization.
The use of patents and exclusive licenses is not the only, nor in some cases the most appropriate, means of implementing the act. Where the subject invention is useful primarily as a research tool, inappropriate licensing practices are likely to thwart rather than promote utilization, commercialization, and public availability of the invention.
Utilization, commercialization, and public availability of technologies that are useful primarily as research tools rarely require patent protection; further research, development, and private investment are not needed to realize their usefulness as research tools. In such cases, the goals of the act can be met through publication, deposit in an appropriate databank or repository, widespread nonexclusive licensing for nominal or cost-recovery fees, or any other number of dissemination techniques.
Minimize Administrative Impediments to Academic Research
Recipients should take every reasonable step to streamline the process of transferring their own research tools freely to other academic research institutions using either no formal agreement, a cover letter, the Simple Letter Agreement of the Uniform Biological Materials Transfer Agreement (UBMTA), or the UBMTA itself.
Ensure Dissemination of Research Resources Developed with NIH Funds
Unique research resources arising from NIH-funded research must be made available to the scientific research community.
A second section of the report, “Guidelines for Disseminating Research Resources Arising Out of NIH-Funded Research’” contains both a sample document, a UBMTA, used when transferring nonpatented materials, and sample
phrases that can be used in license agreements. The goal of the NIH guidance is to simplify the process of transferring research resources from one party to another. By streamlining the process, it will help to avoid confusion about how to implement the Bayh-Dole Act properly, and will help ensure that the interests of all parties, as well as the public health, are properly balanced.
NIH Best Practices for the Licensing of Genomic Inventions
Consistent with its ongoing interest to facilitate broad access to government-sponsored research results, in 2004 NIH issued Best Practices for the Licensing of Genomic Inventions. This document aims to maximize the public benefit whenever Public Health Service-owned or -funded technologies are transferred to the commercial sector. In this document, NIH recommends that “whenever possible, nonexclusive licensing should be pursued as a best practice. A nonexclusive licensing approach favors and facilitates making broad enabling technologies and research uses of inventions widely available and accessible to the scientific community.” The policy distinguishes between diagnostic and therapeutic applications and cautions against exclusive licensing practices in some areas.
The report considers the following to be “genomic inventions”: cDNAs, ESTs, haplotypes, antisense molecules, small interfering RNAs (siRNA), full-length gene and expression products, methods, and instrumentation for the sequencing of genomes, quantification of nucleic acid molecules, detection of SNPs, and genetic modifications.
The best practice guidelines for seeking patent protection on genomic inventions depend on whether significant investment by the private sector is required to make the invention widely available. If significant investment is necessary, then patent protection should be sought. If significant investment is not necessary, however, as with many research material and research tool technologies and diagnostics, then patent protection rarely should be sought.
Regarding best practices in licensing research tools, the report recommends pursuing a nonexclusive licensing agreement whenever possible. If an exclusive license is necessary to encourage research and development by the private sector, however, then the license should be tailored to promote rapid development of as many aspects of the technology as possible. This may include limiting the field of use, specific indications, or territories the licensee has exclusive rights to develop. NIH also recommends that specific milestones be included in the licensing agreement to ensure that the technology is fully developed by the licensee. If the licensee does not meet these milestones and/or progress toward commercialization is deemed inadequate, NIH recommends that the license be modified or terminated. Additionally, whenever possible, a licensing agreement should include a provision allowing both the funding recipient and nonprofit institutions the right to use the licensed technology for research and educational purposes.
Concerns About Access to and Research Use of DNA-Based Diagnostic Tests
An area of research that has provoked the most concern, particularly among clinicians and clinical researchers, involves patents and licensing strategies for genes or partial genes associated with specific diseases. Understanding the genetics of rare and common diseases has been accelerated by the HGP, with the identification of the genes for hundreds of rare diseases, many in the past two to three years. But identifying the gene for a disease is like passing through a bottleneck. Scientists have to survey the landscape on the other side of the bottleneck to truly understand and capitalize on new opportunities for diagnosis, prevention, and treatment.
Scientists working on questions along the path between genes and function (Cho et al., 2003; Merz, 1999; Merz et al., 2002) have expressed concern about the potential restrictions that might be placed on their work if they encounter overly restrictive licensing practices by patent holders along the way. This concern is more acute in the area of diagnostic tests than it is in therapeutic product development, which clearly benefits from the protections the patent system offers during prolonged periods of research and development. The patent and licensing policies for genetic testing that followed the discoveries of the BRCA1 and 2 genes, the Canavan disease (CD) gene, and the Huntington’s disease (HD) gene illustrate the many complexities of intellectual property in the area of genomics. Each case is briefly described below.
The BRCA Story
At a scientific meeting in 1990, Mary-Claire King, then a professor at the University of California, Berkeley, announced the discovery that a small region of chromosome 17 could be linked to early-onset breast cancer. This discovery was based on 15 years of research by King as well as others in this field and fueled interest in the scientific community to find the gene responsible for the high incidence of breast cancer in some families. The Breast Cancer Linkage Consortium (BCLC), an international group of scientists interested in the genetic inheritance of breast and ovarian cancer, was formed, and by pooling resources and data, scientists in the consortium were able to make discoveries more quickly than by working alone. Nevertheless, the race to the discovery of the gene was a competitive one.
In 1994, a group of scientists working under the direction of Mark Skolnick at the University of Utah announced that they had identified the gene underlying hereditary breast cancer, and named the gene BRCA1. At the time, OncorMed also performed BRCA1 diagnostic testing based on its 1997 U.S. patent on the BRCA1 consensus sequence (U.S. Patent #5,654,155). Skolnick and the University of Utah applied for and eventually were granted a U.S. patent for the gene
sequence of BRCA1. They licensed the exclusive rights to Myriad Genetics, a biopharmaceutical company founded in 1991 by Mark Skolnick and others.
The studies of the BCLC also uncovered evidence that there was at least one other breast cancer gene, based on the fact that only 45 percent of familial breast cancer cases showed linkage to chromosome 17. Soon a region on chromosome 13 was identified, and the gene was localized by two groups, one led by Skolnick and Myriad Genetics and a second at the Institute for Cancer Research in the United Kingdom. BRCA1 and BRCA2 are both tumor suppressor genes whose protein products interact to control cell growth and division. Certain mutations in the BRCA1 or BRCA2 gene disrupt the regulation of growth in mammary cells, a critical step on the path to tumor formation.
Both groups filed patents in December 1995 on the second gene, termed BRCA2. CRC Technology, the technology transfer office of the Institute for Cancer Research, filed for a patent in the United Kingdom on behalf of researcher Mike Stratton, and Myriad Genetics filed for a patent in the United States. CRC Technology was granted a patent in the United Kingdom for its discovery of BRCA2 (GB 2307477), and licensed exclusively the right to a BRCA2 diagnostic test to OncorMed, a U.S. company providing genetic testing services to clients. Myriad Genetics and OncorMed were in a legal dispute over both of these patents; in 1998, the dispute finally was settled with Myriad paying OncorMed for exclusive rights to the patents. Myriad, in essence, now had a monopoly over diagnostic testing for BRCA1 and 2 familial breast cancer in the United States.
Myriad Genetics began enforcing its patent claims against certain universities, a previously rare practice. In 1999, Arupa Ganguly, of the Clinical Genetics Laboratory at the University of Pennsylvania, received a patent infringement notification from Myriad Genetics. Ganguly had developed independently a test to screen for mutations in the BRCA genes and was charging a fee to her patients to perform this test. The University of Pennsylvania was advised to cease activities related to its testing for the BRCA genes for fear of litigation by Myriad Genetics.
To curb criticism from the academic community, in 2000, Myriad Genetics negotiated an agreement with NIH so that NIH-funded researchers would receive a discount on Myriad’s BRAC analysis test as long as the test was used for research purposes. The negotiated price was $1,200 per test instead of the $2,580 Myriad normally charged. In exchange for this discount, Myriad would have access to resulting research data (Hollen, 2000).
Myriad also sought patent protection on BRCA1 and BRCA2 in the European Union. In 2001, three European Union patents on BRCA1 were granted to Myriad. Myriad’s first European Union patent, EP 699754, covered any methods of diagnosing a predisposition for breast and/or ovarian cancer using the normal sequence of the BRCA1 gene. Its second European Patent, EP 705903, covered 34 specific mutations of the BRCA1 gene and diagnostic methods for detecting those mutations. The third European patent, EP 705902, covered the BRCA1 gene itself, the corresponding protein, therapeutic applications of the BRCA1
gene, and diagnostic kits. In 2003, Myriad was granted a European patent on BRCA2. This patent, EP 785216, covered materials and methods used to isolate and to detect BRCA2.
In 2001 and 2002, European researchers challenged Myriad’s three patents on BRCA1. When the BRCA2 patent was granted in 2003 by the European Patent Office, researchers challenged it as well. Institut Curie, a French cancer research center, and Belgian officials led the challenge along with other French and Italian research institutes. Most of the complaints fell under the categories of failure to show novelty, inventive step, industrial application, and disclosure.
In February 2004, Myriad’s patent on BRCA2 was struck down because CRC Technology had filed the claim for the gene first. The European Patent Office (EPO) granted Cancer Research U.K. patent EP 868467B1 for the BRCA2 discovery. Cancer Research U.K. plans to allow free public access to its patented gene sequence.
In May 2004, Myriad’s first European patent on BRCA1, EP 699754, was struck down, based on errors in the original sequence and lack of an inventive step. By the time errors were corrected by Myriad in subsequent U.S. patents, the sequence was in the public domain (Sheridan, 2004). In January 2005, two of Myriad’s other BRCA1 patents met similar fates in the EPO. The scope of Myriad’s claims in EP 705902 and EP 705903 was limited to probes on only the correct parts of Myriad’s originally filed sequence on BRCA1 and only for testing in Ashkenazic populations. Those claims are being reviewed and may be opposed as well. These rulings by the EPO were considered a victory to those supporting free access to BRCA1 testing in Europe (Vermij, 2005).
The Canavan Disease Story
Another aspect of the controversy around genetic testing addresses the rights of patients and families whose tissue donation enable the discovery of a disease gene and eventually development of a specific test for its presence. The fight for control over the CD gene patent illustrates this debate. CD is a degenerative brain disease that irreversibly leads to loss of body control and death in children. It is a rare recessive disorder that affects about 200 children, mostly in Ashkenazi Jewish families.
Patient families were instrumental to the development of the CD genetic test. A Chicago couple with two afflicted children secured seed funding to develop a CD prenatal test, and more than 160 families provided the tissue and DNA samples that enabled discovery of the gene. Dr. Reuben Matalon at the Miami Children’s Hospital (MCH) led the team that discovered the CD gene (Kaul et al., 1993). Matalon and MCH received a patent on the gene in 1997 (U.S. Patent #5,679,635).
At the time the patent was issued, the Canavan Foundation was promoting CD testing. In 1998, the American College of Obstetricians and Gynecologists
recommended screening for Ashkenazi Jewish couples. Soon afterwards, MCH began enforcing its patent rights. It charged a $12.50 per test royalty and, more significantly, limited the number of laboratories that could perform the test and the number of tests performed each year. It also sought to identify one laboratory that would become the market leader and ultimately receive an exclusive license (it later abandoned this attempt). MCH came under severe criticism from patient advocacy groups and from a CD screening consortium that had been banned from performing genetic testing. In 2000, MCH initiated negotiations that involved relaxing its licensing practices and offering funds from royalties for outreach and testing in exchange for ceasing public criticism. Consortium members rejected the terms.
Also in 2000, CD families and tissue donors sued MCH; they hoped to regain control of the gene patent, testing costs, and availability. In its 2003 decision, the U.S. District Court for the Southern District of Florida dismissed several of the plaintiffs’ claims—including lack of informed consent, breach of fiduciary duty, fraudulent concealment of the patent, and misappropriation of trade secrets. The judge did not dismiss the families’ claim of unjust enrichment made by tissue donation, concluding that it should be litigated. MCH and the litigants decided to settle the claim; however, the settlement is sealed. Under terms of the confidential settlement, the Canavan Foundation and the families agreed to renounce any further challenges to MCH’s ownership and licensing of the CD gene patent, MCH would continue to license and collect royalty fees for CD clinical testing, and license-free use of the gene in research would be allowed.
The Huntington’s Disease Story
Treatment of intellectual property for the gene associated with Huntington’s disease (HD) represents a counterpoint to the BRCA1/2 and CD examples. HD is an autosomal dominant disease causing involuntary movements of all parts of the body, cognitive decline, and psychiatric disturbance. It is inevitably fatal over a 10 to 20 year course. In 1979, the Hereditary Disease Foundation (HDF) organized a workshop at NIH to discuss using DNA markers to find the HD gene. HDF subsequently funded a grant to David Housman, a molecular biologist from the Massachusetts Institute of Technology, and his graduate student James Gusella to develop and use restriction fragment length polymorphisms (RFLPs) for genetic linkage analyses to locate the HD gene. Gusella continued this research at the Massachusetts General Hospital (MGH), where he began collaborating with Nancy Wexler, then at the National Institute of Neurological Disorders and Stroke. Wexler and colleagues traveled to a region in Venezuela that is home to the largest number of HD kindreds in the world, more than 18,000 people, in order to collect pedigree information and DNA for these genetic studies. The collaboration was successful; in 1983 Gusella, Housman, and colleagues discovered a RFLP marker tightly linked to the HD gene on the short arm of chromo-
some 4 (Gusella et al., 1983). This discovery marked the first time a disease gene of unknown chromosomal locale had been localized using anonymous DNA markers.
It would take another decade to clone the HD gene itself. To this end, scientists who discovered the HD gene marker, as well as investigators who had developed technologies for cloning genes from linked markers, in 1983 organized the Huntington’s Disease Collaborative Research Group (HDCRG), under the auspices of HDF. After 10 years of collaboration by more than 50 scientists, the HD gene was isolated in 1993 (MacDonald et al., 1993). The HD gene encodes a protein called “huntingtin,” the function of which remains unknown. The protein normally contains a stretch of 7 to 34 glutamines in a row. In HD, a DNA expansion causes an increase in repeats of CAG, the nucleotides encoding glutamine. In late middle age, when 40 or more CAGs occur in a row, HD almost invariably appears, usually around 30 to 40 years of age. Juvenile onset results with 60 CAG repeats or more.
When linkage was discovered in 1983, it was evident that presymptomatic and prenatal testing would be possible. These tests carry the potential to bring devastating news to HD individuals and families. Accordingly, many groups collaborated to develop detailed guidelines for HD genetic testing. In all instances, the groups included HD family members, interdisciplinary health care professionals, and research scientists. The development of counseling guidelines, including pre-, during-, and post-test counseling sessions, accompanied the development of meticulous laboratory protocols. The biotechnology company Integrated Genetics (IG), whose founders included Housman and Gusella, introduced the first HD diagnostic test in 1986. IG had been selected by MGH to carry out genetic testing, and its leadership had been included in the advisory group developing guidelines. (IG was acquired by Genzyme in 1989 and continues to do HD genetic diagnostic testing.)
Treatment of intellectual property related to the HD gene and the genetic test has been largely shaped by patient and family concerns. MGH has been granted three patents related to the HD gene. The first (U.S. Patent #4,666,828) was issued in 1987 and claimed diagnostic uses of DNA markers to detect the HD gene; Gusella was listed as the sole inventor. The others (U.S. Patents #5,686,288 and #5,693,797) were issued in 1997 and claimed huntingtin nucleic acid and protein sequences, respectively; Gusella was one of four inventors from his own laboratory. The patent application from which the 1997 patents issued was broadly written and described diagnostic and therapeutic uses of the HD gene; however, MGH did not pursue further patents claiming these methods. When Gusella and MGH filed the first application, their main concern was that the marker not be used inappropriately; they believed they might use the patent to control the testing process. Similarly, discussion of patenting the huntingtin gene usually focused on using licenses as a means to enforce testing and counseling protocols.
To date, MGH has not exerted its own patent rights or licensed the patents to others for financial gain.
The HD test is now available from more than 50 academic and commercial laboratories in the United States. It also is offered in Canada, Europe, Australia, New Zealand, Mexico, and other Latin American countries. In each region, there is no centralized licensing or pricing. There are also no complaints of the test being prohibitively expensive. (The HD genetic test is available at considerably lower cost than the BRCA tests, from $200 to $500 in the United States. The pricing of both may reflect real costs as much as business decisions. The CAG repeat test for HD involves particular primers, PCR, and accurate counting of size bands. The Myriad tests for BRCA 1 and 2 involve complete sequencing of two extremely large genes.)
The broad availability of the HD test has important ramifications for patients. First, it allows verification of test results; sending a blood sample to two different laboratories independently avoids error, which can have dire consequences. Second, the relatively low cost allows at-risk individuals to pay out-of-pocket and thus to maintain privacy with respect to insurers and employers. In the United States, many presymptomatic individuals who choose to be tested elect to pay for the test themselves. In fact, part of the genetic counseling protocol involves warning people that if they are found to have the abnormal form of the HD gene, their medical and life insurance may be jeopardized. Even individuals who are symptomatic may not want their insurance to cover the test, because a positive test implicates other family members as having a genetic risk, who then become uninsurable.
Genetic testing for HD can have profoundly catastrophic and irrevocable repercussions, as there is no treatment, prevention, or cure. Since the test became available in 1986, and where it is offered, fewer than 20 percent of eligible candidates worldwide have chosen to be tested. The psychological burden, combined with prospects for loss of privacy, insurance, and employment discrimination, have often weighed against testing. Patent-related issues, however, have not been shown to inhibit prospects for testing, in contrast to BRCA1/2.
Once the HD gene was cloned, academic and commercial laboratories interested in testing took it upon themselves to develop the proper test methodology to ensure quality control. They shared test samples representing normal and variably sized expanded alleles in order to ascertain that all the laboratories were using the same techniques and getting comparable results. This cross-checking among laboratories was also done after genetic linkage was discovered as a quality control check, but accurately counting CAG repeat numbers requires more diligence and skill. Testing quality control by sending around test samples has been done periodically ever since.
Responsibility for monitoring quality control was assumed by the academic and commercial laboratories themselves since, at least for HD and many other
genetic disease tests, quality control for test performance itself currently is not monitored by any federal or state agency.
The Clinical Laboratory Improvement Amendments (CLIA) were passed by Congress in 1988 to establish “quality standards for all laboratory testing to ensure the accuracy, reliability and timeliness of patient tests.” The Centers for Medicare and Medicaid Services regulates all laboratory testing (except research testing) performed on humans in the United States through CLIA regulations, but CLIA’s mandate extends only to examining the physical conditions and quality control procedures of the laboratory and not to the actual performance of tests (although proficiency testing is required for many standard tests).
If the genetic test involves using a commercially available kit, FDA regulates the kit as a device and must approve the manufacturer’s claims regarding the composition and performance of the kit before it can be marketed. If the test is considered a “home brew,” no matter how complicated, to date neither FDA nor CLIA has opted to regulate its performance and accuracy.
In order for academic and commercial laboratories to be able to deliver an HD test result to an individual, the laboratory must be CLIA certified. As noted above, this does not mean that CLIA certifies the accuracy of specific test results, only that the laboratories in which the tests are conducted meet CLIA requirements. Many laboratories, most particularly academic laboratories that are not CLIA certified, conduct the HD test for research purposes only and these test results cannot, by law, be given to individuals, nor should they be entered into the medical record.
Gene Patents and Diagnostics
Currently, more than 1,000 genetic diseases can be diagnosed through available tests. Although some of the associated genes are free of patents, most are not. The cases of the BRCA, CD, and HD genes represent the range of situations that might apply to such genes. In a number of instances, as with HD patents, the patent holder does not enforce patent claims to prohibit testing. In some other instances, patent holders offer nonexclusive licenses for reasonable fees, making the patented tests generally available. (Following litigation, this is now the case with the CD patent.). Some gene patents, however, have been licensed exclusively, and licensees (or the original patent holders, such as Myriad) either enforce their patent rights to prevent others from performing genetic testing, or present prohibitively expensive terms under which others may perform genetic testing. Anecdotally, there are instances in which hospital laboratories performing tests on genes covered by patents have continued to offer the test until subjected to very strong pressure from the patent holder or exclusive licensee. In numerous other cases, however, hospitals have stopped offering these tests or have decided against developing genetic tests due to fear of litigation (Henry et al., 2002; Merz and Cho, 1998). (Also see further discussion in Chapter 3.)
The rapid accumulation of data and information resulting from the HGP and its many spin-off projects is beginning to show movement toward clinical applications in the fields of diagnostics, therapeutics, and personalized medicine. Since the 1970s, the commercial potential of this information has been the driving force behind the growth and development of the biotechnology industry and realignment in the pharmaceutical sector. Because much of the intellectual capital in this area has resided in academic research institutions, the relationships among universities, government, and the private sector also have changed. Although these relationships have been highly beneficial, they also have generated debates about the relative roles of government and industry in supporting and promoting science, particularly with regard to open access to information and the sharing of research resources. Also at issue is what can or should be patented and whether there is an obligation to the public health to ensure that clinical and research access to valuable discoveries is not unduly restricted.