Click for next page ( 30


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 29
Models of Large-Scale Science To further elaborate on the concept of large-scale biomedical science as defined in this report, this chapter provides an overview of several examples of past and current large-scale projects or strategies in biology and other fields. It begins with a sum- mary of the Human Genome Project (HGP), the largest and most visible large-scale science project in biology to date. Many examples are drawn from NCI, in part because NCI has a longer history and more extensive ex-perience with directed, large-scale projects compared to other branches of NIH, and also because a major focus of this report is on cancer re- search. Several initiatives recently launched by other branches of NIH are described in detail, followed by examples of National Science Founda- tion (NSF) programs, industry consortia, public-private collaborations, and initiatives sponsored by private foundations. The chapter concludes with an example of a nonbiology model of large-scale science for con- trast that of the Defense Advanced Research Projects Agency (DARPA). The DARPA model is commonly cited as a potential strategy for under- taking large-scale, high-risk, and goal-oriented research, but this model has rarely been replicated in biology. A review of federally funded large- scale research projects in nonbiology fields such as high-energy physics is provided in the Appendix. The common theme among the examples described in this chapter is that they are all formal programs launched by funding agencies, founda- tions, or industry. There is certainly no shortage of other ideas for poten- tial large-scale biomedical research projects among scientists. Without an 29

OCR for page 29
30 LARGE-SCALE BIOMEDICAL SCIENCE initiative by a fonder, however, individual scientists may find it very difficult to obtain the funding necessary to launch an expensive, long- term, large-scale project because of the nature of traditional funding mechanisms (see Chapter 4~. Another common thread among these projects is their dependence on new or developing technologies. Technical innovations drive scientific discovery and determine what can be accomplished in the field. The pace and variety of new innovations have increased greatly in recent years, in turn increasing the feasibility of and opportunities for large-scale projects in biology (see Box 3-1~. For example, the advent of DNA arrays and the development of software for analyzing the data they generate have made it feasible to study the entire transcriptional profiles of cells in health and disease or under various conditions. However, such projects are not only much larger in scale, but also much more expensive to undertake.

OCR for page 29
MODELS OF LARGE-SCALE SCIENCE THE HUMAN GENOME PROJECT 31 Ever since the discoveries of genetic inheritance and the chemical structure of DNA, there has been interest in "unlocking the secrets of life" by deciphering the information encoded in the genome. Initially, scien- tists concentrated on small pieces of the puzzle because they lacked the ability to investigate genetic material efficiently on a large scale. As tech- nological advances were made, however,1 some molecular biologists be- gan to discuss the feasibility and potential value of mapping and sequenc- ing the entire human genome (see Figure 3-1~. The first editorial published in a major scientific journal advocating a large-scale approach to sequence the human genome brought the concept to the scientific mainstream, with an emphasis on cancer research (Dulbecco, 1986~. Nobel laureate Renato Dulbecco suggested that a project to map the human genome was the best way to make progress in the "war on cancer," which had been launched by the Nixon Administration in 1971. Dulbecco compared the significance of such a project to that of the U.S. space program, arguing that a genomic approach would facilitate a greater understanding of the genetic changes that lead to cancer, which would be essential in eradicating the disease. But he also noted that research on other diseases would certainly benefit as well. At about the same time, a number of influential scientists were pub- licly discussing and advocating the possibility of sequencing the entire human genome (reviewed by Sulston and Ferry, 2002; Davies, 2001; Cook- Deegan, 1994, Kevles and Hood, 1992~. In May 1985, Robert Sinsheimer, chancellor of the University of California Santa Cruz and a well-known molecular biologist, brought together a group of leading American and European molecular biologists to discuss the technical prospects for a human genome project. At this symposium on DNA sequencing, one of the strongest advocates for a large-scale HOP was Walter Gilbert, a Nobel laureate from Harvard, who had developed one of the first methods for sequencing DNA. The following year, in early March 1986, Charles DeLisi, director of the Office of Health and Environmental Research at the U.S. Department of Energy (DOE), held a workshop to discuss the idea of undertaking an HOP under DOE. Although DOE may not have appeared to be the logical choice of a federal agency to oversee such a project, it had a long-standing research program on the effects of radiation on mutation rates, and the Life Sciences Division at Los Alamos National Laboratory had already established Genbank, a major database for DNA sequences, in 1983. DOE 1 These technical advances included recombinant DNA methods, DNA sequencing meth- ods, techniques for genetic mapping, and computer analysis.

OCR for page 29
32 LARGE-SCALE BIOMEDICAL SCIENCE May 1985 -- Robert Sinsheimer, UCSC chancellor hosts a meeting to discuss the technical prospects of the HGP. March 1986 -- Editorial by Renato Dulbecco suggests that the HGP is the best way to make progress in the War on Cancer. March 1986 -- Charles DeLisi holds a workshop to discuss the possibility of a DOE-sponsored HGP. May 1986 -- A molecular biology meeting at Cold Spring Harbor includes a special session to discuss the possibility of the HGP. February 1988 -- A report from the U.S. National Research Council endorses the HGP. April 1988 -- The Congressional Office of Technology Assessment endorses the HGP. September 1988 -- NIH establishes the Office of Human Genome Research, with James Watson as its head. October 1989 -- The new NIH office becomes the National Center for Human Genome Research (NCHGR). April 1990 -- NIH and DOE publish a 5-year mapping and sequencing plan, with a projected budget of $200 million/year. 1991 -- NIH funds ~175 genome projects, with an average grant size of ~$300,000/year. July 1991 -- Craig Venter, then at NIH, reveals that NIH has applied for patents on expressed sequence tags (ESTs) identified by his laboratory. April 1992 -- Watson resigns as head of NCHGR. Francis Collins appointed as his replacement in 1993. June 1992 -- Venter leaves NIH to set up The Institute for Genomic Research (TIGR), a non-profit devoted to identifying human genes using EST methods. October 1993 -- NIH and DOE publish a revised 5-year plan, with full completion expected in 2005. October 1993 -- The Wellcome Trust and the U.K. Medical Research Council open the Sanger Center to sequence the human genome and model organisms. September 1994 -- French and American researchers publish a complete genetic linkage map of the human genome, one year ahead of schedule. December 1995 -- Another group of American and French scientists publishes a physical map of the human genome containing 15,000 marker sequences. February 1996 -- International HGP partners agree to release sequence data into public databases within 24 hours. January 1997 -- NCHGR renamed as National Human Genome Research Institute (NHGRI). October 1997 -- Only 3 percent of the human genome is sequenced in finished form by the projected midway point of the 15-year HGP. 1998 -- ABI PRISM 3700 automated sequencing machines enter the laboratory market. May 1998 -- Craig Venter announces formation of a company, later named Celera, to sequence the human genome in 3 years, using the whole genome shotgun approach. May 1998 -- The Wellcome Trust announces that it will double its support for the HGP. May 1998 -- Collins redirects the bulk of available NHGRI funds to three sequencing centers. October 1998 -- NIH and DOE publish new goals for 1998-2003, expecting a working draft of the genome by 2003, and a full sequence by 2005. March 1999 -- NIH moves the expected date for release of a working draft ahead to spring of 2000. March 2000 Celera and academic collaborators release a draft sequence of the fruit fly genome, obtained using the whole-genome shotgun method. March 2000 -- Possibility for collaboration between Celera and the public HGP wanes. Disagreement over data access is a major obstacle. June 2000 -- HGP and Celera jointly announce a working draft of the human genome sequence. FIGURE 3-1 A timeline of the human genome project. SOURCE: Adapted from Macilwain (2000:983~.

OCR for page 29
MODELS OF LARGE-SCALE SCIENCE 33 was also accustomed to big-science projects involving sophisticated tech- nologies. It tended to oversee big, bureaucratic, goal-oriented projects, in contrast to the smaller, hypothesis-driven research that was the standard at NIH. DeLisi, formerly chief of mathematical biology at NIH, had been exploring the feasibility of such a project, and in 1986 he proposed a plan for a 5-year DOE HGP that would comprise physical mapping, develop- ment of automated high-speed sequencing, and research into computer analysis of sequence data. Soon after, in May 1986, a meeting on molecular biology hosted by lames Watson at Cold Spring Harbor included a special session dedicated to discussing the possibility of an HGP. During this session, Walter Gil- bert estimated the cost of sequencing the human genome at $3 billion (approximately $1 per base). Many scientists opposed the endeavor on the basis of cost, as they assumed it would take funding away from other projects. The project was also viewed by many as a forced transition away from hypothesis-driven science to a directed, hierarchical mode of big science. Many argued that sequencing efforts should focus on the genes rather than the entire genome, which included large areas of repetitive DNA of unknown function. Searching for and characterizing genes hy- pothesized to be associated with human diseases was thought by oppo- nents of the project to be the more scientifically valid approach than "blindly sequencing the denote. However advocates for the Project en 1 tJ tJ ' 1 ~ . . . . ~ . . . . . . . . . . . . argued that a large-scale libel' would be a less risky undertaking than b~g- science programs in space or physics. A failed space mission or particle accelerator would be extremely expensive and would be unlikely to yield partial benefits. In contrast, accomplishing even some of the goals of the HGP (e.g., an incomplete map or a partial sequence) would likely be very beneficial. Others suggested, however, that such a project would not ad- vance medical science, because knowing the sequence of a gene does not necessarily foster progress in developing new treatments. For example, the single base-change mutation responsible for sickle cell anemia has been known for more than 20 years, but no therapies based on this knowl- edge have yet been developed. Many biologists also viewed DOE's efforts as a means of expanding its influence and involvement in biological re- search, as there were questions at the time about the future of the Na- tional Laboratories, given the volatility of national defense and energy policy since the 1970s (Cook-Deegan, 1994~. They argued that a federally funded large-scale HGP, if undertaken at all, should be carried out through NIH. One incentive for undertaking a federally funded HGP was to main- tain a U.S. lead in biotechnology. In the late 1980s, genome efforts were gaining momentum in several other countries as well (reviewed by

OCR for page 29
34 LARGE-SCALE BIOMEDICAL SCIENCE Davies, 2001; Cook-Deegan, 1994; Kevles and Hood, 1992, Sulston and Ferry, 2002~. In 1988, the European Community proposed the launch of a European Human Genome Project. A modified proposal was adopted in 1989, authorizing a 3-year commitment of 15 million euros, 7 percent of which would be devoted to ethical issues. Meanwhile, human genome programs at the national level were also prospering in Europe. For ex- ample, in 1989 the British government committed itself to a formal human genome program, funded at 11 million pounds per year for the first 3 years. In France, the Centre d'Etude du Polymorphisme Humain (CEPH), a key player in developing the genetic linkage map of the human genome, was founded by Nobelist lean Dausset with funds from a scientific award and gifts from a private French donor. Through additional support from the Howard Hughes Medical Institute (HHMI), CEPH made clones of its DNA available to dozens of researchers in Europe, North America, and Africa. Japan, which had thus far been involved only marginally in bio- technology research, was also pushing hard to develop new automated sequencing technologies, with the objective of a major sequencing initia- tive. In 1988, the international Human Genome Organization (HUGO) was formed, primarily with funding from HHMI and the Imperial Cancer Research Fund in Great Britain (Kevles and Hood, 1992~. Its goal was to help coordinate human genome research internationally; to foster ex- changes of data, materials, and technologies; and to encourage genomic studies of organisms other than human beings, such as mice. Because of the controversies surrounding the proposed U.S. HOP, the National Research Council (NRC) was commissioned to undertake a study to determine a strategy for the project. The NRC study, chaired by Bruce Alberts, generated a report (NRC, 1988) advocating an interna- tional program led by the United States and containing the following recommendations: Postponing large-scale sequencing until the necessary technology could be improved, thereby reducing the cost per base (estimated to be about a 5-year delay) Making technology development for sequencing a high priority Focusing first on mapping the human genome Characterizing the genomes of model organisms (e.g., mouse, fruit fly, yeast, bacteria) Providing $200 million in funding per year for up to 15 years The report did not make a recommendation as to whether the NIH or DOE should oversee the project. In 1988, however, NIH and DOE reached an agreement on their working relationship for the next 5 years: NIH would primarily map the chromosomes, while DOE would develop tech- nologies and informatics, with collaboration occurring between the two

OCR for page 29
MODELS OF LARGE-SCALE SCIENCE 35 . . . . agencies In overlapping areas. In 1988, DeLisi submitted a budget from DOE of $12 million. In the same year, NIH Director lames Wyngaarden offered lames Watson, Nobel laureate and codiscoverer of the helical structure of DNA, the position of associate director of human genome research. Watson built political sup- port for the project, and made a commitment to devote about 5 percent of its budget to the study of the project's ethical, legal, and social impli- cations.2 In October 1989, the unit became the National Center for Hu- man Genome Research, with a budget of $60 million for fiscal year 1990 (Davies, 2001~. The HGP actually entailed three related endeavors: genetic mapping, physical mapping, and sequencing. Genetic mapping is accomplished by determining the order and approximate location of genetic markers, such as genes and polymorphisms, on each chromosome. Physical mapping involves breaking each chromosome into small, ordered, overlapping fragments and placing these fragments into vectors that can easily be stored and replicated. For the sequencing phase, fragments of each chro- mosome are processed to determine the base pair code.3 The U.S. HGP was inaugurated as a formal federal program in 1991, receiving about $135 million. Seven NIH centers were involved: five fo- cused on human gene mapping, one focused on mouse gene mapping, and one focused on yeast chromosome sequencing. These centers were supported on a competitive, peer-reviewed basis. In 1991, the largest cen- ter budget was $4 million, divided among several research groups. The genome installations at DOE's National Laboratories were focused on developing technologies for mapping, sequencing, and informatics. Four additional projects, funded jointly by NIH and DOE, were engaged in large-scale sequencing efforts and innovations. In addition, dozens of smaller, investigator-initiated gene mapping and sequencing projects aimed at single disease-associated genes were funded by NIH in laborato- ries across the country. For example, in 1991 NIH funded about 175 differ- ent genome projects, with an average grant size of $312,000 a year (about 1.5 times the average grant size for basic research, and about equal to the average AIDS research grant). Thus, the HGP initially was characterized more by loose coordination, local freedom, and programmatic and insti- 2 This commitment of NIH funds to ethical debate was unprecedented, as was making bioethics an integral part of an NIH biological research program. 3 The original plan called for carefully orchestrated sequencing of the fragments derived from physical mapping; more recently, however, a "shotgun" method has been used to sequence random fragments from a chromosome, followed by application of computer algorithms to determine the order of the sequence fragments.

OCR for page 29
36 LARGE-SCALE BIOMEDICAL SCIENCE tutional pluralism than by strong central management or external hierar- chy (Kevles and Hood, 1992~. Criticism of the program continued, however, especially with regard to funding priorities at NIH. During the late 1980s, the proportion of grants funded by NIH fell from 40 percent to less than 25 percent (Davis, 1990~. For example, the National Institute for General Medical Sciences (NIGMS) awarded more than 900 new and competing renewal grants for projects unrelated to the genome in 1988; in 1990, it awarded only 550, a 43 percent decrease. Across NIH, the total number of grants had fallen from 6000 to 4,600 a year (fewer than the number funded in 1981~. This drop caused great consternation among biomedical scientists, and many assumed that it was due directly to the transfer of funds to the HGP, though close examination of concurrent changes in NIH funding patterns suggests that this was not the case. In the mid-1980s, the average grant period was extended from 3.3 to 4.3 years to provide greater stability for funded projects and reduce the frequency of grant applications; the aver- age amount of funding per grant also increased significantly. But this in turn reduced the funds available for new awards or renewals. During the same period, the production of Ph.D. scientists in the field of biomedicine greatly increased, so more people were competing for grant money. Sup- porters of the HGP argued that the project was bringing appropriations to biomedical research that simply would not otherwise have been received. In any case, NIH expenditures on the project in 1991 accounted for only 1 percent of the agency's total budget of $8 billion (Kevles and Hood, 1992~. In addition, the project's deliberate emphasis on technological and methodological innovation was contrary to the tradition and preference of many in the biomedical research community. However, much progress in biomedical science has been fostered and accelerated by sophisticated tools and technologies, often those developed through work in other fields, such as the physical sciences (Varmus, 1999~. Furthermore, unlike technologies in the field of high-energy physics, those in biology tend to become smaller, cheaper, and more widely obtainable and dispersed as they improve. Thus technology development in biology is more likely to benefit a large number of scientists in the long run, rather than making the field more exclusive. The HGP faced a new challenge in 1992 when lames Watson resigned. Earlier that year, a controversy had arisen regarding patent applications on gene fragments. T. Craig Venter, who was working at NIH at the time, had used a high-throughput technique for sequencing fragments of genes from cDNA libraries (known as expressed sequence tags, or ESTs). NIH applied for patents on hundreds of ESTs on Venter's behalf. The patents were even- tually rejected by the Patent Office on the grounds that they did not meet the criteria of nonobviousness, novelty, and utility. Initial rejection of an

OCR for page 29
MODELS OF LARGE-SCALE SCIENCE 37 application is not unusual, and NIH had the option to appeal the decision, but in 1994 a decision was made to abandon the effort. These patent appli- cations were widely criticized by the scientific community at large, and the issues surrounding DNA patents continue to be controversial. Francis Collins was appointed in 1993 to be Watson's successor. Col- lins had been among the first to identify a human disease gene (for cystic fibrosis) through positional cloning, a technique that relies on genetic and physical mapping. By the time of his new appointment, he had also been involved in the discovery of several additional disease genes4 using simi- lar methods. The HOP soon faced new criticism. By 1997, the midpoint of the 15- year project, only 3 percent of the human genome had been sequenced in finished form, and there were many technical difficulties with the physi- cal maps of the chromosomes (Rowen et al., 1997; Anderson, 1993~. A1- though the first 6 years of the project had deliberately focused on smaller genomes and on the development of techniques that would allow for a more efficient and cost-effective approach to large-scale sequencing of the human genome, sequencing technologies had not yet been sufficiently improved to either dramatically speed the sequencing process or reduce the cost (Pennisi, 1998~. As a result, there was concern about whether the project could be completed within the projected timeframe or budget. In 1998, the technology of DNA sequencing took a major step forward when the Applied Biosystems Incorporated (ABI) PRISM 3700 entered the laboratory market (Davies, 2001; Wade, 2001~. While not the first auto- mated sequencer, the ABI PRISM was still an evolutionary advance over existing commercial automation because it provided increased capacity and throughput. It incorporated two major modifications to the original Sanger sequencing method: it used fluorescent dyes instead of radioactiv- ity to label the DNA fragments, so that a laser detector and computer could identify and record each letter in the sequence as the DNA frag- ments were eluted; and it separated DNA fragments in ultrathin capillary tubes filled with a polymer solution, rather than the traditional polyacry- lamide slab gels. These improvements were the inspiration of Michael Hunkapiller, and the machines were produced by ABI, originally an inde- pendent company that had been purchased by the scientific instrument maker Perkin-Elmer (PE) and now a subsidiary of Applera. As a result of these technological advances, DNA samples could be separated much more quickly, and several samples could be processed each day using very small volumes of reagents. The new machines required only about 15 minutes of human intervention every 24 hours, compared with 8 hours 4 The genes for neurofibromatosis ~ and Huntington's disease.

OCR for page 29
38 LARGE-SCALE BIOMEDICAL SCIENCE for the traditional machine. These changes cut sequencing time by 60 percent, reduced labor costs by 90 percent, and produced sequence about eight times faster (about 1 million bases a day) than traditional sequenc- ing methods (Davies, 2001~. The new sequencing machines were used early on by Craig Venter, who had left NIH in 1992 to found The Institute for Genomic Research (TIGR), a nonprofit organization devoted initially to identifying expressed human genes using EST methods. The organization had since branched out into other areas of genomic research, such as sequencing the genomes of bacteria. It was also a major player in the federally funded HGP. TIGR was the first center to use and verify the effectiveness of the "shotgun" method for sequencing the relatively small, simple genomes of microbes. The advent of the new sequencing machines led Hunkapiller to consider the possibility of rapidly sequencing the entire human genome using a similar approach, and he brought the idea to Venter. In 1998, Venter left TIGR to found Celera, initially an independent subsidiary of PE Corpora- tion and now a subsidiary of Applera the same company that produced the ABI PRISM 3700 sequencing machines with the goal of doing just that. The feasibility of such a project was widely questioned in the scien- tific community. The PRISM sequencers were still largely untested, and the shotgun method had never been used on anything other than bacterial genomes. Many predicted that the final product would likely have many more gaps and errors than would result from the methodical approach of the public project because of the size, repetitiveness, and complexity of mammalian genomes as compared with microbial genomes. Venter and colleagues (1998) argued that these challenges could be overcome, and Celera launched a test project to sequence the genome of the fruit fly Drosophila, a complex eukaryote whose genome was about one-twentieth the size of the human genome. It took Celera 4 months to prepare a rough sequence draft of the Drosophila genome, suggesting that the human ge- nome could be deciphered in this way as well (Loafer, 2000; Pennisi, 2000a). To accomplish the goal of producing a complete rough draft of the human genome sequence by 2001 (4 years ahead of the public project's timetable), Celera purchased about 300 PRISM 3700 sequencers and a supercomputer for sequence analysis. The company also recruited a large number of people who specialized in developing algorithms and soft- ware for sifting through and organizing the huge amounts of data to be generated. Most notable was Gene Myers, who had already been working on shotgun assembly algorithms at the University of Arizona. Venter estimated that the total cost to sequence the human genome would be about $200-500 million. By this time, $1.9 billion had already been in-

OCR for page 29
MODELS OF LARGE-SCALE SCIENCE 39 vested in the publicly funded HOP, but questions were raised as to whether Celera's efforts would now make continuation of the public project redundant and unnecessary. On the other hand, supporters of the public project believed the new challenge from Celera was ample reason to accelerate the public effort. Some of the concern stemmed from the potential commercial exploitation of genomic data, although the com- pany had announced that it would seek patents on only 100-300 genes. The Celera business plan entailed selling access to sequence analysis, such as information on gene identification, DNA variants, medical rel- evance, and comparisons with other species. Celera still planned to re- lease raw sequence data free of charge, but only every 3 months, as op- posed to every 24 hours as in the public project (Davies, 2001~. Shortly after the launch of Celera, the Wellcome Trust doubled sup- port for the Sanger Center, Great Britain's main sequencing center in the public effort. Francis Collins also suggested producing a public rough draft of the sequence first, by 2001, to coincide with Celera's target date. The public consortium would then release a finished, "gold-standard" version by the original deadline in 2004, a goal that Celera had never established. To meet this new deadline, Collins redirected the bulk of available NIH funds to just three centers, announcing that these three centers would receive $80 million over 5 years. At about the same time, the Wellcome Trust announced that it would provide another $7 million to the Sanger Center. Thus the lion's share of the draft sequence would be produced by five major genome centers: Sanger, three centers funded by NIH (Whitehead Institute, Washington University, and Baylor College of Medicine), and DOE's Joint Genome Institute. To meet the new goal, hundreds of PRISM sequencers (or similar machines) were purchased by the publicly funded centers (Davies, 2001~. The competition and animosity between the public and private ef- forts to sequence the genome escalated (reviewed by Davies, 2001; Wade, 2001), but as the self-imposed deadline to finish the draft sequence ap- proached, a compromise was brokered between the leaders of the two projects. On June 26,2000, Craig Venter and Francis Collins came together for a White House press conference to formally announce completion of the draft sequence. The first publications on the draft sequences were published about 7 months later in the journals Nature and Science (Lander et al., 2001; Venter et al., 2001~. Science has been criticized for its decision to publish Celera's analysis because the company was allowed to post its data in its own database with some restrictions on its use, rather than depositing the sequence into a public database such as Genbank, as is usually required for publication. Leaders of the public project have also noted that Celera's analysis was dependent upon access to the public databases, suggesting that the company's shotgun method alone could

OCR for page 29
MODELS OF LARGE-SCALE SCIENCE 69 Recently, the SNP Consortium collaborated with the International Human Genome Sequencing Consortium40 to publish a paper in the jour- nal Nature describing a map of 1.42 million validated SNPs distributed throughout the human genome (Sachidanandam et al., 2001~. Using DNA from a diversified, representative panel of anonymous volunteers, the collaborators identified, on average, one SNP for every 1.9 kilobases of DNA. Such collaboration further demonstrates that public-private coop- eration can be an efficient means of developing basic research tools. In the case of SNP analysis, however, international cooperation was perhaps not as strong as it had been for the Human Genome Project. The SNP Consortium invited Japanese companies to participate in the project, but they declined the offer. Instead, 40 Japanese drug firms decided to provide a total of $10 million to university researchers in Japan to study SNPs in that country's population. They will establish their own database of SNPs, but these data will also be made freely available to other scien- tists (Sciencescope, 2000~. A new public-private consortium was recently established to further build on the work of both the SNP Consortium and the HOP. The $100 million HapMap project, with funds from six countries41 and several phar- maceutical companies, aims to map about 300,000 haplotypes from four populations in Africa, Asia, and the United States within 3 years (Couzin, 2002b; Adam, 2001b). Haplotypes are sets of genetic markers that are close enough on a particular chromosome to be inherited together. Using SNPs alone to identify disease-associated genes can be difficult and ex- pensive, partly because it is difficult to trace individual SNPs in a genome containing 3 billion base pairs. Haplotype analysis will reduce back- ground noise and should make the search for genes easier and faster because the many individual markers are consolidated into more man- ageable clusters. Scientists realized only recently that a haplotype map might be fea- sible when they discovered that relatively large blocks of DNA are inher- ited in this way. Computer simulations predicted that DNA haplotypes 40 This collaborative effort was funded by the National Human Genome Research Insti- tute and the SNP Consortium. Three academic genome research centers the Whitehead Institute for Biomedical Research in Cambridge, Massachusetts; Washington University School of Medicine in St. Louis; and the Sanger Centre in Hinxton, United Kingdom- participated directly in this collaboration. The International Human Genome Sequencing Consortium includes scientists at 16 institutions in France, Germany, Japan, China, Great Britain, and the United States, with funding from government agencies and public charities in several countries. 41 Funders include NIH in the United States ($40 million) and the Wellcome Trust in the United Kingdom ($25 million).

OCR for page 29
70 LARGE-SCALE BIOMEDICAL SCIENCE would only be about 10,000 or fewer bases. To their surprise, genome researchers have found that haplotype blocks tend to be much larger (up to 100,000 base pairs), and that many such blocks come in just a few different versions. For example, within some sequence stretches of 50,000 bases, only four of five patterns of SNPs, or haplotypes, might account for 80-90 percent of the population. It is not clear why this occurs, but some chromosome regions may be less likely than others to recombine during meiosis, leading to conservation of the DNA blocks (Helmuth, 2001~. Haplotypes are found by analyzing genotype data, so the new col- laboration will essentially be a high-throughput genotyping effort. The work will be done by several biotechnology companies and public labora- tories, including the Sanger Center and the Whitehead Institute, but deci- sions are still pending on such issues as how data collection will be stan- dardized, how the map will be structured, and how the work will be divided. It is hoped that the new map will provide an invaluable tool to simplify the search for associations between DNA variations and com- plex diseases such as cancer, diabetes, and mental illness. However, many scientists, especially population geneticists, have questioned the value of generating a haplotype map at this time, arguing that there is too little information on the usefulness of such a map or how to best to proceed (Couzin, 2002a). There is also great interest in developing more efficient, cost-effective technologies for high-throughput analysis of SNPs (Chicurel,2001~. With- out such improvements, screening large populations to search for dis- ease- or therapy-associated genes could still be impractical. A number of investigators are attempting to improve on the current technolo~v but to date no coordinated effort has been made. HUMAN PROTEOME ORGANIZATION hi, The Human Proteome Organization (HUPO) is an international alli- ance of industry, academic, and government scientists aimed at determin- ing the structure and function of all proteins made by the human body (Kaiser, 2002; Abbott, 2001~. The mission42 of HUPO is threefold: to con- solidate national and regional proteome organizations; to engage in scien- tific and educational activities that encourage the spread of proteomics technologies, as well as the free dissemination of knowledge pertaining to the human proteome and that of model organisms; and to assist in the coordination of public proteome initiatives. The organization's formation was spurred by concerns that in the absence of such a coordinated effort, 42 help: / /www.hupo.org/.

OCR for page 29
MODELS OF LARGE-SCALE SCIENCE 71 individual companies would generate their own basic proteomics data and protect them through trade secrecy. The organizers hope to include more countries than participated in the HOP, and plan to generate fund- ing contributions from companies, with matching government funds. HUPO participants have proposed five initial research and technol- ogy development projects to garner interest from potential funders (see Box 3-2~. Several companies have already offered financial support, and a number of countries are launching initiatives related to HUPO's goals. The NIGMS Alliance for Cellular Signaling is one such initiative, but a broader role for NIH in a global proteomics project remains unclear. Some U.S. proteomics experts have proposed establishing a few pilot large- scale centers to identify proteins en masse with uniform standards from healthy and diseased tissues and blood serum (Kaiser, 2002~. But many others question the sensitivity and specificity of current mass spectrom- eters, suggesting that such an undertaking would be premature, and that it would be more useful to fund individual investigators to study small parts of large, complex protein networks (Check, 2002~. HOWARD HUGHES MEDICAL INSTITUTE The Howard Hughes Medical Institute (HHMI) provides an example of an alternative strategy that could be used to undertake large-scale re- search projects. HHMI is a nonprofit medical research organization that employs more than 300 biomedical scientists across the United States at more than 70 universities, medical centers, and other research organiza- tions. It also maintains a grants program aimed at enhancing science edu- cation at all levels. One of the world's largest philanthropies, HHMI had an endowment in mid-2000 of approximately $13 billion, and $600 million

OCR for page 29
72 LARGE-SCALE BIOMEDICAL SCIENCE was disbursed for medical research ($466 million), science education, and related activities. Created by Hughes in 1953, the Institute has always been committed to basic research, with the charge of probing "the genesis of life itself."43 The organization's charter states that "the primary purpose and objective of the Howard Hughes Medical Institute shall be the promotion of human knowledge within the field of the basic sciences (principally the field of medical research and medical education) and the effective application thereof for the benefit of mankind." The Institute draws a clear distinction between itself and other foundations that provide money for biomedical research in that it operates as an organization with investigators across the country. Hughes investigators are employed by the Institute but con- duct their research in the laboratories of their host institutions. The Institute's work has traditionally focused on five main areas of research: cell biology, genetics, immunology, neuroscience, and structural biology. More recently, clinical science programs have been added, as well as a new focus on bioinformatics. Investigators are free to pursue their own research interests without the burden of writing detailed proposals for each project, but their research progress is reviewed by HHMI every 5 years. Scientists who are not renewed as HHMI investigators are pro- vided with additional phase-out funds for 2-3 years so they will have an opportunity to seek other funds or gradually scale back their activities. This approach also eases the strain on affected staff and trainees in the lab who need time to seek other positions. In what was perhaps the Institute's first foray into large-scale science (as defined in this report), HHMI held an Informational Forum on the Human Genome at NIH in 1986. Subsequently, HHMI played a role in the HOP by supporting several databases, including one at Yale University; one at the Centre D'Etude du Polymorphisme Humaine in Paris; and one at the Jackson Laboratory in Bar Harbor, Maine (Cook-Deegan, 1994~. Recently, HHMI announced a novel research endeavor for the organi- zation. This new 10-year, $500 million project44 may be viewed as another form of large-scale science funded by a nonprofit organization. HHMI plans to build a permanent biomedical research center that will develop advanced technology for biomedical scientists and provide a collabora- tive setting for the development of new research tools. Slated to open in 2005, the new center will have an annual operating budget of about $50 million (Kaiser, 2001~. Research topics have not yet been fully defined, but are likely to focus on such areas as bioinformatics, proteomics, and imag- 43 http://www.hhmi.org/. 44 See .

OCR for page 29
MODELS OF LARGE-SCALE SCIENCE 73 ing tools (e.g., electron microscopy). Investigators are likely to include computational scientists, chemists, physicists, engineers, and biomedical scientists with cross-disciplinary expertise. The center will provide laboratories for up to 24 investigators (who will not have tenure), plus their research staffs, for a total of 200-300 people. In addition, laboratories and other facilities will be built for visit- ing researchers and core scientific support resources. Visiting scientists will be able to stay for as little as a few weeks or may take a sabbatical year. Organizers hope this format will allow for rapid shifts into new areas that show unusual scientific promise and for quick adaptation of new discoveries for use in biological research and health-related sciences. For collaborative research at the new center, HHMI will request pro- posals from the scientific community at large, as well as from its own investigators. The Institute will seek out proposals focused on cutting- edge scientific and technological goals, and will give preference to projects that bring together diverse individuals and expertise from different envi- ronments. To be successful, proposals will have to demonstrate original- ity, creativity, and a high degree of scientific risk taking. One goal of these collaborations is to ensure that all HHMI investigators, regardless of their home institution's facilities, can obtain access to expensive, high-technol- ogy tools and the expertise needed to run them (Kaiser, 2001~. HHMI leaders have acknowledged that the kind of research they are proposing for the center is more typically undertaken by biotechnology companies. The Institute will encourage patenting of discoveries made at the center, which may foster the launch of new startup companies. How- ever, the generation of royalty revenues or new private businesses is not a stated goal of the Institute (Kaiser, 2001~. Because this project is still in the very early stages of planning, predicting its effectiveness or impact on the broader scientific community is impossible. Nonetheless, it provides a novel and unique model for consideration. SYNCHROTRON RESOURCES AT THE NATIONAL LABORATORIES Two institutes from NIH, the NIGMS and the NCI, are providing $23 million over three years to support the design and construction of a user facility at Argonne National Laboratory's Advanced Photon Source (APS), the newest and most advanced synchrotron in the country. After two years of planning, NIGMS and NCI, which represent two-thirds of the life-science synchrotron user community, finalized an agreement early in 2002 to increase synchrotron resources by constructing three new beam lines at Argonne's APS that will be fully operational by 2005. The facility is operated by the University of Chicago, but beam time will be adminis-

OCR for page 29
74 LARGE-SCALE BIOMEDICAL SCIENCE tered by NIH. Half of the beam time will be allocated to peer-reviewed research. NIGMS and NCI grantees will have access to the beam through a peer-review process for research grants. Twenty-five percent of the beam time will be divided between NIGMS and NCI for special projects, and the remaining beam time will be reserved for staff use and maintenance. The NIGMS/NCI facility will be fine-tuned to focus on the aspects of X- rays most useful for biological studies. Demand for beam time is increas- ing because of such projects as the NIGMS PPSI. NCI is particularly inter- ested in how the synchrotron facilities will advance the study of cancer-related molecules, because an understanding of detailed protein structure will help cancer researchers develop targeted drug therapies. NIGMS and NCI anticipate that information about molecular structures will allow scientists to help develop new medicines and diagnostic tech- niques. Once construction is complete, operation costs for the beam line are estimated to be $4 million a year, of which NCI has committed $1 million annually. (Cancer Letter, 2001; Softcheck, 2002~. DEFENSE ADVANCED RESEARCH PROJECTS AGENCY The Defense Advanced Research Projects Agency (DARPA) provides another alternative strategy for undertaking large-scale research projects. DARPA is the central research and development organization for the Department of Defense. It manages and directs selected basic and applied research and development projects for the department, with a focus on projects in which the risk and potential payoff are both very high, and in which success could provide dramatic advances for traditional military roles and missions. The agency was created in 1958 by President Eisenhower following the Soviet Union's surprise launch of Sputnik (Malakoff, 1999~. An inves- tigation blamed delays in the U.S. military satellite program on bureau- cratic infighting and an unwillingness to take risks. Intent on keeping the United States at the forefront of technological innovations, Eisenhower ordered Pentagon planners to create an agency that would be completely different from the conventional military research and development struc- ture and, in fact, would serve as a deliberate counterpoint to traditional thinking and approaches. The new agency relied on a small group of experts to look beyond near-term military needs and to fund areas offer- in~ great potential to revolutionize military capabilities. Today, the em- ~ ~ ,' ,, . . . .... .. . . . . .. ~ .. . ~ .. phases Is skill on seeking out and pursuing novel Ideas. A list of the agency's founding principles, which are still followed, is provided in Box 3-3. Best known for its role in developing the Internet (Norberg and O'Neill, 1996), DARPA has funded work focused primarily on computer

OCR for page 29
MODELS OF LARGE-SCALE SCIENCE 75 and software development, engineering, materials science, microelectron- ics, and robotics. The agency has had only a limited and very recent interest in basic molecular biology, and most of its biology research relates to just one functionprotecting personnel against biological weapons. However, some of this work could potentially have broader implications for biological research, such as novel approaches for DNA sequencing (Alper, 1999) or sophisticated biosensors. Funding for research on this topic began in 1997, with contracts totaling about $50 million going to biotechnology ventures and nonprofit organizations. Although a panel of expert advisors provided some input in launching this program, it is run essentially the same as all other DARPA programs with hands-on over- sight by carefully selected program managers (Marshall, 1997~. With an annual budget of $2 billion, DARPA's small group of about 125 program managers have extensive power to direct high-risk projects that would not normally fare well in peer review. A DARPA program manager will typically spend as much as $40 million on contracts to in- dustry, academic, and government laboratories for one or more projects.

OCR for page 29
76 LARGE-SCALE BIOMEDICAL SCIENCE The contracts call for defined deliverables and allow less-promising work to be canceled easily. The agency aims to complete 20 percent of ongoing projects each year, and renewals are not made, although projects are occa- sionally reformulated for a subsequent attempt. The funded researchers often attend team meetings, file frequent reports, and work cooperatively with other contractors. Program managers are selected on the basis of their technical exper- tise and their aspiration to leave their mark on a field. They stay for an average of 4 years and often return to their primary field of research when their term is over. In addition to their technical expertise, they must dem- onstrate bureaucratic skills, as they must lobby for their portion of the DARPA budget, and be able to move established research communities in a particular direction or create new collaborations in disparate fields. Program managers identify opportunities in science or technology that appear promising, and then make decisions about whom to fund in pur- suing the ideas. They may make the latter decisions by probing the net- work of experts in a field to identify the most appropriate researchers, or by using written specifications to invite experts in the field to apply for funds. Program managers have only two layers of supervision an office director and the DARPA director, who reports to the Secretary of De- fense. These supervisors monitor the performance of the managers and hold them accountable for advancing their fields, but a major criterion for success is positive peer assessment of the manager's performance. This arrangement is in stark contrast to the current model at NIH, in which peer review is used to select proposals from a competitive pool of grant applications, rather than to assess the performance of program managers. NIH grant management staff generally have a comparatively passive role in project selection. It can also be difficult to determine whether the selected grant portfolios are actually meeting the goals of NIH programs. Ultimately, the strength of DARPA has been in pursuing innovative research directions to create new fields, or in solving specific technical problems by fostering the development of new technologies. The agency is not responsible for sustaining fields in the long run, as is NIH. Thus, adopting a DARPA model of funding for all NIH programs would be unworkable. However, the addition of some DARPA-like programs to the traditional NIH portfolio might add valuable research that would not otherwise be undertaken. Indeed, some leaders at NIH, including former director Harold Var- mus, have recently expressed interest in adopting some DARPA-like pro- grams at NIH to spark innovation (Malakoff, 1999~. Under the leadership of NCI director Richard Klausner, NCI has even launched a pilot program modeled in part after DARPA, as well as other agencies, such as NASA.

OCR for page 29
MODELS OF LARGE-SCALE SCIENCE 77 The Unconventional Innovations Program (discussed earlier) emulates the DARPA approach by assembling interdisciplinary research teams and pressing them to share information, with the goal of producing break- throughs in cancer detection technologies. NCI's traditional peer review panels still play a major role in selecting projects, but agency managers are more involved in program oversight than is usual. The program seeks input from and collaboration with investigators that have not tradition- ally been engaged in biomedical research. Despite these new developments and the past successes of DARPA, however, such programs do not come without difficult challenges and criticism. One of the greatest challenges to undertaking DARPA-like pro- grams may be the difficulty of recruiting effective managers. The DARPA model works best when the manager is an intellectual peer of the scien- tists being funded. But for biomedical scientists, a 4-year absence from the laboratory and the resultant lack of published scientific papers during that period could very well be disastrous from a long-range career per- spective. In addition, university-based scientists in particular often feel uncomfortable with aggressive supervision and team-dominated research, and biomedical scientists have opposed most initiatives that involve strong external control in the past. Furthermore, it is not uncommon for DARPA-funded projects to fail in meeting their intended goals. This is to be expected, given the high-risk nature of the work, but it may not be a popular approach in other fields. And even when its projects have been successful, DARPA has had difficulty in moving some findings into the military venue or the marketplace (Malakoff, 1999~. All of these issues need to be weighed carefully in attempting to emulate the DARPA pro- gram in other fields of research. SUMMARY As is clear from the examples described in this chapter, the character- istics of large-scale biomedical research projects can vary greatly, even when such research is defined relatively narrowly. However, the examples presented here share many common themes, characteristics, and issues. For example, most are dependent on technology in the sense that they require the use of expensive technologies, the development of novel tech- nologies, refinements to current technologies, or standardization of the way technologies are used and how the information generated is inter- preted and analyzed. Another common feature of the examples described here is a great need for planning, organizational structure, and oversight. The capacity of a large-scale project to efficiently and effectively produce data and other end products that are novel and valuable to the scientific commu-

OCR for page 29
78 LARGE-SCALE BIOMEDICAL SCIENCE nity can be determined by its design and the skill of the individuals who oversee the work. Many of the large-scale projects described here are also quite collaborative and interdisciplinary in nature. For example, the needs for data assessment and technology development mandate the collabora- tion of scientists who may not have been involved traditionally in biologi- cal research, such as engineers, physicists, and computer scientists. This new approach to biology creates additional challenges in communication across disciplines, and can also lead to difficult questions regarding train- ing and career advancement. If interdisciplinary scientists do not fit well into the traditional models of academic science departments, it may be difficult to assess their contribution and compensate them fairly with promotions and tenure. These issues are also relevant to managers of large-scale projects, who are crucial to the success of the effort, but often do not find themselves on traditional academic career paths, and may be given relatively little credit for the accomplishments of the project. These topics are covered in more detail in Chapters 4 through 6. One issue common to all large-scale biomedical research projects that generate research tools or databases of information is that of accessibility. Concerns are often raised regarding intellectual property rights, open communication among researchers, and public dissemination of data and information. Such concerns may be especially pertinent when for-profit entities are involved in the undertaking. Most projects to date have adopted a policy of making data publicly available, at least in raw form. Research tools and reagents generated through large-scale projects funded by NIH are also often made available to other scientists at cost, but doing so requires a considerable commitment of NIH resources and infrastruc- ture support. Clearly such matters need to be thoroughly addressed be- fore a large-scale project is launched. Chapter 7 examines these issues in greater detail. The issue of peer review also appears to be extremely important for large-scale projects in biology. Many of the early attempts by NCI to undertake large-scale, directed projects resulted in harsh criticism be- cause of a lack of peer review, which has been fairly standard for NIH funding. Traditionally, NIH decisions about which projects and investi- gators to fund have been made following peer review of project proposals in grant applications. But peer review could also take other forms, such as reviewing the progress and achievement of grant recipients to determine whether funding should continue or whether the project's goals or objec- tives should be altered. Peer review might also focus on the performance of program managers who make decisions about which projects and people to fund, as is done under DARPA. Recently, NIH has developed some new large-scale programs that incorporate novel approaches to peer review, whereby steering and advisory committees whose members in-

OCR for page 29
MODELS OF EUGENE SINE 79 clude scientists not directly involved with the protect assess progress and provide advice on future direchons. It is sUll too early to determine how e~ecUve these mechanisms are' but Bus far they appear to be acceptable to the scientific community. These topics are addressed in more detail in Chapter 4.