National Academies Press: OpenBook
« Previous: 1 Introduction
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 20
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 21
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 22
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 23
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 24
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 25
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 26
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 27
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 28
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 29
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 30
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 31
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 32
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 33
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 34
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 35
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 36
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 37
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 38
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 39
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 40
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 41
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 42
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 43
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 44
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 45
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 46
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 47
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 48
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 49
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 50
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 51
Suggested Citation:"2 Assessment." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 52

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

2 Assessment WHAT IS PLANT BIOLOGY RESEARCH IN 2007? The ultimate goal of plant biology research and of the National Plant Genome Initiative (NPGI) is to create the knowledge-based capability to breed or produce plants with specific performance characteristics (phenotypic traits). Most traits of economic interest are under strong to moderate genetic control and are variable across populations and environments both within a species and between species. Discovering the genetic processes that control trait expression requires deep ex- perimental knowledge in a few model species, intersected with broad knowledge of how natural variation in crop species and their close relatives contributes to it. Of course, the assumption that the most closely related genes across species share function is not always true, but it is an excellent starting assumption that is typi- cally testable. Plant biologists aim to understand the “genetic wiring” of plants and of plant processes of basic, societal, or environmental interest. They aim to inform the breeding of plants with a defined genetic makeup, and to be able to predict with high certainty how these plants will perform, in different environments and climate conditions. Examples of the traits that plant genome scientists would like to understand and control include disease resistance against a wide range of plant pathogens, nematodes, and insects and tolerance to environmental stresses (for example, salt, toxic soil chemistries, drought, extreme temperatures, and soil nutrient utilization). Other important targets include modulation of plant growth and development (for 20

Assessment 21 example, useful alterations of plant size, shape, and chemistry and the ability to use less fertilizer) and improved control of flowering and of the amount and quality of fruits and seeds produced (see Chapter 1). Achieving the goals of breeding or producing plants with specific perfor- mance characteristics requires extensive investment in data generation, data management, and analysis infrastructures, and human capacity-building to make effective use of the data. It also requires a daunting level of intellectual growth in biologists’ perception of how genetic networks control physiological traits, how natural genetic variability in important traits within and across plant species is manifested, how environmental signals are transduced into adaptive responses, and how evolutionary processes lead to network diversification, op- timization, and creation of trait novelties. Scientific AND SOCIETAL Impacts OF NPGI Impacts and Outcomes from NPGI-Funded Research At the beginning of NPGI in 1998, there was little dedicated federal funding for plant genomics research beyond the then rapidly expanding Arabidopsis genome project and its associated research community, and various projects funded by ad hoc grants to principal investigators (PIs) from different research agencies. One exception was the U.S. Department of Agriculture’s (USDA) National Research Initiative, which awarded 86 grants in FY 1997 worth about a total of $11 million from its “Plant Genomics” grant panel. These ad hoc efforts were split among many plant species, which arguably inhibited deep strategic investment in plant biology as a whole and genomics-based crop improvement in particular. A fair assessment of NPGI, then, would address whether and how it has con- tributed to the building of strong and vibrant research communities linked by common interests. If these research communities have indeed been built, have they invested in cutting-edge genomic technology, and have they performed well using those resources? The committee relied on three key documents that articulated the goals (NRC 2002; NSTC 1998, 2003) and on the advice, critiques, and summaries of discussions at a workshop featuring key academic and private sector plant genome scientists (see Appendix D for workshop agenda and speakers). The committee also used data collected from a questionnaire sent to all lead principal investigators and reviewed the yearly NPGI Progress Reports (NSTC 1998, 1999, 2000, 2001, 2003, 2004, 2005, 2006, 2007). The 5- and 10-year goals of NPGI were noted in Chapter 1 (see also NRC 2002; NSTC 1998, 2003). Some highlights of the research aimed toward those goals are emphasized in the following sections.

22 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e Capacity and Infrastructure Building The committee views at least a significant part of the first nine years of NPGI as a capacity-building exercise, as also emphasized by the previous NRC report The National Plant Genome Initiative: Objectives for 2003–2008 (NRC 2002). The capacity-building exercise was not trivial for two important reasons. First, there are many plant species, each of which might provide unique biology of interest to society. Hence, the mission of “plant genomics” is much broader than the mission of “animal genomics,” which is nearly all driven by ultimate concerns for human health, and to a far lesser degree, by humans’ uses of domesticated animals. Sec- ond, traditional plant biology research on the broad number of crops species took place in many institutions that, before NPGI began, had little exposure to either the mind frame or toolkit of genomics. The committee addressed how NPGI has built human capacity and how it has contributed to the distribution of a broad technological platform serving a variety of institutions and plant species. NPGI has done very well by those metrics. First, the number of different PIs funded by NPGI grew nearly 13-fold over the first 9 years (from 21 to 277; see Table E-1 in Appendix E). As is perhaps expected, many of these PIs had more than one grant funded in that period. In sum, these numbers suggest that a critical mass of plant genomics PIs is being recruited for future efforts. Second, the committee noted what seems at first glance to be a rather low pro- portion of investment ($14 million, or about 2 percent of the total) in the emerg- ing, and often expensive, instruments required to compete effectively in genomics research (Table E-2 in Appendix E). The low investment in genomics instruments is partly a result of NPGI projects taking advantage of “sequencing for hire.” Be- cause sequencing for hire has become a lot cheaper over the nine-year course of the program, it results in cost savings compared to investing in large-scale sequencing equipment. Nevertheless, NPGI needs to ensure that its projects have access to the ever-changing landscape that characterizes high-throughput biology. Technology access facilitates previously impossible experimentation and in fact drives creation of new technologies. The rationale for further investment in technology access and technology creation in the framework of NPGI is discussed in detail in Chapter 3. Human capacity-building is addressed in the Education section below. Genome Sequence, Structure, and Organization NPGI has contributed to revolutionary breakthroughs in plant genome se- quencing. The initial priority in plant genomics research is to have a high-quality finished genome sequence of the relevant organisms. The first such sequence for any species is referred to as the “reference” sequence (see below). NPGI initially

Assessment 23 invested in the international sequencing consortium that accelerated finishing of the Arabidopsis thaliana reference sequence (The Arabidopsis Genome Initiative 2000), and then helped to build an international consortium for sequencing the rice genome (see below). The publications describing their genomes are citation classics. Because pathogens and pests cause great losses in crop yield, the sequencing of plant pathogenic genomes was included within the broader NPGI. Sequenced pathogens initially included bacterial pathogens (three strains of Pseudomonas syringae, three Xanthomonads, and several Xylella strains) and the fungal caus- ative agent of rice blast, Magneporthe grisea. The NPGI subsequently supported the sequencing of additional fungal genomes, such as Phytophthora (three species that cause late blight of potato, root and stem rot of soybean, and sudden oak death syndrome; http://www.oomycete.org/). Three Fusarium species, three strains of Verticillium wilt, several powdery mildew and rust fungi, and the necrotrophic fungus Botrytis cinerea (Broad Institute 2007) were sequenced as part of a focus on fungi by the National Human Genome Research Institute (NHGRI). Genome sequences from additional pathogens, like Hyaloperonospora parasitica (oomycete causing downy mildew of Arabidopsis), are nearly finished. This first wave of plant pathogen genome sequences begins to cover the most economically critical plant pathogens, and it opens the door for comparative studies both across different isolates of one species and between species in the search for common mechanisms of virulence. In addition to using “sequencing for hire” in some projects, NPGI has recently benefited from an extremely successful interaction with the Department of Energy’s (DOE) Joint Genome Institute (JGI) to accelerate high-throughput plant and pathogen genome sequencing. That in-kind support to NPGI relies on a stringent external peer review by JGI that prioritizes projects on the basis of a mix of crite- ria, which include relevance to the DOE mission, organization and activity of the research community centered around candidate species, and evolutionary criteria aimed at maximizing the phylogenetic breadth of sampling. It is the committee’s view that the successful interaction of IWG with JGI, as the key (in fact, the only) major plant genome sequencing center, is critical to future overall success of NPGI. Comparative genomics is central to modern genetic approaches. Perhaps the most profound lesson of the Human Genome Project is that comparative analysis between closely and distantly related genomes provides a rapid and cost-effective way to extract information that can accelerate applied biomedical research and development. After the sequencing and analysis of the mouse and rat genomes (both model systems of direct relevance to biomedical research), it became evident that more sampling of diverse mammals would accelerate the identification and

24 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e characterization of functional elements in the human genome through compara- tive analysis. That rationale led the NHGRI to sequence not only model organisms like the chicken and dog, but also the opossum, platypus, elephant, armadillo, and squirrel genomes. Nearly 20 mammalian reference genomes are either complete or in progress, totaling perhaps 60 billion base pairs. This rich comparative sequence landscape can lead to profound understanding of genome organization. The development of plant comparative genomics has an important addi- tional strength relative to the parallel comparative study of species related to humans. Humans, in essence, are the sole focus of biomedical research and this consideration drives the selection of relevant genomes to sequence. In contrast, there are dozens of societally important crops and wild plant species that are far more distantly related from one another than are mammals to each other. In ad- dition, many species of plants have already undergone hundreds to thousands of years of domestication and agronomic improvement. They provide snapshots of how traits important to humans can be modified by selection. The multiple spe- cies focus of agriculture, therefore, places a premium on research approaches that can leverage generically useful genomics information for application across plant taxa. The committee is confident that comparative genomics within and between plant families will accelerate the definition of gene function in parallel to the way comparative mammalian genomics has accelerated human genomics in the last five years. The evolutionarily conservation across plant genomes strengthens infer- ences made by comparative genomics methods. Hence, genome comparisons will have many useful cross-family applications between legume, rosaceous, sola- naceous, and cereal crops, as well as between wood and fiber crops in the Salica- ceae (poplar, willow), Myrtaceae (eucalypts), and the diverse families of conifers (pines and spruces). In particular, synergistic use of Arabidopsis and rice genome sequences can often allow definition of candidates for conserved function in other species for which, for example, expressed tag sequences (ESTs) from specific organs exist. There is also substantial and useful genome conservation that extends to the evolutionarily ancient gymnosperms, which include pine and spruce. Full-length cDNA clone sequences are even more useful to understand gene function and evolution; complete collections of full-length cDNA clones are important tools for subsequent functional experimentation. Completed and Ongoing Land Plant Reference Genome Projects Table 2-1 includes known ongoing plant genome sequencing projects, many of which support the mission of NPGI through in-kind support from JGI. The table includes only those projects that are expected to release sequences publicly in the

Assessment 25 TABLE 2-1 Reference Plant Genomes Sequenced and in Progress Estimated or Common Size Actual Date of Species Name (Gb) Strategy Completiona Group  1 Arabidopsis thaliana Thale cress 0.2 BAC 2000 International consortium  2 Oryza sativa (x2) Rice (indica and 0.4 BAC 2005 International consortium japonica)  3 Populus trichocarpa Black cottonwood 0.5 WGS 2005 Joint Genome Institute  4 Vitis vinifera Grape 0.5 WGS 2007 Genoscope  5 Physcomitrella patens Club moss WGS 2006 Joint Genome Institute  6 Medicago truncatula Barrel medic 0.5 BACb 2007 International consortium  7 Sorghum bicolor Sorghum 0.7 WGS 2007 Joint Genome Institute  8 Carica papaya Papaya 0.4 WGS 2007 University of Hawaii  9 Ricinus communis Castor bean 0.4 WGS 2007 The Institute for Genomic Research 10 Zea mays (x2) Maize 2.3 BACc 2008 Washington University Genome Center 11 Arabidopsis lyrata Rockcress 0.2 WGS 2007 Joint Genome Institute 12 Selaginella Spike moss 0.2 WGS 2008 Joint Genome Institute mollendoerfii 13 Mimulus guttatus Monkeyflower 0.5 WGS 2008 Joint Genome Institute 14 Glycine max Soybean 1.1 WGS 2009 Joint Genome Institute 15 Brachypodium Purple false brome 0.4 WGS 2008 Joint Genome Institute distachyon 16 Prunus persica Peach 0.3 WGS 2008 Joint Genome Institute 17 Solanum lycopersicum Tomato 1.0 BACb 2010? International consortium 18 Brassica rapa Chinese cabbage 0.5 BAC 2009? International consortium 19 Capsella rubella Shepherds purse 0.2 WGS 2008 Joint Genome Institute 20 Setaria italica Foxtail millet 0.5 WGS 2009 Joint Genome Institute 21 Aquilegia formosa Western columbine 0.4 WGS 2009 Joint Genome Institute 22 Eucalyptus grandis Eucalyptus 0.6 WGS 2009 Joint Genome Institute 23 Lotus japonicus Trefoil 0.5 BAC 2010? Kasuza DNA Research Institute NOTE: The strategies used could be map-based sequencing using bacterial artificial chromosomes (BAC) or whole-genome shotgun sequencing (WGS). aSeveral timelines in Table 2-1 are estimated from project websites or personal communication, and are hence approximate. bBAC indicates only euchromatic BACs will be sequenced. cBAC in addition to the BAC-by-BAC maize project, a second maize inbred line is being sequenced using a whole genome shotgun method by the Joint Genome Institute. near future. A reference genome might have gaps and errors but captures greater than 90 to 95 percent of protein-coding gene content in highly accurate sequence (less than 1 error in 10,000 nucleotides), typically (but not always) anchored to physical and genetic maps. In some cases, targeted gap closure generates higher quality “finished” sequence. Resequencing projects that are aimed at characterizing

26 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e variation relative to a reference sequence within a particular species, or efforts at finishing a nearly complete genome, are not included in this table. Some groups are still seeking funds to complete an ongoing sequencing project. Genome sizes are estimates of haploid content, given in billions of base pairs (Gb). As noted above, the ultimate success of plant genomics is enriched by the knowledge of what genes are expressed in various cell types and organs under different stress conditions and their overall developmental time. The sampling of ESTs can give rise to a measure of the gene number (termed Unigene or, as in Table 2-2, a TIGR contig), and hence the deduced number of proteins, in an or- ganism. Additional methods can sample the expression of the genome in specific tissues and cell types over developmental and environmentally altered conditions (transcriptomics). NPGI has contributed significantly to the collection of ESTs from various species, as shown in Table 2-2, and to the deployment of various transcriptomic tools. Despite the large numbers of EST sequences and the equally compelling numbers of different cDNAs represented by these ESTs for many species, the extent and functional relevance of splicing of primary RNA transcripts and other elements and of alternate transcriptional starts and stops in plants are largely unknown. Whole genome analysis with tiling arrays using the Arabidopsis or rice genome sequences have made careful analysis in those important areas possible. Another calculation of the number of putative unique transcripts (PUTs) for these and other species can be found at the Plant Genome Database (Plant Genome Database 2007). Gene Function, Expression, and Regulatory Networks Genome sequence is the raw material for biological discovery. However, it is only one of the first steps toward understanding gene function, even at the biochemical level. In fact, plant scientists claim to have functional knowledge of only about 40 percent of the genes in Arabidopsis, and that estimate is based on an arguably overestimate of gene ontology (GO) functional inference. Hence, one important metric of plant genomics progress is whether the genomics tools have been generated with which to perform functional analysis in both high-throughput “data factories” and by hypothesis-driven studies of detailed gene function, usually in the laboratories of single investigators who specialize in functional networks of genes that act in a particular process or who study specific classes of genes. NPGI has supported a wide range of investigations into gene regulatory mechanisms in model and crop plants. By virtue of the number of plant species funded by NPGI grants, the diversity represented by the funded projects is high. However, they generally fall into one or more of the following categories: defining

Assessment 27 TABLE 2-2  Public Land Plant EST and Assembled Unigene of the National Center for Biotechnology Information (NCBI) or Contig Sequences of the Institute for Genomic Research (TIGR) That Are Deposited in Genbank up to August 2007 TIGR Plant Unigenes Transcript Common Name Total ESTs (NCBI) Assemblies Eurosids II (Brassicas, citrus, cotton) Arabidopsis thaliana Thale cress 1,276,692 29,918 27,983 Brassica napus Oilseed rape 567,177 26,285 16,608 Gossypium hirsutum Upland cotton 177,182 16,367 24,797 Citrus sinensis Sweet orange 94,738 9,667 11,061 Gossypium raimondii New world cotton 63,577 3,279 8,665 Citrus clementina Clementine orange 62,250 6,106 5,222 Gossypium arboretum Tree cotton 39,232 NA 4,591 Brassica rapa Field mustard 33,316 NA 4,409 Brassica oleracea var. alboglabra Wild cabbage 30,759 NA 6,761 Poncirus trifoliate Japanese hardy orange 28,737 NA 5,083 Brassica oleracea Wild cabbage 26,692 NA See var. alboglabra above Brassica rapa subsp. Pekinensis Chinese cabbage 20,073 NA 4,409 Eurosids I (legumes, rosaceous plants, euphorbs, willows) Glycine max Soybean 392,321 24,018 36,399 Malus x domestica Apple tree 255,097 16,903 26,757 Medicago truncatula Barrel medic 236,819 16,211 20,414 Lotus japonicus Trefoil 150,631 13,640 14,461 Populus trichocarpa Black cottonwood 89,943 14,059 12,687 Populus tremula x Hybrid aspen 76,160 7,519 11,593 Populus tremuloides Prunus persica Peach 70,972 6,306 6,596 Ricinus communis Castor bean 53,402 NA 4,524 Populus trichocarpa x Populus Hybrid poplar 53,208 NA 7,803 deltoides Euphorbia esula Leafy spurge 47,543 NA 9,905 Arachis hypogaea Peanut 40,627 NA 1,491 Trifolium pratense Rotklee clover 38,109 NA 4,347 Populus tremula European aspen 37,313 NA 5,961 Manihot esculenta Cassava 36,120 NA 5,189 Phaseolus vulgaris Common bean 22,847 NA 2,941 Bruguiera gymnorrhiza Burma mangrove 20,373 NA 2,031 Populus trichocarpa x Populus nigra Hybrid poplar 20,130 NA 3,531 Phaseolus coccineus Scarlet runner bean 20,120 NA 2,315 continued

28 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e TABLE 2-2  Continued TIGR Plant Unigenes Transcript Common Name Total ESTs (NCBI) Assemblies Asterids Solanum lycopersicum Tomato 257,093 16,945 21,523 Solanum tuberosum Potato 227,289 19,539 26,280 Helianthus annuus Common sunflower 94,111 7,955 10,219 Nicotiana tabacum Common tobacco 88,579 8,436 10,693 Lactuca sativa Garden lettuce 80,781 7,839 11,215 Ipomoea nil Japanese morning glory 62,282 NA 11,216 Coffea canephora Robusta coffee 55,692 NA 6,732 Lactuca serriola Prickly lettuce 55,490 NA 7,125 Cichorium intybus Chicory 41,747 NA 6,501 Nicotiana benthamiana Tobacco 41,440 NA 4,836 Taraxacum officinale Dandelion 41,296 NA 5,993 Helianthus tuberosus Jerusalem artichoke 40,362 NA 5,845 Helianthus exilis Serpentine sunflower 33,961 NA 5,187 Capsicum annuum Pepper 31,090 NA 4,189 Lactuca saligna Willowleaf lettuce 30,696 NA 4,999 Helianthus paradoxus Paradox sunflower 30,517 NA 3,864 Cichorium endivia Endive 30,171 NA 4,098 Lactuca virosa Wild lettuce 30,068 NA 4,912 Lactuca perennis Wild lettuce 29,125 NA 4,485 Helianthus petiolaris Prairie sunflower 27,484 NA 3,994 Antirrhinum majus Snapdragon 25,310 NA 4,221 Ocimum basilicum Sweet basil 23,260 NA 3,343 Helianthus ciliaris Texas blueweed 21,590 NA 3,070 Other eudicots Vitis vinifera Grape 320,538 22,278 21,627 Aquilegia formosa x Aquilegia Western columbine 85,039 7,555 12,160 pubescens Mesembryanthemum crystallinum Common ice plant, “basal” 27,348 NA 2,897 core eudicot Beta vulgaris Beet, “basal” core eudicot 26,745 NA 3,868 Monocots (includes grasses) Oryza sativa Rice 1,211,418 40,259 49,870 Zea mays Maize 1,159,264 57,447 64,601 Triticum aestivum Wheat 1,050,926 34,505 62,121 Hordeum vulgare + subsp. vulgare Barley 437,713 21,418 30,171 Saccharum officinarum Sugarcane 246,301 15,586 26,894 Sorghum bicolor Sorghum 204,308 13,547 20,714 Festuca arundinacea Fescue 41,869 NA 6,297 Zingiber officinale Ginger 38,139 NA 7,850 Hordeum vulgare subsp. spontaneum Barley 24,161 NA See sp. vulgare above

Assessment 29 TABLE 2-2  Continued TIGR Plant Unigenes Transcript Common Name Total ESTs (NCBI) Assemblies Sorghum propinquum Sorghum 20,881 NA 3,402 Brachypodium distachyon Purple false brome 20,449 NA 2,785 Allium cepa Onion 20,159 NA 3,578 Gymnosperms Pinus taeda Loblolly pine 328,628 18,859 28,060 Picea sitchensis Sitka spruce 139,569 15,683 11,551 Picea glauca White spruce 132,623 17,810 16,102 Picea engelmannii x Picea glauca Hybrid spruce 28,170 NA 5,767 Pinus pinaster Maritime pine 27,288 NA 3,901 Other land plants Physcomitrella patens subsp. Patens Moss 174,908 13,688 18,707 Marchantia polymorpha Liverwort 33,692 NA 3,874 NOTE: All plants with more than 20,000 ESTs are shown, as listed in dbEST. gene function, defining regulatory genes and networks, understanding patterns of gene expression, comparative analysis of gene expression, gene expression resources and databases, and epigenetics and RNA-based regulation. This information is captured in both the published record and in various databases (see Appendix F). A brief summary of the many highlights includes the following: Defining gene function. Large-scale insertional mutagenesis and TILLING re- sources, first deployed in Arabidopsis but now available in a variety of crop spe- cies, have revealed key functional and phenotypic knowledge and provided vital resources for further work (see Table 2-3). The ability to define gene function via loss of function mutation remains the bedrock of genomics, and methods to over- come genetic redundancy and other impeding factors are further being developed. Those methods include the engineering of artificial micro-RNAs capable of silenc- ing several members of a gene family simultaneously. Defining regulatory genes and networks. Several projects focused on identifica- tion of novel regulatory genes and features through genome-wide approaches. For example, regulatory networks and factors that control host-microbe interactions and disease resistance, largely identified by large-scale forward genetics in Arabi- dopsis, are now being exploited in rice, tomato, and legumes, among others, using forward and reverse genetics methods enabled by genome sequences.

30 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e Understanding patterns of gene expression. Functional genomic technologies were developed and applied to analyze gene expression patterns in different cell types, tissues, and organs, and in plants under stress and undergoing developmental transitions. Microarray and other high-throughput profiling tools have been used to identify and characterize important genes for root architecture, leaf form, and tomato fruit development, just to name a few examples. Many of these projects have yielded publicly available expression atlases and searchable resources. Comparative analysis of gene expression. Comparative analysis of expression patterns could be a major outcome of functional genomics applied to a wide va- riety of plant species. Some success has been realized in NPGI-funded analysis of genes involved in flower development across an evolutionary spectrum of plants. Genes that are regulated by the circadian clock, and by photoperiodic regulatory modules, are being revealed through comparative profiling and analysis in Arabi- dopsis, poplar, and rice. Gene expression resources and databases. Several databases and online resources emerged from NPGI-funded projects (Table 2-3). Those resources include the MPSS database of transcript and small RNA expression data, and the PlexDB database for expression data. Many of those resources are used regularly by PIs of NPGI projects. (See the list of Websites that NPGI PIs reported as their five most- used websites for their work in Appendix F.) Epigenetics and RNA-based regulation. The diversity and functions of small RNAs (20–25 nt) that affect both genic and intergenic sequences have been revealed using innovative high-throughput sequencing technology in a variety of dicot and monocot models and crops. This work will enable a more subtle understanding of gene regulation and the evolution of developmental regulator processes. NPGI- funded projects have contributed to the rapidly expanding field of epigenetics, which deals with heritable changes and patterns that occur without changes in DNA sequence. Epigenetic inheritance properties are controlled by the structure of chromatin as expressed in changes to histones and DNA methylation, which are affected by polyploidy, hybridization, and the expression of small RNAs. Genome- wide surveys and functional analysis of genes affecting epigenetic inheritances have been done in maize, Arabidopsis, and a few other species. Shortcomings in Gene Function, Expression, and Regulatory Network Analyses Not all progress that was envisioned five years ago (NRC 2002) has been real- ized. Integration of data across plant species remains a considerable challenge, partly because of the heterogeneity of datasets, disparate data standards, lack of sufficient experimental tools, and small number of groups funded to do database and experimental integration work. Data integration across heterogeneous plat-

Assessment 31 TABLE 2-3  Examples of websites that are direct results of NPGI and IWG collaborations. Estimated Estimated Number Number of Unique of Page Visitors Impressions Title Website in a Montha in a Montha The Arabidopsis information http://www.arabidopsis.org/ 33,795 939,272 resource The Barley Coordinated http://barleycap.org/ 253 1,454 Agricultural Project Dendrome: A Forest Tree http://dendrome.ucdavis.edu/ 3,600 30,000 Genomics Database The Prunus Genome Database: A http://www.bioinfo.wsu.edu/gdr// Model for Rosaceae GrainGenes: A Database for http://wheat.pw.usda.gov/GG2/ 32,256 419,632 Triticeae and Avenab Gramene http://www.gramene.org/ 607,966 4,786,295 The Legume Information System http://www.comparative-legumes.org/ 4,628 66,997 Maize Genetics and Genomics http://www.maizegdb.org/ 30,358 179,671 Database The Maize Tilling Project http://genome.purdue.edu/maizetilling/ 452 2,196 The Floral Genome Project http://www.floralgenome.org/ The Cotton Genome Database http://www.cottondb.org/ 2,111 287,857 Comparative cDNA Sequencing in http://radish.plantbiology.msu.edu/ 65 175 Radish (Raphanus) The Populus Genome Portal http://genome.jgi-psf.org/Poptr1/Poptr1. home.html PlexDB: A community resource http://www.plexdb.org/ for plant and plant-pathogen microarrays Rice Coordinated Agricultural http://www.uark.edu/ua/ricecap/ 4,044 40,365 Project Rice MPSS http://mpss.udel.edu/rice/ Soybase and the Soybean http://soybase.org 2,104 24,413 Breeders’ Toolbox http://soybeanbreederstoolbox.org/ http://soybeanphysicalmap.org/ Wheat Coordinated Agricultural http://maswheat.ucdavis.edu/ 566 1,555 Project Potato Functional Genomics http://www.potatogenome.org 4,777 62,775 aTheestimated number of unique visitors and page impressions do not include robot requests. They were provided by managers or principal investigators of the websites. Estimates were not available for all Websites because the statistics were not kept for some or because the web managers or principal investigator did not respond to the request. bThe GrainGenes database has a mirror Website (http://grain.jouy.inra.fr/GG2/index.shtml) that also is actively used. That website has an estimated 25,975 unique visitors and 269,323 page impressions, excluding robot requests. SOURCE: Examples of Websites were obtained from the Interagency Working Group on Plant Genomes.

32 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e forms and metadata types presents a difficult challenge to efficient extraction of knowledge and translation to crop species. Genetic redundancy, where multiple genes serve overlapping functions, has limited some functional analysis of genes in some high-throughput phenotype screens. The extent of genetic redundancy was not recognized in the early years of NPGI, and future projects need to take redundancy into account during ex- perimental design. Further, it is now known that epigenetic regulation is masking considerable phenotype expression. Revealing the full genetic potential of plants, therefore, will require future studies that unmask hidden phenotypes under epi- genetic control. Finally, even for the most advanced plant models, technical limitations have yet to be overcome in a number of areas. For example, the technology to consis- tently and predictably perform gene replacement is not generally available. It is also not feasible to do large-scale capture of protein interactomes using large sets of epitope-tagged proteins. For nearly all species, the important tool of genetic transformation—namely the ability to generate large numbers (over thousands) of traditional transformants—is expensive and not well developed. Domestication, Diversity, and Natural Genetic Variation The 2002 NRC report noted that while sequencing costs were too high for deep sequencing across the plant kingdom, EST sampling, development of mapping tools, and the coalescence of focused, community-supported projects of evolution- ary and ecological interest were worthy of support. NPGI-funded studies of natural variation and crop domestication have led the way in the understanding of selec- tion and in the dissection of complex traits. Results from those studies have been influential in human genetics as well (see example below). Further, as noted above, JGI has been the focal point for communities of scientists to develop genomics- based programs to understand unique and important concepts in evolutionary and ecological genomics using Mimulus (Monkey flower) and Aquilegia (Columbine). Important scientific results from NPGI funding include: • Initiation of “association mapping” (population-based mapping of traits inherited in a complex or multigenic manner) in plants. Association mapping facilitates identification of the actual genes that underlie Quantitative Trait Loci (QTL) mapping down to the gene level. The examples funded by NPGI are the first to explicitly use structured association mapping in any species. • Development of leading statistical genetic algorithms for association map- ping, which outperform many of those developed for human biomedical research, and have been adopted by the human genetics community.

Assessment 33 • Provision of detailed examples of the genetic events that led to crop domes- tication—these include molecular details, the dynamics of natural selection, and the ultimate effects of those genetic changes on the domesticated trait. • Identification of reservoirs of natural diversity and core germplasm in a variety of crops. • Creation of key resources for mapping complex traits, including the maize nested association mapping panel, which when released at the end of 2007 will be the largest complex trait dissection system for any species. • Acceleration of positional cloning of QTL for many species. Informatics, Modeling, and the Virtual Plant Clearly, locating the information generated by a program as large and diverse as NPGI would be an overwhelming, time-consuming task without interfaces that are easy enough to use routinely. As a result of NPGI support, many websites and databases were expanded or newly developed (listed in Table 2-3). An important positive note is that several of these receive very large numbers of visits. Further, a key organizing principle is that many of these sites are sharing software—some- thing that is desirable but considered by many to be difficult or impossible. Soft- ware reuse has the dual benefits of reducing costs (the software is only developed once) while improving quality (because users and informatics support staff from all sites sharing the software can critique and debug). The Generic Model Organ- ism Database (GMOD 2007) project, with funding contributed by NPGI, is one of the reasons the software re-use is possible. The aim of GMOD is to collaboratively design and build a database architecture and constituent software components for creating and managing genome-scale biological databases. A number of the proj- ects listed below are active GMOD contributors or users in Table 2-3. Some share a common database schema and associated utilities, and others are using the same interface application (for example, cMAP). An additional collaborative success is the direct, active involvement of GMOD contributors in developing semantic community standards. Through their efforts, shared anatomical, developmental, phenotypic, and other controlled vocabular- ies and ontologies are being developed and maintained. It is only with clear and accepted standardization that semantically driven mechanisms for data mining, integration, assimilation, knowledge discovery, and analysis become possible, and many of the projects in Table 2-3 are building this foundation. The available plant genomics databases have already led to concrete ben- efits in the plant research community, as summarized above. From an analytical perspective, the gains are clear: Collection of large data sets from multiple sources has led to metadata analyses, though comparisons of data that span technology

34 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e or plant physiological platforms have to be done with caution. The utility of the databases to formulate testable experimental hypotheses is most advanced in the plant systems where there are most data. Here, genome-wide based analyses can have predictive value, and therefore can be used to prioritize targets for fu- ture experimentation. For example, examining the joint probability of observing particular DNA regulatory sequence motifs (representative of binding sites for a particular transcription factor family) together with analysis of their expression under certain environmental stresses could lead to new insights of co-regulatory nodes. Grass genomes are genetically well aligned, and traits that are distinctive to each species can be selected for further studies. Comparative studies exploiting the basal plant genomes now available provide the basis for phylogenetic models of gene and genome structure. To enable these and other studies, the data resource projects typically offer downloading datasets such as physical maps, genetics maps, computed gene predictions, microsatellite sequences, the database content (as tab- separated-value files or as database dumps), protein or nucleotide sequences, and curated, versioned gene models. The community has adopted these resources, and their use has dramatically increased over the years, as evidenced by the increase in database visits at the most heavily used Websites. Currently, the two limiting factors are the problems presented by disparate, heterogeneous datasets that make comparisons across databases difficult, and the lack of students and postdoctoral fellows trained in computational skills and statistics. Without a skilled workforce to use them, the value of database resources and large datasets cannot be fully realized. Efforts are being made by the databases that serve large communities (for example, TAIR and Gramene; see section on Education below) to educate users on access to the resources available at the database Website. The availability of genome-based information has set the stage for improved modeling and understanding of biological processes from the cellular, to or- ganismal, and ultimately to the level of entire plant communities. At present, in any given organism, the function for roughly 30 to 50 percent of the genes can be inferred, but these inferences are only guides for subsequent experimental proof of function. The function of the remainder is still unknown, and gaining that knowledge is a challenge for upcoming years. However, plant biologists can place the genomic knowledge gained into an integrated systems framework. Meta- bolic pathway information is available for both Arabidopsis (TAIR 2007) and rice (Gramene 2007). These are both implementations of bioCyc, which is another example of a GMOD component. At the scale of the organismal level, sophisticated image analysis techniques of gene expression are being used to understand the roles of relevant genes and the environmental factors that influence the developing meristem in Arabidopsis (see Scientific Inference Systems Laboratory 2007). This project is pioneering techniques

Assessment 35 for constructing cellular models of coordinated patterns of gene expression and will enable simulation of developmental processes under different conditions. Translational Genomics The 2002 NRC report recommended an expansion of NPGI research into areas that enable the translation of findings from reference species to related crop plants—known as Translational Agriculture. That newly proposed area of research was intended to empower both public and private scientists to use the basic dis- coveries from other research areas of NPGI for crop improvement. The report mentioned the following as critical enabling technologies and goals: the develop- ment of genetic maps, physical maps, transcript maps, and germplasm collections with molecular genotypes; the discovery of chromosomal intervals that possess gene conditioning important agronomic, compositional, and pest resistance traits; the design of “breeder-friendly” DNA markers; and the analysis of fungal genomes and the sequencing of the gene-rich regions of major crop plants. This committee found important progress in almost all those areas for some crops. For example, major advances in the small-grain crops of wheat, barley, and rice include: • The nationally coordinated application of DNA markers in publicly funded wheat breeding programs for selection of quality and pest resistance traits (Uni- versity of California Davis 2007). • The positional cloning (relying on DNA markers, genetic maps, large-insert DNA libraries, and a physical map of the critical genomic region) of the genes underlying vernalization in wheat and barley �������������������������������������� (Fu et al. 2005; Yan et al. 2006; Yan et al. 2004; Yan et al. 2003). • The mapping of Gpc-B1, a gene from a wild wheat accession that increased grain protein content in cultivated wheat (Distelfeld et al. 2006). • The cloning of a gene for stem rust (a major disease of both wheat and barley) resistance from barley (Brueggeman et al. 2002). • The provision of industry-wide training by rice breeders in the application of marker-assisted selection technologies (University of Arkansas 2007). In addi- tion, the committee noted that NPGI discoveries have led to a variety of productive interactions with the private sector, including a handful of new, start-up companies. These are mentioned in Appendix J in the PIs’ responses about “Collaborations with Industry.” DNA marker technologies have become widely employed in the improve- ment programs of major U.S. commodity crops. NPGI has enabled the develop- ment of more efficient DNA markers for that purpose. The NPGI funding has

36 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e supported the development of BAC libraries for several crops and research to obtain BAC-end sequence to arrange overlapping BACs in their proper location within a contig. That sequence information has been used as a template for the discovery of ­ genome-wide markers for single-nucleotide polymorphism (SNP). The identification of the SNP markers for major commodity crops, along with their adaptability to high-throughput platforms and reduced cost per data point, has led to their rapid introgression in crop improvement programs and provides a clear example of the leveraging and translational power of the NPGI resource development. NPGI also has had a major impact on how crops are bred in the private sector. During the last five years it has become common practice for com- mercial crop breeders to employ DNA-marker assisted selection, particularly in the development of improved corn inbreds and soybean cultivars. NPGI has recognized the need for translational genomics applicable to trees. For example, poplar serves as a model organism for genomic studies of tree and wood development, and bioenergy feedstocks because of its modest genome size (about 480 Mb) and facile capacity for transformation (Brunner et al. 2004). The poplar genome was sequenced by JGI and an international consortium of collaborators (Tuskan et al. 2006). The highly outbred genetic structure of many forest trees make linkage disequilibrium blocks extremely small (Brown et al. 2004), increasing the accuracy of association genetic studies (Neale and Savolainen 2004). In order to understand the nature of natural variation in trees and to d ­ evelop useful markers, more than 8,000 genes have already been resequenced and SNPs discovered in a major NPGI-funded study of loblolly pine wood prop- erties. Earlier studies identified a null allele of the cinnamyl dehydrogenase gene (cad-n1) as the largest-effect major gene known for volume growth in loblolly pine (Yu and Buckler 2006). NPGI-funded translational projects oriented toward developing tools for ecological studies and restoration of disease-damaged wild trees include an effort to save the American chestnut, by breeding or engineer- ing varieties resistant to the introduced chestnut blight pathogen. Other research funded by NPGI grants has led to genetic markers useful for DNA fingerprinting of clones to aid management during breeding and propagation, and have served as useful markers for ecological studies of gene flow and community ecology (for example, Whitham et al. 2006). The NPGI Literature Footprint NPGI and the independent National Science Foundation’s (NSF) Arabidopsis 2010 Project are the engines of basic plant genomics discovery that power intel- lectual and practical furtherance of plant biology in both the public and private sectors. Hence, the committee also monitored scientific output of NPGI-funded grants (as provided by IWG) using the traditional metric of peer-reviewed publica-

Assessment 37 tions and their impact on international plant science. The data in Appendix B were gathered based on surveying the 40 most cultivated crop species by area harvested (FAOSTAT 2007). The committee assessed whether the investment made by NPGI had increased the knowledge footprint for each species on the basis of these data. For each species, the Web of Science—a database of about 8,700 journals—was queried for articles with each crop’s species name or common name in the title. To narrow down the list of articles to those relevant to genomics, only the articles that include either genomics, genomic, sequencing, or sequence as a keyword are included. These data are also compared to plants that serve as broad models for all of basic plant biology (bottom of the tables in Appendix B). The numbers of laboratories contributing to publications using the model plants that were the focus of NPGI has, in most cases, grown considerably, as have the number of laboratories using most models, including Arabidopsis. Fur- thermore, the U.S. contribution to the top 10 most-cited papers for many species has remained high or increased. Finally, the percentage of total publications that included U.S. institutions has risen or stayed constant for all of the most important crop species except, notably, soybean. The committee also measured the publication impact of NPGI over the nine-year period (see Appendix G). The 165 PIs who responded to the committee’s questionnaire (from 277 total; shown in Appendix C) cited 1,478 peer-reviewed publications that are included in the 2006 ISI Journal Citation Impacts. The com- mittee noted that 317 of these publications (or 21 percent of the total) appeared in highly cited journals across all of life sciences journals (citation impact of 9 or higher). This striking finding suggests that NPGI is generating novel, important, and topical science. A total of 659 publications (or 45 percent of the total) appeared in journals with citation impact of 6 or higher. The committee concludes from these data that many NPGI PIs are producing highly competitive publications that are published in the most outstanding scientific journals; this is a major accomplishment for NPGI. Education In the 2003–2008 NPGI plan (NSTC 2003), the roles of education, training, and outreach in realizing the full potential of NPGI for plant sciences within the United States were emphasized. Five goals were outlined: • Traineeships for undergraduates, graduate students, and postdoctoral re- searchers in plant genomics research. • Informatics training for both established and young investigators. • Mid-career training programs in plant genomics for university and college faculty and plant science professionals.

38 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e • Workshops to inform the broader research community about accessing and using the NPGI research resources. • Outreach to the K-12 community. Some aspects of these objectives have been partially met, but others have yet to be fulfilled. The following sections discuss progress to date. Undergraduate, Graduate, and Postdoctoral Traineeships There has been substantial investment in undergraduate, graduate, and post- doctoral stipends funded through NPGI (Tables 2-4, 2-5, and 2-6). There are no TABLE 2-4  Number of Undergraduate Students Trained and the Sector in Which They Now Hold Positions, as Reported by the 165 NPGI-Funded Investigators Who Responded to the Committee’s Questionnaire Number of Sector Students Academia, including undergraduate and graduate schools and other positions in academia 438 Government 8 Industry 79 Still in investigators’ laboratory 198 Other, including professional schools such as medical, dental, and law school 115 Unknown 658 Total 1496 NOTE: Number of undergraduate students among the 1,496 known to have left the country = 10. TABLE 2-5  Number of Graduate Students Trained and the Sector in Which They Now Hold Positions, as Reported by the 165 Npgi-Funded Investigators Who Responded to the Committee’s Questionnaire Number of Sector Students Academia, including undergraduate and graduate schools and other positions in academia 214 Government 25 Industry 60 Still in investigators’ laboratory 190 Other, including professional schools such as medical, dental, and law school 25 Unknown 60 Total 574 NOTE: Number of graduate students among the 574 known to have left the country = 45.

Assessment 39 TABLE 2-6  Number of Postdoctoral Researchers Trained and the Sector in Which They Now Hold Positions, as Reported by the 165 Npgi-Funded Investigators Who Responded to the Committee’s Questionnaire Number of Sector Students Academia 318 Government 46 Industry 58 Research in other sector (for example, nonprofit) 17 Still in investigators’ laboratory 193 Other 21 Unknown 64 Total 717 NOTE: Number of postdoctoral researchers among the 717 known to have left the country = 129. readily available data to indicate how many of these trainees are or were U.S. citi- zens, whether they stay in plant genomics after the completion of their training, or whether NPGI funding attracted previously uncommitted students into plant genomics. The lack of long-term follow-up data on the students’ career paths compromises the committee’s ability to assess the direct impact of NPGI educa- tion funding on the future of plant genomics. Nonetheless, NPGI is exposing an increasingly large number of students to the “excitement of scientific discovery in a field at the cutting edge of biology” (NSTC 2003). NPGI projects have supported large numbers of students and postdoctoral researchers. However, no specific plant genomics graduate or postdoctoral fellow- ship programs or “cross-over” fellowship programs to bring non-plant scientists (particularly those with quantitative or computational skills) into plant genomics were created or funded. Specialized individual fellowship programs provide incen- tive for the best students at any level to seek and win competitive fellowships and have proven to bring new talent into a field. For example, the NSF Plant Biology Post-doctoral Fellowship Program (1983–1994), which was not a part of NPGI, recruited 237 scientists into plant research, nearly all of whom were trained in other fields of biology. This program, especially in the beginning, populated plant science with young researchers trained in the molecular biology of well-developed systems. Several members of this cohort are now members of the National Academy of Sciences; one was a Howard Hughes Medical Institute Fellow and is now the chief executive officer of a start-up company. Many more are productive faculty members or high-level administrators at institutions of all types, governmental science policy makers, and corporate scientists. Plant sciences have reached yet another turning point, where a similar infusion of scientists trained in quantitative disciplines like

40 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e computer science, applied mathematics, statistics, biomedical engineering, quan- titative genetics and plant breeding, and ecological or evolutionary biology would be welcome additions to most plant genome projects. Since 2005, NSF has funded three new graduate programs relevant to NPGI through its Integrative Graduate Education and Research Traineeship (IGERT) program. In FY 2005, the University of California, Riverside Center for Plant Cell Biology and the University of California, San Diego received IGERT awards for doctoral training programs in chemical genomics and in plant systems biology, respectively; both programs will bring together faculty in plant cell biology, chem- istry, computational biology, and engineering (NSTC 2006). A new IGERT program at the University of Arizona focuses on evolutionary, functional, and computational genomics (University of Arizona 2007). Although the numbers of students involved in those IGERT programs are low, the three programs are a first of what could be an expanding set of steps towards addressing a key recommendation of the 2002 NRC report that “the plant biology community needs to expand training opportunities into disciplines that are not traditionally associated with plant biology and crop sciences, such as computer science, mathematics, chemistry, and engineering.” At the undergraduate level, NPGI provides partial support for 12 summer internship programs through its Research Experience for Undergraduate Program (REU) (NSF 2006). In each program, about six to ten students chosen through a BOX 2-1 Genome Consortium for Active Teaching Partial funding from NPGI has supported the Genome Consortium for Active Teaching (GCAT) (Davidson College 2005) to bring functional genomics methods into undergraduate cur- ricula by providing undergraduates with access to affordable microarray technology, including the arrays, scanning services, free software for data analysis, and faculty workshops. In the first seven years of the project, 5,000 microarrays provided by GCAT have been used by 141 faculty and 6,000 students on 134 campuses. By 2009, GCAT estimates that 9,480 undergraduates per year will have access to its microarrays (Campbell et al. 2007). NPGI supports best practices workshops on the use of the microarrays and software for faculty at primarily undergraduate and minority-serving institutions (NSTC 2006), and also enables GCAT to maintain a helpdesk, staffed by students, for users of the microarray data analysis software package developed by GCAT faculty and students (Campbell et al. 2007). It is too early to assess whether this program influenced subsequent career choices for these students. However, preterm and postterm sur- veys of student participants documented substantial gains in their knowledge about microarray experiment design, error, gene expression, clustering analyses, and interpreting microarray results. Surveys and open-ended responses from faculty members who attended the workshops also revealed positive impacts on their teaching programs (Campbell et al. 2007).

Assessment 41 BOX 2-2 Reaching Out to Underrepresented Populations Several NPGI-funded programs specifically target underrepresented populations of high school and college students. For example, Iowa State University and the USDA Agricultural Research Service (USDA-ARS) Station in Ames, Iowa, offer Native American students eight-week summer research internships that focus on genetic and bioinformatic investigations of diversity among plants with cultural and historical importance to Native Americans (NSTC 2007). The multi-institutional Research Experiences for Undergradu- ates (REU) programs led by North Carolina State University and by the Uni- versity of Connecticut offer opportunities for diverse populations of students to gain hands-on experience in plant genomics research (NSTC 2005). An interdisciplinary team of scientists from the University of Wyoming, The Institute for Genomic Research (TIGR), and Cold Spring Harbor Laboratory provide enhanced educational and research opportunities through exchange visits, courses, and hands-on workshops for both faculty members and stu- dents from Little Big Horn College, a tribal community college in Crow Agency, Montana (NSTC 2007). Several NPGI-funded laboratories have en- tered into formal partnerships with historically black colleges and universi- ties, in which African American masters’ candidates and undergraduate students participate in the research at the host laboratories over the summer and take their projects back to their home institutions during the academic year (NSTC 2001). It is not clear whether these programs influenced sub- sequent career choices for these students; assessment methods need to be developed and deployed. competitive national search receive a stipend and housing for eight to ten weeks. In addition, at least 14 faculty members from primarily undergraduate institutions were hosted by NPGI-funded laboratories through NSF’s Research Opportunity Award program. Several opportunities also exist for high school students and sci- ence teachers to participate in NPGI-funded research through NSF Research As- sistantships for High School Students or Research Experiences for Teachers supple- ments to individual laboratories. Boxes 2-1 and 2-2 provide additional examples of the NPGI contribution to undergraduate education. Informatics and Mid-career Training In contrast with the growing opportunities for graduate training and the slate of undergraduate educational programs, the rate of progress in providing training

42 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e opportunities in bioinformatics for established and new investigators or in plant genomics for plant breeders and physiologists has been considerably slower. The 2002 NRC report proposed a national strategy for bioinformatics that included training, collaboration with large data centers, and bioinformatics-oriented re- search. Strategies to address this perceived gap were also presented as key objectives of the proposed Plant Cyberinfrastructure Center (Meyerowitz and Rhee 2006). Although the anticipated new generation of researchers specializing in plant genomics is emerging, there is a need for experienced plant physiologists and plant breeders who have acquired skills in genomic technologies. The lack of plant breed- ers who are well versed in genomic approaches is seen as a major impediment to translational plant genomics and to the future of plant improvement in the public and private sector in the United States. NPGI-supported workshops on marker-as- sisted selection for plant breeders are a good start to correcting this deficit (NSTC 2006). Outreach to plant breeders, and potentially to farmers, seems to be within the mandate of USDA and its extension arm, but it is unclear whether there has been a concerted effort in this regard. In the first year of the Wheat Coordinated Agricultural Project (CAP), USDA provided workshops or information sessions on marker-assisted selection at more than 40 field days and industry meetings, and mounted a symposium at the Crop Science Society of America meeting that reached more than 120 people (USDA-CSREES presentation to the committee, April 26, 2007, Workshop). NPGI has also sponsored workshops at the Plant and Animal Genome Conferences on specialized topics relevant to specific crops and on general subjects such as database construction, transcriptional profiling, and genomic computing (NSTC 2000). Informing the Broader Research Community The 2002 NRC report called for organizers of community databases to im- prove the user skill level through short courses and exchange visits (NRC 2002). For example, the Arabidopsis Information Resource (TAIR) has offered one-hour to two-hour introductory and advanced workshops at Plant and Animal Genome Conferences, the International Conference on Arabidopsis Research, and the Ameri- can Society of Plant Biologists Meeting. TAIR usage has increased steadily since the project was founded in 1999. K-12 Education and Outreach Some NPGI grantees have invested considerable energy in developing out- reach efforts targeted towards K-12 students. Several held workshops, where K-12 teachers learn about genomics and biotechnology and develop their own curricu- lar modules or lesson plans (see Appendix H). Recognizing that most precollege

Assessment 43 t ­ eachers are not trained in the practice of science as a process, at least two exemplary NPGI-funded programs provide six- to eight-week full-time mentored research internships through which teachers gain first-hand experience in a plant genom- ics laboratory, as well as education in current learning theory research, so that the teachers are well-equipped to develop research-based curricula. Other outreach efforts resulted in Internet-accessible activities and kits (including “Biotech in a Box” loaner equipment) designed and provided by the scientists or classroom visits by the researchers. The committee could not assess these programs because long-term tracking of their impact is not provided. NPGI researchers frequently tap into existing education and training programs on their campuses (NSTC 2001). Although better public outreach is needed, many PIs are not trained in K-12 educa- tion, and they cannot devote much time to it because of the demanding schedule of the research profession. To resolve this issue, some NPGI-funded programs hired a full-time coordinator who provides cohesive leadership for all their outreach activities (NSTC 2004). The committee enthusiastically endorses this concept and concluded that NPGI has set an example for other federal programs by appoint- ing a professorial or an affiliate faculty-level education coordinator for each of its Coordinated Agricultural Projects (Interagency Working Group on Plant Genomes, personal communication, September 18, 2007). Some of the NPGI-associated outreach initiatives have been remarkably large scale. The Plant Genomics Research Experience for Teachers at the University of Missouri has trained 70 teachers over the last four years (NSTC 2007). By creating educational software and online pedagogical materials, holding workshops, and providing equipment loans and ongoing support for teachers serving low-income, rural, and underrepresented minority students, the Partnership for Plant Genomics Education at the University of California, Davis, trained 52 teachers in FY 2005. They were expected to share their information with 772 other teachers and use activities and laboratories from the course with 8,600 students (NSTC 2006). Other K-12 outreach activities that could have broad impacts are listed in Appendix H. Supplemental funding for the MaizeGDB enabled the creation of a central on- line repository that compiles links to outreach resources in one location, the Plant Genome Research Outreach Portal, or PGROP (PGROP 2006). The portal, which has pull-down menus, allows users to conduct searches by user type (high school teachers, undergraduate students, growers, public at large), plant species, topic (for example, proteomics), or resource type (for example, Web-accessible teaching materials and fellowships). One of the strengths of the gateway’s interface is its capacity for directors of individual outreach or educational programs to upload information about their own programs (Baran et al. 2004). Although navigation of the Web interface is straightforward, searches routinely yield an unwieldy number of marginally relevant “hits.” A search for “Resources for High School Teachers,” for example, returns links to 130 resources including many Web pages on single

44 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e genera of plants that distract from the relatively few resources (such as animated tutorials for use in the classroom) that are truly targeted specifically to teachers. On the other hand, searches for graduate programs or summer internships in plant genomics yield incomplete lists and nonfunctional links. PGROP is a comprehen- sive resource with high potential impact that would benefit from more inclusive cataloging and more robust, discriminating search functions. International Interactions Another successful aspect of the NPGI-funded efforts is their collaboration with international partners. The coordination among researchers from six groups across three continents in the Arabidopsis sequencing project (The Arabidopsis Genome Initiative 2000) paved the way for subsequent multinational endeavors (Table 2-7). Participating non-U.S. scientists in each of these projects are supported by their respective national research funding programs. The projects are overseen and coordinated by an international committee of scientists, typically elected by the research community. Such projects leverage the resources, expertise, and facilities of many countries to achieve a much richer and more comprehensive set of genome datasets than could be obtained by any single national effort. The free exchange of information engendered by such collaboration maximizes efficiency and minimizes the duplication of efforts among teams of researchers. U.S.-funded projects, from Arabidopsis Genome and Arabidopsis 2010, through the entire spectrum of NPGI projects, have led the way in truly open access data deposition. Policy recommendations for U.S. funding must be fully self-contained, both intellectually and technically. While international collaboration is important, the success of NPGI and other U.S. science cannot be reliant on access to data and resources that, to date, are often only available with intellectual property strings attached. A prominent example of a successful NPGI-supported international collab- orative effort is the International Rice Genome Sequencing Project (IRGSP), a consortium of publicly funded laboratories from the United States, Japan, China, Taiwan, India, the Republic of Korea, Brazil, Thailand, and the United Kingdom. Two companies, Monsanto and Syngenta, invested in rice genome sequencing in- dependently and their willingness to release data publicly facilitated the completion of the draft sequence, which was announced in 2002 (IRGSP 2002). The sharing of data, materials, and technology between public and private sector players hastened the completion of the projected 10-year initiative by four years. Building on the success of the rice genome sequencing project, an Interna- tional Rice Functional Genomics Consortium was convened with leaders from 18 institutions representing 10 countries and two international agricultural research centers. The goals of the initiative are to work cooperatively to elucidate gene

Assessment 45 TABLE 2-7  Examples of NPGI-funded Projects That Involve International Collaboration Project Name Website International Barley Sequencing Consortium http://www.public.iastate.edu/~imagefpc/ IBSC%20Webpage/IBSC%20Template-home.html International Brachypodium Initiative http://www.brachypodium.org/ International Citrus Genome Consortium http://int-citrusgenomics.org/ International Cotton Genome Initiative http://icgi.tamu.edu/ International Grape Genome Program http://www.vitaceae.org/ International Legume Database & Info http://www.ildis.org/ System International Populus Genome Consortium http://www.ornl.gov/sci/ipgc/ International Rice Functional Genomics http://www.iris.irri.org:8080/IRFGC/ Consortium International Rice Genome Sequencing http://rgp.dna.affrc.go.jp/IRGSP/ Project International Soybean Genome Consortium http://genome.purdue.edu/isgc/index.shtml International Tomato Sequencing Project http://www.sgn.cornell.edu/about/tomato_sequencing.pl International Wheat Genome Sequencing http://www.wheatgenome.org/ Consortium The Multinational Coordinated Arabidopsis http://www.arabidopsis.org/ thaliana Functional Genomics Project http://www.arabidopsis.org/portals/masc/index.jsp Multinational Brassica Genome Project http://www.brassica.info/ SOL Genomics Network http://www.sgn.cornell.edu/ SOL (EU-SOL) http://www.eu-sol.net/ SOL (Lat-SOL) http://cnia.inta.gov.ar/lat-sol/ SOURCE: Interagency Working Group on Plant Genomes. function, integrate databases, establish bilateral or multilateral partnerships, and enhance rice production (IRFGC 2007). NPGI-funded PIs are prominent on the project’s steering committee, and the USDA Cooperative State Research, Education, and Extension Service (USDA-CSREES) has facilitated participation by American students, postdoctoral fellows, and senior researchers. Other functional genomics projects also capitalize on the resources and exper- tise of an international scientific community. For example, the goal of the Interna- tional Solanaceae Genomics is to develop a comparative framework for studying plant diversification and adaptation across the Solanaceae family (including the important crop plants tomato, potato, eggplant, and pepper). The SOL Genomics Networks form partnerships with laboratories in Latin America and in Europe to improve the nutritional value, taste, flavor, fragrance, shelf-life, starch composition, yield, and other traits important to consumers, producers, and processors of these staple fruits and vegetables (European Commission 2006). The Developing Country Collaborations in Plant Genome Research (DCC- PGR) program was started as an NPGI activity in 2004 to support collaborative

46 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e research involving researchers in the United States and scientists in developing countries. The goal is to facilitate the application of new tools and resources to solve agricultural, environmental, and energy problems of significance to the foreign researcher’s home country. Supplemental funding to an existing or a new NPGI award of up to $100,000 for two years enables joint research projects and long- or short-term reciprocal visits of students and senior investigators, which could lead to long-term partnerships (NSF 2007b). International collaborative NPGI projects that are targeted to directly benefit resource-poor farmers in developing countries include the following (NSTC 2004, 2005): • Using the genome map of sorghum, an important staple cereal in Africa and India, to elucidate networks of genes that control drought tolerance. • Developing cultivars of the African cow pea (a legume widely grown in Africa, Latin America, Southeast Asia, and the southern United States) that are resistant to the parasitic weed Striga. • Establishing comparative markers to link the genetic maps of chick pea, cow pea, and pigeon pea to the Medicago genome sequence map, enabling breeders in India and Africa to identify disease resistance genes and develop improved cultivars of their local crops. • Using proteomics technologies to develop improved oilseed cultivars in Nepal with enhanced processing and feed characteristics. • Harnessing genetic variation in natural rice populations to introduce dis- ease resistance and drought tolerance from natural populations into improved cultivars. • Developing new Bolivian cultivars of potato that are resistant to bacterial wilt, which causes serious crop losses each year. • Investigating the genes that allow plants to produce seed without fertiliza- tion (apomixis), which can be used to breed desirable traits into land races of corn that are adapted to the diverse growing conditions across Mexico. Two other major NPGI projects that involve substantial international collabo- rations and represent the next wave of genomics initiatives with applications to the developing world are the sequencing of cassava and the Generation Challenge Program. U.S. researchers and their partners at the International Center for Tropi- cal Agriculture (a center of the Consultative Group on International Agricultural Research, CGIAR) are working with JGI to perform sample sequencing of the cassava (Manihot esculenta) genome. The tuber grows in diverse climates and in nutrient-poor soil and is an important source of food and biofuel for 1 billion people globally. As a staple for subsistence farmers, a cash crop for local markets,

Assessment 47 and a reliable source of food and animal feed in famines, M. esculenta is well po- sitioned for nutritional improvement, but genome sequencing will also provide insights into starch and protein biosynthesis and stress controls (JGI 2006). The CGIAR Generation Challenge Program is dedicated to alleviating constraints in agricultural productivity that contribute to global poverty and hunger, with an emphasis on harnessing genomic technologies to make rapid progress in the area of drought tolerance (CGIAR Genomics Task Force 2006). NPGI AND INTERAGENCY COOPERATION Earlier sections in this chapter assessed various specific aspects of NPGI, but NPGI is not merely a funding mechanism. It is an interagency collaboration that coordinates activities in plant genomics. The committee also assessed the role of IWG in facilitating research, training, and outreach. In addition, perhaps the most important metric for the success of NPGI is whether, and to what extent, U.S. research and development agencies have reprioritized their mission-oriented, agency-specific research portfolios on the basis of NPGI research and discoveries. Coordination of Programs Although each member agency of IWG has its own mission, some agencies also have overlapping interests and goals. IWG member agencies have increasingly issued joint calls for proposals or co-funded programs of mutual interest. (See Appendix I for examples). The joint programs reduce administrative burdens for principal investigators applying for funds and allow the agencies to jointly achieve common program goals. Perhaps the most important metric for NPGI is whether the science funded to date has served as a springboard for agency-specific, mission-oriented pro- grams that capitalize on either new funding from the public or on public-pri- vate partnerships. One of the greatest challenges that the nation faces in the 21st century is reducing dependence on foreign oil. Top quality basic genome-based research is necessary to achieve these goals, and that research will rely heavily on plant genomics and genetics for its progress. Biofuels and bio-based products are potentially sustainable solutions if conversion efficiency is improved dramatically (NRC 2005a). New investments in bioenergy research will leverage basic plant genomics discoveries made though NPGI (Table 2-8). The largest to date is the $500 million investment by BP America, Inc., in conjunction with the University of California, Berkeley, and Lawrence Berkeley National Laboratory to create the Energy Biosci- ence Institute (EBI). In addition, DOE has made plant genomics a linchpin in its

48 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e Genomics: GTL portfolio for Bioenergy. Using the JGI sequencing platform as a departure point, DOE recently invested about $375 million into three bioenergy centers that are intended to accelerate basic research in the development of cellu- losic ethanol and other biofuels (http://genomicsgtl.energy.gov/centers/). Further, a joint DOE and USDA-CSREES program recently announced $8.3 million in grants for improvement of feedstock (DOE 2007a). Additional new programs that leverage the basic science of NPGI include a DOE Genomics: GTL program that is soliciting proposals for new analytical and imaging technologies for lignocellulosic material degradation and for multiplexed screening for mutant plant phenotypes. Also, the 2007 Farm Bill Title VII (H.R. 2419) passed by the House of Representatives has provisions of $50 million per year for an Agricultural Bioenergy and Biobased Products Research Initiative and $100 million per year for a Specialty Crop Research Initiative to develop and disseminate science-based tools, including plant breeding, genetics, and genomics, to address needs of specialty crops (Table 2-8). At this writing, the bill has been placed on the Senate calendar for consideration. If it is passed, the initiatives would provide vital new resources to expand IWG activities. Other examples of refocusing and increased investment in agency mission- specific research include the conversion of the USDA-CSREES National Research Initiative (NRI) Plant Genome panel into a translational genomics program in recent years, with a different crop focus each year, and with a major emphasis on outreach and extension of research efforts. USDA-ARS has also refocused some of its internal programs to complement and support NPGI research. The National Program 301 on Plant Genetic Resources, Genomics, and Genetics Improvement redirected its statement of purpose to support the new discoveries made by NPGI- funded research ($140 million in FY 2007; Table 2-8). As NPGI research generates valuable data, the need for database stewardship and informatics tools to use the data effectively becomes apparent. Therefore, the National Program 301 includes a component on crop informatics, genomics, and genetic analyses that addresses genome database stewardship and informatics development, structural comparison and analysis of crop genomes, and genetic analyses and mapping of important traits. Likewise, the National Program 302 on plant biological and molecular pro- cesses has redirected its focus to applications of genomics to crop plants because of NPGI discoveries. As a result of NPGI-funded research on model plants, National Program 302 has refocused its objective to take advantage of the new genomic in- formation and to advance it from model plants to crop plants ($40 million in FY 2007; Table 2-8). The goal is to translate plant genomics into crop improvement. Applied mission-oriented, agency-based forest tree genomics programs have also been derived from basic discoveries made through NPGI. For example, a

Assessment 49 TABLE 2-8 Examples of Agency-specific And Mission-focused Programs That Have Spun Off of, or Benefit from, Results of Npgi Research Program Budget Programs (in millions) Energy Biosciences Institute, UC Berkeley, BP, LNL $500 over 10 years Agricultural Bioenergy and Biobased Products Initiativea $250 over 5 years Barley Coordinated Agricultural Program $5 over 4 years Bioenergy Research Centerb $375 over 5 years Conifer Coordinated Agricultural Program $6 over 4 years National Program 301: Plant Genetic Resources, Genomics, and Genetics $140 in FY 2007 Improvement National Program 302: Plant Biological and Molecular Processes $40 in FY 2007 Plant Feedstocks Genomics for Bioenergy $8 over 3 years Rice Coordinated Agricultural Program $5 over 4 years Specialty Crop Research Initiativea $500 over 5 years Wheat Coordinated Agricultural Program $5 over 4 years aThe Agricultural Bioenergy and Biobased Products Initiative and the Specialty Crop Research Initiative were proposed in the 2007 Farm Bill, which has not been passed by Congress at the time this report was written. bThe program budget presented for the Bioenergy Research Center includes funding for plant and microbial research and technology development. multimillion-dollar Coordinated Agricultural Project from USDA-CSREES and USDA Forest Service (USFS) on conifer genomics began in 2007 and will allow association genetic studies of trees in the major breeding programs throughout the United States. Each of the above examples is testament to the power of federal investment in competitive, peer-reviewed, curiosity-driven basic plant genom- ics research, and illustrates the return reaped in translation to agency-specific, mission-oriented applied plant genomics. In-kind Support and Distribution of Resources Although some IWG member agencies fund plant genomics research, all of them contribute to the goals of NPGI by providing in-kind support, distributing resources, and keeping each other abreast of latest genomic technologies. In-kind Support (provided largely by IWG member Agencies) USDA-ARS USDA-ARS funding of $3.6 million in FY 2002 and $8.5 million in FY 2006 for plant bioinformatics includes support for the following projects:

50 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e • The maize genetics and genomics database. This project aims to synthesize, display, and provide access to maize genomics and genetics data for the research and user communities. • Identification of functional sequence in plant genomes through bioinfor- matic, genomic, and genetic approaches. This project aims to provide resources to characterize, track, and identify sequence associated with agronomically important traits. • An integrated database and bioinformatics resource for small grains. This project aims to integrate small grains genetic and genomic data within the Grains­ Genes database and link to relevant external databases. It also aims to develop software and interfaces to enhance utility for researchers. • Curation and development of the Soybean Breeder’s Toolbox and its inte- gration with other plant genome databases. This project aims to implement web- accessible computation and visualization tools to enable comparison and transfer of agronomically important genetic information among soybean and other related species. The project also involves the curation and enhancement of the SoyBase and the Soybean Breeder’s Toolbox and the coordination of the assembly and an- notation of soybean whole-genome sequence. DOE In addition to funding individual research projects, DOE’s contributions to NPGI include sequencing of plant species through its Community Sequencing Program (CSP) or Laboratory Science Programs (LSP). Examples of plant genome sequencing by DOE’s Joint Genome Institute through CSP include Physcomitrella in 2005; Selaginella, sorghum, Arabidopsis lyrata, Capsella, Mimulus, and the chlo- roplast of Campanulales in 2006; and Brachypodium, Aquilegia, Gossypium, cas- sava, maize, soybean, and Eucalyptus in 2007. JGI was the lead organization in the sequencing of poplar (Table 2-1). JGI is committed to plant EST sampling as well. For example, JGI agreed to produce ESTs for switchgrass and peach in 2007, as well as for eucalyptus, foxtail millet, and conifers—loblolly pine and 22 other species selected for their commer- cial and ecological importance or their ability to provide phylogenetic insight into conifer genome evolution—in 2008 (DOE 2007a) (see Table 2-1). The committee notes that JGI’s contribution to plant genomics is unique and fundamental, and spans both explicitly energy-oriented projects and projects that broadly inform all of plant biology from evolution through comparative genomics. There is no other high-throughput sequencing facility interested in serving plant genomics that can match JGI’s power and consequent economy of

Assessment 51 scale. These points inform one of the committee’s most important recommenda- tions (see Chapter 3). USDA Forest Service The USFS has 10 full-time-equivalent scientists who conduct genomics re- search. USFS has also provided technical support in tree genomics or molecular genetics in the form of competitive awards or cooperative agreements (see Appen- dix K). Compared to funding from USDA-CSREES NRI and NSF grant programs, USFS has to date made modest investments in plant genomics. Other in-kind products of USFS include: • Maps of amplified fragment length polymorphism and single sequence repeats for the American beech. • Markers for the selection of butternut that is resistant to butternut canker. • Markers for the improvement of black walnut. • Multiplex sequencing capability on high-capacity sequencing platform. • Specific markers for identifying rust-resistant loblolly pine. • Neutral markers for QTL analyses of loblolly pine. NHGRI Although NHGRI’s primary focus is human genome sequencing, it plays a role in advancing NPGI’s objectives through its support for genome sequencing and its built genomics infrastructure. NHGRI has provided financial support for a number of large-scale sequencing centers over the years. Although NHGRI does not fund plant genome sequencing directly, parts of some plant genome sequencing projects have been done at one of the NHGRI-supported sequencing centers, and many of the fungal pathogen genome sequences noted above were done as part of the Broad Institute’s Fungal Genomics Program. NHGRI also supports the advancement of sequencing technology, development of bioinformatics tools, and identification of all functional elements in the human genome. As a member of NPGI, NHGRI can pass on the technologies and tools developed and lessons learned to the plant community swiftly. NHGRI continues to promote free and open data release and keeps NPGI updated on NHGRI’s policies. In fact, NPGI has adopted the Bermuda accord that requires rapid release of publicly-funded sequence assemblies of 2kb or larger and the Fort Lauderdale accord that defines a community-resource project. NHGRI considers whether the data release policies are appropriate periodically and keep NPGI informed on those discussions.

52 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e Distribution of Resources The National Plant Germplasm System (NPGS), managed and funded by USDA-ARS in partnership with agricultural experiment stations and land-grant universities, aids plant scientists by conserving the plants and seeds of nearly 10,000 species. To ensure that genes are available to NPGI fundees, NPGS continues to acquire, preserve, evaluate, document, and distribute crop germplasms, many of which originate outside the United States (ARS 2005). NPGS distributed over 150,000 accessions in 2006, including 9,131 Triticum, 5,597 Oryza, 11,951 Zea mays, 19,349 Glycine, 9,729 Lycopersicon, and 5,073 Vitus. Among those distribu- tions, some were mutants or cytogenetic stocks (based on information submitted to the committee by USDA-ARS on May 17, 2007). Because of the increasing de- mand as a result of NPGI-funded research, stock centers were built or expanded. For example, the Maize Genetics Corporation was expanded to provide long-term curation of maize mutant genetic stocks developed by NPGI awardees. The Ge- netic Stocks—Oryza Collection was established as a result of NPGI when the rice genome was sequenced and the need for a collection of rice seed mutant genetic stocks was recognized. Mutant seed genetic stocks of other plants developed by NPGI awardees are added to the working collections of other NPGS repositories. Other than germplasm collections, many Websites and databases were developed or expanded as a result of NPGI (see Appendix F).

Next: 3 Recommendations and Goals: New Horizons in Plant Genomics »
Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology Get This Book
×
Buy Paperback | $53.00 Buy Ebook | $42.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Life on Earth would be impossible without plants. Humans rely on plants for most clothing, furniture, food, as well as for many pharmaceuticals and other products. Plant genome sciences are essential to understanding how plants function and how to develop desirable plant characteristics. For example, plant genomic science can contribute to the development of plants that are drought-resistant, those that require less fertilizer, and those that are optimized for conversion to fuels such as ethanol and biodiesel. The National Plant Genome Initiative (NPGI) is a unique, cross-agency funding enterprise that has been funding and coordinating plant genome research successfully for nine years. Research breakthroughs from NPGI and the National Science Foundation (NSF) Arabidopsis 2010 Project, such as how the plant immune system controls pathogen defense, demonstrate that the plant genome science community is vibrant and capable of driving technological advancement. This book from the National Research Council concludes that these programs should continue so that applied programs on agriculture, bioenergy, and others will always be built on a strong foundation of fundamental plant biology research.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!