tutes of Health have achieved quite a bit in terms of technology development, informatics, resource development. We are delighted to make available all the lessons we have learned."
How to apply these scientific tools—which information is most important to pursue, which techniques are best suited for certain types of studies, how research should be ordered and prioritized, and so on—is a more complicated issue, but it is an issue to which lessons from other genome work still apply. In this case, however, it does matter which genome efforts serve as models. The most appropriate lessons will come from genome work on multi-cellular organisms rather than bacteria and other single-celled creatures, because of the difference in the size and complexity of their genetic codes.
Microorganisms have relatively small genomes. The yeast Saccharomyces cerevisiae, for example, has 16 chromosomes with a total length of 12 million base-pairs—a base-pair being an individual letter in the genetic instructions that are encoded in an organism's DNA. Most of the bacteria whose genomes have been decoded have fewer than two million base-pairs. The genomes of plants and animals are, by comparison, huge. The common weed Arabidopsis thaliana, which is believed to have the smallest genome of any flowering plant, still boasts 120 million base-pairs. Corn's genome has 2.3 billion base-pairs, and wheat's is an oversized 16 billion. More than 100 times as large as Arabidopsis. Compared with wheat, the human genome is modest—only 3 billion base-pairs—but it is still a thousand times larger than that of the typical single-celled organism. The bigger a genome is, the more difficult it is to sequence, and not just because of the greater number of base-pairs. As the total amount of genetic material increases, the technical difficulties of handling and keeping track of all the DNA grow as well. For that reason, researchers looking to sequence the genomes of rice or corn or cows or pigs can learn most by gathering advice from their colleagues working on Arabidopsis and mice and humans.
That advice, as offered at the workshop, ranges from very broad and general to quite narrow and specific. On the broad end, there is general agreement about the right way to approach mapping and sequencing of multi-cellular organisms. The recommendations of Christopher Somerville, director of the Department of Plant Biology of the Carnegie Institution of Washington in Stanford, California, were typical: "When we began developing the Arabidopsis program, we realized the sequencing had to be preceded by other forms of information, particularly a very good map, because sequenced information is most useful when it is used in conjunction with mapping information on individual genes and mutants." In mapping a genome, researchers piece together a large-scale picture of where genes and larger chunks of the chromosomes are. A sequence, on the other hand, is a letter-by-letter compendium of the genetic code, a listing of the base-pairs in the order in which they appear on a chromosome. (See Box 1.)