This new knowledge is revolutionizing the field of medical diagnostics and could yield a powerful arsenal of therapies that offer the promise of cures instead of just amelioration of symptoms. Precisely because of this potential, the rise of genomics and proteomics has generated numerous policy battles, of which disputes about intellectual property are but one.

This chapter provides background information on the science of genomics and proteomics and their impact on the changing paradigm in genetic or personalized medicine and briefly describes some of the policy debates that have ensued regarding openness and access to genomic and proteomic data as they have affected the conduct of science. Chapter 3 focuses more specifically on intellectual property issues affecting these fields as they have entered the U.S. patent system and the courts.


After 1953, when Watson and Crick proposed the essentially correct model for the three-dimensional structure of the DNA double-stranded helix (Watson and Crick, 1953), it soon became evident that genetic information stored in DNA was both finite and discrete (or digital) in nature. Knowledge of the order of the four bases—adenine, guanine, cytosine, and thymine (A, G, C, and T)—within each DNA strand, or sequence, of an organism provides full knowledge of all the genetic information passed from one generation to the next. According to Crick, he and Watson speculated about determining the full sequence of human DNA early on but discarded the idea as one that would not reach fruition for centuries (Crick, 2004).

Astounding progress over the ensuing three decades in the discipline now known as molecular genetics, however, proved their pessimistic estimates incorrect. A DNA fragment from any organism can be inserted (or cloned) into the bacterium E. coli, which in turn can generate for further study huge numbers of copies of the desired gene fragment. In 1977, the Nobel laureate chemist Frederick Sanger developed efficient methods for using these amplified samples of genetic fragments to determine the sequence of the DNA bases and published the entire sequence of some small viral genomes (Sanger et al., 1977). By the mid-1980s, much of the molecular genetics research community was engaged in isolating and sequencing from particular organisms DNA for individual genes of interest.

Open, facile access to this relatively limited amount of DNA sequence information became an important priority for molecular biologists and molecular geneticists alike. As a result, in 1979 GenBank was established as a nucleic acid sequence database at the Los Alamos National Laboratory and was funded by the National Institute of General Medical Sciences three years later. In 1988, the National Center for Biotechnology Information (NCBI) of the National Institutes of Health (NIH) was organized, and it took over the management of GenBank.

The GenBank database is designed to provide and encourage access to the

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement