most up-to-date and comprehensive DNA sequence information to members of the scientific community. Because protein primary structures now are determined mostly by complementary DNA (cDNA) sequence analysis, links between the nucleotide and protein sequence databases are common. GenBank belongs to an international collaboration of sequence databases, which also includes the European Molecular Biological Laboratory and the DNA Data Bank of Japan. Protein sequences are archived in another international consortium, Universal Protein Resource (UNIPROT),4 which is a central repository of protein sequence and function.
NCBI places no restrictions on the use or distribution of the GenBank data. However, some submitters may claim patent, copyright, or other intellectual property rights in all or a portion of the data they have submitted. There were 37,893,844,733 bases in 32,549,400 sequence records as of February 2004.
In an effort to marshal these rapid advances, Robert L. Sinsheimer of the University of California, Santa Cruz, formally proposed in 1985 the possibility of a concerted effort to sequence the human genome. In 1986, Renato Dulbecco, a Nobel laureate and a member of the Salk Institute, made in the pages of Science magazine a similar proposal to provide the underpinning for the study of cancer (Dulbecco, 1986). Influential and widely circulated reports by the U.S. Department of Energy (DOE), the congressional Office of Technology Assessment (U.S. Congress, 1988), and the National Research Council (NRC, 1988) all followed and recommended such a project. The NRC report recommended that the U.S. government financially support a project and presented an outline for a multistep research plan to accomplish the goal over 15 years. Soon thereafter, NIH and DOE signed a Memorandum of Understanding to “provide for the formal coordination” of their activities “to map and sequence the human genome.” In fiscal year 1988, Congress formally launched the Human Genome Project (HGP) by appropriating funds to both DOE and NIH for that specific purpose.
As envisioned in the NRC report, the HGP did not begin immediately with human sequencing. Instead, the program sought to build infrastructure through a variety of projects. These efforts included the exploration of alternative sequencing technologies, the adaptation of existing technologies to the simpler problem of sequencing smaller genomes of laboratory organisms, and the development of low-resolution maps of the human genome. Other countries—in particular Britain, France, and Japan—also initiated the HGP, and indeed several early successes came from outside the United States.
Despite broad governmental support, the HGP generated considerable con-