| ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 26
Implications for Medicine
and Science
MEDICAL USES
A Map of the Human Genome Will Greatly Facilitate the
Identification of Specific Disease Genes
Humankind is afflicted by more than 3,000 known different inherited
disorders. Taken together, these disorders affect every organ, system,
and tissue in the human body. Some cause disease even before birth,
whereas others are observed only in adulthood. Some are common,
others rare. Although their overall impact on human health is enor-
mous, until recently our understanding of the vast majority of these
disorders has been meager. Even today we have identified the
responsible gene in fewer than 3 percent of all known inherited
disorders. In nearly all of these cases the disease gene codes for a
known protein. For diseases in which the responsible protein has
been identified, it is now regularly possible, with recombinant DNA
methods, to clone the gene and begin to understand the genetic defect.
In this way we have learned much about conditions such as thalas-
semia, sickle-cell anemia, hemophilia, Tay-Sachs disease, and familial
hypercholesterolemia. However, most disorders result from mutations
in genes whose protein products have not been defined. In these
situations, identification of a DNA segment that is regularly altered
(either by deletion, rearrangement, or point mutation) in a given
disorder provides clues to identifying the disease gene. So far, the
genes for three disorders- Duchenne muscular dystrophy, retinoblas-
toma, and chronic granulomatous disease have been successfully
26
OCR for page 27
IMPLICATIONS FOR MEDICINE AND SCIENCE
27
identified in this manner. This approach is also making possible an
ongoing search for the genes relevant to such conditions as cystic
fibrosis, Huntington's disease, and familial Alzheimer's disease. These
are but a small subset of the numerous Mendelian disorders for which
direct genetic analysis offers the best hope of identifying the respon-
sible genes.
The availability of various types of maps of the human genome
would greatly facilitate the search for genes related to specific inherited
diseases. A detailed genetic linkage map based on RF~Ps would
permit rapid assignment of disease loci to subchromosomal regions,
perhaps at a resolution of 1 million nucleotides. The availability of
DNA clone collections and a restriction map of the genome would
then allow efficient comparative analysis of DNAs from normal and
affected individuals to pinpoint with higher resolution the area in
which the relevant gene resides. Finally, a DNA sequence of the
genome would allow all putative genes in the region to be identified
and would also provide a data base for evaluating sequences obtained
in samples of DNA from patients. Although more complicated in its
execution, similar approaches could be applied to the more common
multigenic disorders, i.e., those for which more than one gene may
be responsible. Examples include hypertension, some forms of cancer,
diabetes, schizophrenia, mental retardation, and neural tube defects.
Thus, the availability of a map and sequence would greatly accelerate
the identification of disease genes and permit investigators to focus
more rapidly on the nature of the gene products and their cellular
roles.
Disease Genes Promise to Provide Important
Insights into Human Biology
An understanding of normal physiology and biochemistry has often
been gained through the study of single gene disorders for which
protein products have been characterized. For example, elucidation
of many pathways of intermediary metabolism resulted from the
examination of cells from patients in whom a single enzyme activity
was abolished. Similarly, the study of individual mutant genes encoding
uncharacterized products is certain to illuminate new biochemical and
cellular mechanisms related to both normal human physiology and to
the development of disease. The rapid identification of disease genes
will enable investigators to examine in detail the protein product of
such genes and their role in cellular biology. When few clues to
pathophysiology exist (e.g., neurofibromatosis, polycystic kidney
OCR for page 28
28
MAPPING AND SEQUENCING THE HUMAN GENOME
disease, or retinitis pigmentosa), this strategy will provide new insights
into pathogenesis.
The implications of such research are likely to be extensive. In
many instances, examination of an apparently rare situation may lead
to a clearer unclerstanding of normal mechanisms that may be adversely
affected in other ways in more common diseases. For instance, studies
of the recently isolated gene responsible for the rather rare childhood
tumor known as retinoblastoma should increase our understanding of
more common cancers (Dryja et ai., 1986; Friend et at., 1986), and
studies of the genes involves! in an apparently uncommon type of
Alzheimer's disease may explain more general features of aging (St
George-HysIop et at., )987~.
Specific Medical Applications
An improved capacity to identify genes related to disease will have
an immediate impact on the diagnosis, treatment, and prevention of
genetic disorders. As more disease genes are isolated, DNA-based
diagnosis will become more common and the potential for somatic
cell gene therapy will increase. Furthermore, the availability of
molecular probes for specific gene loci will permit detection of the
carriers of disease-associated genes. This ability will enable parents
to identify the extent to which their offspring may be at risk for a
genetic defect. In addition, the identification and characterization of
disease genes will lead (and already has led for many genetic disorders)
to improved prenatal diagnosis of serious conditions by direct DNA
analysis. Finally, the ability to determine whether individuals are
carriers for specific gene defects will facilitate various epidemiological
investigations of the risks associated with specific environmental
factors, occupational settings, or drugs.
Toward an Understanding of Cancer
Cancer results from the unregulated growth of cells. What has been
learned over the past decade or so, largely through the application of
molecular genetic tools, is that deregulation of growth is caused by
specific genetic abnormalities, i.e., mutations in growth-related genes
that are either inherited or acquired during life. Inherited defects
generally confer increased susceptibility to a particular form of
cancer for example, retinoblastoma, cancer of the colon, certain
kidney tumors, and malignant melanoma. Only in retinoblastoma has
the susceptibility gene been identified. The search for the responsible
genes in other instances is in its early stages and will be greatly
facilitated by detailed ElF~P and DNA clone maps and the nucleotide
sequence. With the susceptibility genes in hand, it will be possible to
OCR for page 29
IA'1PLICA TIONS FOR MEDICINE AND SCIENCE
29
identify by testing an individual's DNA those who need special
surveillance for precancerous or early cancerous changes so that
appropriate treatment can be applied at an early stage of disease. It
may also become- possible to counter the effects of inherited suscep-
tibility more directly once the physiological effects of the various
genes are understood.
In recent years much has been learned about acquired genetic
abnormalities related to cancer. During one's lifetime, the DNA in
somatic cells undergoes mutation, either spontaneously or as induced
by environmental mutagens. These mutations involve changes in
nucleotides, rearrangements, duplications, or deletions. Some of these
changes occur in genes that regulate growth. Several dozen genes are
now known that, when mutated in specific ways or overexpressed,
deregulate cell proliferation. Some of these abnormal genes (called
oncogenes) have been found in human cancer cells and seem to
contribute to their tumorigenic properties. In several instances the
proteins encoded by oncogenes have been shown to be altered forms
of cell growth stimulators or the cellular receptors for growth stimu-
lators. Other oncogenes encode proteins that are involved in the
intracellular response of cells to growth stimulators. As a result of
these findings, primary questions regarding cell growth and human
cancer have come into sharp focus: What normal human proteins are
involved in cell growth and how do they act? How do changes in one
or more of these proteins cause cells to grow into tumors and to
spread to distant organs? What genetic mechanisms underlie these
changes? What is the spectrum of oncogenes or metastasis genes
present in human tumors?
The availability of a map and sequence of the human genome and
of the genomes of simpler organisms will help answer these questions.
It will facilitate the isolation of genes that are homologous to known
growth-related genes and the identification of previously undiscovered
genes that play a role in cell growth and development. The charac-
terization of the genes and proteins that regulate cell growth and are
responsible for neoplasia and metastasis of tumor cells is likely to
lead to more sensitive diagnostic and prognostic tests and to new
approaches to the control of cancer.
IMPLICATIONS FOR BASIC BIOLOGY
What Aspects of Genome Organization Are
Important for Genome Function?
The principles of genome organization are poorly understood. The
human chromosome contains functional segments that are not genes.
OCR for page 30
30
MAPPING AND SEQUENCING THE HUMAN GENOME
Specific segments are essential for the duplication of the chromosomes
before cell division and for ensuring that the correct complement of
chromosomes segregate into the two daughter cells. The nature of
these segments within a chromosome and the mechanism by which
they carry out their functions are poorly understood in mammals. A
physical map of the human genome will provide the basis for exper-
imentation into the identity and role of these and other elements.
The study of genome organization, that is, the order in which genes
occur along a chromosome and their relations to various other
components, will be enhanced by the existence of a physical map.
For example, we do not know in most cases whether the order of
genes on a given chromosome is important to their function. Is there
a selective advantage to the organism to maintain the proximity of
genes that are expressed together? Limited studies comparing the
overall organization of genes in the chromosomes of humans and mice
suggest that the organization of large blocks of genes has often been
conserved, but it is not known whether this is important to their
function (Sawyer and Hozier, 19861. By comparing the physical maps
of a variety of organisms, it will become apparent which segments
are conserved in their gene order across species and therefore are
likely to have functional significance.
The detailed comparison of corresponding mouse and human DNA
sequences is likely to be of special importance. Sufficient time (an
estimated 70 million years) has elapsed since the divergence of mice
and humans from a common mammalian ancestor for those chromo-
somal regions whose nucleotide sequence is not crucial for the function
of the organism to differ extensively as a result of random events that
change nucleotide sequences. Thus, a comparison of mouse and
human sequences can reveal those regions of our chromosomes with
crucial functions that are reflected as conserved (i.e., common)
nucleotide sequences. Evolutionary biologists believe that changes in
most of these sequences have occurred at one time or another during
evolution, but because the changes were deleterious, the mutant
individuals who carried such changes were eliminated from the
population by natural selection. Included among the conserved se-
quences will be the exons of important proteins as well as the sequences
in genes that regulate gene expression. Other conserved sequences
whose function cannot be anticipated will no doubt be discovered in
this way; their identification should eventually provide many new
insights into the functions of both genes and genomes.
OCR for page 31
lMPLICA TlONS FOR MEDICINE AND SCIENCE
Many New Human Genes and Proteins Will Be Identified
31
Only a small percentage of the human genes involved in normal
development and disease have been identified to date. Mapping and
sequencing the human genome will result in the identification of a
large number of new genes and their encoded proteins. As one benefit,
the physical map will help pinpoint the position of human genes that
have been mapped to specific chromosomal locations but have not
yet been isolated. Moreover, genetic studies of the mouse have
revealed mutations in many genes that cause interesting pathological
defects, but little is known about these genes except their location on
the genetic map of mice. By knowing the specific correspondence
between the physical maps of humans and mice, the corresponding
gene can be identified and studied in both organisms.
There are also computer-based methods for detecting genes when
the only information available is a long stretch of continuous nucleic
acid sequence (Staden and McLachIan, 19821. These methods have
been improving dramatically, and a human genome project will
stimulate further improvement in existing computer-based tools. At
present, the identification of genes and their protein products relies
on several methods. First, the exons within a DNA sequence can
often be predicted by identifying those segments that contain open
reading frames (regions of nucleotide sequence without the 'istop
codons" that terminate protein synthesis) and also have codon usage
biases (the preferential use of one of several codons that specifies a
particular amino acid) that are consistent with other genes in that
organism. Moreover, there are conserved sequences that always flank
an intron.
As a second approach, genes often share homologies with one
another on the basis of common evolutionary history; these homologies
have been successfully exploited in a number of areas, for example,
to identify related family members of lymphokines, to find new receptor
proteins for neurotransmitters, and to find genes that may play
important roles in pattern formation in development. Many sequence
motifs that encode protein domains with a similar function have been
identified, such as the common domain found in all protein kineses.
These have been useful in predicting the function of unidentified gene
products from their amino acid sequences. As increasing numbers of
new proteins are isolated and functionally characterized, the data
base available for such comparisons will be greatly increased. Many
proteins contain domains that have been used over and over again in
the construction of related proteins. Therefore, it should eventually
be possible to discover a great deal about the structure and function
OCR for page 32
32
MAPPING AND SEQUENCING THE [IUMAN GENOME
of a protein from the amino acid sequence derived from its gene.
Because exons coincide in many instances with protein domains,
knowledge of the exon-intron structure of a gene can also provide
insights into both the structure and function of the protein.
How Do Organisms Evolve?
To gain a deep understanding of organisms we must understand
how they evolved, and much of the evolutionary history of humans
is present in our genomes. If we knew the complete DNA sequences
of humans and other organisms, we should be able to trace the origins
of most of our genes. However, because all mammals are constructed
from similar sets of proteins, the building blocks that are used to
construct a human and whale are very much the same. The many
differences between mammalian species are therefore believed to
depend largely on differences in the regulatory signals that control
the timing, level, and cell specificity of gene expression. Thus, the
orderly development of the human embryo requires that specific gene
sets be activated at exactly the right place and time as new cell types
arise from multipotential cells. This process is controlled at least in
part by regulatory DNA sequences located near the genes. In many
cases, these sequences wit! be homologous among those genes that
are coactivated. The sequence analysis of the human genome, and its
comparison with the sequence of other mammalian genomes such as
the mouse, should allow us to identify very large number of regulatory
DNA sequences. Moreover, one can hope to begin to understand not
only the rules that govern gene regulation but also the changes that
have occurred during evolution that have differentiated the human
organism from our mammalian relatives.
In summary, the acquisition of the map and sequence of the human
genome will expand our understanding of many basic questions in
biology. To maximize this impact, it will be necessary to pursue the
analysis of genomes of organisms that can be experimentally manip-
ulated. Thus, for example, the function of regulatory sequences
detected in humans can be tested by experiments in the mouse in
which transgenic animals can be constructed with appropriately
engineered genes. Because many crucial insights may be gained from
such comparative studies, experiments in several other organisms will
inevitably be required to test the function of potentially important
human genes.
OCR for page 33
IMPLICATIONS FOR MEDICINE AND SCIENCE
REFERENCES
33
Dryja, T. P., J. M. Rapaport, J. M. Joyce, and R. A. Petersen. 1986. Molecular detection
of deletions involving band ql4 of chromosome 13 in retinoblastomas. Proc. Natl.
Acad. Sci. U . S . A . 83 :7391 -7394.
Friend, S. H., R. R. Bernards, S. Rogelj, R. A. Weinberg, J. M. Rapaport, D. M. Albert,
and T. P. Dryja. 1986. A human DNA segment with properties of the gene that
predisposes to retinoblastoma and osteosarcoma. Nature 323:643-646.
Sawyer, J. R., and J. C. Hazier. 1986. High resolution of mouse chromosomes: Banding
conservation between man and mouse. Science 232:1632-1635.
Staden, R., and A. D. McLachlan. 1982. Codon preference and its use in identifying protein
coding regions in long DNA sequences. Nucleic Acids Res. 10:141-156.
St George-Hyslop, P. H., R. E. Tanzi, R. J. Polinsky, J. L. Haines. L. Nee, P. C. Watkins,
R. H. Myers, R. G. Feldman, D. Pollen, D. Drachman, J. Growdon, A. Bruni, J.-F.
Foncin, D. Salmon, P. Frommelt, L. Amaducci, S. Sorbi, S. Piacentini, G. D. Stewart,
W. J. Hobbs, P. M. Conneally, and J. F. Gusella. 1987. The genetic defect causing
familial Alzheimer's disease maps on chromosome 21. Science 235:885-890.
Representative terms from entire chapter:
dna sequences