Click for next page ( 155


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 154
10. Conclusions and Recommendations CONCLUSIONS Predicting macromolecular structure from fundamental chem- ical principles and information on primary structure is a challeng- ing task. Understanding macromolecular function is even more demanding. Identifying important steps toward these goad is pos- sible, however, and we have made considerable progress in various subtasks and specialized areas. There is every reason to believe that major breakthroughs can be expected over the next 10 years. 1. The tools of molecular mechanics and molecular dynam- ics have proved useful for exploring the conformational space of polypeptides, oligonucleotides, and oligosaccharides. In favorable cases, they identify the most stable conformers and quantitatively probe intermolecular interactions. Although these methods have not yet successfully predicted, a priori, the structures of molecules the size of small proteins, they play a major role in the refinement of exper~mentally-derived tertiary structures of macromolecules. Some promising results have been obtained in predicting the struc- tural and thermodynamic consequences of local changes in amino acid sequences. Exciting new techniques make it possible to cal- culate free energies directly by perturbation methods. The tech- niques can be applied to intermolecular interactions or the changes 154

OCR for page 154
155 in free energy that "company substitution of one amino acid for another and are readily applied to nucleic acid and polysaccharide problems as well. 2. The major limitations of current methods include: the quality of the potential functions and of their param- eters, especially the electrostatic terms; methods for incorporating the solvent; global search aIgorithrns for solving the multiple-minima problem. Each of these areas has seen notable developments. While recently introduced procedures may produce solutions, we expect effective solutions to the multiple-minnna problem to await new conceptual breakthroughs. 3. Heuristic modeling has been successful in the past, par- ticularly in predicting the double helical structure of DNA, the alpha helix, and the beta-pleated sheet. When applied to globular proteins, this approach has yielded results which, although of rel- atively low resolution, have proved useful in guiding experiments in pursuit of more definitive data from crystallographic or nuclear magnetic resonance (NMR) techniques. 4. Experimental and theoretical methods can be usefully combined when the goal is to elucidate a new molecular structure based on a known one. When they are appropriate, modeling efforts based on the structural homology of one protein to another are currently the strongest line of attack. 5. Direct experimental approaches to macromolecular struc- ture have been very successful; they cannot always be applied. They are limited by the need for significant quantities of highly purified material. Acquiring sufficient amounts of many interest- ing proteins, glycosylated proteins, and most nucleic acids is a challenging task. The powerful diffraction techniques all have an absolute requirement for crystals. NMR has molecular weight re- strictions and some constraints on ultimate resolution. It takes at best months, and frequently a year or more to deduce a structure through crystallography or NMR. ~ ~ ~ _ ~ . 6. Recent progress in instrumentation for crystallography has included the development of area detectors, which are only now

OCR for page 154
156 being fully utilized. Synchrotron sources and new neutron sources offer improved data. Isotopic labeling techniques and unproved magnet technology signal new directions for NMR. We expect equally important breakthroughs In crystaDization techniques. 7. Even with these advances, the most likely situation in the next decade ~ a substantial but essentially linear growth in the number of three-dimensional molecular structures elucidated by empirical methods. We estunate from current rates that several thousand protein and nucleic acid structures will be known in 10 years. 8. The explosive growth in the number of known nucleic acid sequences and hence protein prunary sequences will continue to accelerate with or without implementation of the human genome project in the United States. Even at current rates, it is reasonable to expect 100,000 protein sequences to be described in the next decade. The overwheIrn~ng majority of new protein sequences are likely to be identifiable as members of known families of proteins. 9. Currently, the inventory of three-dimensional protein struc- tural descriptions underrepresents the general Retribution of pro- tein families. The opportunities for computer-assisted modeling are enormous and will grow proportionately as more new struc- tures and sequences are determined. Estimates of the number of sequences to be reported in the next decade suggest that existing facilities and resources for structural analysm will be overwhelmed by the avalanche of new sequence data. 10. Effects of covalent modification on structure and function of proteins, nucleic acids, and carbohydrates are diverse and poorly understood. No theoretical bash for predicting these effects exists in many cases. Describing structural relationships and cooperative functional roles In supramolecular systems are embryonic research areas to which modeling methods will contribute. Substantial attention will be directed toward these areas in the coming decade. 11. Computer speed, availability, and storage capacity are important limitations on the types of modeling calculations that can be attempted. Exiting equipment is frequently incapable of performing all the necessary control experiments and refining ma- jor approximations. A Unfold increase in computer performance

OCR for page 154
157 capability ~ required for conducting many current projects of bi- ological importance systematically and rigorously. A munimum of a 100-fold improvement is needed for exploring new time scales or studying molecules of greater structural complexity than small proteins. We expect supercomputers, specialized hardware, and personal supercomputers (PSCs) to be significantly more available in the next few years. Most prorn~s~ng is the development during the next decade of high-capacity parallel processors. 12. A national computer network, operating at high speed and linking major government, academic, and industrial research facilities, will be crucial to molecular computation in the coming years. The uses of the network include transmission of sequence and structural data as well ~ access to computational facilities. 13. Of immense applied potential is the design of ligands to interact preferentially with macromolecular receptors, and the de- sign of receptors to cause alterations of structure and/or function. These programs are in the earliest stages of development, and many hurdles must be overcome on the way from the laboratory to full clinical or commercial utility. 14. The intellectual, practical, and economic benefits of im- proved understanding of protein folding, macromolecular interac- tions, and macromolecular function are substantial. RECOMMENDATIONS 1. The burgeoning volume of new sequence data requires a radical new policy on data banking of protein and nucleic acid sequences. A permanent national facility should be put in place as soon as possible, and considerable attention should be given to developing a data storage format that facilitates data retrieval. There should be no direct charges to the user. The initiation of this new national resource should be undertaken only after a round of detailed proposals has been sought and reviewed. A standing advisory committee of users should be appointed by a consortium drawn from the National Institutes of Health (NIH), National Science Foundation (NSF), and Department of Energy (DOE). 2. Whether the new facility should be allied with a national laboratory, such as Los ATamos, or with the National Library of

OCR for page 154
158 Medicine, or should be a completely new academic or comrner- cial enterprise remains to be determined. Until the new unit Is functioning, current facilities should be maintained to ensure an orderly transition. 3. Support for the archiving of coordinate and model-derived structures should continue. The Protein Data Bank at Brookhaven and the Cambridge Crystallographic File in England currently serve this need for the national and international community. In- clusion of data from new methods of structural analysm should be encouraged. 4. We recommend in the strongest terms expanding the su- percomputer initiative, funding of computer networks, improving access by the scientific community to the existing supercomputer centers at the national laboratories, upgrading those centers, and providing individual research grants for purchasing PSCs. DOE should work closely with the supercomputer project managers at NSF to provide the broadest and most versatile computer network system on a national level. NTH should become more involved in direct support of scientific supercomputer centers. 5. Although the report does not specifically address this is- sue, the committee felt strongly that educational opportunities in structural biology and molecular modeling should be improved. Several mechanisms are available, such as expanding graduate pro- grams through new training grants. We recommend that NSF and DOE increase graduate fellowship and postdoctoral fellow pro- grarns in this area. Workshops have been particularly effective for transferring information and skills. These include formal hands-on training programs in molecular dynamics and molecular graph- ics, and working meetings of independent investigators to address critical limiting aspects of a particular problem. Such workshops, which also promote crucial interdisciplinary approaches, could be funded by NTH, NSF, or DOE, acting together or independently. 6. Innovative and interdisciplinary research proposals in both theoretical and experimental aspects of structural biology should be directly encouraged tech thy. ,,r" of PYiat.in. In; anisms. a ~ _ _^ _~V^~. ^~ll~o mech 7. We see a special role for the national laboratories, which should interact at every level of these recommendations. The

OCR for page 154
159 national laboratories should compete for the National Sequence Data Bank. The national laboratories and DOE have leadership status in the national computer network. They should increase efforts to make supercomputers available to the scientific commu- nity. Research efforts are going forward in molecular calculations and structural biology, with major program at a few locations. Strengthening these efforts will assist the department's Office of Health and Environmental Research to assess the potential health and environmental effects of chemicals involved in energy pro cesses. Each of our recommendations involves developing some cen- tralized activity. The issues in each area are quite different, how- ever, and should not be taken as a general call for more biotech- nology centers.