National Academies Press: OpenBook
« Previous: Chapter 9 Folding the Sheets: Using Computational Methods to Predict the Structure of Proteins
Suggested Citation:"A PRIMER ON PROTEIN STRUCTURE." National Research Council. 1995. Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology. Washington, DC: The National Academies Press. doi: 10.17226/2121.
×
Page 237
Suggested Citation:"A PRIMER ON PROTEIN STRUCTURE." National Research Council. 1995. Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology. Washington, DC: The National Academies Press. doi: 10.17226/2121.
×
Page 238
Suggested Citation:"A PRIMER ON PROTEIN STRUCTURE." National Research Council. 1995. Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology. Washington, DC: The National Academies Press. doi: 10.17226/2121.
×
Page 239
Suggested Citation:"A PRIMER ON PROTEIN STRUCTURE." National Research Council. 1995. Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology. Washington, DC: The National Academies Press. doi: 10.17226/2121.
×
Page 240

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

FOLDING THE SHEETS: USING COMPUTATIONAL METHODS TO PREDICT THE STRUCTURE OF PROTEINS 237 A PRIMER ON PROTEIN STRUCTURE Proteins are constructed by the head-to-tail joining of amino acids, chosen from a 20-letter alphabet. The 20 natural amino acids have a common backbone, but a variable side chain or R-group. The R-groups may be large or small, charged or neutral, hydrophobic or hydrophilic, and conformationally restricted or flexible (see Figure 9.1). It is the physical properties of these R-groups that determine the diverse structures into which a given amino acid chain will fold. Broadly speaking, proteins can adopt fibrous or globular shapes. Repetitive amino acid sequences adopt elongated periodic fibrous structures, with common examples including elastin (skin), collagen (cartilage), keratin (hair), and β-fibroin (silk). This chapter focuses on globular proteins. Figure 9.1 Twenty amino acids: R-groups are shown clustered by functional types: aliphatic hydrophobic, aromatic hydrophobic, hydrophilic, negatively charged, positively charged, and conformationally special.

FOLDING THE SHEETS: USING COMPUTATIONAL METHODS TO PREDICT THE STRUCTURE OF PROTEINS 238 The enzyme ribonuclease, which catalyzes the breakdown of ribonucleic acid (RNA), provides a useful example. The sequence contains 124 amino acids. Under appropriate conditions, the amino acid chain is covalently cross-linked in four locations through disulfide bridges between cysteines in the protein chain. (The amino acid cysteine has a reactive sulfur atom that forms such bridges, which provide the only covalent bonds joining nonneighboring amino acids in the chain.) In a classic series of experiments, Anfinsen et al. (1961) demonstrated that the amino acid sequence of ribonuclease contained enough information to code for the folded structure. Specifically, he showed that ribonuclease lost its enzymatic activity in the presence of a chemical denaturant (which disrupted the protein's structure) but spontaneously regained its activity when the denaturant was removed. Even when the disulfide pairings were scrambled after denaturation, renaturation could occur. Thus, without any outside assistance, the protein could refold. Independent of the starting conformation, the amino acid sequence contains sufficient information to direct the chain to the correct folded structure. Similar experiments have been repeated with many other proteins. This work would suggest that proteins follow an energy gradient from the denatured state to the native state. The free energy difference between these two states favors the folded state, and the height of the activation barrier along the folding pathway governs the rate of chain assembly (see Figure 9.2). Recently, molecular biologists have discovered that some proteins can assist the folding process. These proteins, dubbed foldases, include the chaperonins (Kumamoto, 1991) that prevent proteins from assembling inside an undesirable cellular compartment, prolyl isomerases that increase the rate of the cis-trans isomerization of the amino acid proline (Fischer and Schmid, 1991), and protein disulfide isomerases (Freedman, 1989), which shuffle disulfide bridges. While it is conceivable that these foldases might take a protein to a kinetically trapped final state different from the state of lowest free energy, this seems unlikely. Instead, I imagine that these foldases simply lower the activation barrier to folding into the lowest energy state. In the absence of an appropriate foldase, the height of the activation barrier might be such that in some cases, protein folding will not occur on a biologically sensible time scale.

FOLDING THE SHEETS: USING COMPUTATIONAL METHODS TO PREDICT THE STRUCTURE OF PROTEINS 239 Figure 9.2 Thermodynamics of protein folding: the folding chain must surmount a free energy barrier to move from the denatured to the native state. The native state is more stable than the denatured state by free energy ∆G. One reason for the tremendous interest in the protein folding problem is that it has become simple to determine the amino acid sequence of large numbers of proteins while it remains difficult to determine the structure of even a single protein. The first protein sequences were laboriously determined by classical biochemical methods (Konigsberg and Steinman, 1977). The proteins in question were isolated, purified to homogeneity, and enzymatically digested into smaller fragments. Amino acids in each such fragment were chemically cleaved, one residue at a time, from one end and from each successive amino acid. Automated methodologies and improved chemistry accelerated this process, but protein sequencing remained a tedious task until molecular biology supplied a different approach (Maxam and Gilbert, 1980). By determining the deoxyribonucleic acid (DNA) sequence of the gene encoding the protein (using methods that were quite rapid), one could infer the amino acid sequence of the protein by simply translating the DNA codons according to the genetic code. The approach is much faster and more reliable than direct protein sequencing. With the advent of this technology has come a flood consisting of tens of thousands of protein sequences.

FOLDING THE SHEETS: USING COMPUTATIONAL METHODS TO PREDICT THE STRUCTURE OF PROTEINS 240 By comparison, the rate at which new protein structures are determined remains a trickle because the structure determination remains a formidable experimental task. X-ray crystallography was the first technique used to determine the structure of proteins (Kendrew, 1963). One must first coax a protein to crystallize with sufficient regularity to diffract X-rays. Then the crystal must be bombarded with X-rays and the X-ray diffraction pattern collected, either on film or with an electronic detector system. In principle, the X-ray diffraction pattern corresponds to a Fourier transform of the electron density D of the crystal—with the amplitude and phase of the signal at each point corresponding to the amplitude and phase of the corresponding complex Fourier coefficient. Unfortunately, detectors can record only the amplitude, not the phase. Solving for an X-ray crystal thus involves determining the density D from | |, which can be a formidable task. In general, the problem is underdetermined. A mathematical approach is to add constraints (for example, D must be everywhere positive, since it represents a density). An experimental approach is to use additional information from the X-ray diffraction pattern obtained when the protein is crystallized in the presence of a heavy atom (for example, mercury, uranium, or platinum) or anomalous scatterers (for example, selenium) bound to the protein in a covalent or non-covalent fashion. The difference between the original and modified patterns or the patterns as a function of X-ray wavelength provides the missing phase information. Although the approach is very powerful, it requires that the protein architecture not be significantly changed by this molecular perturbation, and it is more successful when several derivatives are available for study (Blundell and Johnson, 1976). Finally, one can start with a good guess at the protein structure. The Fourier transform of this structure yields a set of intensities and phases. The hypothetical structure is rotated and translated until the intensities match the experimental data. If the correlation between the hypothetical and actual structure is strong, then the structure determination can succeed without the need for heavy atom derivatives. More recently, nuclear magnetic resonance (NMR) spectroscopy has been used to determine protein structure (Wuthrich, 1986). Pairs of hydrogen atoms (protons) produce resonances when they lie in neighboring positions in the protein chain or when they lie very close together in space. By determining the correspondence of resonances with

Next: BASIC INSIGHTS ABOUT PROTEIN STRUCTURE »
Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology Get This Book
×
 Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology
Buy Paperback | $80.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

As researchers have pursued biology's secrets to the molecular level, mathematical and computer sciences have played an increasingly important role—in genome mapping, population genetics, and even the controversial search for "Eve," hypothetical mother of the human race.

In this first-ever survey of the partnership between the two fields, leading experts look at how mathematical research and methods have made possible important discoveries in biology.

The volume explores how differential geometry, topology, and differential mechanics have allowed researchers to "wind" and "unwind" DNA's double helix to understand the phenomenon of supercoiling. It explains how mathematical tools are revealing the workings of enzymes and proteins. And it describes how mathematicians are detecting echoes from the origin of life by applying stochastic and statistical theory to the study of DNA sequences.

This informative and motivational book will be of interest to researchers, research administrators, and educators and students in mathematics, computer sciences, and biology.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!