Read "Computer Assisted Modeling: Contributions of Computational Approaches to Elucidating Macromolecular Structure and Function" at NAP.edu

« Previous: 7 Functional Aspects of Proteins and Nucleic Acids

Page 131 Cite

Suggested Citation:"8 Structure and Function of Complex Carbohydrates." National Research Council. 1987. Computer Assisted Modeling: Contributions of Computational Approaches to Elucidating Macromolecular Structure and Function. Washington, DC: The National Academies Press. doi: 10.17226/1136.

Page 132 Cite

Page 133 Cite

Page 134 Cite

Page 135 Cite

Page 136 Cite

Page 137 Cite

Page 138 Cite

Page 139 Cite

Page 140 Cite

Page 141 Cite

Page 142 Cite

Page 143 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

8 Structure and Function of Complex Carbohydrates Complex carbohydrates are very common in animals, plants, and bacteria. They are constituents of cell membranes, as well as subcellular materials of cells. They are also found in physio- logical fluids such as blood, tears, milk, and urine. It was esti- matec! recently that the covalent structures of between 4,000 and 6,000 natural carbohydrates have been determined (DOE, 1987~. Many complex carbohydrates are unsubstituted at their reduc- ing ends and are referred to as polysaccharides; examples include the oligosaccharides of milk, the cellulose of plant cell walls, and storage forms such as starch and glycogen. Many other naturally occurring complex carbohydrates are covalently connected to other molecules, such as proteins or lipids, by glycosidic linkages of the sugar residues at their reducing ends to form glycoconjugates. BlOLOGICAI' FUNCTION Glycoproteins have many functions in higher organisms. Col- iagen is an important structural element in the extracellular space and in cartilage, bone and basement membranes. Mucins are significant as lubricants and protective agents in mucous secre- tions. Important immunological molecules of the glycoprotein class include the immunogIobulins, histocompatibility antigens, 131

132 blood group antigens of the ABO and Lewis types, complement in the blood clotting mechanism, and interferon. Many human plasma proteins such as fetuin, transferrin, and ceruloplasmin are glycoproteins, as are several of the hormones such as chorionic go- nadotropin and thyrotropin. Most of the animal and plant lectins are glycoproteins, as are the lysosomal enzymes. The recogni- tion and binding of lysosomal enzymes to specific receptors in the Golgi apparatus and on the cell surface involves one or more phos- phorylated mannose residues on N-linked oligosaccharide chains. Recognition sites on cell surfaces for binding and uptake of hor- mones and for interactions with other ceils, viruses and bacteria are also glycoproteins. Many of the cell surface functions of glycoproteins have also been proposed for the neutral and acidic glycosphingolipids. In ad- dition, certain glycosphingolipids of the ganglio~ide class have been found recently to inhibit the mitogenic response of cell growth fac- tors by allosteric modulation of their cell surface receptors (Bremer et al., 1986~. Oncogenic transformation by viral infection or chem- ical mutagens usually leads to alterations in the cell surface pat- tern of glycosphingolipids such that certain types increase greatly in quantity. In some cases, there are also qualitative differences due to the expression of genes that are silent in the differentiated normal cells. This is particularly unportant in tumor cells, where tumor-associated antigens may provide a basis for specific mono- clonal antibody-based diagnostic assays and eventually, perhaps, treatment. The binding between glycosaminoglycans and other extracel- lular macromolecules contributes significantly to the structural organization of connective tissue matrix. All of the glycosam~no- glycans, except those that lack sulfate groups or carboxy! groups, bind electrostatically to collagen at neutral pH because of their remarkable anionic character. Dermatan sulfate, which appears to be the major glycosaminoglycan synthesized by arterial smooth muscle ceils, binds strongly to plasma lipoproteins, and heparin also interacts with several plasma proteins, including clotting fac- tors OX and XI and antithrombin ITI. Interestingly, the 1:1 stoi- chiometric binding of heparin to Lys residues of antithrombin Ill is believed to induce a conformational change in antithrombin Ill that increases the binding of antithrombin IT! to thrombin. This binding inactivates the thrombin. Hyaluronic acid is deposited on the surface of Petri dishes by cells growing in tissue culture,

133 giving them a substratum for attachment during growth. The proteoglycans have also been implicated in the regulation of cell growth, possibly through nuclear effects on chromatin structure and activation of DNA polymerase, and may mediate cell-cell communication and the shedding of cell surface receptors. BIOSYNTHESIS OF N-LINKED GLYCOPROTEINS AND GLYCOSPEINGOlIPIDS The role of carbohydrates in biological function poses a par- ticularly challenging problem for the future. The synthesis of these glycoconjugates occurs during their intracellular transport from the site of initial assembly of a lipid-linked intermediate (gly- coproteins) or ceramide (glycosphingolipids) in the endoplasmic reticulum, through the Golgi apparatus, to the cell surface, intra- cellular organelles, or extracellular space. Their synthesis requires a family of activated sugar donors called sugar nucleotides that are synthesized ~ the cytosolic fraction of cells from sugar photo phases and nucleoside triphosphates. An interesting exception is the sugar nucleotide of sialic acid, called cytidine monophosphate sialic acid (CMP-NeuAc), which is synthesized in the nucleus from free sialic acid and CTP. The enzymes involved in glycoconjugate biosynthesis are glycosy~transferases that catalyze the transfer of sugar residues from the sugar nucleotides to the nonreducing end of a growing carbohydrate chain. The distinction between glycoconjugate biosynthesis and pro- tein synthesis is key; the latter occurs on a template of messenger RNA and is therefore determined by the genetic code for a single structural gene.) In sharp contrast, glycoconjugate synthesis is accomplished by the stepwise addition of sugar units using a dif- ferent enzyme for each step. Therefore, no single DNA sequence is involved in determining the primary structure of the complex carbohydrate, since the order in which sugars are added depends on the substrate specificities and kinetic characteristics of the dif- ferent glycosy] transferases, each of which ~ coded by a different structural gene. It is clearly impossible to predict the primary Actually, it is more appropriate to refer to "one cistron-one polypep- tide". This is no longer strictly accurate either, as more than one gene may contribute to the primary structure of a protein, i.e., immunoglobulins.

134 structures of complex carbohydrates from DNA sequences. There- fore, the three-dimensional structures of glycoproteins, glycosph- ingolipids and other complex carbohydrate-conta~ning molecules can never be completely predicted without experimental structural analysis of the carbohydrates. Snider (1984) reported that glycoproteins of the N-linked type are synthesized as a cotranslational event in the rough endopla~ mic reticulum. While the polypeptide chain is being translated on a messenger RNA and concurrently passed through the en- doplasm~c reticulum membrane into the coeternal space (lumen), a single oligosaccharide is coordinately synthesized on a photo phorylated polyisoprenoid alcohol (dolicho! in higher animate and smaller, similar substances in insects, yeast, and plants). The en- tire precursor oligosaccharide is then transferred to appropriate asparagine residues on the nascent polypeptide chain (probably before folding into a tertiary structure) according to rules of speci- ficity that are not completely understood. Transfer requires an Asn-X-Ser or Asn-X-Thr sequence but additional factors are in- volved as wed. Accessibility of the Asn residue may be one such factor and assessment of this possibility conic be made by the predictive methods described ~ this report. The second stage of N-linked glycoprotein synthesis involves extensive posttranslational modification of the protein-linked pre- cursor oligosaccharide by the removal and acIdition of sugars. In many cases the protein moiety is also modified by partial prote- olytic cleavages and/or the addition of function-modifying groups on specific amino acid residues. Posttranslational modification ~ initiated In the rough endoplasm~c reticulum by the removal of the three glucose residues by two specific membrane-bound glu- cosidases. These glucose residues appear to have the sole function of enabling transfer of the oligosaccharide chain from dolicho! py- rophosphate to nascent polypeptide chains. It will be interesting to determine from three-dimensional structures and predicted con- formations how these groups interact with the trans£erase enzyme involved at this step. Mature high mannose oligosaccharide chains are synthesized by the subsequent removal of up to four mannosy! residues from the three branches of the precursor structure. At least three different alpha-mannosidases in the Golgi apparatus are involved in this process. These enzymes and the two glucosidases are hydrolases like lysosomal glycosidases but their activities are

135 II III Man Man \ I \ Man Man ° Man GlcNAc GlcNAc~ (Fuc) ~~~ Asn FIGURE 8-1 Intermediate partially processed asparagin~linked carbohy- drate chain of a glycoprotein. greatest at neutral pH, in contrast with lysosomal enzymes that have their greatest catalytic activity at an acid pH. In eukaryotic ceils, the high mannose oligosaccharide with five mannose units (see Figure 8-1) is the direct precursor of complex and hybrid structures. The initial step in the Golgi apparatus is the addition of an N-acety~glucosamine residue to the last remaining Man on branch ~ (*), after which the re- maining two Man residues on branches IT and Ill can be re- moved by alpha-mannosidases that are almost certainly differ- ent from those involved in earlier steps. Additional branches may be made at this point to produce tri- and tetra-antennary structures, and the final stages of processing are carried out by the addition of galactose, N-acety~glucosamine, sialic and fucose residues to give mature, complex, N-linked chains. An interesting N-acety~glucosam~ny~transferase may add a beta-1,4-linked GIc- NAc residue to the branched beta-linked mannose residue of the inner core region (0) to give a "bisected structure." This step has been the subject of intensive study by Carver and cowork- ers, who have been interested in the structural specificity of the enzyme with different conformations of the precursor oligosaccha- rides (Carver and Brisson, 1984~. It is likely that predictive methods will be employed in studies

136 of processing pathways and the extent of processing of oligosac- charide chains. If control arises from an enzyme specificity for a particular three-~unensional structure of the substrate, it may be possible to determine these preferences and, from predictions of the distributions of thre - dimensional structures of the oligosac- charide attached to the glycoprotein substrate, predict how far the carbohydrate chain wib be processed. Lysosomal enzymes contain one or more phosphate groups on mannose residues of the high mannose type oligosaccharide chains. The mannose-~phosphate groups are specific recognition markers that are involved in the transport of lysosomal enzymes from the Golgi apparatus or outside the cells into lysosomes. Two membrane-bound mannose-~phosphate receptors have been dim covered in the plasma membrane; at least one of them also re- sides in the Golgi membranes. Although their binding specificities have been probed in some detail, other aspects have not been determined: the nature of the interaction of the phosphorylated mannose resiclues with the receptors and the three-dimensional structures of the lysosomal enzyme-receptor complexes. Another interesting aspect of lyosomal enzyme synthesis in- volves the determination of structural domains on the folded prm teins recognized by the enzyme that initiates phosphorylation of mannose residues, which ~ an N-acety~glucosam~ne-phosphotran~ ferase (GIcNAc-P transferase) in the Golgi apparatus. This ~ the mechanism by which only lysosomal enzyme proteins are selected for phosphorylation. It ~ especially important because one form of a genetic lysosomal storage disorder, called mucolipidosis IT, results from a defect in the binding domain of the GIcNAc-P transferase for lysosomal enzyme proteins. Perhaps this problem can be solved only by computer modeling to predict the three- dimensional structures of both proteins. Glycosphingolipids are synthesized in an analogous manner, except that ceram~de serves the function served by dolicho! for gly- coproteins and transfer occurs directly from a sugar nucleotide to the acceptor glycolipid. Ceramide is an acceptor for either glucose (from UDP-GIc) or galactose (from UDP-Gal), giving glucosyI- ceram~de or galactosy~ceram~de. These simple glycosphingolipids predominate in human plasma and the brain, respectively, and aLso serve as precursors for more complex glycosphingolipids. In

137 most organs, including the brain, the major pathways involve con- version of glucosylceramide to lactosylceramide, Gal-beta-l,tGlc- Cer. Lactosylceram~de is the substrate for several glycosyltrans- ferases, the products of which are the first intermediates in the synthesis of related glycosphingolipids that may be classified ac- cording to their general structural characteristics. More than 100 different glycosphingolipids have already been characterized, and new compounds are still being discovered. Although some of the glycosphingolipids may contain between 15 and 35 or more sugar residues, most of the commonly occurring types have between 4 and 10 residues in the oligosaccharide chain. ANALYSIS OF PRIMARY AND TERTIARY STRUCTURE A complete understanding of the interactions between car- bohydrates and proteins (enzymes, lectins, antibodies, and cell surface receptors) will depend on the determination of accurate three-dimensional structures of both kinds of molecules. As was noted, the plenary structures of the oligosaccharide chains of com- plex carbohydrates cannot be deduced from DNA sequences and so must be determined by chemical and spectroscopic analysis. Modern chromatographic methods of separation, along with mass spectrometry and nuclear magnetic resonance (NMR), allow us to carry out complete analysis of a primary structure on a one micro- mole sample. Still to be determined are composition; arrangement of sugar residues; ring size; positions of glycosidic linkages and their anomerity; and the location and the chemical nature of non- carbohydrate substituents such as lipids, sulfate, and phosphate groups. Three-dimensional structures of carbohydrates represent the spatial arrangements of the individual sugar residues. Most com- monly occurring mammalian complex carbohydrates consist of sugar residues that exist in the pyranose ring form, the most sta- ble and rigid conformation of which are the chair forms. When two sugar residues are joined together covalently In a glycosidic linkage, they are free to rotate about the glycosidic oxygen atom between the two rings, and the resulting disaccharide can therefore assume a number of different conformations corresponding to the rotations about these two bonds. It is customary to designate the dihedral angles at the glycosidic linkage (see Figure ~2) by the

138 O. HOH>~ / \ H-1 74 HO \.4 OH HO ~ CH, OH H-4 2 0~ HC \ OH FIGURE 8-2 Dihedral angles determining the spatial relationship of two sugar residues in a disaccharide. Greek symbols phi (~) and psi ('b), where the initial conforma- tion (l = 0°,'6 - 0°) is that conformer where the C-1 H-1 bond eclipses O-C'-X' and C-1-O eclipses C'-X' H-X'. The relative orientations of adjacent sugar residues in an oligosaccharide chain are described by specifying the rotational angles (¢, '6) at each glycosidic oxygen atom. When these angles are the same at each linkage, the chain has a helical conformation with n residues per turn and h unit translation along the helical axis. If n and h are available from x-ray data, then ~ and ~ can be computed and vice versa. If ~ and ~ are different among glycosidic linkages in an oligosaccharide chain, the three-dimensional struc- ture becomes non-periodic and, for extreme variations, assumes a random cod! conformation. Information about perturbations can be obtained by light-scattering, viscosity, sedimentation, and dif- fusion measurements. X-1lAY ANALYSIS OF CRYSTAL STRUCTURES OF CARBOHYDRATES Of the three major classes of complex biological molecules, we have the least structural information at atomic resolution about carbohydrates. This is because they have not been crystallized, and consequently there is no relevant crystal structure data base other than that of the simple monomers to trimers upon which to mode! classical or semiempirical quantum mechanical calculations. The blood group-specific oligosaccharides, cord factors, and lipids A and X are typical examples. Exceptions are the cyclodextrins, which crystallize well, but are conformationally a separate class. Structures derived from the fiber-patterns of polysaccharides are

139 model-dependent and do not constitute a source of definitive struc- tural data. Stachyose, an oligosaccharide consisting of four sugar residues, is the largest noncyclic oligosaccharide for which there is a crystal structure analysis, but even in this case, the associated water structure has not been determined. The crystallinity problem is only partially intrinsic. Carbohy- drates do not solvate the same way as proteins, oligonucleotides, or nucleic acids. However, fewer efforts have been made to obtain the significant amounts of configurationally homogeneous material needed to conduct crystallization experunents than were made for proteins and nucleic acids. Another aspect of the crystallography of glycoconjugates Is that the electron density for the oligosac- charide portion of glycoproteins has rarely been interpreted, even though several crystalline glycoprote~ns have been studied. This is because the standard refinement programs cannot handle the oligosaccharides, or there is microheterogeneity at the site of gly- cosylation, and so it is left out of the model. Thus, a potentially valuable source of information Is not being exploited for lack of appropriate program development or strategic approaches to deal with m~croheterogeneity. Steric considerations about the minimum approach distances between atoms, derived from observed nonbonded distances in various crystal structures, can be used to predict alDowed confor- mations. This chard spheres approach, which was originally devel- oped by V.S.R. Rao in the ~rud-1970s, is a rudimentary method of theoretical calculation that ignores electrostatic effects (hydrogen bonding), but does give a qualitative prediction of structure. This approach was subsequently extended by adapting energy calcula- tions originally used for peptides, where the potential energy is divided into functions that describe Excrete contributions such as van der Waals energies, electrostatic interactions, torsional energy, hydrogen bond energy, and bond and angle deformations (Bock, 1983~. The data are presented In the form of computer-generated energy contour maps. In much of the recent literature, conformational energy cal- culations have been made using a form of Rao's parameters with an added torsional potential about one of the glycosidic bonds (exoanomeric effect). This approach, which goes by the name HSEA (hard-sphere exoanomeric) method (Bock, 1983), has been used with success by Lemieux and Bock (1983), Carver and Bris- son (1984), and others, although it contains a number of untested .

140 assumptions. The addition of a hydrogen bond potential (HEAH method) yields energy minimization results that differ from those calculated by the HSEA method, from which geometries can be derived that differ from those obtained by the HSEA method. NMR SOLUTION STRUCTURES OF CARBOHYDRATES Proton NMR methods provide detailed experimental data from which three-dimensional structures can be determined and compared with conformations arrived at by potential energy cal- culations. Carver ant} Cumming (1987) have generated contour maps of computed NOEs of various high mannose oligosaccha- rides as a function of the torsional angles ~ and fib. They then related them to experimental results as wed as to m~nnnum en- ergy conformations estimated by various potential energy calcula- tions (Carver and Cumming, in press). Brisson and Carver (1983) evaluated the utility of this approach using two biantennary com- plex type glycopeptides (See Figure 8-3~. Since the NO~derived conformations were within a range centered on the minimum en- ergy conformations derived from potential energy calculations, it was concluded "that motional.averaging is confined to a narrow range about one stable conformations (Briton and Carver, 1983~. It now appears, however, that it is meaningless to seek a single NO~derived conformation that satmfies a single potential energy minnnum, because the molecules in fact may occupy such minima for a very small proportion of the time in solution. "Conforma- tional flexibility must be incorporated into the theoretical treat- ment~ (Carver and Cumming, 1987), and the calculation of energy surfaces becomes extremely important. The latest studies by Cum- ming and Carver indicate that NO~determined three-dimensional structures may differ significantly from any minimum energy con- formation. They have concluded from this that the NO~derived conformations in such cases might correspond to "virtual" confor- mations as defined by Jardetzky (1980) to be computed structures that few if any molecules in solution actually adopt. Scarsdale et al. (in press) have employed a molecular mechan- ics-based program in an effort to mode} conformational averaging of NMR data. Conformations were calculated using a combina- tion of molecular potentials and NMR data for the oligosaccharide moiety of an erythrocyte glycolipid composed of three neutral sugars and an amino sugar. The lowest energy conformer cImely

141 GlcNAc 01,2Manoc1,6 A. GlcNAc 1, 2Manocl,3 GlcNAc 01, 2Man =1,6 Fucocl ,6 Many 1, 4GlcNAc ,B 1,4GlcNAcp 1,Asn Fuchs, 6 B. GlcNAc,B1, 4Man 01, 4GlcNAc §1, 4GlcNAc 01, Asn GlcNAc 01, 2Manoc1,3 / FIGURE 8-3 Structures of two partially processed asparagin~linked car- bohydrate chains. The bisecting ,B1,4GlcNAc of B causes a conformational difference from that of A. resembled a structure proposed earlier. However, fits to data could be improved when two equilibrating conformers were considered. Thus, it may be possible to determine solution conformations of the complex carbohydrates, even in nonrigid cases, using a combi- nation of calculations and constraints imposed from experimental NMR data. Despite the questions raised about the interpretation of NMR results and the value of potential energy m~nirn~zations, some important information has been collected about interactions of carbohydrate antigens with antibodies (Lemieux et al., 1985), oligosaccharides with lectins such as concanavalin A (Sekharudu et al., 1986), and oligosaccharides with glycosy~transferase en- zymes (Carver and Cumrn~ng, 1987~. Further refinements will depend upon the development of an agreed-on set of potential en- ergy functions, which can be used with experunentally determined NO~derived three-dimensional structures to evaluate whether a given molecule is distributed among several low energy conforma- tions or occupies a particular subset of them. Tvaroska and Perez (1986) have recently compared several conformational energy cal- culations and proposed a general strategy for oligosaccharides. Computer time and access to appropriate parallel processing array processors are important considerations in determining the level of support of research in this area at the present time. The availability of machines to calculate interatomic distances and van

142 der Waa~ contributions extremely fast is a question that should be addressed by funding agencies. Interestingly, the several super- computers currently operating on campuses have not been used to their capacity; perhaps efforts should be directed by appropriate advisory groups at these centers toward developing necessary soft- ware in these computers and establishing a policy that would direct a portion of their tune for computer modeling of three-dimensional structures. SUPRAMOLECULAR STRUCTURE Structures that consist of more than one macromolecule inter- act as a unit in biological phenomena such as catalysis by many enzymes, binding at a cell surface, signal transduction across cell membranes, and other biological phenomena. Any enzyme that consists of more than one subunit should be thought of as a supramolecular structure. When large numbers of subunits are involved, and perhaps carry out more than one function, special consideration may have to be given to their relative spatial orients tions. Examples are the replication of DNA by DNA polymerases, where complexes containing lO.or 12 proteins (called prunosomes) are required to initiate replication. Ribosomes are even more com- plex, requiring at least 75 proteins to translate messenger RNA. Surfaces that consist of more than one macromolecule often be- have as a functional unit. For example, the uptake of cholesterol by many ceils requires the interaction of a specific cell surface receptor with a polypeptide surface of a complex supramolecular structure called low density lipoprotein (LJDL), which consists of protein, cholesterol, phospholipids, and triacy~glycero~. Alteration of the LDL protein by acetylation of a Lys residue blocks the binding of LDL to its receptor and uptake of cholesterol by the cell. Several hormones, including norepinephrine and epidermal growth factor (EGF), and other signals such as light twith rhodopsin) induce protein phosphorylation. ELF stimulates the growth of normal fi- broblasts by binding to a specific transmembrane protein receptor on the cell surface. The hormone signal in this case is transducer] by self-phosphorylation of the receptor on the intracellular side after the hormone binds, followed by other kinase-catalyzed phos- phorylations of proteins, internalization of the EGF-EGF receptor complex, and a complex set of consequences in the nucleus and elsewhere In preparation for cell division. Bremer et al. (1986)

143 recently found that GM3 ganglioside inhibits this process in an allosteric fashion by preventing the self-phosphorylation of EGF receptor after EGF binding. To accomplish this, GM3 in the outer half of the cell membrane must interact with a domain of the polypeptide chain of EGF receptor, probably causing a conforma- tional change that prevents phosphorylation. A similar situation involving a lipid membrane Is found with a m~tochondrial enzyme, beta-hydroxybutyric dehydrogenase, which is catalytically active only when incorporated into a lipid bilayer composed of certain phospholipids. Computer-assisted mathematical modeling of such supramolecular structures will be necessary to gain a deeper un- derstanding of the organization of biological materials for complex functions.

Next: 9 Hardware »

Computer Assisted Modeling: Contributions of Computational Approaches to Elucidating Macromolecular Structure and Function (1987)

Chapter: 8 Structure and Function of Complex Carbohydrates

Welcome to OpenBook!

Get Email Updates