Genome sequencing and annotation has enabled the development of genome-scale constraint-based metabolic models for hundreds of microbes. These models have been used to characterize and predict the metabolic potential and behavior of a diverse collection of prokaryotes, eukaryotes, and archeae—including those with medical, biotechnological, and environmental applications. Initial models were built to study individual microbes grown in monoculture; however, over the past 10 years, modeling efforts have been extended to study metabolic interactions between microbes in synthetic and natural microbiomes. The remaining sections describe how constraint-based models are built from genomic information, and how these models have been used to answer qualitative and quantitative questions regarding cellular metabolism for individual species and microbial communities.
RECONSTRUCTING METABOLIC NETWORKS AND BUILDING CONSTRAINT-BASED MODELS
Constraint-based metabolic models are built from an organism’s genome-scale metabolic network reconstruction. A metabolic reconstruction details the enzymatic and transport reactions that an organism can catalyze and the genes responsible for these reactions. An organism’s genome annotation is one of the primary sources of information used to reconstruct a metabolic network. Metabolic and transport genes are identified, and elementally balnaced and charge-balanced reactions associated with these genes are included in the reconstruction. Because many reactions commonly occur across species, a variety of metabolic databases and tools can be used to facilitate reconstructing metabolic networks (Hamilton and Reed, 2014). Databases such as KEGG (Kanehisa and Goto, 2000), MetaCyc (Krieger et al., 2004), and Model Seed (Henry et al., 2010) can be used to translate genome annotations into draft metabolic reconstructions. These reconstructions may contain metabolic gaps due to missing reactions, which occur spontaneously or are associated with genes that are incorrectly or incompletely annotated. These metabolic gaps can be identified and resolved by converting these metabolic reconstructions into constraint-based metabolic models.
Constraint-based metabolic models calculate intracellular flux distributions that satisfy three fundamental types of constraints (Price et al., 2004). The first type of constraint is a steady-state mass-balance constraint, which sets the total production and consumption rates for each metabolite to be equal. This ensures that there is no net
a Department of Chemical and Biological Engineering, University of Wisconsin–Madison.
* Corresponding Author: firstname.lastname@example.org.
accumulation or depletion of intracellular metabolites. These mass-balance constraints can be used when metabolism is in a steady or a quasi-steady state. The second type of constraint is associated with reaction reversibility and ensures that irreversible reactions can only operate in the appropriate directions. This reversibility constraint was traditionally derived based on biochemical and physiological data, but more recently can be determined using thermodynamic estimates for changes in Gibbs energies due to a reaction (Henry et al., 2007; Fleming et al., 2009). The third type of constraint is referred to as enzyme capacity constraints. For a subset of reactions where the flux capacities are known or measured, upper and lower bounds for fluxes can be imposed. In most cases, capacity constraints limit a small number of fluxes that can easily be measured experimentally, such as growth rates, nutrient uptake rates, or product secretion rates. Together these three types of constraints define a solution space of possible intracellular flux distributions. Since there are often multiple solutions to constraint-based models, optimization can be used to identify optimal flux distributions, including those that maximize biomass yields, minimize enzyme usage or total flux, and minimize flux changes (Orth et al., 2010). In the vast majority of constraint-based models, kinetic parameters and regulatory effects are not included, but such constraints can be included if this information is available (Covert et al., 2004; Yizhak et al., 2010; Cotten and Reed, 2013).
Constraint-based models were initially built for individual species, and hundreds of such models exist for various bacteria, eukaryotes, and archaea (see Systems Biology Research Group, 2017), for a maintained list of models that have been validated against experimental data. Recently, Magnusdottir and colleagues reported the development of a semiautomated pipeline that was used to build 773 individual metabolic models for microbes found in the human gut microbiome (Magnusdottir et al., 2016). Multispecies models have also been developed over the past 10 years. In most of these multispecies models, the reactions and metabolites in each species are accounted for separately, meaning that metabolite production and consumption rates in each species are balanced. These multispecies models also allow for the exchange of metabolites between species by introducing an additional compartment to the model representing the media or shared environment. By modeling the shared environment explicitly, the relative or absolute abundance of different species can be predicted and accounted for. To date, most multispecies models have been developed for synthetic and natural communities containing just a few species, although this is likely to be expanded in the coming years.
While constraint-based models are inherently quantitative, meaning they provide numerical values for all fluxes in the metabolic network, they can be used to answer qualitative and quantitative questions about the metabolic behavior of an organism or microbial community. Qualitative predictions typically require less physiological information because the results are qualitatively insensitive to the enzyme capacity constraints imposed. However, if quantitative predictions are desired, then more physiological data are needed to constrain the metabolic models.
QUALITATIVE PREDICTIONS FOR SINGLE- AND MULTISPECIES CONSTRAINT-BASED MODELS
Constraint-based models can be used to answer a variety of qualitative questions regarding cellular metabolism. These models can be used to predict nutrient utilization, minimal medium requirements, product secretion, pathway utilization, gene essentiality, synthetic lethality, and missing reactions from network reconstructions. The amount of data needed to generate qualitative predictions is typically lower than that for quantitative predictions; however, the types of data needed depend on the questions being asked. To answer qualitative questions related to growth or cellular fitness, the metabolic reconstruction and a list of biomass components are needed. Here, the biomass components include the chemicals that must be produced to generate new cells, including amino acids, nucleic acids, lipids, and cofactors; what is typically not needed for qualitative predictions are measurements of biomass composition or uptake and secretion rates. While the model-predicted fluxes will depend on the enzyme capacity constraints imposed, the qualitative output of the model will not change if the capacity constraints are scaled up or down.
Individual species models have been successfully used to predict what genes and nutrients are essential for growth. In this case, the models determine whether nutrients present in the media or environment can be converted by the metabolic reactions into all biomass components. In the case of gene deletion simulations, reactions associated with these genes are removed by constraining the associated fluxes to zero. Metabolic models of Escherichia
coli have been used to predict which carbon, nitrogen, phosphorous, and sulfur sources can be used to support growth (Orth et al., 2011), while models for Mycoplasma genitalium (Suthers et al., 2009) and Bacteroides caccae (Magnusdottir et al., 2016) have been used to define medium components necessary for growth. Gene essentiality has also been predicted and compared to experimental results for a number of different species, including E. coli (Feist et al., 2007), Saccharomyces cerevisiae (Zomorrodi and Maranas, 2010), and Bacillus subtilis (Henry et al., 2009), with accuracies of 92%, 83%, and 95%, respectively.
Discrepancies between qualitative model predictions and experimental results can be used to improve the metabolic models and refine genome annotations (Orth and Pallson, 2010). As noted earlier, constraint-based models can be used to help identify and fill gaps in draft metabolic models in a process referred to as gap filling. Missing reactions and isozymes in draft models can be identified by resolving discrepancies where the models predict no growth, but cells grow experimentally. Previously, mispredictions associated with carbon source utilization, gene essentiality, and synthetic lethality have been used to add reactions and genes to the metabolic models (Reed et al., 2006; Henry et al., 2009; Zomorrodi and Maranas, 2010). Similarly, reactions and genes can be removed from the models to resolve discrepancies where the models predict growth but the cells do not grow experimentally (Kumar and Maranas, 2009). With the development of high-throughput mutant fitness experiments like TnSeq (van Opijnen and Camilli, 2013) and BarSeq (Wetmore et al., 2015), these gene essentiality comparisons will become more readily available to help probe and improve constraint-based models for a variety of organisms.
Multispecies models have been used to predict the types of interactions that might exist between microbes in a community (Heinken and Thiele, 2015; Magnusdottir et al., 2016). These predictions were made based on how predicted growth rates change for each organism between monoculture and co-culture conditions. For monoculture simulations, the individual growth rate was maximized, while the sum of both microbes’ growth rate was maximized for co-culture simulations. Magnusdottir and colleagues recently predicted all pairwise interactions between 773 microbes found in the human gut microbiome under four different dietary conditions (Magnusdottir et al., 2016). Microbial growth was predicted to increase or decrease in co-culture if the growth rate in co-culture was more or less than 10% of the growth rate in monoculture, respectively. Most interactions were predicted to be parasitic (38-41%) or commensal (22-30%), where either one microbe’s growth rate increases while the other’s decreases or one microbe’s increased growth does not affect the other in co-culture. The types of interactions between pairs of microbes depended on diet and oxygen conditions. For example, a low- versus high-fiber diet impacted the numbers of commensal interactions, while aerobic versus anaerobic conditions mostly impacted the numbers of mutualistic and amensal interactions (Magnusdottir et al., 2016).
QUANTITATIVE PREDICTIONS FOR SINGLE- AND MULTISPECIES CONSTRAINT-BASED MODELS
To answer quantitative metabolic questions, more information is typically needed to constrain the genome-scale metabolic models. This information can include measurements of biomass component composition of cells, uptake and secretion rates, growth rates, and kinetic parameters. Such information can be used by the models to predict uptake and secretion rates, intracellular fluxes in metabolic pathways, metabolite concentrations, growth rates, interspecies fluxes, and community composition.
Individual species models have been frequently used to predict metabolic fluxes in response to genetic or environmental changes. A number of constraint-based methods have been developed specifically for this purpose, where they identify flux distributions that minimize flux differences between perturbed and unperturbed states (Segre et al., 2002; Shlomi et al., 2005; Kim and Reed, 2012). These methods have been used to successfully predict central metabolic fluxes and growth rates for a variety of gene knockout mutants—including E. coli, S. cerevisiae, and B. subtilis—or growth conditions (Kim and Reed, 2012). The accuracy of these tools has enabled the use of metabolic models for metabolic engineering strain design purposes. For example, combinations of metabolic additions and deletions needed to couple growth and product formation (Burgard and Maranas, 2003; Kim et al., 2011) or that maximize productivity can be identified (Patil et al., 2005). These tools have been used to design strains that produce polymer precursors (Fong et al., 2005), nutriceuticals (Lee et al., 2007; Park et al., 2007), and commodity chemicals (Kim and Reed, 2010).
A number of studies have used multispecies models to predict community-, interspecies-, and intraspecies-level fluxes in microbial communities. One of the first community models was developed for a syntrophic community containing the sulfate reducer Desulfovibrio vulgaris and methanogen Methanococcus maripaludis. Stoylar and colleagues used community measurements of lactate and hydrogen fluxes to predict acetate, methane, carbon dioxide, and biomass production rates (Stolyar et al., 2007). Wintermute and Silver used metabolic models to predict how pairs of E. coli auxotrophs would grow, and found a strain’s growth in co-culture was correlated with the ratio of the growth benefit for acquiring that strain’s essential nutrients to the cost of producing those nutrients by the other partner strain (Wintermute and Silver, 2010). Dynamic multispecies models have been developed that can capture changes in community composition over time when species have different growth rates. Such models have been developed for communities that degrade cellulose (Salimi et al., 2010), co-utilize glucose and xylose (Hanly and Henson, 2011), cross-feed amino acids (Zhang and Reed, 2014), and reduce uranium (Zhuang et al., 2011). Zhuang and colleagues developed a dynamic community model of Rhodobacter ferrireducens and Geobacter sulfurreducens that included kinetic parameters for nutrient uptake rates. The model accurately predicted changes in G. sulfurreducens abundance in response to acetate amendment as a function of ammonium availability (Zhuang et al., 2011). In many multispecies models, either individual species are assumed to maximize their own growth rate or the combined growth rates of all species in the community are maximized. In contrast, the OptCom modeling framework was developed to allow both community- and species-level objective functions to be optimized (Zomorrodi and Maranas, 2012). Application of OptCom to different synthetic and natural communities found that some microbes in phototrophic microbial mats reduce their species-level biomass production to increase community-level biomass production, while microbes in a synthetic community representing a subsurface anaerobic environment maximize community and individual species biomass production (Zomorrodi and Maranas, 2012).
CHALLENGES AND FUTURE DIRECTIONS
While modeling microbiomes is an exciting and expanding area of research, there are a number of experimental and computational challenges that need to be overcome to move the field forward in its ability to more accurately predict the qualitative and quantitative behaviors of microbial communities. A current limitation for modeling microbial communities is a lack of experimental inter- and intraspecies flux measurements, which are needed to evaluate and improve model predictions. Monoculture measurements of intracellular and extracellular fluxes have been invaluable for the development of modeling approaches and the identification of objective functions that best predict monoculture behaviors (Burgard et al., 2003; Schuetz et al., 2007); however, analogous co-culture flux measurements are more difficult to acquire. Extracellular fluxes for individual species are more challenging to measure for metabolites that are produced or consumed by multiple community members. Recent advances using carbon-13 labeling experiments have been able to resolve intracellular fluxes in two-species communities (Gebreselassie and Antoniewicz, 2015). Improvements in experimental techniques to measure fluxes in microbial communities will enable the development of constraint-based modeling approaches to more accurately predict fluxes in microbial communities by identifying appropriate species- and community-level objective functions. Individual models have been successfully used to design genetic and environmental perturbations to achieve desired phenotypes; however, to extend such approaches to design microbial communities and manipulate their behaviors will require knowledge of which objective functions accurately predict intracellular and extracellular fluxes in co-culture.
Another challenge deals with building metabolic models from genomic and metagenomic data. With a few exceptions, constraint-based models are built using genomic data from individual organisms; however, metagenomic sequencing identifies metabolic genes in community members, but it lacks complete details on which microbe these genes belong to. As a result, it is difficult to predict which metabolic genes and reactions should go into different species’ models. Biggs and Papin have recently developed a new approach to try and address this issue (Biggs and Papin, 2016). Another challenge with using genomic and metagenomic annotations to build models is predicting what metabolites can be taken up and secreted by different organisms from sequencing data alone. While transporter mechanisms can be predicted based on sequence information, it is more difficult to predict which specific metabolites are being taken up or excreted by these transporters. Thus, improving transporter annotations
and their experimental characterization will help improve predictions of nutrient uptake, product secretion, and metabolite exchange in microbial communities.
Most current constraint-based modeling studies of communities have not accounted for spatial variation in microbial communities. However, spatial chemical gradients will develop in a lot of natural communities where good mixing does not occur. Since cellular behaviors are dependent on chemical concentrations in their local environments, future microbiome models should also include approaches to predict concentration gradients in response to flow, diffusion, and microbial metabolism.
Constraint-based models can be used to study a diverse range of organisms and microbial communities, including synthetic and natural communities associated with ocean, marine, and human environments. To date, multispecies models have mostly been used to study communities with low diversity, comprising two- or three-member communities. As tools for building, refining, and simulating multispecies models improve, the numbers of microbiomes being modeled and their applications to describe, predict, and design the chemistries being performed by communities will increase.
This work was funded by the Office of Science (BER), the U.S. Department of Energy (DE-SC0008103), the U.S. Department of Energy Great Lakes Bioenergy Research Center (DOE BER Office of Science DE-FC0207ER64494), and the National Science Foundation (NSF 1053712).
Biggs, M. B., and J. A. Papin. 2016. Metabolic network-guided binning of metagenomic sequence fragments. Bioinformatics 32(6):867-874.
Burgard, A. P., and C. D. Maranas. 2003. Optimization-based framework for inferring and testing hypothesized metabolic objective functions. Biotechnol Bioeng 82(6):670-677.
Burgard, A. P., P. Pharkya, and C. D. Maranas. 2003. Optknock: A bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol Bioeng 84(6):647-657.
Cotten, C., and J. L. Reed. 2013. Mechanistic analysis of multi-omics datasets to generate kinetic parameters for constraint-based metabolic models. BMC Bioinformatics 14:32.
Covert, M. W., E. M. Knight, J. L. Reed, J. Markus, J. Herrgard, and B. O. Palsson. 2004. Integrating high-throughput and computational data elucidates bacterial networks. Nature 429(6987):92-96.
Feist, A.M., C. S. Henry, J. L. Reed, M. Krummenacker, A. R. Joyce, P. D. Karp, L. J. Broadbelt, V. Hatzimanikatis, and B. Ø. Palsson. 2007. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol 3:121.
Fleming, R. M., I. Thiele, and H. P. Nasheuer. 2009. Quantitative assignment of reaction directionality in constraint-based models of metabolism: Application to Escherichia coli. Biophys Chem 145(2-3): 47-56.
Fong, S. S., A. P. Burgard, C. D. Herring, E. M. Knight, F. R. Blattner, C. D. Maranas, and B. Ø. Palsson. 2005. In silico design and adaptive evolution of Escherichia coli for production of lactic acid. Biotechnol Bioeng 91(5):643-648.
Gebreselassie, N. A., and M. R. Antoniewicz. 2015. (13)C-metabolic flux analysis of co-cultures: A novel approach. Metab Eng 31:132-139.
Hamilton, J. J., and J. L. Reed. 2014. Software platforms to facilitate reconstructing genome-scale metabolic networks. Environ Microbiol 16(1):49-59.
Hanly, T. J., and M. A. Henson. 2011. Dynamic flux balance modeling of microbial co-cultures for efficient batch fermentation of glucose and xylose mixtures. Biotechnol Bioeng 108(2):376-385.
Heinken, A., and I. Thiele. 2015. Anoxic conditions promote species-specific mutualism between gut microbes in silico. Appl Environ Microbiol 81(12):4049-4061.
Henry, C. S., L. J. Broadbelt, and V. Hatzimanikatis. 2007. Thermodynamics-based metabolic flux analysis. Biophys J 92(5):1792-1805.
Henry, C. S., J. F. Zinner, M. P. Cohoon, and R. L. Stevens. 2009. iBsu1103: A new genome-scale metabolic model of Bacillus subtilis based on SEED annotations. Genome Biol 10(6):R69.
Henry, C. S., M. DeJongh, A. A. Best, P. M Frybarger, B. Linsay, and R. L. Stevens. 2010. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol 28(9):977-982.
Kanehisa, M., and S. Goto. 2000. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28(1):27-30.
Kim, J., and J. L. Reed. 2010. OptORF: Optimal metabolic and regulatory perturbations for metabolic engineering of microbial strains. BMC Syst Biol 4:53.
Kim, J., and J. L. Reed. 2012. RELATCH: Relative optimality in metabolic networks explains robust metabolic and regulatory responses to perturbations. Genome Biol 13(9):R78.
Kim, J., J. L. Reed, and C. T. Maravelias. 2011. Large-scale bi-level strain design approaches and mixed-integer programming solution techniques. PLoS One 6(9):e24162.
Krieger, C. J., P. Zhang, L. A. Mueller, A. Wang, S. Paley, M. Arnaud, J. Pick, S. Y. Rhee, and P. D. Karp. 2004. MetaCyc: A multiorganism database of metabolic pathways and enzymes. Nucl Acids Res 32(Database issue):D438-D442.
Kumar, V. S., and C. D. Maranas. 2009. GrowMatch: An automated method for reconciling in silico/in vivo growth predictions. PLoS Comput Biol 5(3):e1000308.
Lee, K. H., J. H. Park, T. Y. Kim, H. U. Kim, and S. Y. Lee. 2007. Systems metabolic engineering of Escherichia coli for L-threonine production. Mol Syst Bio 3:149.
Magnusdottir, S., A. Heinken, L. Kutt, D. A. Ravcheev, E. Bauer, A. Noronha, K. Greenhalgh, C. Jäger, J. Baginska, P. Wilmes, R. M. Fleming, and I. Thiel. 2016. Generation of genome-scale metabolic reconstructions for 773 members of the human gut microbiota. Nat Biotechnol 35:81-89.
Orth, J. D., and B. Ø. Palsson. 2010. Systematizing the generation of missing metabolic knowledge. Biotechnol Bioeng 107(3):403-412.
Orth, J. D., I. Thiele, and B. Ø. Palsson. 2010. What is flux balance analysis? Nat Biotechnol 28(3):245-248.
Orth, J. D., T. M. Conrad, J. Na, J. A. Lerman, H. Nam, A. M. Feist, and B. Ø. Palsson. 2011. A comprehensive genome-scale reconstruction of Escherichia coli metabolism—2011. Mol Syst Biol 7:535.
Park, J. H., K. H. Lee, T. Y. Kim, and S. Y. Lee. 2007. Metabolic engineering of Escherichia coli for the production of L-valine based on transcriptome analysis and in silico gene knockout simulation. Proc Natl Acad Sci USA 104(19):7797-7802.
Patil, K. R., I. Rocha, J. Förster, and J. Nielsen. 2005. Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics 6:308.
Price, N. D., J. L. Reed, and B. Ø. Palsson. 2004. Genome-scale models of microbial cells: Evaluating the consequences of constraints. Nat Rev Microbiol 2(11):886-897.
Reed, J. L., T. R. Patel, K. H. Chen, A. R. Joyce, M. K. Applebee, C. D. Herring, O. T. Bui, E. M. Knight, S. S. Fong, and B. Ø. Palsson. 2006. Systems approach to refining genome annotation. Proc Natl Acad Sci USA 103(46):17480-17484.
Salimi, F., K. Zhuang, and R. Mahadevan. 2010. Genome-scale metabolic modeling of a clostridial co-culture for consolidated bioprocessing. Biotechnol J 5(7):726-738.
Schuetz, R., L. Kuepfer, and U. Sauer. 2007. Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli. Mol Syst Biol 3:119.
Segre, D., D. Vitkup, and G. M. Church. 2002. Analysis of optimality in natural and perturbed metabolic networks. Proc Natl Acad Sci USA 99(23):15112-15117.
Shlomi, T., O. Berkman, and E. Ruppin. 2005. Regulatory on/off minimization of metabolic flux changes after genetic perturbations. Proc Natl Acad Sci USA 102(21):7695-7700.
Stolyar, S., S. Van Dien, K. L. Hillesland, N. Pinel, T. J. Lie, J. A. Leigh, and D. A. Stahl. 2007. Metabolic modeling of a mutualistic microbial community. Mol Syst Biol 3:92.
Suthers, P. F., M. S. Dasika, V. S. Kumar, G. Denisov, J. I. Glass, and C. D. Maranas. 2009. A genome-scale metabolic reconstruction of Mycoplasma genitalium, iPS189. PLoS Comput Biol 5(2):e1000285.
Systems Biology Research Group. 2017. Other Organisms. http://systemsbiology.ucsd.edu/InSilicoOrganisms/OtherOrganisms (accessed February 2, 2017).
van Opijnen, T., and A. Camilli. 2013. Transposon insertion sequencing: A new tool for systems-level analysis of microorganisms. Nat Rev Microbiol 11(7):435-442.
Wetmore, K. M., M. N. Price, R. J. Waters, J. S. Lamson, J. He, C. A. Hoover, M. J. Blow, J. Bristow, G. Butland, A. P. Arkin, and A. Deutschbauer. 2015. Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons. mBio 6(3):e00306-e00315.
Wintermute, E. H., and P. A. Silver. 2010. Emergent cooperation in microbial metabolism. Mol Syst Biol 6:407.
Yizhak, K., T. Benyamini, W. Liebermeister, E. Ruppin, and T. Shlomi. 2010. Integrating quantitative proteomics and metabolomics with a genome-scale metabolic network model. Bioinformatics 26(12):i255-i260.
Zhang, X., and J. L. Reed. 2014. Adaptive evolution of synthetic cooperating communities improves growth performance. PLoS One 9(10):e108297.
Zhuang, K., M. Izallalen, P. Mouser, H. Richter, C. Risso, R. Mahadevan, and D. R. Lovley. 2011. Genome-scale dynamic modeling of the competition between Rhodoferax and Geobacter in anoxic subsurface environments. ISME J 5(2):305-316.
Zomorrodi, A. R., and C. D. Maranas. 2010. Improving the iMM904 S. cerevisiae metabolic model using essentiality and synthetic lethality data. BMC Syst Biol 4:178.
Zomorrodi, A. R., and C. D. Maranas. 2012. OptCom: A multi-level optimization framework for the metabolic modeling and analysis of microbial communities. PLoS Comp Biol 8(2):e1002363.
This page intentionally left blank.