8
Digitizing the Chemistry Associated with Microbes: Importance, Current Status, and Opportunities
Since the discovery of penicillin to treat individual infections, immunological disorders—such as asthma, diabetes, and Crohn’s disease—have risen sharply (Bach, 2002). Less than a decade ago, limited connections were made to microbes that live on and within the human body. However, since the launch of the Human Microbiome Project (HMP) from 2007 to 2012, which took an inventory of those microbes, associations to aberrant microbial communities and the above-mentioned immunological diseases have been made (Alivisatos et al., 2015; Gilbert et al., 2016; NIH, 2017). Since the HMP, the microbiome field has continued to take inventories and uncover the associations of microbial communities with many other diseases, including cancer, autism, depression, and obesity. Currently, the field of microbiomes is in transition from only taking inventories of the genetic material that is present—which will continue to be important—to understanding the functional roles that the microbes play; understanding the molecules associated with the microbiome will play a key role in this endeavor. This lecture summary on the chemistry of the human microbiome will highlight the status and opportunities for mass spectrometry–based chemical analysis of the microbiome.
The chemical makeup of the human microbiome and ecology is very diverse. The chemical environment and the microbes’ chemistry define the community a specific niche can support. For example, early colonization with Bifidobacterium theta and B. longum enables the processing of food sugars, such as complex carbohydrates; alters the immune homeostasis as reflected in the large changes in prostoglandin E2 that is increased by orders of magnitude; and also affects the amount of bile acids that are produced (Phelan et al., 2011). Furthermore, gut microbes are actively modifying bile acids and, therefore, early gut colonization not only impacts lipid and cholesterol transport, but also impacts pathogen colonization (Donia and Fischbach, 2015). There are three sources of chemicals associated with any ecological niche: the external niche chemistry imposed by the host, food, expo-some, medications, and personal care derived molecules; microbially modified molecules; and microbiome genome encoded molecules. This latter group of molecules can be further subdivided into common metabolites and metabolic pathways detailed in biochemistry textbooks; an estimated 35% of the protein-encoding genome is dedicated to the production of these common metabolites (Phelan et al., 2011). There are also specialized metabolites to which 5-30% of the genome is dedicated, including secondary metabolites, virulence factors, natural products, and metabolic exchange factors. The genes that make such molecules often cluster on the genome and are referred to
___________________
a Department of Pharmacology and Pediatrics, University of California, San Diego.
* Corresponding Author: pdorrestein@ucsd.edu.
as biosynthetic gene clusters. A recent inventory found 3,118 such clusters in the HMP metagenomics inventory. Properly, from a scientific standpoint, due to the stringent cutoff values the authors choose, this is very much an underestimate. If we consider other similar gene clusters from bacteria outside in other environments, one quickly realizes that these molecules must play significant roles in shaping the human microbiome. The activities of the molecules they produce speak volumes: immunosuppression, antimicrobials, protease inhibitors, and kinase inhibitors are representative activities. All of us are familiar with the molecules that are produced by such gene clusters—penicillin, vancomycin, rapamycin, and Taxol are just some examples. It is remarkable that similar gene clusters are found in the human microbiome, and very few of the products they produce have been characterized, but the ones that are have functions that could indeed shape microbial communities. Indeed, some such molecules isolated from human microbiome–derived organisms have antimicrobial activities and other properties (Kang and Brady, 2013; Donia and Fischbach, 2015). While these molecules are isolated from human microbiome samples or obtained through heterologous expression of the gene clusters, such molecules are usually not directly detected from human-derived samples—such as skin, feces, and saliva—making it difficult to truly assess the functions of these molecules and their role in defining the human ecosystem. While other methods exist for the characterization of chemicals, such as nuclear magnetic resonance, and infrared spectroscopies, engineered strains, or enzymatic systems to detect specific molecules, the focus of this presentation was mass spectrometry and challenges and the opportunities within this field to characterize the chemistry of the microbiome.
Right now, we do not know the 10 most common microbial molecules that are found in the gut or what the 10 most influential molecules are that shape microbial community composition. It is, therefore, difficult to shape the theories linking molecules to microbiome health. Mass spectrometry can identify short chain fatty acids, trimethylamine oxide, and microbe-associated molecules that are often of interest to the microbiome community, and are typically analyzed in a targeted fashion. However, this is akin to looking under a light post in the dark, and does not enable the discovery of molecules that may also be important. That is enabled through untargeted metabolomics; though, the challenge in detecting microbial molecules from human samples is three-pronged:
- The ability to detect the molecules is challenging because it is not known when and where they are produced, and there are few instances within a human body where the microbial biomass dominates the samples.
- Even if detected by mass spectrometry, it is unlikely they could be recognized as microbial molecules given that the reference data for microbial metabolites are virtually absent from metabolomics reference data collections (Scalbert et al., 2014; Johnson and Lange, 2015; Wang et al., 2016). Of these databases, the Gobal Natural Product Social Molecular Networking (GNPS) community created the largest number of microbial reference metabolites.
- Many specialized metabolites may not be detected from a human-derived sample.
As mass spectrometry equipment becomes more sensitive with higher throughput and improved detection of such molecules, the last limitation will continue to diminish. However, many opportunities remain, some of which are indicated next.
To take control of the human microbiome, it is necessary to understand its function and how the communities are shaped. From a mass spectrometry standpoint, there are several areas that would significantly improve its utility toward creating a functional understanding of microbial ecology. It is commonly accepted that the development of infrastructure to deposit gene sequences and the accompanying analysis infrastructure has revolutionized the life sciences. For the discovery of molecules, this knowledge sharing and the resulting ability to compare experimental data is in its infancy. When new molecules are uncovered, their associated data are deposited in supporting information, essentially rendering the data inaccessible. If we are fortunate, the structure, without its data, will be found in a database such as ChemSpider or PubChem. This is remarkable because structure elucidation is not easy to do and costs a significant amount of time and financial resources, which was the argument for sharing sequence data in the first place. Some reports list costs in the range of $25,000 to $86,000 to solve the structure; however, there have been molecules where that analysis is estimated to cost millions of dollars for a single molecule (Nguyen et al., 2016).
Data accessibility and standardization are, therefore, major opportunities to make it easier for the community
to develop tools that allow structure elucidation to become easier and allow the continued use of expensive data. For mass spectrometry, not only are the reference data inaccessible, if the study itself relies on metabolomics data, the knowledge and the data themselves are not reused in the way that sequence information is reused. Imagine a scenario where one could not compare individually characterized gene sequences or genomes—this is, by and large, the status in mass spectrometry. To date, and despite a decade of discussion on the importance of reusing it, there is only one study that has reused the raw metabolomics data of a previous study. However, mass spectrometry repositories have emerged in the last few years—such as MetaboLights, XCMS Online, Metabolomics Workbench, and GNPS—and are the first step to capture the data associated with microbiome chemical information (Gowda et al., 2014; da Silva et al., 2015; Kale et al., 2016). However, the real opportunity is not only the capture of knowledge from the scientific community so that it can be parsed by computers, but the development of analysis tools that enable the reuse of information captured from the knowledge of the community; such tools have enhanced the sequencing communities. This includes capturing and enabling the analysis of spectral information associated with the chemicals of the human microbiome, improving accessibility to this information, bolstering data standards, and including the history of annotations; currently, we can annotate, on average, 2% of metabolomics data (Biteen et al., 2016; Wang et al., 2016). Already we are uncovering the key roles associated with microbiome chemistry; imagine what could be done if this is improved to 20%.
Once we capture this knowledge, make the knowledge accessible, identify the cornerstone molecules that drive microbial communities, and enable their visualization, methods can be designed for in vivo microrobotic monitoring, remote sensing, and enabling the crowdsourcing of analysis, just to name a few (Chu et al., 2016). All of these will require the development of proper cyberinfrastructure so that the information can be mined from a larger knowledge base. Furthermore, if one wants to test the activities of these molecules, we need to have access to them. When the molecules of interest are not commercially available, the main ways to gain access to the molecules are via purification or synthesis, which will continue to play important roles; however, genetic engineering methods, such as the heterologous expression of entire gene clusters, are also emerging to enable the production of these molecules (Kembel et al., 2014).
If we want to understand a human being from a microbial standpoint, we must first collect their biochemical information. It is, therefore, important to develop tools and standards that enable the analysis and visualization of the complexity associated with their chemistry. Although not discussed in detail, if we want to consider the microbiome associated with humans, we must also consider microbes that are present within interior microbiomes and the relationships between building materials, indoor air, and surface reactions.
REFERENCES
Alivisatos, A. P., M. J. Blaser, E. L. Brodie, M. Chun, J. L. Dangl, T. J. Donohue, P. C. Dorrestein, J. A. Gilbert, J. L. Green, J. K. Jansson, R. Knight, M. E. Maxon, M. J. McFall-Ngai, J. F. Miller, K. S. Pollard, E. G. Ruby, and S. A. Taha. 2015. MICROBIOME. A unified initiative to harness Earth’s microbiomes. Science 350(6260):507-508. doi: 10.1126/science. aac8480.
Bach, J. F. 2002. The effect of infections on susceptibility to autoimmune and allergic diseases. N Engl J Med 347(12):911-920.
Biteen, J. S., P. C. Blainey, Z. G. Cardon, M. Chun, G. M. Church, P. C. Dorrestein, S. E. Fraser, J. A. Gilbert, J. K. Jansson, R. Knight, J. F. Miller, A. Ozcan, K. A. Prather, S. R. Quake, E. G. Ruby, P. A. Silver, S. Taha, G. van den Engh, P. S. Weiss, G. C. Wong, A. T. Wright, and T. D. Young. 2016. Tools for the microbiome: Nano and beyond. ACS Nano 10(1):6-37. doi: 10.1021/acsnano.5b07826.
Chu, J. X. Vila-Farres, D. Inoyama, M. Ternei, L. J. Cohen, E. A. Gordon, B. V. Reddy, Z. Charlop-Powers, H. A. Zebroski, R. Gallardo-Macias, M. Jaskowski, S. Satish, S. Park, D. S. Perlin, J. S. Freundlich, and S. F. Brady. 2016. Discovery of MRSA active antibiotics using primary sequence from the human microbiome. Nat Chem Biol 12(12):1004-1006. doi: 10.1038/nchembio.2207.
da Silva, R. R., P. C. Dorrestein, and R. A. Quinn. 2015. Illuminating the dark matter in metabolomics. Proc Natl Acad Sci USA 112(41):12549-12550. doi: 10.1073/pnas.1516878112.
Donia, M. S., and M. A. Fischbach. 2015. HUMAN MICROBIOTA. Small molecules from the human microbiota. Science 349(6246):1254766. doi: 10.1126/science.1254766.
Gilbert, J. A., R. A. Quinn, J. Debelius, Z. Z. Xu, J. Morton, N. Garg, J. K. Jansson, P. C. Dorrestein, and R. Knight. 2016. Microbiome-wide association studies link dynamic microbial consortia to disease. Nature 535(7610):94-103. doi: 10.1038/nature18850.
Gowda, H., J. Ivanisevic, C. H. Johnson, M. E. Kurczy, H. P. Benton, D. Rinehart, T. Nguyen, J. Ray, J. Kuehl, B. Arevalo, P. D. Westenskow, J. Wang, A. P. Arkin, A. M. Deutschbauer, G. J. Patti, and G. Siuzdak. 2014. Interactive XCMS Online: Simplifying advanced metabolomics data processing and subsequent statistical analyses. Anal Chem 86(14):6931-6939. doi: 10.1021/ac500734c.
Johnson, S. R., and B. M. Lange. 2015. Open-access metabolomics databases for natural product research: Present capabilities and future potential. Front Bioeng Biotechnol 3:22. doi: 10.3389/fbioe.2015.00022.
Kale, N. S., K. Haug, P. Conesa, K. Jayseelan, P. Moreno, P. Rocca-Serra, V. C. Nainala, R. A. Spicer, M. Williams, X. Li, R. M. Salek, J. L. Griffin, and C. Steinbeck. 2016. MetaboLights: An open-occess database repository for metabolomics data. Curr Protoc Bioinformatics 53:14.13.1-18. doi: 10.1002/0471250953.bi1413s53.
Kang, H. S., and S. F. Brady. 2013. Arimetamycin A: Improving clinically relevant families of natural products through sequence-guided screening of soil metagenomes. Angew Chem Int Ed Engl 52(42):11063-11067. doi: 10.1002/anie.201305109.
Kembel, S. W., J. F. Meadow, T. K. O’Connor, G. Mhuireach, D. Northcutt, J. Kline, M. Moriyama, G. Z. Brown, B. J. Bohannan, and J. L. Green. 2014. Architectural design drives the biogeography of indoor bacterial communities. PLoS One 9(1):e87093. doi:10.1371/journal.pone.0087093.
Nguyen, D. D., A. V. Melnik, N. Koyama, X. Lu, M. Schorn, J. Fang, K. Aguinaldo, T. L. Lincecum Jr, M. G. Ghequire, V. J. Carrion, T. L. Cheng, B. M. Duggan, J. G. Malone, T. H. Mauchline, L. M. Sanchez, A. M. Kilpatrick, J. M. Raaijmakers, R. Mot, B. S. Moore, M. H. Medema, and P. C. Dorrestein. 2016. Indexing the Pseudomonas specialized metabolome enabled the discovery of poaeamide B and the bananamides. Nat Microbiol 2:16197. doi: 10.1038/nmicrobiol.2016.197.
NIH (National Institutes of Health). 2017. NIH Human Microbiome Project (HMP) Roadmap Project. https://www.ncbi.nlm.nih.gov/bioproject/43021 (accessed February 2, 2017).
Phelan, V. V., W. T. Liu, K. Pogliano, and P. C. Dorrestein. 2011. Microbial metabolic exchange—the chemotype-to-phenotype link. Nat Chem Biol 8(1):26-35. doi: 10.1038/nchembio.739.
Scalbert, A., L. Brennan, C. Manach, C. Andres-Lacueva, L. O. Dragsted, J. Draper, S. M. Rappaport, J. J. van der Hooft, and D. S. Wishart. 2014. The food metabolome: A window over dietary exposure. Am J Clin Nutr 99(6):1286-1308. doi: 10.3945/ajcn.113.076133.
Wang, M., J. J. Carver, V. V. Phelan, L. M. Sanchez, N. Garg, Y. Peng, D. D. Nguyen, J. Watrous, C. A. Kapono, T. Luzzatto-Knaan, C. Porto, A. Bouslimani, A. V. Melnik, M. J. Meehan, W. T. Liu, M. Crüsemann, P. D. Boudreau, E. Esquenazi, M. Sandoval-Calderón, R. D. Kersten, L. A. Pace, R. A. Quinn, K. R. Duncan, C. C. Hsu, D. J. Floros, R. G. Gavilan, K. Kleigrewe, T. Northen, R. J. Dutton, D. Parrot, E. E. Carlson, B. Aigle, C. F. Michelsen, L. Jelsbak, C. Sohlenkamp, P. Pevzner, A. Edlund, J. McLean, J. Piel, B. T. Murphy, L. Gerwick, C. C. Liaw, Y. L. Yang, H. U. Humpf, M. Maansson, R. A. Keyzers, A. C. Sims, A. R. Johnson, A. M. Sidebottom, B. E. Sedio, A. Klitgaard, C. B. Larson, C. A. Boya, P. D. Torres-Mendoza, D. J. Gonzalez, D. B. Silva, L. M. Marques, D. P. Demarque, E. Pociute, E. C. O’Neill, E. Briand, E. J. Helfrich, E. A. Granatosky, E. Glukhov, F. Ryffel, H. Houson, H. Mohimani, J. J. Kharbush, Y. Zeng, J. A. Vorholt, K. L. Kurita, P. Charusanti, K. L. McPhail, K. F. Nielsen, L. Vuong, M. Elfeki, M. F. Traxler, N. Engene, N. Koyama, O. B. Vining, R. Baric, R. R. Silva, S. J. Mascuch, S. Tomasi, S. Jenkins, V. Macherla, T. Hoffman, V. Agarwal, P. G. Williams, J. Dai, R. Neupane, J. Gurr, A. M. Rodríguez, A. Lamsa, C. Zhang, K. Dorrestein, B. M. Duggan, J. Almaliti, P. M. Allard, P. Phapale, L. F. Nothias, T. Alexandrov, M. Litaudon, J. L. Wolfender, J. E. Kyle, T. O. Metz, T. Peryea, D. T. Nguyen, D. VanLeer, P. Shinn, A. Jadhav, R. Müller, K. M. Waters, W. Shi, X. Liu, L. Zhang, R. Knight, P. R. Jensen, B. Ø. Palsson, K. Pogliano, R. G. Linington, M. Gutiérrez, N. P. Lopes, W. H. Gerwick, B. S. Moore, P. C. Dorrestein, and N. Bandeira. 2016. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat Biotechnol 34(8):828-837. doi: 10.1038/nbt.3597.