Information extracted from microbiome molecular measurement tools is intended to reconstruct and ultimately predict critical microbiome features that include relative and absolute abundances of microbial taxa, persistence, impact on human health, and transmissibility. Application of the measurement tools is driven by access to observational data and intentional hypothesis-driven experiments. Given its central role in elucidating the microbiome world, tools for observational experimentation are the primary focus of this appendix (Gilbert et al., 2016; Wang and Jia, 2016). The key objective of these efforts is to determine site-specific microbial content across space and time, along with the corresponding building and environment conditions. Critical to these efforts is the collection of relevant metadata and data sharing to support interpretation of measurement results and accurate reproducible microbiome models. The development of mathematical and computational models for microbiome dynamics, the refinement of statistical techniques that guide sampling design, and the linkage of models to data are important parts of the toolkit of microbiome researchers. But in this appendix, the focus is on the basic data that are needed to parameterize and test such models in the first place.
Tools for controlled experiments are more commonly applied at the macro rather than the molecular level and are not the focus of this appendix. Nonetheless, molecular manipulation could emerge in the future as an
1 Mention of commercial products or organizations does not imply endorsement by the National Academies of Sciences, Engineering, and Medicine or by members of the Committee on Microbiomes of the Built Environment: From Research to Application.
important tool (Biteen et al., 2016). Intentional molecular interventions could include introduction of a specific organism, gene, community composition, or chemical product into the environment to test a hypothesis or to achieve a desired impact. As a result of the vast diversity and largely uncultivated status of built environment microorganisms, analyzing the structure, functions, activities, and dynamics of microbial communities, especially under natural settings, remains an enormous challenge and focus of existing research. Over the last few decades, to meet this challenge, a variety of open-format (e.g., high-throughput sequencing, mass spectrometry–based proteomic and metabolomic approaches) and closed-format (e.g., functional gene arrays, protein and metabolite arrays) detection technologies have been developed and used to address questions about microbial ecology at the frontier of knowledge (Roh et al., 2010; Vieites et al., 2009; Zhou et al., 2015). Substantial biological insights have been obtained from a variety of ecosystems important to human health (e.g., Alivisatos et al., 2015), as well as from analyses of foodstuffs, systems subject to climate changes (Xue et al., 2016), industrial settings, and agriculture, and more broadly across the environmental sciences (Long et al., 2016).
The open- and closed-format technologies are fundamentally different in sample preparation, quality control, data processing and analysis, performance, and applications (Zhou et al., 2015). Recently, numerous studies have examined the performance of various types of technologies, but most of these studies are related to high-throughput sequencing and microarrays. High-throughput sequencing includes primarily polymerase chain reaction (PCR) amplification-based target gene sequencing (TGS) with phylogenetic (e.g., 16S rRNA) or functional (e.g., amoA and nifH) targets, and shotgun metagenome sequencing. The discussion that follows is focused mainly on the performance of these types of high-throughput technologies in terms of key performance issues, such as specificity, resolution, sensitivity, biologic activities, quantification, and reproducibility, within the context of microbial communities, particularly the microbiomes of the built environment.
EVALUATING THE CAPABILITIES OF CURRENT TOOLS
Table A-1 presents a summary of characterization tools, organized by input type, tool type, and detection format (open or closed). This summary shows the diverse set of molecular measurement tools that are available and distills their strengths and weaknesses with respect to analytical certainty and interpretive power. In many cases a tool may include multiple measurement types. For example, there are both targeted and untargeted metabolite measurement tools.
It is important to consider the ability of molecular measurement tools to recover information relevant to the following features: relative
and absolute abundances of microbial taxa, persistence, potential health impacts, and transmissibility. To assess these capabilities, the following criteria are considered in the sections that follow: specificity, taxonomic resolution, sensitivity and organism coverage, organism viability, biologic activity, functional coverage, toxicologic potential, quantification, and reproducibility.
Table A-1 is meant to reflect the positive attributes of each tool; the text identifies limitations and caveats. The table highlights the fact that there are a number of new and emerging molecular measurement tools designed for different types of measurements, with each tool having varying capabilities. Each criterion is discussed in greater detail in the following sections, along with the rationale for the qualitative assessments shown in the table. These qualitative metrics are meant to provide a consensus perspective on microbiome measurement tools through a profile of their strengths and gaps in meeting the goals of recovering microbiome information from built environments. It should be noted that it is difficult, if not impossible, to make straightforward, point-by-point direct comparisons among different technologies because of their broad diversity and distinct characteristics. Therefore, an attempt has been made to highlight the major differences among the various technologies at very coarse levels. The following discussion focuses on the issues important to microbial ecology rather than reviewing the various technologies comprehensively.
Open- and Closed-Format Tools
“Open-format” refers to “technologies whose potential experimental results cannot be anticipated prior to performing the analysis, and thus, the experimental outcome is considered open” (Zhou et al., 2015, p. 2). In contrast, “closed-format” refers to “detection technologies whose range of potential experimental results is defined prior to performing the analysis, and thus, the experimental outcome is considered closed” (Zhou et al., 2015, p. 2).
Closed-format nucleic acid–based tools are more adept at identifying known organisms, but they provide limited information on biologic identity. In general, fundamental questions of reproducibility remain for all of the high-throughput measurement tools. This limitation is due, in part, to their exquisite sensitivity and thus potential for high reporting variability, but also to a lack of common benchmarks and the reliance on an evolving collection of bioinformatics tools and databases. In metagenomics, “open” formats can also be called “untargeted,” and closed formats can be called “targeted.” Open-based formats such as shotgun sequencing show considerable potential to elucidate specific functions and capture unknown microbial material; however, the sampling efforts and costs to use open-based
|Tools||Input||Format||Speciﬁcity||Taxonomic Resolution||Sensitivity||Organism Coverage|
|Amplicon sequencing of phylogenetic markers (e.g., 16S)||DNA/RNA||Open||Targeted||Genus/family||Rare detection||Conserved primers|
|Amplicon sequencing of functional markers||DNA/RNA||Open||Targeted||Species/strains||Rare detection||Conserved primers|
|Whole-community shotgun sequencing||DNA/RNA||Open||Off targets||Species/strains||Variable||Untargeted|
|Phylogenetic gene arrays||DNA/RNA||Closed||Targeted||Genus/family||Rare detection||High multiplex|
|Functional gene arrays||DNA/RNA||Closed||Targeted||Species/strains||Rare detection||High multiplex|
|Mass spectrometry–based proteomics||Proteins||Open||Off targets||Genus||No||Untargeted|
|Array-based proteomics||Proteins||Closed||Targeted||Genus||Rare detection||High multiplex|
|Mass spectrometry–based metabolomics||Small molecules||Open||Off targets||None||Variable||Untargeted|
|Array-based metabolomics||Small molecules||Closed||Targeted||None||Variable||High multiplex|
NOTES: Because various technologies have different characteristics, it is difficult to make straightforward, point-by-point direct comparisons. Therefore, this table attempts to highlight the major differences among various technologies at a coarse level for general comparison; accuracy may be lost in some cases in this simple table. A hyphen (-) indicates tools that exhibit less variability and may not include replicates in reporting. “High multiplex” indicates a tool that can track thousands of distinct targets but requires prior knowledge of the target.
|Organism Viability||Biologic Activity||Functional Coverage||Toxicologic Potential||Quantiﬁcation||Reproducibility|
tools in indoor environments continue to prevent their use in cases where large numbers of samples are needed.
Specificity refers to the fraction of recovered biologic material belonging to a target microbial community of interest. For example, if shotgun sequencing is undertaken on an air filter intended to monitor indoor microbial communities with pollen particles, a large portion of the recovered genomic material may yield plant genomes (Be et al., 2015). The specificity metric is particularly important for analyzing environmental samples because there could be numerous homologous sequences for each gene present in a sample. Various technologies, such as target gene sequencing, shotgun metagenome sequencing, and gene arrays, are commonly used for detecting specific organisms of interest. The specificity of target gene sequencing is determined primarily by means of the primers used for PCR amplification of the target genes. After amplification, single nucleotide differences can be resolved by subsequent high-throughput sequencing. Thus, theoretically, highly specific detection of different phylogenetic groups, species, strains, ecotypes, populations, genes, and/ or single nucleotide polymorphisms can be achieved, depending on the target genes and sequencing depths, as well as the complexity of communities examined.
To detect broader groups of organisms, highly conserved degenerate primers generally are designed, such as those used for amplifying 16S rRNA genes for bacteria and archaea, the internal transcribed spacer (ITS) for fungi, and 18S and 28S for eukaryotes in general (Cole et al., 2007; Fischer et al., 2016; Tedersoo et al., 2015). On the one hand, the higher the degree of conservation of the primers among different organisms, the broader is the phylogenetic scope of the organisms that can be detected. On the other hand, the acquired data are potentially less specific to the target genes/organisms of interest. Also, various primer sets can be designed for adopting next-generation sequencing (NGS) for phylogenetic marker genes, but their specificity and hence detection broadness vary greatly among different primer sets (Cole et al., 2007). Appropriate selection of primer sets for amplification also depends on a variety of factors such as research questions and objectives, sequencing platforms, and community composition. For instance, the primers for amplifying V3–V4 regions of 16S rRNA genes (280 bp fragment) have been used for Illumina sequencing platforms. Computationally, these primer sets should be able to amplify both bacteria and archaea, but in practice, archaea generally are poorly recovered. Thus, further development is needed for the detection of various types of archaea across different environments.
Although some earlier studies showed that detection specific to individual microbial species or strains could be achieved based on 16S rRNA genes (Loy et al., 2002; Rudi et al., 2000; Urakawa et al., 2002), detection specificity is still problematic (Zhou and Thompson, 2002). Analytic technologies based on functional genes and other noncoding sequences have advantages in specifically detecting individual species or strains (Zhou, 2003). For example, both target sequencing of functional genes and shotgun sequencing of whole communities are capable of providing highly specific information at the level of nucleotide differences on both known and novel genes and pathways (Hess et al., 2011; Mackelprang et al., 2011; Qin et al., 2012; Tringe et al., 2005; Venter et al., 2004). However, PCR amplification biases (Engelbrektson et al., 2010; Kunin et al., 2010; Lemos et al., 2012; Schloss et al., 2011), sequencing errors, and chimeric sequences (Edgar, 2013; Pinto and Raskin, 2012; Schloss et al., 2011) are inherent in sequencing technology, and they will have considerable impacts on detection specificity. The existence of nontarget contaminant DNAs in sequencing libraries will greatly affect detection specificity for the shotgun sequencing approach, which is a particular problem for host-associated microbiome studies in which sequence data may be predominantly from the host (Gevers et al., 2012; Kuczynski et al., 2012; Zhou et al., 2015). In addition, the uncertainty involved in selecting various bioinformatic tools for data processing could have significant impacts on detection specificity (Nayfach and Pollard, 2016; Sinclair et al., 2015). Contamination continues to present informatics challenges in metagenomics, with results showing errors in distinguishing a live organism of interest from nucleic acids isolated from laboratory contaminants (Merchant et al., 2014; Salter et al., 2014).
High specificity can be achieved under stringent hybridization conditions (e.g., 45ºC plus 40 percent formamide for GeoChip 4.0, and 67ºC plus 10 percent formamide for GeoChip 5.0) with GeoChip-based functional gene arrays for detecting specific taxa in analyses of environmental samples (Tu et al., 2014; Wang et al., 2014). Various controlled studies have demonstrated that such hybridization stringencies could differentiate sequences with <90–92 percent identity (Zhou, 2009). Also, unlike sequencing technologies, contaminated nontarget DNAs should have less impact on detection specificity (Zhou et al., 2015). However, low-level cross-hybridization to nontarget genes/strains always occurs. The challenge is to resolve true hybridization signals from nonspecific noises without ambiguity (Zhou et al., 2015). In addition, on the one hand, array hybridization with probes is quite specific. On the other, since the detection is defined by probe sets on arrays, depending on the hybridization stringencies used, novel genes and highly divergent genes are not detected by array hybridization (Zhou et al., 2015). Consequently, array hybridization–based detection is not suitable for discovery of novel organisms (Zhou et al., 2015).
Taxonomic resolution defines the amount of taxonomic information that is recoverable from genetic variation for each microbe in the target community (Hanson et al., 2012), with maximal resolution being an individual clone or strain and coarser resolution at higher phylogenetic levels such as genus and family. Taxonomic resolution is a critical issue for achieving appropriate detection specificity, sensitivity, and quantification to address questions related to microbial distributions, biogeography, activities, functions, and dynamic succession in response to treatments and environmental changes (Zhou et al., 2015). Although the use of phylogenetically accurate markers (e.g., 16S, 18S, 28S rRNA genes, ITS) in surveys radically changed the view of microbial diversity, distribution, and evolution, most studies are still based on information from short segments of these phylogenetic markers (Uyaguari-Diaz et al., 2016), typically 200–300 bp. Because of low rates of molecular evolution, it can be difficult to obtain fine-scale resolution at the desired species/strain level with phylogenetic markers such as 16S rRNA genes. In addition, because of the existence of sequencing errors and chimera and the lack of sufficiently accurate reference sequences, resolving classification at fine-scale taxonomic resolution (e.g., species) is even more difficult using limited short segments of phylogenetic marker genes (Jovel et al., 2016). Consequently, most bioinformatics tools provide only annotated information to the level of genera or above (Ritari et al., 2015), and hence the majority of microbial ecology studies are restricted to analyses at coarse taxonomic levels, such as differences by phylum or class (Jovel et al., 2016; Singer et al., 2016). Recently, various approaches have been developed based on single molecule sequencing using third-generation sequencing technologies such as PacBio (Singer et al., 2016) and Nanopore technology (Benitez-Paez et al., 2016) to obtain full length of sequencing of phylogenetic markers, which potentially provide more accurate and finer taxonomic resolution of microbial communities and better predict metabolic potentials. However, the experimental cost associated with such technologies remains quite high, although rapidly decreasing.
Recent informatics work shows that increased resolution of operational taxonomic units (OTUs) is possible with similarity metric alternatives to traditional sequence identity cutoffs applied to read clustering (Eren et al., 2016; Nguyen et al., 2016). New algorithms are being developed to use longer gene regions from new long-read sequencing technology (Singer et al., 2016), which improves differentiation of closely related species within a single microbial community profile. In addition, differing use of reference-based and reference-free read clustering can impact the reported community structure (He et al., 2015).
Compared with phylogenetic gene markers, functional gene markers
such as nifH genes and other coding sequences have higher taxonomic resolution (Scholz et al., 2016; Zhou et al., 2003, 2016). As a result, technologies based on functional genes, such as shotgun sequencing and GeoChip-based functional gene arrays (He et al., 2010a; Tu et al., 2014), could resolve organismal differences at the species/strain level. High taxonomic resolution is important for gauging such treatment effects as experimental warming (Xue et al., 2016), examining fine-scale biogeographic patterns (Liang et al., 2015; Zhou et al., 2008), and understanding microbial evolution (Kashtan et al., 2014; Nayfach and Pollard, 2016; Shapiro et al., 2012). For instance, DNA-based microarrays have demonstrated species/strain-level resolution in a wide range of environmental conditions (Be et al., 2013; Devault et al., 2014; Liebich et al., 2006; Tiquia et al., 2004; Zhou et al., 2010). It is important to note that, despite the advantages in providing greater taxonomic resolution relative to rRNA genes, functional gene biomarkers are more vulnerable to the effects of horizontal gene transfer (HGT); this is particularly the case for those that are frequently plasmidborne, such as those involved in metal resistance and organic contaminant degradation (Zhou et al., 2008).
Analytic tools for shotgun sequencing continue to be actively developed. Genomic data can theoretically come from any part of each microbial community member’s genome. Thus, the information, which extends beyond an isolated marker gene, can increase taxonomic resolution, moving from differentiating genera and species to tracking individual genetically differentiated populations (e.g., bacteria within a strain), and documenting evolutionary changes at even short time scales (Greenblum et al., 2015). The two principal informatics research tracks are read binning and metagenomic assembly. Metagenomic assembly holds the greatest promise for maximizing accurate community characterization since it attempts to reconstruct each organism’s genome for maximal taxonomic resolution. It will also be important to develop common reference materials to support analysis. However, there are fundamental challenges to assembly that currently preclude its use as the sole technical approach (Ghurye et al., 2016). One barrier to assembly is lack of sequencing depth for each organism, a roadblock that is driven by limits in sequencing fidelity and throughput. Two open analysis challenges remain—disentangling individual genomes within a population of closely related organisms and resolving genome complexity, particularly in the case of microbial eukaryotes (Sangwan et al., 2016). These two problems could become easier to solve as read length increases and error rates are reduced with emerging sequencing technologies (Beitel et al., 2014; White et al., 2016). For microbial communities of the built environment, and with the use of current sequencing technology, the two practical features for determining informatics success in complete metagenomic assembly are ensuring sufficient biomass to retrieve repre-
sentative sampling of DNA and limited community complexity to ensure adequate sequencing depth of each organism. Progress recently has been made in overcoming read depth limitations by developing new assembly approaches that bin and assemble reads that exhibit covariation patterns across multiple related microbiome samples (Imelfort et al., 2014; Nielsen et al., 2014). An open challenge remains in developing robust informatics tools to resolve mobile genetic elements, which are important for determining whether genes or pathways proliferate to multiple kinds of organisms (Jørgensen et al., 2014). Similarly, phage is a potential source of altering community composition (Koskella and Brockhurst, 2014) and stimulating lateral gene transfer, yet phage diversity and abundance are rarely addressed in existing microbiome studies (Rosario and Breitbart, 2011).
Sensitivity and Organism Coverage
Sensitivity measures the fraction of known taxa (for convenience, “species”) present in the microbiome that are detected even when the organisms are present at low abundance. Organism coverage defines the fraction of organisms detected regardless of whether they are known a priori or not. Thus, a targeted sequencing method can be of high sensitivity by amplifying and sequencing a single gene copy, even if occurring at ultra-low abundance. Yet, the same method would exhibit poor organism coverage since it would be designed to target a gene that likely can occur only in a limited number of organisms. The open-detection formats are best suited to maximizing organism coverage; however, even these tools are typically limited by access to reference databases needed to make a species assignment.
Sensitivity is a critical parameter for detection, particularly for complex environmental samples in which many populations exist in low abundances (Rhee et al., 2004; Wu et al., 2001, 2006). Sensitivity generally can be assessed based on absolute amounts of template materials (e.g., DNA and RNA) needed for analyses, and the lowest percentage of populations within a community can be detected. Although the former has been reported in various studies, the information on the latter is sparse because sophisticated experiments need to be designed with special efforts on implementation.
Because target gene sequencing involves PCR amplification, typically with 25–35 cycles, its detection sensitivity is generally expected to be as high as that of PCR amplification. Highly sensitive detection can be achieved with NGS (de Boer et al., 2015; Leonard et al., 2015). In general, 1–10 ng of DNA is used for library preparation for target sequencing with the Illumina platform. Shotgun sequencing typically requires 1 μg DNA for library preparation with sonication for DNA fragmentation without PCR amplification. Various low numbers of PCR amplification cycles (typically 6) could also be used in library preparation for shotgun sequencing
with the Illumina platform, depending on the amount of starting material. If biomass is extremely low (e.g., 1 ng), high PCR amplification numbers (e.g., 18 cycles) can be used in library preparation for shotgun sequencing. However, this approach is sensitive to dominant populations in the sample, which can be oversampled. Consequently, it might be difficult to detect rare taxa (Zhou et al., 2015). Further studies might be necessary to provide explicit evidence for the lowest abundance of a population in a complex community that can be detected using these technologies.
Sensitive detection can also be obtained with functional gene arrays. With the current version of GeoChip fabricated by Agilent printing technology and updated protocols, 0.2–1.0 μg of community genomic DNA is needed for direct labeling and hybridization, a sample size that can suffice for analyzing environmental samples from many habitats, such as soils, marine sediments, bioreactors, and wastewater treatment plants (Zhou, 2009; Zhou et al., 2015). Nucleic acids (1–500 ng) can also be representatively amplified using whole-community genome DNA (1 ng) amplification (WCGA) (Wu et al., 2006) or whole-community RNA amplification (WCRA) (500 ng) (Gao et al., 2007) if biomass is extremely low. Very low concentrations of DNA (~10 fg, ~2 bacterial cells) can be detected using a modified amplification method (Wu et al., 2006). Furthermore, recent studies showed that the Agilent-based functional gene arrays are highly sensitive, with a detection limit as low as 5×10–4 to 5×10–5 proportion of populations within a complex soil community in terms of DNA concentration (μg). The phylogenetic arrays (e.g., PhyloChip) exhibit a detection limit of 107 copies or 0.01 percent of nucleotides hybridized to the array (Brodie et al., 2007; DeAngelis et al., 2011). Such detection sensitivity is comparable to that of PCR amplification-based target gene and shotgun sequencing.
Microbial biomass is generally very low for built environment samples, and hence sensitivity is a critical issue for built environment microbiome studies. At this stage, it is not clear whether the available sequencing- and array-based detection technologies are sensitive enough for built environment microbiome studies. Rigorous systematic examinations of this issue within the context of the built environment are needed.
Despite progress in increasing the accuracy of metagenomic assembly, read binning software tools remain an important complement to improve sensitive organism detection in metagenomic shotgun sequencing since there are fewer restrictions on minimum read depth. This feature enables identification of low-abundance species, which can be important for accurate community profiling (Segata et al., 2013). Rare taxa, if present, can potentially explode in numbers if environmental conditions change. Unsupervised binning uses short sequence frequency profiles to group sequencer reads with similar profiles and can be useful both for characterizing novel organisms (Liao et al., 2014) and for identifying novel organisms common across
multiple samples (Alneberg et al., 2014). Supervised approaches are widely used and match reads against a reference database for taxonomic identification. The use of multilevel hash tables, suffix arrays, de Bruiin graphs, and other related searchable data structures offers the potential to explore large metagenomic datasets against a comprehensive genome database (Ames et al., 2015). A clear challenge is the recognition that reference databases are dynamic and must be updated regularly to reflect the increasing number of available sequenced organisms.
Viability refers to the measurable ability of a microorganism to replicate under artificial (engineered) or natural conditions. Because culture-based analyses can assess viability, such analyses historically have been preferred over non-culture-based methods, which with many classical protocols cannot assess viability. Culture-based approaches are valid for assessing air and surface samples for some infectious pathogens, for example. In the context of the built environment, however, adverse health effects include more than infectious diseases that depend on viability for disease transmission. Evidence suggests that adverse health effects can be caused not only by inhalation of viable airborne pathogens but also by inactive microorganisms and their fragments and component parts (Miller, 1992). There is evidence suggesting that nonviable fungi, their spores, and their fragments all can cause respiratory illness and chronic systemic illness (Burge, 1990; Flannigan et al., 1991; Sorenson et al., 1987; Su et al., 1992). The same is true for some species of airborne bacteria (Flannigan, 1992). Genera included in indoor bioaerosols and on surfaces are those that are pathogenic (e.g., Aspergillus niger), those that are toxigenic (e.g., Stachybotrys atra), and those that are allergenic (e.g., Aspergillus versicolor).
Inhalation exposure to microbial toxins and allergens contained in/on spores or inactive microbial fragments is not detected by viability-based culturing methods. To better understand the ecology of bioaerosols and to study the effects of engineering controls on the indoor environment, methods are needed to differentiate between metabolically competent and nonviable airborne microorganisms as they exist in situ. In short, a dead microbe is not an irrelevant dust particle, but may have serious health consequences.
Conventional culture-based approaches, of course, have intrinsic limitations for characterizing airborne microorganisms. Standard plate counts underestimate the true quantity and diversity of airborne microbes, as this method is incapable of investigating the fate of slow-growing, unculturable, or inactive microbes and their fragmented parts, and culturing techniques do not span the range of environments that prompt microbial metabolic activity and reproduction. PCR has recently been adapted to character-
ize airborne microorganisms; however, PCR also has intrinsic limitations, particularly in gauging activity. While promising, genetic amplification methods have been reported to estimate microbial biomass and diversity inaccurately. PCR is as yet incapable of assessing in situ activity; provides no measure of microbial fractions; and is labile to some ubiquitous environmental interferences, particularly trace concentrations of heavy metals (Alvarez et al., 1995; MacNeil et al., 1995).
Fluorochrome enumeration has been adapted to measure airborne bacterial concentrations in indoor environments using the DNA intercalating agents acridine orange or ethidium bromide (Griffiths et al., 1996; Moschandreas et al., 1996; Palmgren et al., 1986; Terzieva et al., 1996). In a bench-scale chamber study (550 cm3), Terzieva and colleagues (1996) directly enumerated Psuedomonas fluorescens captured in impingers using proprietary membrane integrity dyes to assess viability. In many ecological and environmental studies, fluorochromes have been coupled with various heterocyclic tetrazolium dyes and nalidixic acid to determine not only total microorganism numbers but also the fraction of metabolically active microorganisms (Hernandez et al., 1999; Maki and Remsen, 1981; McFeters, 1995; Rodriguez et al., 1992; Tabor and Neihof, 1982; Trevors, 1985). However, in most indoor aerosol and surface studies, the detection and quantification of metabolically active microorganisms have, until recently, been limited to agar plate count methods, where sampling methods, nutritional requirements, and culturability bias the results (Burge, 1990; Buttner et al., 1993; Flannigan, 1993; Hinds, 1982; Jensen et al., 1992; Marchand et al., 1995; Pillai et al., 1996; Teltsch and Katzenelson, 1978).
“Biologic activity” refers to any enzymatically mediated metabolic function, or fraction thereof, that contributes to the persistence of a microorganism as a living entity, regardless of its ability to propagate. DNA-based metagenomics provides a “snapshot” of the diversity and functional potential of various microbial taxa with potentially functional populations in a community, but it is not clear whether the populations of these taxa are actively engaged in metabolic activity and reproduction (which is required for a population to persist). To address such questions, other “omics” approaches, including metatranscriptomics for assessing mRNAs, metaproteomics for measuring proteins, and metametabolomics for monitoring metabolites, as well as stable isotope probing, are more appropriate (Zhou et al., 2015).
Metatranscriptomics typically involves random sequencing of microbial community mRNA (DeLong, 2009; Moran, 2009; Moran et al., 2012; Shi et al., 2009; Sorek and Cossart, 2010). Total RNA is first extracted from
a microbial community, generally with rRNA removal. Then, mRNA is amplified, converted into cDNA, and sequenced. Because of the low relative abundance of mRNA in total cellular RNA (e.g., 1–5 percent) and the lack of poly(A) tails, prokaryotic rRNA is often removed, and mRNA is amplified before sequencing to improve detection sensitivity (He et al., 2010b; Sorek and Cossart, 2010; Stewart et al., 2010). Effectively removing rRNA can be labor-intensive, time-consuming, and challenging.
Metatranscriptomics has been widely used for characterizing microbial communities from different habitats, such as soil (Urich et al., 2008), seawater (Frias-Lopez et al., 2008; Poretsky et al., 2009; Stewart et al., 2012), human/animal microbiomes (Giannoukos et al., 2012), and activated sludge (Yu and Zhang, 2012). Results from such studies demonstrate that metatranscriptomics provides a powerful approach to functionally characterizing microbial communities. Although metatranscriptomics is attractive, the challenge is how to obtain sufficiently high quality of total community RNA and to completely remove rRNA prior to sequencing. Obtaining sufficient community RNAs for metranscriptomics could be particularly difficult for microbial communities associated with the built environment, given the challenges of low biomass in samples.
Analysis of community RNA can also be performed via functional gene arrays (Xue et al., 2016). The main advantage of the array-based approach is that total community RNAs can be used for direct hybridization without the need for removing rRNAs, so the activity of less abundant taxa and populations can be more easily discerned, and hence the results can be more quantitative. However, many key genes of known functions and novel genes can be missed if the probes on the arrays are not representative of the diversity of the community examined; thus, the activity information may be constrained to known taxa and functional populations.
The general belief is that only mRNA, proteins, and metabolite levels can be used to assess activities of individual taxa and populations (Nawy, 2013). However, in many cases, the activities of functional genes/populations can be inferred based on changes in DNA abundances, particularly if time-series data are available. That is, community members or functions that are more (or less) active in certain conditions would have increased (or decreased) abundances reflected in higher (or lower) abundance of associated DNA. Thus, DNA-based measurements can be appropriate for signifying changes in functional activities in these cases (e.g., a growing microbial population), potentially providing a good alternative for mRNA and protein measurements. Because the half-life of mRNA is generally very short, it might in any case be less suitable for comparing functional activities of genes/populations at ecological time scales (e.g., months, seasons, years). It is also more challenging to process RNA appropriately in the field, so the development of alternative approaches could broaden the situations in
which microbiomes can be assessed in the built environment. Consequently, in many ecological studies, DNA-based abundance changes have been used for assessing the activities of functional genes/populations (Van Nostrand et al., 2011). However, it is important to be cautious because assessing functional activities in natural settings can be complicated. For instance, active genes could be coupled to nonactive genes within a single genome (e.g., presence of a nonexpressed nif operon), making the discrimination between active and nonactive genes/populations impossible. In this case, DNA-based abundance changes would be unsuitable for measuring functional activities. Thus, ideally, a combination of both DNA- and mRNA-based measurements, as well as protein- and metabolite-based measurements, would be used to assess the presence and activity of genes/populations in a complementary and mutually reinforcing fashion.
To optically determine microbial activity in situ, redox dyes that serve as nonspecific substrates for microbial respiration have been employed to stain actively respiring microorganisms within aerosol samples. A tetrazolium analog that intracellularly reduces to a fluorescing formazan is now commercially available in high purity (Rodriguez et al., 1992). These compounds provide for concurrent fluorescence determinations of bacterial numbers and metabolic activity regardless of propagation potential. Fungal conidial and hyphal viability have been quantitatively assessed using fluorochromes that mark fungal esterase activity. Fluorescein diacetate is a colorless substrate that is intracellularly cleaved to brightly fluorescing free fluorescein by active fungal esterase enzymes. It has been found to assess the viability of fungal hyphae and conidia accurately and precisely compared with standard germination tests (Firstencel et al., 1990; Jensen and Lysek, 1991). This compound provides for concurrent epifluorescent determinations of fungal mass and activity.
Functional coverage is the amount of gene function potential recovered from the microbiome sample. This remains a particular challenge, because even recovery of the complete complement of proteins, genes, or metabolites does not automatically yield an accurate functional assessment since the function may remain unknown.
Metagenomic-based gene annotation gives a preliminary description of biochemical process potential in microbial communities with transcriptionally active pathways identified through RNA-based metagenomic sequencing. Functional gene annotation operates primarily by identifying genes through sequence homology search using translated amino acid query sequences. There have been a few efforts to identify compact genetic signatures unique to gene families of interest (e.g., Kaminski et al., 2015)
and applied to tracking antibiotic resistance genes in indoor environments (Hartmann et al., 2016). However, most gene identification pipelines continue to benefit from longer or assembled reads to recover the correct gene families (Carr and Borenstein, 2014). Functional annotation is reliant on the completeness of reference databases and multiple reference gene databases, such as KEGG (Kanehisa et al., 2014) and eggNOG (Huerta-Cepas et al., 2016). Gene function profiling is commonly separated from the organism, and abundance quantification of functional categories is calculated by measuring the numbers of reads that map to a set of reference protein sequences in a family or profile hidden Markov model (Eddy, 1998). Gene pathway abundance can then be inferred from gene family abundance. In addition to direct gene measurements, 16S data have been used as a proxy to retrieve reference genomes thought to be closely related to the 16S OTUs (Langille et al., 2013). However, particular attention needs to be paid to the potential for OTUs to match directly with the correct sequenced genomes and the potential for intra species and strain gene gain and loss, which would lead to incorrect taxonomic identification.
Integrating molecular data from multiple sources—genome, transcript, metabolite, and protein—presents an emerging opportunity to identify more accurately the biologic processes that explain how the diverse elements of the microbiome persist in a specific environment (Jansson and Baker, 2016; Quinn et al., 2016). For metaproteomics, however, determining the completeness of the proteins recovered remains a challenge, and applying an accurate protein database search strategy, ideally informed by sample-specific metagenomic data, influences the observable microbiome protein content (Armengaud, 2016; Herbst et al., 2016). Complete recovery of the biochemical structures directly from mass spectrometers has yet to be fully addressed, in part because of the lack of well-annotated databases, with only an estimated 1.8 percent of spectra annotated from untargeted mass spectrometry experiments (da Silva et al., 2015; Wang et al., 2016). Considerable efforts continue on developing databases of efficiently encoded spectra data to support a more complete recognition of unlabeled measured chemical compounds (Quinn et al., 2017). Recent efforts have demonstrated the utility of topic modeling, which illustrates the potential to improve chemical substructure feature encoding and also improve sensitivity in automated biochemical annotation (van der Hooft et al., 2016). Such approaches will also benefit from being integrated with molecular networking (Watrous et al., 2012) and genome mining strategies (Cimermancic et al., 2014; Donia et al., 2014). Genomic data, however, continue to serve as a convenient information source, and previously annotated metabolic pathways have been used to infer metabolic profiles using the relative abundance of annotated enzymes with some comparisons with direct measurements of metabolites (Noecker et al., 2016). These meth-
ods have the potential to predict metabolic processes that originate from individual genes and organisms and to enter into mechanistic models for community dynamics (Mendes-Soares et al., 2016). The models have so far been restricted to limited curated metabolic pathway data and rarely take multiorganism interactions such as competition or facilitation into account (Henry et al., 2016).
Of fundamental importance are tools that use multiple microbiome samples to build models that predict and explain the persistence of microbial communities and their functional characteristics. Co-association networks are a common approach used to predict pairwise organism interactions, including cooperation and competition, based on co-occurrence patterns in multiple samples (Faust et al., 2015). Inferring organism interaction networks remains challenging for several reasons, in part because compositional data present underlying dependencies between organisms, with normalization procedures that can confound some correlation methods (McMurdie and Holmes, 2014), and large-dimensional feature space can lead to sparse feature representation and model overfitting (Cardona et al., 2016). Multi-organism dependencies (beyond pairwise) may be important but require additional samples to support rigorous statistical confidence. Recent work has employed the use of synthetic community data to show how the choice of method can impact the inferences about any organism relationships detected (Weiss et al., 2016). General extensions from organism interaction networks to gene interaction networks have not been as well developed but could yield new predictive models of important molecular processes required for successful microbial communities (Boon et al., 2014).
Toxicologic potential reflects the ability of an individual chemical agent, or combination of chemical agents, to alter the normal metabolic functions or replication of a cell such that its life cycle is shortened, or its functioning is compromised relative to the condition where such exposure did not occur. Toxicity is classically realized in a dose-response scenario beyond a distinct threshold dose in biologic systems.
Indoor pollutants include airborne microscopic particulate matter comprised in whole or in part of biogenic materials, which are often termed “bioaerosol.” By this definition, bioaerosols include all airborne microorganisms regardless of viability or ability to be recovered by culture; additionally, the term encompasses their fractions, other biopolymers, and products from all varieties of living things (ACGIH, 1999). Bioaerosols originate from occupants (e.g., humans, pets, houseplants), but they can also drift in from external sources, as well as originate from building materials in high-moisture environments and/or experiencing water dam-
age, even after structures have been considered refurbished by modern construction practices.
Numerous publications on indoor air quality report that airborne biologic particles can range in aerodynamic diameter from 0.01 μm to 100 μm (ACGIH, 1999); in many environments, airborne bacteria, fungi, their fragments, and other biopolymeric materials may fall into a size range that can penetrate into human lungs (<3 μm) (Górny et al., 2002; Reponen et al., 2001). While only intact microorganisms can be infectious, and culturable numbers of airborne bacteria have been positively correlated with adverse respiratory symptoms (Björnsson et al., 1995), toxic, hypersensitive, and allergic reactions can also be caused by microorganism fragments or their biochemical by-products (Burrell, 1991; WHO, 1990). The health “penumbra” of the microbiome extends well beyond living, reproducing microbes. Well-known examples of potent biogenic factions, which are collectively referred to in this context as “biomarkers,” include endotoxin, a compound found in the outer membranes of Gram-negative bacteria cell walls (ACGIH, 1999); many peptides from bacterial and fungal cell walls and metabolic products (ACGIH, 1999; Miller, 1992); β-(1-3)-D-glucans, found in fungal cell walls (ACGIH, 1999); and mycotoxins, products of fungal metabolism (Robbins et al., 2000). The exocellular toxins produced by the airborne bacteria responsible for whooping cough, Bordetella pertussis, serve as an unfortunate example of a reemerging toxigenic disease—the agents for which have never been recovered from ambient aerosol by conventional culture techniques.
While air quality indices and recommended threshold exposure levels are well defined in terms of certain chemical compounds and particulate matter masses, they are inadequately defined regarding airborne or surface-borne contaminants of biologic origin. In contrast with the scientific grounding for wastewater and drinking water regulation, bona fide toxicologic characterization of aerosols is only beginning to emerge in the aerosol and built environment community (Brook et al., 2010; Li et al., 2003), making it difficult to devise effective building regulations.
Only the culturable portion of bacteria in the atmosphere has been studied in detail (Hernandez et al., 1999; Moschandreas et al., 1996; Tong and Lighthart, 1999), and it is clear that air quality regulations have been biased by analytic reliance on culture over the last generation (Flannigan, 1997; Heidelberg et al., 1997; Henningson et al., 1997; MacNaughton et al., 1997). It is now known that many microorganisms (>99.9 percent in some environments) are not readily cultured using routine media and growth conditions (Amann et al., 1995; Pace, 1997). The most basic genetic characterizations are only beginning to be applied to the atmospheric environment (Womack et al., 2010), and toxicology assays have not been systematically adapted to determine the relationship between the amount
of ambient (bio)aerosols and the likelihood of inducing stress responses in accepted cellular exposure models (Douwes et al., 2003).
Those limited bioaerosol regulations that do exist for indoor environments are in the form of guidelines based on culturable airborne microbe concentrations from grab samples, without taking into account other assays that could better indicate the potential for impacts on human and ecosystem health (Rao et al., 1996). There is also little recognition that transmission depends upon host traits as well, which in addition to such factors as age, health condition, and genetic variation in susceptibility, could in the built environment reflect behavioral patterns of space occupancy and use. Organizations such as the North Atlantic Treaty Organization (NATO) and the World Health Organization (WHO) have concurred that there is a pressing need to develop more accurate and robust methods for characterizing the biologic contributions to total exposure (aerosol) loads (Maroni et al., 1995; WHO, 1990), yet only in the past few years have basic toxicology perspectives emerged in the indoor air quality arena beyond the characterization of occupational exposures.
Quantification refers to the basic definable unit of biologic measurement, which can be unambiguous and identified by a referenced and accepted definition (i.e., colony-forming unit). Determination of the different microbial taxa that make up indoor bioaerosols and are surface associated is particularly important because gross abundance can serve as a meaningful indicator of exposure to airborne respiratory health risks and allergens. Fluorochromes are now available that are specific for different groups of microorganisms and thus allow for representative estimates of the range of populations that make up the microbiological fraction of indoor aerosols. Since they are used in a direct visualization technique, stains considerably reduce the potential for inaccuracies in estimating microorganism numbers in environmental samples regardless of the medium (aerosol or surface). The fluorochrome stains acridine orange (AO), 5-(4,6-dichlorotriazinyl) amino-fluorescein (DTAF), 4´6-diamino-2-phenylindole (DAPI), and calcofluor M2R target different biomolecules and have been used successfully in (built) environment samples for total microorganism counts and size measurements (Bloem et al., 1995; Hernandez et al., 1999; Hobbie et al., 1977; Wagner et al., 1994). Well-tested fluorochrome stains have been used to directly enumerate (by means of microscopy) three major microbial taxa that make up airborne and surface populations, and reviews on the subject relevant to the indoor environment are available (Peccia and Hernandez, 2006).
Various quantitative parameters (e.g., absolute abundance, relative
abundance, and copy numbers) are used to capture different biologic properties of a taxon or gene in a community (Nayfach and Pollard, 2016). Ideally, absolute abundances of individual taxa or genes in a community are desired for subsequent statistical analysis and model prediction. However, obtaining accurate absolute abundance data using high-throughput sequencing technologies is challenging. Because of inherently high variations in experimental protocols and bioinformatics, measurements can be quite different among different samples even under identical conditions. Therefore, most microbial ecology analyses with sequencing data are based on relative rather than absolute abundance. This is an important impediment in relating microbiome data to fundamental models in population, community, and ecosystem ecology, where absolute numbers matter. Moreover, even relative abundance calculations for universal marker genes are limited by amplification bias (Brooks et al., 2015). When unbiased, such data provide relative measures of community composition, which need not correspond with absolute abundance. Careful attention needs to be paid to using statistical methods that account for compositional bias to avoid false correlations across multiple samples with systematic differences in sample characteristics, such as sequencing depth (Kurtz et al., 2015). Computationally, metagenomic sequencing provides additional information, with the potential to measure copies of recovered genes and genomes irrespective of the organisms present. Relying on universal marker genes remains the most common approach for abundance profiling (Manor and Borenstein, 2015; Nayfach et al., 2016).
There are additional efforts to apply statistically rigorous abundance quantification in terms of genome abundance; however, more work is needed to evaluate existing datasets on real conditions of microbiomes in built environments (McLoughlin, 2016). Nevertheless, modeling abundance using information beyond a selected set of marker genes could improve flexibility, permitting one to quantify abundance in cases where marker genes are inaccessible. A promising recent development is the peak-to-trough ratio (Korem et al., 2015) and a related extension (Brown et al., 2016). These methods observe that bacterial replication begins at a distinct replication site, and a replication rate can be inferred by measuring the change in sequencing depth for each genome, starting from the origin of replication to the distal portion of the genome. The replication rates are used to infer bacterial growth rates from a single microbiome snapshot and present an important emerging informatics technique for estimating replication rates for microbial communities in the built environment.
Because traditional PCR amplification is involved in amplicon-based target sequencing, it appears that target gene sequencing is not quantitative in complex communities as previously demonstrated (Pinto and Raskin, 2012; Tremblay et al., 2015; Zhou et al., 2011). This limitation
is consistent with the previous observations about pyrotag sequencing studies (Engelbrektson et al., 2010) and with the general consensus that traditional PCR amplification is not quantitative (Qiu et al., 2001; Suzuki and Giovannoni, 1996). Various strategies have been proposed for alleviating the amplification biases on quantification, including combining several amplifications (>3) and using fewer cycle numbers (e.g., 25 or no more than 30 cycles) to avoid PCR product saturation. However, some studies have shown that quantitative estimation of organismal abundance can be obtained with deep amplicon sequencing of 16S rRNA genes in a simple microbial community (Avramenko et al., 2015; de Boer et al., 2015). Nevertheless, great caution is necessary in drawing quantitative inferences about microbial community diversity, and in particular absolute abundances, in comparative studies based on amplicon sequencing data.
There are two general approaches for quantitatively assessing the abundance of organisms and functional genes in a shotgun metagenome approach. One is to classify sequencing reads according to reference databases of genes and/or genomes via alignment-based homology analyses, followed by counting the classified reads to estimate taxonomic groups and gene families (Nayfach and Pollard, 2016). The complementary approach is to perform de novo analyses by grouping shotgun sequencing and then annotating the resulting OTUs and gene families based on their homology to known genes. Theoretically, it is believed that shotgun sequencing could be quantitative (Nayfach and Pollard, 2016; Zhou et al., 2015) because shotgun sequencing of whole communities does not require amplification prior to sequencing if template DNA is sufficient, and hence it avoids many of the biases encountered in amplicon sequencing. However, quantifying organisms and functional genes in a shotgun metagenome also presents several unique challenges (Nayfach and Pollard, 2016).
First, reference databases used to date do not represent the vast majority of microbial diversity at low taxonomic levels. Consequently, it is difficult to assign taxa or genes with high confidence to estimate their abundance (Nayfach and Pollard, 2016). Also, high inherent variation in experimental protocols and uncertainty in selecting bioinformatics tools for analysis challenge the quantitative estimation of the abundances of individual taxa and genes (Clooney et al., 2016; Kerepesi and Grolmusz, 2016; Nayfach and Pollard, 2016). In addition, the massive size of short metagenome reads data makes it very difficult to compare taxa and genes across different samples. As a result, it might be impossible to obtain an absolute abundance estimation based on shotgun sequencing data alone (Nayfach and Pollard, 2016). By combining sequencing with other techniques, such as density measurements and quantitative PCR, it might be feasible to obtain estimates of absolute abundance (Nayfach and Pollard, 2016). Finally, it has recently been concluded that cellular relative abundance and average genomic copy
number are the most meaningful biologic parameters that can be quantified based on shotgun sequencing reads (Nayfach and Pollard, 2016).
In contrast to sequencing-based approaches, absolute abundance of taxa and genes can be estimated based on the signal intensity from array hybridization. This is because signal intensity is derived from the extent of actual hybridization, and the hybridization signals reflect the absolute abundance for the amounts of DNAs used for hybridization. During the past decade or so, numerous studies have demonstrated strong correlations between target DNA or RNA concentrations and GeoChip hybridization signal intensities using pure cultures, mixed cultures, and environmental samples without amplification (Brodie et al., 2007; Gao et al., 2007; He et al., 2010a; Tiquia et al., 2004; Wu et al., 2006). Very good correlations were also observed between PhyloChip signal intensities and quantitative PCR copy numbers spanning five orders of magnitude (Brodie et al., 2007; Lemon et al., 2010). These results suggest that the array-based approaches are highly quantitative with environmental DNAs and could provide one avenue toward refined abundance estimation for microbial assemblages.
Reproducibility needs to address both technical and biologic variations. In a deterministic world, if one could measure a sample nondestructively, then repeatedly using the same protocol applied to the same sample, one would be able to repeat one’s results. More practically, if the same sample were split into multiple aliquots and sent to different labs to recover the microbiome contents, one would consider results “reproducible” if all labs returned the same answer when using the same measurement tools. A possible starting point could be to send the same raw data (rather than a starting sample) to different labs to compare the analytic results obtained. Because of inherent stochasticity, there will be sampling error, which needs to be taken into account. Moreover, the potentially dynamic nature of the microbes and the high degree of community complexity mean that natural biologic variation can prevent two labs from producing the same results even after controlling for technical variation. Nevertheless, developing an understanding of the conditions under which accurate and reproducible microbiome measurements in built environments can be made is a foundational requirement for moving investigative research toward practical building design applications that are subject to greater oversight by a broad community of stakeholders.
Reproducibility is a big concern both scientifically and ethically (AAM, 2016). Part of the irreproducibility is due to technologies themselves, because of measurement errors and biases and sampling processes. Such issues have not been appropriately recognized and addressed in microbial ecol-
ogy until recently (Zhou et al., 2013). A few years ago, it was first noticed that amplicon-based sequencing approaches have very low reproducibility, with <15 percent OTU overlap among three technical replicates (Zhou et al., 2011), which is well below the theoretical expectation of 100 percent overlap among technical replicates in which the same DNA from the same samples were amplified and sequenced three times. This phenomenon has recently been well established experimentally across different laboratories (Flores et al., 2012; Ge et al., 2014; Palmer and Horn, 2012; Peng et al., 2013; Pinto and Raskin, 2012; Sinclair et al., 2015; Talley and Fodor, 2011; Vishnivetskaya et al., 2014; Wen et al., 2017; Xu et al., 2011; Zhan et al., 2014; Zhou et al., 2011), although discrepant results have been observed (Bartram et al., 2011; Kauserud et al., 2012; Mao et al., 2011; Pilloni et al., 2012). Part of the reason for such discrepancies could be the complexity of the ecosystems examined—for instance, because of microscale variation in microbial community composition (Kauserud et al., 2012; Pinto and Raskin, 2012; Zhou et al., 2011)—but part could be lab-based, such as differences in sequencing depths (Bartram et al., 2011; Lemos et al., 2012; Zhou et al., 2011) and/or variations in sequencing and sequence preprocessing approaches (Pinto and Raskin, 2012; Schloss et al., 2011). Based on random sampling theory, mathematical simulation explicitly demonstrated that low technical reproducibility for amplicon sequencing is most likely due to the artifacts associated with random sampling processes inherent in PCR amplification and sequencing (Zhou et al., 2013, 2015), which would contribute to variations because of inherent spatial and temporal heterogeneity in the microbiome itself. To achieve high technical reproducibility, several orders of magnitude more sequencing efforts are needed (Zhou et al., 2013). This makes extensive sampling that is quantitatively rich methodologically challenging. Similar challenges also exist for other “omics” technologies, such as proteomics and metabolomics.
Although shotgun sequencing avoids many of the biases encountered in amplicon sequencing, the reproducibility problem could be more severe with shotgun sequencing than with amplicon-based target sequencing. This is because sampling processes via shotgun sequencing from a community with thousands and up to hundreds of thousands of species and thousands of genes from each species are likely much more random than those via amplicon sequencing. However, no experimental evidence is available as yet to support such speculations.
In contrast, the array-based closed format has lower susceptibility relative to the sequencing-based open format for random sampling artifacts (Zhou et al., 2015). Because the number of detected taxa or genes is defined by the probe sets on the array, the overlap among technical replicates is less dependent on the level of sampling effort. Thus, high technical reproduc-
ibility would be expected, as demonstrated by mathematical simulation (Zhou et al., 2015).
Because the technical variations associated with random sampling processes could greatly overestimate microbial β-diversity, high reproducibility is critical for comparative studies to be reliable across different spatial and temporal scales and environmental gradients. Great caution is needed in quantifying and interpreting β-diversity for microbial community analysis using high-throughput metagenomics technologies, particularly next-generation sequencing. Null model approaches (such as those used in community ecology [Gotelli, 2001]) could be a valuable tool for assessing the degree of reproducibility in empirical microbiome studies. In addition, reproducibility depends on software, versions of software packages, predefined parameters of the software, and databases.
There have been several benchmarking efforts designed to demonstrate reproducibility on multiple aspects of the problem from sample collection to data analysis, yet complete, accurate, and reproducible recovery of complex communities remains challenging. For example, comparison of different DNA extraction and sequencing library preparation protocols has shown that the observed microbial community can differ dramatically (Brooks et al., 2015; Hart et al., 2015; Jones et al., 2015). In addition, there have been several published efforts to develop mock community resources. A “mock” community is a defined synthetic mixture of microbial cells or nucleic acids designed to simulate a microbial community (Highlander, 2015). To date, the vast majority of resources focus either on defining 16S-based reference material (Bokulich et al., 2016), with limited microbial diversity (Morgan et al., 2010; Tanca et al., 2013), or exclusively on the human microbiome (HMP Consortium, 2012; Sinha et al., 2015). Benchmarking studies can be expensive and challenging to do well, and it may be difficult to obtain funding for such studies.
In addition to published reports, multiple consortia have begun to form to help organize future efforts in designing reference material. Beyond physical mock communities, there are efforts to develop simulated datasets with which to benchmark computational tools, which have demonstrated that different descriptions of community structure may be found for the same input data depending on the choice of analysis tool and choice of parameters (Lindgreen et al., 2016; Randle-Boggis et al., 2016; Weiss et al., 2016; see http://www.cami-challenge.org).
Mock communities in the context of human microbiomes helped establish sequencing protocols for sequencing centers, which are working to expand their sequencing capability to support microbial community profiling (Gohl et al., 2016). Mock communities have been used to evaluate the impact of analytic parameters on functional annotation (Nayfach et al., 2015). However, considerably less attention has been given to design-
ing mock communities and benchmarking standards so as to better reflect indoor built environments. An open question remains of how best to use existing benchmarking efforts to guide future validation work in the built environment. In addition, opportunities to build reference material that better captures living biologic material in a controlled environment would further enhance existing reference material resources (Ling et al., 2015).
Data Sharing and Metadata on Buildings and Building Systems
A key component to support reproducibility and the maturation of the microbiome modeling field is ensuring that experimental data and software are accessible to the research community. This requires publicly accessible data repositories such as the Sequence Read Archive, where raw genomic data can be housed, as well as data-sharing standards to ensure sufficiently complete and accurate descriptions of the experimental conditions used to generate new microbiome data (Leinonen et al., 2011).
Studies of indoor microbial communities and the factors that affect them necessarily need to characterize the buildings in which the studies are conducted. This information is essential for understanding how the features of buildings and building systems influence these communities, in order to design and operate buildings to reduce the likelihood of detrimental health outcomes and to increase the potential for beneficial communities in the future.
LOOKING TO THE FUTURE
Various high-throughput technologies of both open and closed formats have been developed and used for the analysis of microbial communities. Each has advantages and disadvantages in terms of specificity, resolution, sensitivity, activity measurement, quantification, and reproducibility. Generally speaking, while the closed-format technologies have advantages for hypothesis-driven comparative studies, open-format technologies are excellent for exploratory discovery studies (Zhou et al., 2015). They can be integrated in a complementary fashion to address complex biologic questions and objectives (Zhou et al., 2015). Also, careful experimental design is as important as the selection of various “omics” technologies. Because all high-throughput technologies have inherently high noise, increasing biologic replicates is critical for ameliorating the impacts of technical variations on drawing biologic conclusions. The numbers of biologic replicates needed depend on the biologic systems examined, the research questions, the objectives, and the magnitudes of biologic variations. In addition, because of various potential technical difficulties associated with specificity, sensitivity, quantification, resolution, and/or reproducibility, it
is extremely helpful to apply high-throughput “omics” technologies for relative comparisons, which are the comparisons between communities (e.g., between treatment and control samples), and thus typically the ratio (treatment/control) is used. This is especially important when dealing with environmental samples of unknown composition (He et al., 2007). Because the signal ratios (based on sequencing reads or hybridization intensity) of treatment samples to control samples are used, the effects of errors, biases, sampling processes, and bioinformatics uncertainty can potentially be canceled out if the biases and errors are more or less similar between the treatment and control samples (He et al., 2007; Zhou et al., 2015). This could be the best use of “omics” data to address biologic questions of interest.
AAM (American Academy of Microbiology). 2016. Promoting responsible scientific research. Washington, DC: AAM. https://www.asm.org/images/Colloquia-report/Promoting_Responsible_Scientific_Research.pdf (accessed July 8, 2017).
ACGIH (American Conference of Governmental Industrial Hygienists). 1999. Bioaerosols: Assessment and control, 1st ed., edited by J. M. Macher. Cincinnati, OH: ACGIH.
Alivisatos, A. P., M. J. Blaser, E. L. Brodie, M. Chun, J. L. Dangl, T. J. Donohue, P. C. Dorrestein, J. A. Gilbert, J. L. Green, J. K. Jansson, R. Knight, M. E. Maxon, M. J. McFall-Ngai, J. F. Miller, K. S. Pollard, E. G. Ruby, and S. A. Taha. 2015. A unified initiative to harness Earth’s microbiomes. Science 350(6260):507-508.
Alneberg, J., B. S. Bjarnason, I. de Bruijn, M. Schirmer, J. Quick, U. Z. Ijaz, L. Lahti, N. J. Loman, A. F. Andersson, and C. Quince. 2014. Binning metagenomic contigs by coverage and composition. Nature Methods 11(11):1144-1146.
Alvarez, A. J., M. P. Buttner, and L. D. Stetzenbach. 1995. PCR for bioaerosol monitoring: Sensitivity and environmental interference. Applied and Environmental Microbiology 61:3639-3644.
Amann, R. I., W. Ludwig, and K. H. Schleifer. 1995. Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiological Reviews 59(1):143-169.
Ames, S. K., S. N. Gardner, J. M. Marti, T. R. Slezak, M. B. Gokhale, and J. E. Allen. 2015. Using populations of human and microbial genomes for organism detection in metagenomes. Genome Research 25:1056-1067.
Armengaud, J. 2016. Next-generation proteomics faces new challenges in environmental biotechnology. Current Opinion in Biotechnology 38:174-182.
Avramenko, R. W., E. M. Redman, R. Lewis, T. A. Yazwinski, J. D. Wasmuth, and J. S. Gilleard. 2015. Exploring the gastrointestinal “nemabiome”: Deep amplicon sequencing to quantify the species composition of parasitic nematode communities. PLOS ONE 10(12):e0143559.
Bartram, A. K., M. D. J. Lynch, J. C. Stearns, G. Moreno-Hagelsieb, and J. D. Neufeld. 2011. Generation of multimillion-sequence 16S rRNA gene libraries from complex microbial communities by assembling paired-end illumina reads. Applied and Environmental Microbiology 77(11):3846-3852.
Be, N. A., J. B. Thissen, S. N. Gardner, K. S. McLoughlin, V. Y. Fofanov, H. Koshinsky, S. R. Ellingson, T. S. Brettin, P. J. Jackson, and C. J. Jaing. 2013. Detection of Bacillus anthracis DNA in complex soil and air samples using next-generation sequencing. PLOS ONE 8(9):e73455.
Be, N. A., J. B. Thissen, V. Y. Fofanov, J. E. Allen, M. Rojas, G. Golovko, Y. Fofanov, H. Koshinsky, and C. J. Jaing. 2015. Metagenomic analysis of the airborne environment in urban spaces. Microbial Ecology 69(2):346-355.
Beitel, C. W., L. Froenicke, J. M. Lang, I. F. Korf, R. W. Michelmore, J. A. Eisen, and A. E. Darling. 2014. Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products. PeerJ 2:e415.
Benitez-Paez, A., K. J. Portune, and Y. Sanz. 2016. Species-level resolution of 16S rRNA gene amplicons sequenced through the MinIONTM portable nanopore sequencer. GigaScience 5:4. doi:10.1186/s13742-016-0111-z.
Biteen, J. S., P. C. Blainey, Z. G. Cardon, M. Chun, G. M. Church, P. C. Dorrestein, S. E. Fraser, J. A. Gilbert, J. K. Jansson, R. Knight, J. F. Miller, A. Ozcan, K. A. Prather, S. R. Quake, E. G. Ruby, P. A. Silver, S. Taha, G. van den Engh, P. S. Weiss, G. C. Wong, A. T. Wright, and T. D. Young. 2016. Tools for the microbiome: Nano and beyond. ACS Nano 10(1):6-37.
Björnsson, E., D. Norback, C. Janson, J. Widstrom, U. Palmgren, G. Strom, and G. Boman. 1995. Asthmatic symptoms and indoor levels of micro-organisms and house dust mites. Clinical & Experimental Allergy 25(4):423-431.
Bloem, J., M. Veninga, and J. Shepherd. 1995. Fully automatic determination of soil bacterium numbers, cell volumes, and frequencies of dividing cells by confocal laser scanning microscopy and image analysis. Applied and Environmental Microbiology 61:926-936.
Bokulich, N. A., J. R. Rideout, W. G. Mercurio, A. Shiffer, B. Wolfe, C. F. Maurice, R. J. Dutton, P. J. Turnbaugh, R. Knight, and J. G. Caporaso. 2016. Mockrobiota: A public resource for microbiome bioinformatics benchmarking. mSystems 1(5):e00062-16. doi:10.1128/mSystems.00062-16.
Boon, E., C. J. Meehan, C. Whidden, D. H.-J. Wong, M. G. Langille, and R. G. Beiko. 2014. Interactions in the microbiome: Communities of organisms and communities of genes. FEMS Microbiology Reviews 38(1):90-118.
Brodie, E. L., T. Z. DeSantis, J. P. M. Parker, I. X. Zubietta, Y. M. Piceno, and G. L. Andersen. 2007. Urban aerosols harbor diverse and dynamic bacterial populations. Proceedings of the National Academy of Sciences of the United States of America 104(1):299-304.
Brook, R. D., S. Rajagopalan, C. A. Pope, III, J. R. Brook, A. Bhatnagar, A. V. Diez-Roux, F. Holguin, Y. Hong, R. V. Luepker, M. A. Mittleman, A. Peters, D. Siscovick, S. C. Smith, Jr., L. Whitsel, and J. D. Kaufman. 2010. Particulate matter air pollution and cardiovascular disease: An update to the scientific statement from the American Heart Association. Circulation 121(21):2331-2378.
Brooks, J. P., D. J. Edwards, M. D. Harwich, M. C. Rivera, J. M. Fettweis, M. G. Serrano, R. A. Reris, N. U. Sheth, B. Huang, P. Girerd, J. F. Strauss, III, K. K. Jefferson, and G. A. Buck. 2015. The truth about metagenomics: Quantifying and counteracting bias in 16S rRNA studies. BMC Microbiology 15:66.
Brown, C. T., M. R. Olm, B. C. Thomas, and J. F. Banfield. 2016. Measurement of bacterial replication rates in microbial communities. Nature Biotechnology 34:1256-1263.
Burge, H. A. 1990. Bioaerosols: Prevalence and health effects in the indoor environment. Journal of Allergy and Clinical Immunology 86:687-701.
Burrell, R. 1991. Microbiological agents as health risks in indoor air. Environmental Health Perspectives 95:29-34.
Buttner, M. P., P. V. Scarpino, and C. S. Clark. 1993. Monitoring airborne fungal spores in an experimental indoor environment to evaluate sampling methods and the effects of human activity on air sampling. Applied and Environmental Microbiology 59:219-226.
Cardona, C., P. Weisenhorn, C. Henry, and J. A. Gilbert. 2016. Network-based metabolic analysis and microbial community modeling. Current Opinion in Microbiology 31:124-131.
Carr, R., and E. Borenstein. 2014. Comparative analysis of functional metagenomic annotation and the mappability of short reads. PLOS ONE 9(8):e105776.
Cimermancic, P., M. H. Medema, J. Claesen, K. Kurita, L. C. Wieland Brown, K. Mavrommatis, A. Pati, P. A. Godfrey, M. Koehrsen, J. Clardy, B. W. Birren, E. Takano, A. Sali, R. G. Linington, and M. A. Fischbach. 2014. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell 158(2):412-421.
Clooney, A. G., F. Fouhy, R. D. Sleator, A. O. Driscoll, C. Stanton, P. D. Cotter, and M. J. Claesson 2016. Comparing apples and oranges?: Next generation sequencing and its impact on microbiome analysis. PLOS ONE 11(2):e0148028.
Cole, J. R., B. Chai, R. J. Farris, Q. Wang, A. S. Kulam-Syed-Mohideen, D. M. McGarrell, A. M. Bandela, E. Cardenas, G. M. Garrity, and J. M. Tiedje. 2007. The Ribosomal Database Project (RDP-II): Introducing myRDP space and quality controlled public data. Nucleic Acids Research 35:D169-D172.
da Silva, R. R., P. C. Dorrestein, and R. A. Quinn. 2015. Illuminating the dark matter in metabolomics. Proceedings of the National Academy of Sciences of the United States of America 112(41):12549-12550.
de Boer, P., M. Caspers, J. W. Sanders, R. Kemperman, J. Wijman, G. Lommerse, G. Roeselers, R. Montijn, T. Abee, and R. Kort. 2015. Amplicon sequencing for the quantification of spoilage microbiota in complex foods including bacterial spores. Microbiome 3:30.
DeAngelis, K. M., C. H. Wu, H. R. Beller, E. L. Brodie, R. Chakraborty, T. Z. DeSantis, J. L. Fortney, T. C. Hazen, S. R. Osman, M. E. Singer, L. M. Tom, and G. L. Andersen. 2011. PCR amplification-independent methods for detection of microbial communities by the high-density microarray phylochip. Applied and Environmental Microbiology 77(18):6313-6322.
DeLong, E. F. 2009. The microbial ocean from genomes to biomes. Nature 459(7244):200-206.
Devault, A. M., K. McLoughlin, C. Jaing, S. Gardner, T. M. Porter, J. M. Enk, J. Thissen, J. Allen, M. Borucki, and S. N. DeWitte. 2014. Ancient pathogen DNA in archaeological samples detected with a Microbial Detection Array. Scientific Reports 4:4245. doi:10.1038/srep04245.
Donia, M. S., P. Cimermancic, C. J. Schulze, L. C. Wieland Brown, J. Martin, M. Mitreva, J. Clardy, R. G. Linington, and M. A. Fischbach. 2014. A systematic analysis of biosynthetic gene clusters in the human microbiome reveals a common family of antibiotics. Cell 158(6):1402-1414.
Douwes, J., P. Thorne, N. Pearce, and D. Heederik. 2003. Bioaerosol health effects and exposure assessment: Progress and prospects. Annals of Occupational Hygiene 47(3):187-200.
Eddy, S. R. 1998. Profile hidden Markov models. Bioinformatics Review 14(9):755-763.
Edgar, R. C. 2013. UPARSE: Highly accurate OTU sequences from microbial amplicon reads. Nature Methods 10(10):996-998.
Engelbrektson, A., V. Kunin, K. C. Wrighton, N. Zvenigorodsky, F. Chen, H. Ochman, and P. Hugenholtz. 2010. Experimental factors affecting PCR-based estimates of microbial species richness and evenness. The ISME Journal 4(5):642-647.
Eren, A. M., M. L. Sogin, and L. Maignien. 2016. Editorial: New insights into microbial ecology through subtle nucleotide variation. Frontiers in Microbiology 7:1318.
Faust, K., G. Lima-Mendez, J.-S. Lerat, J. F. Sathirapongsasuti, R. Knight, C. Huttenhower, T. Lenaerts, and J. Raes. 2015. Cross-biome comparison of microbial association networks. Frontiers in Microbiology 6:1200.
Firstencel, H., T. M. Butt, and R. I. Carruthers. 1990. A fluorescence microscopy method for determining the viability of entomophthoralean fungal spores. Journal of Invertebrate Pathology 55(2):258-264.
Fischer, M. A., S. Gullert, S. C. Neulinger, W. R. Streit, and R. A. Schmitz. 2016. Evaluation of 16S rRNA gene primer pairs for monitoring microbial community structures showed high reproducibility within and low comparability between datasets generated with multiple archaeal and bacterial primer pairs. Frontiers in Microbiology 7:1297.
Flannigan, B. 1992. Indoor microbial pollutants: Sources, species, characterisation and evaluation. In Chemical, microbiological, health and comfort aspects of indoor air quality: State of the art in SBS, edited by H. Knöppel and P. Wolkoff. Kluwer: Dordrecht, The Netherlands. Pp. 73-98.
Flannigan, B. 1993. Approaches to assessment of the microbial flora of buildings. In ASHRAE IAQ 1992, Environments for healthy people. Atlanta, GA: ASHRAE. Pp. 139-145.
Flannigan, B. 1997. Air sampling for fungi in indoor environments. Journal of Aerosol Science 28(3):381-392.
Flannigan, B., E. M. McCabe, and F. McCarry. 1991. Allergenic and toxigenic microorganisms in houses. Journal of Applied Bacteriology 70(Symposium Suppl.):61S-73S.
Flores, R., J. Shi, M. H. Gail, P. Gajer, J. Ravel, and J. J. Goedert. 2012. Assessment of the human faecal microbiota: II. Reproducibility and associations of 16S rRNA pyrosequences. European Journal of Clinical Investigation 42(8):855-863.
Frias-Lopez, J., Y. Shi, G. W. Tyson, M. L. Coleman, S. C. Schuster, and S. W. Chisholm. 2008. Microbial community gene expression in ocean surface waters. Proceedings of the National Academy of Sciences of the United States of America 105(10):3805-3810.
Gao, H., Z. K. Yang, T. J. Gentry, L. Wu, C. W. Schadt, and J. Zhou. 2007. Microarray-based analysis of microbial community RNAs by whole-community RNA amplification. Applied and Environmental Microbiology 73(2):563-571.
Ge, Y., J. P. Schimel, and P. A. Holden. 2014. Analysis of run-to-run variation of bar-coded pyrosequencing for evaluating bacterial community shifts and individual taxa dynamics. PLOS ONE 9(6):e99414.
Gevers, D., M. Pop, P. D. Schloss, and C. Huttenhower. 2012. Bioinformatics for the Human Microbiome Project. PLOS Computational Biology 8(11):e1002779.
Ghurye, J. S., V. Cepeda-Espinoza, and M. Pop. 2016. Metagenomic assembly: Overview, challenges and applications. Yale Journal of Biology and Medicine 89(3):353-362.
Giannoukos, G., D. Ciulla, K. Huang, B. Haas, J. Izard, J. Levin, J. Livny, A. Earl, D. Gevers, D. Ward, C. Nusbaum, B. Birren, and A. Gnirke. 2012. Efficient and robust RNA-seq process for cultured bacteria and complex community transcriptomes. Genome Biology 13(3):r23.
Gilbert, J. A., R. A. Quinn, J. Debelius, Z. Z. Xu, J. Morton, N. Garg, J. K. Jansson, P. C. Dorrestein, and R. Knight. 2016. Microbiome-wide association studies link dynamic microbial consortia to disease. Nature 535:94-103.
Gohl, D. M., P. Vangay, J. Garbe, A. MacLean, A. Hauge, A. Becker, T. J. Gould, J. B. Clayton, T. J. Johnson, R. Hunter, D. Knights, and K. B. Beckman. 2016. Systematic improvement of amplicon marker gene methods for increased accuracy in microbiome studies. Nature Biotechnology 34:942-949.
Górny, R. L., T. Reponen, K. Willeke, D. Schmechel, E. Robine, M. Boissier, and S. A. Grinshpun. 2002. Fungal fragments as indoor air biocontaminants. Applied and Environmental Microbiology 68(7):3522-3531.
Gotelli, N. J. 2001. Research frontiers in null model analysis. Global Ecology and Bio-geography 10(4):337-343.
Greenblum, S., R. Carr, and E. Borenstein. 2015. Extensive strain-level copy-number variation across human gut microbiome species. Cell 160(4):583-594.
Griffiths, W. D., I. W. Stewart, A. R. Reading, and S. J. Futter. 1996. Effect of aerosolization, growth phase and residence time in spray and collection fluids on the culturability of cells and spores. Journal of Aerosol Science 27(5):803-820.
Hanson, C. A., J. A. Fuhrman, M. C. Horner-Devine, and J. B. H. Martiny. 2012. Beyond biogeographic patterns: Processes shaping the microbial landscape. Nature Reviews Microbiology 10(7):497-506.
Hart, M. L., A. Meyer, P. J. Johnson, and A. C. Ericsson. 2015. Comparative evaluation of DNA extraction methods from feces of multiple host species for downstream next-generation sequencing. PLOS ONE 10(11):e0143334.
Hartmann, E. M., R. Hickey, T. Hsu, C. M. Betancourt Román, J. Chen, R. Schwager, J. Kline, G. Z. Brown, R. U. Halden, C. Huttenhower, and J. L. Green. 2016. Antimicrobial chemicals are associated with elevated antibiotic resistance genes in the indoor dust microbiome. Environmental Science & Technology 50(18):9807-9815.
He, Z., T. J. Gentry, C. W. Schadt, L. Wu, J. Liebich, S. C. Chong, Z. Huang, W. Wu, B. Gu, P. Jardine, C. Criddle, and J. Zhou. 2007. GeoChip: A comprehensive microarray for investigating biogeochemical, ecological and environmental processes. The ISME Journal 1(1):67-77.
He, Z., Y. Deng, J. D. Van Nostrand, Q. Tu, M. Xu, C. L. Hemme, X. Li, L. Wu, T. J. Gentry, Y. Yin, J. Liebich, T. C. Hazen, and J. Zhou. 2010a. GeoChip 3.0 as a high-throughput tool for analyzing microbial community composition, structure and functional activity. The ISME Journal 4(9):1167-1179.
He, S., O. Wurtzel, K. Singh, J. L. Froula, S. Yilmaz, S. G. Tringe, Z. Wang, F. Chen, E. A. Lindquist, R. Sorek, and P. Hugenholtz. 2010b. Validation of two ribosomal RNA removal methods for microbial metatranscriptomics. Nature Methods 7(10):807-812.
He, Y., J. G. Caporaso, X. T. Jiang, H. F. Sheng, S. M. Huse, and J. R. Rideout. 2015. Stability of operational taxonomic units: An important but neglected property for analyzing microbial diversity. Microbiome 3:20.
Heidelberg, J. F., M. Shahamat, M. Levin, I. Rahman, G. Stelma, C. Grim, and R. R. Colwell. 1997. Effect of aerosolization on culturability and viability of gram-negative bacteria. Applied and Environmental Microbiology 63(9):3585-3588.
Henningson, E. W., M. Lundquist, E. Larsson, G. Sandstrom, and M. Forsman. 1997. A comparative study of different methods to determine the total number and the survival ratio of bacteria in aerobiological samples. Journal of Aerosol Science 28(3):459-469.
Henry, C. S., H. C. Bernstein, P. Weisenhorn, R. C. Taylor, J.-Y. Lee, J. Zucker, and H.-S. Song. 2016. Microbial community metabolic modeling: A community data-driven network reconstruction. Journal of Cellular Physiology 231(11):2339-2345.
Herbst, F.-A., V. Lünsmann, H. Kjeldal, N. Jehmlich, A. Tholey, M. Bergen, J. L. Nielsen, R. L. Hettich, J. Seifert, and P. H. Nielsen. 2016. Enhancing metaproteomics—The value of models and defined environmental microbial systems. Proteomics 16(5):783-798.
Hernandez, M., S. L. Miller, D. W. Landfear, and J. M. Macher. 1999. A combined fluorochrome method for quantitation of metabolically active and inactive airborne bacteria. Aerosol Science and Technology 30(2):145-160.
Hess, M., A. Sczyrba, R. Egan, T.-W. Kim, H. Chokhawala, G. Schroth, S. Luo, D. S. Clark, F. Chen, T. Zhang, R. I. Mackie, L. A. Pennacchio, S. G. Tringe, A. Visel, T. Woyke, Z. Wang, and E. M. Rubin. 2011. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science 331(6016):463-467.
Highlander, S. 2015. Mock community analysis. Encyclopedia of Metagenomics: Genes, Genomes and Metagenomes: Basics, Methods, Databases and Tools 497-503.
Hinds, W. C. 1982. Aerosol Technology: Properties, Behavior, and Measurement of Airborne Particles. New York: John Wiley and Sons.
Hobbie, J. E., R. J. Daley, and S. Jasper. 1977. Use of nucleopore filters for counting bacteria by fluorescence microscopy. Applied and Environmental Microbiology 33:1225-1228.
HMP (Human Microbiome Project) Consortium. 2012. A framework for human microbiome research. Nature 486:215-221.
Huerta-Cepas, J., D. Szklarczyk, K. Forslund, H. Cook, D. Heller, M. C. Walter, T. Rattei, D. R. Mende, S. Sunagawa, M. Kuhn, L. J. Jensen, C. von Mering, and P. Bork. 2016. eggNOG 4.5: A hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Research 44(D1):D286-D293.
Imelfort, M., D. Parks, B. J. Woodcroft, P. Dennis, P. Hugenholtz, and G. W. Tyson. 2014. GroopM: An automated tool for the recovery of population genomes from related metagenomes. PeerJ 2:e603.
Jansson, J. K., and E. S. Baker. 2016. A multi-omic future for microbiome studies. Nature Microbiology 1(5):16049.
Jensen, C., and G. Lysek. 1991. Direct observation of trapping activities of nematode-destroying fungi in the soil using fluorescence microscopy. FEMS Microbiology Letters 85(3):207-210.
Jensen, P. A., W. F. Todd, G. N. Davis, and P. V. Scarpino. 1992. Evaluation of eight bioaerosol samplers challenged with aerosols of free bacteria. Journal of the American Industrial Hygiene Association 53:660-667.
Jones, M. B., S. K. Highlander, E. L. Anderson, W. Li, M. Dayrit, N. Klitgord, M. M. Fabani, V. Seguritan, J. Green, D. T. Pride, S. Yooseph, W. Biggs, K. E. Nelson, and J. C. Venter. 2015. Library preparation methodology can influence genomic and functional predictions in human microbiome research. Proceedings of the National Academy of Sciences of the United States of America 112(45):14024-14029.
Jørgensen, T. S., A. S. Kiil, M. A. Hansen, S. J. Sørensen, and L. H. Hansen. 2014. Current strategies for mobilome research. Frontiers in Microbiology 5:750.
Jovel, J., J. Patterson, W. Wang, N. Hotte, S. O’Keefe, T. Mitchel, T. Perry, D. Kao, A. L. Mason, K. L. Madsen, and G. K. S. Wong. 2016. Characterization of the gut microbiome using 16S or shotgun metagenomics. Frontiers in Microbiology 7:459.
Kaminski, J., M. K. Gibson, E. A. Franzosa, N. Segata, G. Dantas, and C. Huttenhower. 2015. High-specificity targeted functional profiling in microbial communities with ShortBRED. PLOS Computational Biology 11(12):1-22.
Kanehisa, M., S. Goto, Y. Sato, M. Kawashima, M. Furumichi, and M. Tanabe. 2014. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Research 42:D199-D205.
Kashtan, N., S. E. Roggensack, S. Rodrigue, J. W. Thompson, S. J. Biller, A. Coe, H. Ding, P. Marttinen, R. R. Malmstrom, R. Stocker, M. J. Follows, R. Stepanauskas, and S. W. Chisholm. 2014. Single-cell genomics reveals hundreds of coexisting subpopulations in wild prochlorococcus. Science 344(6182):416-420.
Kauserud, H., S. Kumar, A. K. Brysting, J. Norden, and T. Carlsen. 2012. High consistency between replicate 454 pyrosequencing analyses of ectomycorrhizal plant root samples. Mycorrhiza 22(4):309-315.
Kerepesi, C., and V. Grolmusz. 2016. Evaluating the quantitative capabilities of metagenomic analysis software. Current Microbiology 72(5):612-616.
Korem, T., D. Zeevi, J. Suez, A. Weinberger, T. Avnit-Sagi, M. Pompan-Lotan, E. Matot, G. Jona, A. Harmelin, N. Cohen, A. Sirota-Madi, C. A. Thaiss, M. Pevsner-Fischer, R. Sorek, R. J. Xavier, E. Elinav, and E. Segal. 2015. Growth dynamics of gut microbiota in health and disease inferred from single metagenomic samples. Science 349(6252):1101-1106.
Koskella, B., and M. A. Brockhurst. 2014. Bacteria–phage coevolution as a driver of ecological and evolutionary processes in microbial communities. FEMS Microbiology Reviews 38(5):916-931.
Kuczynski, J., C. L. Lauber, W. A. Walters, L. W. Parfrey, J. C. Clemente, D. Gevers, and R. Knight. 2012. Experimental and analytical tools for studying the human microbiome. Nature Reviews Genetics 13(1):47-58.
Kunin, V., A. Engelbrektson, H. Ochman, and P. Hugenholtz. 2010. Wrinkles in the rare biosphere: Pyrosequencing errors can lead to artificial inflation of diversity estimates. Environmental Microbiology 12(1):118-123.
Kurtz, Z. D., C. L. Müller, E. R. Miraldi, D. R. Littman, M. J. Blaser, and R. A. Bonneau. 2015. Sparse and compositionally robust inference of microbial ecological networks. PLOS Computational Biology 11(5):e1004226.
Langille, M. G. I., J. Zaneveld, J. G. Caporaso, D. McDonald, D. Knights, J. A. Reyes, J. C. Clemente, D. E. Burkepile, R. L. Vega Thurber, R. Knight, R. G. Beiko, and C. Huttenhower. 2013. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nature Biotechnology 31:814-821.
Leinonen, R., H. Sugawara, and M. Shumway. 2011. The sequence read archive. Nucleic Acids Research 39:D19-D21.
Lemon, K. P., V. Klepac-Ceraj, H. K. Schiffer, E. L. Brodie, S. V. Lynch, and R. Kolter. 2010. Comparative analyses of the bacterial microbiota of the human nostril and oropharynx. mBio 1(3):e00129-10. doi:10.1128/mBio.00129-10.
Lemos, L. N., R. R. Fulthorpe, and L. F. Roesch. 2012. Low sequencing efforts bias analyses of shared taxa in microbial communities. Folia Microbiologica 57(5):409-413.
Leonard, S. R., M. K. Mammel, D. W. Lacher, and C. A. Elkins. 2015. Application of metagenomic sequencing to food safety: Detection of shiga toxin-producing escherichia coli on fresh bagged spinach. Applied and Environmental Microbiology 81(23):8183-8191.
Li, N., M. Hao, R. F. Phalen, W. C. Hinds, and A. E. Nel. 2003. Particulate air pollutants and asthma: A paradigm for the role of oxidative stress in PM-induced adverse health effects. Clinical Immunology 109(3):250-265.
Liang, Y., L. Wu, I. M. Clark, K. Xue, Y. Yang, J. D. Van Nostrand, Y. Deng, Z. He, S. McGrath, J. Storkey, P. R. Hirsch, B. Sun, and J. Zhou. 2015. Over 150 years of long-term fertilization alters spatial scaling of microbial biodiversity. mBio 6(2):e00240-15. doi:10.1128/mBio.00240-15.
Liao, R., R. Zhang, J. Guan, and S. Zhou. 2014. A new unsupervised binning approach for metagenomic sequences based on N-grams and automatic feature weighting. IEEE/ACM Transactions on Computational Biology and Bioinformatics 11(1):42-54.
Liebich, J., C. W. Schadt, S. C. Chong, Z. L. He, S. K. Rhee, and J. Z. Zhou. 2006. Improvement of oligonucleotide probe design criteria for functional gene microarrays in environmental applications. Applied and Environmental Microbiology 72(2):1688-1691.
Lindgreen, S., K. L. Adair, and P. P. Gardner. 2016. An evaluation of the accuracy and speed of metagenome analysis tools. Scientific Reports 6:19233.
Ling, L. L., T. Schneider, A. J. Peoples, A. L. Spoering, I. Engels, B. P. Conlon, A. Mueller, T. F. Schaberle, D. E. Hughes, S. Epstein, M. Jones, L. Lazarides, V. A. Steadman, D. R. Cohen, C. R. Felix, K. A. Fetterman, W. P. Millett, A. G. Nitti, A. M. Zullo, C. Chen, and K. Lewis. 2015. A new antibiotic kills pathogens without detectable resistance. Nature 517(7535):455-459.
Long, P. E., K. H. Williams, S. S. Hubbard, and J. F. Banfield. 2016. Microbial metagenomics reveals climate-relevant subsurface biogeochemical processes. Trends in Microbiology 24(8):600-610.
Loy, A., A. Lehner, N. Lee, J. Adamczyk, H. Meier, J. Ernst, K. H. Schleifer, and M. Wagner. 2002. Oligonucleotide microarray for 16S rRNA gene-based detection of all recognized lineages of sulfate-reducing prokaryotes in the environment. Applied and Environmental Microbiology 68(10):5064-5081.
Mackelprang, R., M. P. Waldrop, K. M. DeAngelis, M. M. David, K. L. Chavarria, S. J. Blazewicz, E. M. Rubin, and J. K. Jansson. 2011. Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw. Nature 480(7377):368-371.
MacNaughton, S. J., T. L. Jenkins, S. Alugupalli, and D. C. White. 1997. Quantitative sampling of indoor air biomass by signature lipid biomarker analysis: Feasibility studies in a model system. American Industrial Hygiene Association Journal 58(4):270-277.
MacNeil, L., T. Kauri, and W. Robertson. 1995. Molecular techniques and their potential application in monitoring the microbiological quality of indoor air. Canadian Journal of Microbiology 41:657-665.
Maki, J. S., and C. C. Remsen. 1981. Comparison of two direct count methods for determining metabolizing bacteria in freshwater. Applied and Environmental Microbiology 41:1132-1138.
Manor, O., and E. Borenstein. 2015. MUSiCC: A marker genes based framework for metagenomic normalization and accurate profiling of gene abundances in the microbiome. Genome Biology 16:53.
Mao, Y., A. C. Yannarell, and R. I. Mackie. 2011. Changes in N-transforming archaea and bacteria in soil during the establishment of bioenergy crops. PLOS ONE 6(9):e24750.
Marchand, G., J. Lavoie, and L. Lazure. 1995. Evaluation of bioaerosols in a municipal solid waste recycling and composting plant. Journal of the Air and Waste Management Association 45:778-781.
Maroni, M., R. Axelrad, and A. Bacaloni. 1995. NATO efforts to set indoor air quality guidelines and standards. American Industrial Hygiene Association Journal 56(5):499-508.
McFeters, G. A., P. Y. Feipeng, B. H. Pyle, and P. S. Stewart. 1995. Physiological assessment of bacteria using fluorochromes. Journal of Microbiological Methods 21:1-13.
McLoughlin, K. 2016. Benchmarking for quasispecies abundance inference with confidence intervals from metagenomic sequence data. Technical report. Livermore, CA: Lawrence Livermore National Laboratory.
McMurdie, P. J., and S. Holmes. 2014. Waste not, want not: Why rarefying microbiome data is inadmissible. PLOS Computational Biology 10:e1003531.
Mendes-Soares, H., M. Mundy, L. M. Soares, and N. Chia. 2016. MMinte: An application for predicting metabolic interactions among the microbial species in a community. BMC Bioinformatics 17(1):343.
Merchant, S., D. E. Wood, and S. L. Salzberg. 2014. Unexpected cross-species contamination in genome sequencing projects. PeerJ 2:e675.
Miller, D. J. 1992. Fungi as contaminants in indoor air. Atmospheric Environment 26A(12):2163-2172.
Moran, M. A. 2009. Metatranscriptomics: eavesdropping on complex microbial communities. Microbe 4(7):329-335.
Moran, M. A., B. Satinsky, S. M. Gifford, H. Luo, A. Rivers, L.-K. Chan, J. Meng, B. P. Durham, C. Shen, V. A. Varaljay, C. B. Smith, P. L. Yager, and B. M. Hopkinson. 2012. Sizing up metatranscriptomics. The ISME Journal 7(2):237-243.
Morgan, J. L., A. E. Darling, and J. A. Eisen. 2010. Metagenomic sequencing of an in vitro-simulated microbial community. PLOS ONE 5(4):e10209.
Moschandreas, D. J., D. K. Cha, and J. Qian. 1996. Measurement of indoor bioaerosol levels by a direct counting method. Journal of Environmental Engineering 122(5):374-378.
Nawy, T. 2013. Probing microbiome function. Nature Methods 10:35. doi:10.1038/ nmeth.2293.
Nayfach, S., and K. S. Pollard. 2016. Toward accurate and quantitative comparative metagenomics. Cell 166(5):1103-1116.
Nayfach, S., P. H. Bradley, S. K. Wyman, T. J. Laurent, A. Williams, J. A. Eisen, K. S. Pollard, and T. J. Sharpton. 2015. Automated and accurate estimation of gene family abundance from shotgun metagenomes. PLOS Computational Biology 11:e1004573.
Nayfach, S., B. Rodriguez-Mueller, and K. S. Pollard. 2016. An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography. Genome Research 26:1-14. doi:10.1101/gr.201863.115.
Nguyen, N.-P., T. Warnow, M. Pop, and B. White. 2016. A perspective on 16S rRNA operational taxonomic unit clustering using sequence similarity. Biofilms and Microbiomes 2:16004. doi:10.1038/npjbiofilms.2016.4.
Nielsen, H. B., M. Almeida, A. S. Juncker, S. Rasmussen, J. Li, S. Sunagawa, D. R. Plichta, L. Gautier, A. G. Pedersen, E. Le Chatelier, E. Pelletier, I. Bonde, T. Nielsen, C. Manichanh, M. Arumugam, J.-M. Batto, M. B. Quintanilha dos Santos, N. Blom, N. Borruel, K. S. Burgdorf, F. Boumezbeur, F. Casellas, J. Doré, P. Dworzynski, F. Guarner, T. Hansen, F. Hildebrand, R. S. Kaas, S. Kennedy, K. Kristiansen, J. Roat Kultima, P. Léonard, F. Levenez, O. Lund, B. Moumen, D. Le Paslier, N. Pons, O. Pedersen, E. Prifti, J. Qin, J. Raes, S. Sørensen, J. Tap, S. Tims, D. W. Ussery, T. Yamada, P. Renault, T. Sicheritz-Ponten, P. Bork, J. Wang, S. Brunak, and S. D. Ehrlich. 2014. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nature Biotechnology 32(8):822-828.
Noecker, C., A. Eng, S. Srinivasan, C. M. Theriot, V. B. Young, J. K. Jansson, D. N. Fredricks, and E. Borenstein. 2016. Metabolic model-based integration of microbiome taxonomic and metabolomic profiles elucidates mechanistic links between ecological and metabolic variation. mSystems 1.
Pace, N. R. 1997. A molecular view of microbial diversity and the biosphere. Science 276(5313):734-740.
Palmer, K., and M. A. Horn. 2012. Actinobacterial nitrate reducers and proteobacterial denitrifiers are abundant in N2O-metabolizing palsa peat. Applied and Environmental Microbiology 78(16):5584-5596.
Palmgren, U., G. Strom, P. Malmberg, and G. Blomquist. 1986. The nucleopore filter method: A technique for enumeration of viable and nonviable airborne microorganisms. American Journal of Industrial Medicine 10:325-327.
Peccia, J., and M. Hernandez. 2006. Incorporating polymerase chain reaction-based identification, population characterization, and quantification of microorganisms into aerosol science: A review. Atmospheric Environment 40:3941-3961.
Peng, X., K.-Q. Yu, G.-H. Deng, Y.-X. Jiang, Y. Wang, G.-X. Zhang, and H.-W. Zhou. 2013. Comparison of direct boiling method with commercial kits for extracting fecal microbiome DNA by Illumina sequencing of 16S rRNA tags. Journal of Microbiological Methods 95(3):455-462.
Pillai, S. D., K. W. Widmer, S. E. Down, and S. C. Ricke. 1996. Occurrence of airborne bacteria and pathogen indicators during land application of sewage sludge. Applied and Environmental Microbiology 62(1):296-299.
Pilloni, G., M. S. Granitsiotis, M. Engel, and T. Lueders. 2012. Testing the limits of 454 pyrotag sequencing: Reproducibility, quantitative assessment and comparison to T-RFLP fingerprinting of aquifer microbes. PLOS ONE 7(7):e40467.
Pinto, A. J., and L. Raskin. 2012. PCR biases distort bacterial and archaeal community structure in pyrosequencing datasets. PLOS ONE 7(8):e43093.
Poretsky, R. S., I. Hewson, S. Sun, A. E. Allen, J. P. Zehr, and M. A. Moran. 2009. Comparative day/night metatranscriptomic analysis of microbial communities in the North Pacific subtropical gyre. Environmental Microbiology 11(6):1358-1375.
Qin, J., Y. Li, Z. Cai, S. Li, J. Zhu, F. Zhang, S. Liang, W. Zhang, Y. Guan, D. Shen, Y. Peng, D. Zhang, Z. Jie, W. Wu, Y. Qin, W. Xue, J. Li, L. Han, D. Lu, P. Wu, Y. Dai, X. Sun, Z. Li, A. Tang, S. Zhong, X. Li, W. Chen, R. Xu, M. Wang, Q. Feng, M. Gong, J. Yu, Y. Zhang, M. Zhang, T. Hansen, G. Sanchez, J. Raes, G. Falony, S. Okuda, M. Almeida, E. LeChatelier, P. Renault, N. Pons, J.-M. Batto, Z. Zhang, H. Chen, R. Yang, W. Zheng, S. Li, H. Yang, J. Wang, S. D. Ehrlich, R. Nielsen, O. Pedersen, K. Kristiansen, and J. Wang. 2012. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 490(7418):55-60.
Qiu, X. Y., L. Y. Wu, H. S. Huang, P. E. McDonel, A. V. Palumbo, J. M. Tiedje, and J. Z. Zhou. 2001. Evaluation of PCR-generated chimeras: Mutations, and heteroduplexes with 16S rRNA gene-based cloning. Applied and Environmental Microbiology 67(2):880-887.
Quinn, R. A., J. A. Navas-Molina, E. R. Hyde, S. J. Song, Y. Vázquez-Baeza, G. Humphrey, J. Gaffney, J. J. Minich, A. V. Melnik, J. Herschend, J. DeReus, A. Durant, R. J. Dutton, M. Khosroheidari, C. Green, R. da Silva, P. C. Dorrestein, and R. Knight. 2016. From sample to multi-omics conclusions in under 48 hours. mSystems 1.
Quinn, R. A., L.-F. Nothias, O. Vining, M. Meehan, E. Esquenazi, and P. C. Dorrestein. 2017. Molecular networking as a drug discovery, drug metabolism, and precision medicine strategy. Trends in Pharmacological Sciences 38(2):143-154.
Randle-Boggis, R. J., T. Helgason, M. Sapp, and P. D. Ashton. 2016. Evaluating techniques for metagenome annotation using simulated sequence data. FEMS Microbiology Ecology 92(7):fiw095. doi:10.1093/femsec/fiw095.
Rao, C., H. A. Burge, and J. C. S. Chang. 1996. Review of quantitative standards and guidelines for fungi in indoor air. Journal of the Air & Waste Management Association 46(9):899-908.
Reponen, T., S. A. Grinshpun, K. L. Conwell, J. Wiest, and M. Anderson. 2001. Aerodynamic versus physical size of spores: Measurement and implication on respiratory deposition. Grana 40:119-125.
Rhee, S.-K., X. Liu, L. Wu, S. C. Chong, X. Wan, and J. Zhou. 2004. Detection of genes involved in biodegradation and biotransformation in microbial communities by using 50-mer oligonucleotide microarrays. Applied and Environmental Microbiology 70(7):4303-4317.
Ritari, J., J. Salojarvi, L. Lahti, and W. M. de Vos. 2015. Improved taxonomic assignment of human intestinal 16S rRNA sequences by a dedicated reference database. BMC Genomics 16:1056.
Robbins, C. A., L. J. Swenson, M. L. Nealley, R. E. Gots, and B. J. Kelman. 2000. Health effects of mycotoxin in indoor air: A critical review. Applied Occupational and Environmental Hygiene 15(10):773-784.
Rodriguez, G. G., D. Phipps, K. Ishiguro, and H. F. Ridgeway. 1992. Use of fluorescent redox probe for direct visualization of actively respiring bacteria. Applied and Environmental Microbiology 58:1801-1808.
Roh, S. W., G. C. J. Abell, K.-H. Kim, Y.-D. Nam, and J.-W. Bae. 2010. Comparing microarrays and next-generation sequencing technologies for microbial ecology research. Trends in Biotechnology 28(6):291-299.
Rosario, K., and M. Breitbart. 2011. Exploring the viral world through metagenomics. Current Opinion in Virology 1(4):289-297.
Rudi, K., O. M. Skulberg, R. Skulberg, and K. S. Jakobsen. 2000. Application of sequence-specific labeled 16S rRNA gene oligonucleotide probes for genetic profiling of cyanobacterial abundance and diversity by array hybridization. Applied and Environmental Microbiology 66(9):4004-4011.
Salter, S. J., M. J. Cox, E. M. Turek, S. T. Calus, W. O. Cookson, M. F. Moffatt, P. Turner, J. Parkhill, N. J. Loman, and A. W. Walker. 2014. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biology 12:87.
Sangwan, N., F. Xia, and J. A. Gilbert. 2016. Recovering complete and draft population genomes from metagenome datasets. Microbiome 4:8.
Schloss, P. D., D. Gevers, and S. L. Westcott. 2011. Reducing the effects of PCR amplification and sequencing artifacts on 16S rRNA-based studies. PLOS ONE 6(12):e27310.
Scholz, M., D. V. Ward, E. Pasolli, T. Tolio, M. Zolfo, F. Asnicar, D. T. Truong, A. Tett, A. L. Morrow, and N. Segata. 2016. Strain-level microbial epidemiology and population genomics from shotgun metagenomics. Nature Methods 13(5):435-438.
Segata, N., D. Boernigen, T. L. Tickle, X. C. Morgan, W. S. Garrett, and C. Huttenhower. 2013. Computational meta’omics for microbial community studies. Molecular Systems Biology 9:666.
Shapiro, B. J., J. Friedman, O. X. Cordero, S. P. Preheim, S. C. Timberlake, G. Szabo, M. F. Polz, and E. J. Alm. 2012. Population genomics of early events in the ecological differentiation of bacteria. Science 336(6077):48-51.
Shi, Y., G. W. Tyson, and E. F. DeLong. 2009. Metatranscriptomics reveals unique microbial small RNAs in the ocean’s water column. Nature 459(7244):266-269.
Sinclair, L., O. A. Osman, S. Bertilsson, and A. Eiler. 2015. Microbial community composition and diversity via 16S rRNA gene amplicons: Evaluating the illumina platform. PLOS ONE 10(2):e0116955.
Singer, E., B. Bushnell, D. Coleman-Derr, B. Bowman, R. M. Bowers, A. Levy, E. A. Gies, J. F. Cheng, A. Copeland, H. P. Klenk, S. J. Hallam, P. Hugenholtz, S. G. Tringe, and T. Woyke. 2016. High-resolution phylogenetic microbial community profiling. The ISME Journal 10(8):2020-2032.
Sinha, R., C. Abnet, O. White, R. Knight, and C. Huttenhower. 2015. The microbiome quality control project: Baseline study design and future directions. Genome Biology 16:276.
Sorek, R., and P. Cossart. 2010. Prokaryotic transcriptomics: A new view on regulation, physiology and pathogenicity. Nature Reviews Genetics 11(1):9-16.
Sorenson, W. G., D. G. Frazer, B. B. Jarvis, J. Simpson, and V. A. Robinson. 1987. Trichothecene mycotoxins in aerosolized conidia of Stachybotrys atra. Applied and Environmental Microbiology 53(6):1370-1375.
Stewart, F. J., E. A. Ottesen, and E. F. DeLong. 2010. Development and quantitative analyses of a universal rRNA-subtraction protocol for microbial metatranscriptomics. The ISME Journal 4(7):896-907.
Stewart, F. J., O. Ulloa, and E. F. DeLong. 2012. Microbial metatranscriptomics in a permanent marine oxygen minimum zone. Environmental Microbiology 14(1):23-40.
Su, H. J., A. Rotnitzky, H. A. Burge, and J. D. Spengler. 1992. Examination of fungi in domestic interiors by using factor analysis—correlations and associations with home factors. Applied and Environmental Microbiology 58:181-186.
Suzuki, M. T., and S. J. Giovannoni. 1996. Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR. Applied and Environmental Microbiology 62(2):625-630.
Tabor, P. S., and R. A. Neihof. 1982. Improved method for determination of respiring individual microorganisms in natural waters. Applied and Environmental Microbiology 43:1249-1255.
Talley, N. J., and A. A. Fodor. 2011. Bugs, stool, and the irritable bowel syndrome: Too much is as bad as too little? Gastroenterology 141(5):1555-1559.
Tanca, A., A. Palomba, M. Deligios, T. Cubeddu, C. Fraumene, G. Biosa, D. Pagnozzi, M. F. Addis, and S. Uzzau. 2013. Evaluating the impact of different sequence databases on metaproteome analysis: Insights from a lab-assembled microbial mixture. PLOS ONE 8(12):e82981.
Tedersoo, L., S. Anslan, M. Bahram, S. Polme, T. Riit, I. Liiv, U. Koljalg, V. Kisand, R. H. Nilsson, F. Hildebrand, P. Bork, and K. Abarenkov. 2015. Shotgun metagenomes and multiple primer pair-barcode combinations of amplicons reveal biases in metabarcoding analyses of fungi. MycoKeys 10:1-43.
Teltsch, B., and E. Katzenelson. 1978. Airborne enteric bacteria and viruses from spray irrigation with wastewater. Applied and Environmental Microbiology 35(2):290-296.
Terzieva, S., J. Donnelly, V. Ulevicius, S. A. Grinshpun, D. Willeke, G. N. Stelma, and K. B. Brenner. 1996. Comparison of methods for detection and enumeration of airborne microorganisms collected by liquid impingement. Applied and Environmental Microbiology 62:2264-2272.
Tiquia, S. M., L. Wu, S. C. Chong, S. Passovets, D. Xu, Y. Xu, and J. Zhou. 2004. Evaluation of 50-mer oligonucleotide arrays for detecting microbial populations in environmental samples. Biotechniques 36(4):664-670.
Tong, Y., and B. Lighthart. 1999. Diurnal distribution of total and culturable atmospheric bacteria at a rural site. Aerosol Science and Technology 30(2):246-254.
Tremblay, J., K. Singh, A. Fern, E. S. Kirton, S. M. He, T. Woyke, J. Lee, F. Chen, J. L. Dangl, and S. G. Tringe. 2015. Primer and platform effects on 16S rRNA tag sequencing. Frontiers in Microbiology 6:771.
Trevors, J. T. 1985. Effect of temperature on selected microbial activities in aerobic and anaerobically incubated sediment. Hydrobiologia 126(2):189-192.
Tringe, S. G., C. von Mering, A. Kobayashi, A. A. Salamov, K. Chen, H. W. Chang, M. Podar, J. M. Short, E. J. Mathur, J. C. Detter, P. Bork, P. Hugenholtz, and E. M. Rubin. 2005. Comparative metagenomics of microbial communities. Science 308(5721):554-557.
Tu, Q., H. Yu, Z. He, Y. Deng, L. Wu, J. D. Van Nostrand, A. Zhou, J. Voordeckers, Y. J. Lee, Y. Qin, C. L. Hemme, Z. Shi, K. Xue, T. Yuan, A. Wang, and J. Zhou. 2014. GeoChip 4: A functional gene arrays-based high throughput environmental technology for microbial community analysis. Molecular Ecology Resources 14(5):914-928.
Urakawa, H., P. A. Noble, S. El Fantroussi, J. J. Kelly, and D. A. Stahl. 2002. Single-base-pair discrimination of terminal mismatches by using oligonucleotide microarrays and neural network analyses. Applied and Environmental Microbiology 68(1):235-244.
Urich, T., A. Lanzén, J. Qi, D. H. Huson, C. Schleper, and S. C. Schuster. 2008. Simultaneous assessment of soil microbial community structure and function through analysis of the meta-transcriptome. PLOS ONE 3(6):e2527.
Uyaguari-Diaz, M. I., M. Chan, B. L. Chaban, M. A. Croxen, J. F. Finke, J. E. Hill, M. A. Peabody, T. Van Rossum, C. A. Suttle, F. S. L. Brinkman, J. Isaac-Renton, N. A. Prystajecky, and P. Tang. 2016. A comprehensive method for amplicon-based and metagenomic characterization of viruses, bacteria, and eukaryotes in freshwater samples. Microbiome 4.
van der Hooft, J. J. J., J. Wandy, M. P. Barrett, K. E. V. Burgess, and S. Rogers. 2016. Topic modeling for untargeted substructure exploration in metabolomics. Proceedings of the National Academy of Sciences of the United States of America 113(48):13738-13743.
Van Nostrand, J. D., L. Wu, W.-M. Wu, Z. Huang, T. J. Gentry, Y. Deng, J. Carley, S. Carroll, Z. He, B. Gu, J. Luo, C. S. Criddle, D. B. Watson, P. M. Jardine, T. L. Marsh, J. M. Tiedje, T. C. Hazen, and J. Zhou. 2011. Dynamics of microbial community composition and function during in situ bioremediation of a uranium-contaminated aquifer. Applied and Environmental Microbiology 77(11):3860-3869.
Venter, J. C., K. Remington, J. F. Heidelberg, A. L. Halpern, D. Rusch, J. A. Eisen, D. Wu, I. Paulsen, K. E. Nelson, W. Nelson, D. E. Fouts, S. Levy, A. H. Knap, M. W. Lomas, K. Nealson, O. White, J. Peterson, J. Hoffman, R. Parsons, H. Baden-Tillson, C. Pfannkoch, Y.-H. Rogers, and H. O. Smith. 2004. Environmental genome shotgun sequencing of the Sargasso Sea. Science 304(5667):66-74.
Vieites, J. M., M.-E. Guazzaroni, A. Beloqui, P. N. Golyshin, and M. Ferrer. 2009. Metagenomics approaches in systems microbiology. FEMS Microbiology Reviews 33(1):236-255.
Vishnivetskaya, T. A., A. C. Layton, M. C. Y. Lau, A. Chauhan, K. R. Cheng, A. J. Meyers, J. R. Murphy, A. W. Rogers, G. S. Saarunya, D. E. Williams, S. M. Pfiffner, J. P. Biggerstaff, B. T. Stackhouse, T. J. Phelps, L. Whyte, G. S. Sayler, and T. C. Onstott. 2014. Commercial DNA extraction kits impact observed microbial community composition in permafrost samples. FEMS Microbiology Ecology 87(1):217-230.
Wagner, M., B. Assmus, A. Hartmann, P. Hutzler, and R. Amann. 1994. In situ analysis of microbial consortia in activated sludge using fluorescently labelled, rRNA-targeted oligonucleotide probes and confocal scanning laser microscopy. Journal of Microscopy 176(3):181-187.
Wang, C., X. Wang, D. Liu, H. Wu, X. Lu, Y. Fang, W. Cheng, W. Luo, P. Jiang, J. Shi, H. Yin, J. Zhou, X. Han, and E. Bai. 2014. Aridity threshold in controlling ecosystem nitrogen cycling in arid and semi-arid grasslands. Nature Communications 5:4799.
Wang, J., and H. Jia. 2016. Metagenome-wide association studies: Fine-mining the microbiome. Nature Reviews Microbiology 14(8):508-522.
Wang, M., J. J. Carver, V. V. Phelan, L. M. Sanchez, N. Garg, Y. Peng, D. D. Nguyen, J. Watrous, C. A. Kapono, T. Luzzatto-Knaan, C. Porto, A. Bouslimani, A. V. Melnik, M. J. Meehan, W. T. Liu, M. Crüsemann, P. D. Boudreau, E. Esquenazi, M. SandovalCalderón, R. D. Kersten, L. A. Pace, R. A. Quinn, K. R. Duncan, C. C. Hsu, D. J. Floros, R. G. Gavilan, K. Kleigrewe, T. Northen, R. J. Dutton, D. Parrot, E. E. Carlson, B. Aigle, C. F. Michelsen, L. Jelsbak, C. Sohlenkamp, P. Pevzner, A. Edlund, J. McLean, J. Piel, B. T. Murphy, L. Gerwick, C. C. Liaw, Y. L. Yang, H. U. Humpf, M. Maansson, R. A. Keyzers, A. C. Sims, A. R. Johnson, A. M. Sidebottom, B. E. Sedio, A. Klitgaard, C. B. Larson, P. C. A. Boya, D. Torres-Mendoza, D. J. Gonzalez, D. B. Silva, L. M. Marques, D. P. Demarque, E. Pociute, E. C. O’Neill, E. Briand, E. J. Helfrich, E. A. Granatosky, E. Glukhov, F. Ryffel, H. Houson, H. Mohimani, J. J. Kharbush, Y. Zeng, J. A. Vorholt, K. L. Kurita, P. Charusanti, K. L. McPhail, K. F. Nielsen, L. Vuong, M. Elfeki, M. F. Traxler, N. Engene, N. Koyama, O. B. Vining, R. Baric, R. R. Silva, S. J. Mascuch, S. Tomasi, S. Jenkins, V. Macherla, T. Hoffman, V. Agarwal, P. G. Williams, J. Dai, R. Neupane, J. Gurr, A. M. Rodríguez, A. Lamsa, C. Zhang, K. Dorrestein, B. M. Duggan, J. Almaliti, P. M. Allard, P. Phapale, L. F. Nothias, T. Alexandrov, M. Litaudon, J. L. Wolfender, J. E. Kyle, T. O. Metz, T. Peryea, D. T. Nguyen, D. VanLeer, P. Shinn, A. Jadhav, R. Müller, K. M. Waters, W. Shi, X. Liu, L. Zhang, R. Knight, P. R. Jensen, B. Ø. Palsson, K. Pogliano, R. G. Linington, M. Gutiérrez, N. P. Lopes, W. H. Gerwick, B. S. Moore, P. C. Dorrestein, and N. Bandeira. 2016. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nature Biotechnology 34(8):828-837.
Watrous, J., P. Roach, T. Alexandrov, B. S. Heath, J. Y. Yang, R. D. Kersten, M. van der Voort, K. Pogliano, H. Gross, J. M. Raaijmakers, B. S. Moore, J. Laskin, N. Bandeira, and P. C. Dorrestein. 2012. Mass spectral molecular networking of living microbial colonies. Proceedings of the National Academy of Sciences of the United States of America 109(26):E1743-E1752.
Weiss, S., W. Van Treuren, C. Lozupone, K. Faust, J. Friedman, Y. Deng, L. C. Xia, Z. Z. Xu, L. Ursell, E. J. Alm, A. Birmingham, J. A. Cram, J. A. Fuhrman, J. Raes, F. Sun, J. Zhou, and R. Knight. 2016. Correlation detection strategies in microbial data sets vary widely in sensitivity and precision. The ISME Journal 10(7):1669-1681.
Wen, C., L. Wu, Y. Qin, J. D. Van Nostrand, B. Sun, K. Xue, F. Liu, Y. Deng, and J.-Z. Zhou. 2017. Evaluation of the reproducibility of amplicon sequencing with Illumina MiSeq platform. PLOS ONE 12(4):e0176716.
White, R. A., E. M. Bottos, T. Roy Chowdhury, J. D. Zucker, C. J. Brislawn, C. D. Nicora, S. J. Fansler, K. R. Glaesemann, K. Glass, and J. K. Jansson. 2016. Moleculo long-read sequencing facilitates assembly and genomic binning from complex soil metagenomes. mSystems 1.
WHO (World Health Organization). 1990. Indoor air quality: Biological contaminants. WHO Regional Publications. European Series 31:1-67.
Womack, A. M., B. J. M. Bohannan, and J. L. Green. 2010. Biodiversity and biogeography of the atmosphere. Philosophical Transactions of the Royal Society 365(1558):3645-3653.
Wu, L., D. K. Thompson, G. Li, R. A. Hurt, J. M. Tiedje, and J. Zhou. 2001. Development and evaluation of functional gene arrays for detection of selected genes in the environment. Applied and Environmental Microbiology 67(12):5780-5790.
Wu, L., X. Liu, C. W. Schadt, and J. Zhou. 2006. Microarray-based analysis of subnanogram quantities of microbial community DNAs by using whole-community genome amplification. Applied and Environmental Microbiology 72(7):4931-4941.
Xu, L. H., S. Ravnskov, J. Larsen, and M. Nicolaisen. 2011. Influence of DNA extraction and PCR amplification on studies of soil fungal communities based on amplicon sequencing. Canadian Journal of Microbiology 57(12):1062-1066.
Xue, K., M. M. Yuan, Z. J. Shi, Y. J. Qin, Y. Deng, L. Cheng, L. Y. Wu, Z. L. He, J. D. Van Nostrand, R. Bracho, S. Natali, E. A. G. Schuur, C. W. Luo, K. T. Konstantinidis, Q. Wang, J. R. Cole, J. M. Tiedje, Y. Q. Luo, and J. Z. Zhou. 2016. Tundra soil carbon is vulnerable to rapid microbial decomposition under climate warming. Nature Climate Change 6(6):595-600.
Yu, K., and T. Zhang. 2012. Metagenomic and metatranscriptomic analysis of microbial community structure and gene expression of activated sludge. PLOS ONE 7(5):e38183.
Zhan, A., W. Xiong, S. He, and H. J. MacIsaac. 2014. Influence of artifact removal on rare species recovery in natural complex communities using high-throughput sequencing. PLOS ONE 9(5):e96928. doi:10.1371/journal.pone.0096928.
Zhou, J. Z. 2003. Microarrays for bacterial detection and microbial community analysis. Current Opinion in Microbiology 6(3):288-294.
Zhou, J. Z. 2009. Predictive microbial ecology. Microbial Biotechnology 2(2):154-156.
Zhou, J. Z., and D. K. Thompson. 2002. Challenges in applying microarrays to environmental studies. Current Opinion in Biotechnology 13(3):204-207.
Zhou, J. Z., B. C. Xia, H. S. Huang, D. S. Treves, L. J. Hauser, R. J. Mural, A. V. Palumbo, and J. M. Tiedje. 2003. Bacterial phylogenetic diversity and a novel candidate division of two humid region, sandy surface soils. Soil Biology & Biochemistry 35(7):915-924.
Zhou, J. Z., S. Kang, C. W. Schadt, and C. T. Garten, Jr. 2008. Spatial scaling of functional gene diversity across various microbial taxa. Proceedings of the National Academy of Sciences of the United States of America 105(22):7768-7773.
Zhou, J. Z., Z. He, J. D. Van Nostrand, L. Wu, and Y. Deng. 2010. Applying GeoChip analysis to disparate microbial communities. Microbe 5(2):60-65.
Zhou, J. Z., L. Wu, Y. Deng, X. Zhi, Y.-H. Jiang, Q. Tu, J. Xie, J. D. Van Nostrand, Z. He, and Y. Yang. 2011. Reproducibility and quantitation of amplicon sequencing-based detection. The ISME Journal 5(8):1303-1313.
Zhou, J. Z., Y. H. Jiang, Y. Deng, Z. Shi, B. Y. Zhou, K. Xue, L. Y. Wu, Z. L. He, and Y. F. Yang. 2013. Random sampling process leads to overestimation of beta-diversity of microbial communities. mBio 4(3):e00324-13. doi:10.1128/mBio.00324-13.
Zhou, J. Z., Z. He, Y. Yang, Y. Deng, S. G. Tringe, and L. Alvarez-Cohen. 2015. High-throughput metagenomic technologies for complex microbial community analysis: Open and closed formats. mBio 6(1):e02288-14. doi:10.1128/mBio.02288-14.
Zhou, J. Z., Y. Deng, L. N. Shen, C. Q. Wen, Q. Y. Yan, D. L. Ning, Y. J. Qin, K. Xue, L. Y. Wu, Z. L. He, J. W. Voordeckers, J. D. Van Nostrand, V. Buzzard, S. T. Michaletz, B. J. Enquist, M. D. Weiser, M. Kaspari, R. Waide, Y. F. Yang, and J. H. Brown. 2016. Temperature mediates continental-scale diversity of microbes in forest soils. Nature Communications 7:1208.