I am going to use myself as an example to illustrate where the microbial commons can be useful. Box 22–2 lists some areas of microbiology in which people in the categories previously described could be interested.
Areas of Interest in Microbiology
• Public health and pandemics
– Analysis of outbreaks
– Evaluation of drugs and vaccines
• Food security
– Evaluation of products of food biotechnology
– Evaluation of microorganisms used for cleanup
• Biofuel and bioproducts
– Evaluation of microorganisms used to make biocatalysts, enzymes
– Evaluation of microorganisms used to make fuels
– Evaluation of microorganisms used to make chemical substances
Specifically, bioremediation and biofuels or bioproducts are products and processes in which I am closely involved. In particular, the items in these categories are examples of products or services provided by microorganisms that are subject to oversight by my organization. You can see that there is a wide range of potential commercial uses for which microbiological data made accessible through a commons could be used. I want to discuss the kinds of data and information that we have to deal with on a routine basis that could be made more accessible to us if the commons did exist and was in operation.
One of the things that we constantly have to deal with is knowing exactly which organism is being worked with when a submitter provides us with information on an organism. Has the submitter obtained an accurate species identification using the tools available to him? More often than not, commercial organisms belong to that collection of open-genome organisms in which there is a broad range of entities falling within a genus or within a species, with lots of apparent gene exchange and a consequently diverse gene pool. These taxa would appear to have tiny core genomes compared to many genomes in genera that are less diverse. They often have lots of mobile genetic elements. Because of this diversity and especially if determinants used for identification reside on these elements, trying to identify the species of such an organism is a challenge. But, since much of the pan-genome gene pool is sharable, this can at least tell us the range of potential functions that may be expressed, regardless of the species name applied to the strain. Knowledge of the content of this gene pool is something we can work from. We understand about the utility of metadata—how it enables us to know where an organism came from, trace it back to its origins, and figure out what it did, or at least what its precursor did, in the natural environment. Because we deal with health and safety, environmental effects, and those kinds of things, there are different types of information that are useful to us: Where is the organism from? Was it part of an outbreak? Is it is known to be relatively safe when it or its precursors are used commercially? What else could the organism be used for besides what we are being told it might be used for?
We get our data from a variety of sources: the open literature, grey literature, company files, public data banks, and other Web resources. We are interested in various