in complexity from single macromolecules to pathways, organisms, populations, and ecosystems. Successes in prediction and design at each subsequent level of complexity in biology as a whole are the relevant milestones to watch for, before we will be able to predict confidently from genome sequence analysis how a designed organism would replicate, interact with a host, evade a host immune system, and spread in a population to cause disease.
The committee’s view is that for the specific purposes of the Select Agent Regulations, those general biological milestones should be passively monitored, not actively sought. A narrow focus on such milestones for the sole purpose of being able to predict what makes Select Agents dangerous may be a distortion of priorities in biology, and may also raise concerns about dual use. The ability to predict pathogenicity from genome sequence automatically confers the ability to design genome sequences of pathogens.
However, the committee is not satisfied with answering its charge narrowly and in the negative. The rapidly expanding capabilities of automated gene synthethesis and of synthetic genomics to synthesize and “boot” complete Select Agent genomes means that the Select Agent Regulations do need to be defined in terms of genome sequence analysis, not by the phenotypic properties of an encoded agent. A Select Agent genome is covered by the Select Agent Regulations whether or not it is ever “booted” into a living agent whose phenotype can be assayed. A DNA synthesis company needs to be able to tell, unambiguously and by sequence alone, if it is being asked by a customer to synthesize the genome of a Select Agent.
That determination would not be a problem if each Select Agent had a unique genome sequence. However, discrete taxonomic nomenclature in biology is already challenged by the great diversity and continuum of organisms observed in natural wild isolates, and the rapidly expanding ability of synthetic biology to create highly modified variants and chimeras of naturally occurring genomes poses an even greater challenge to taxonomic naming systems. Select Agent pathogens, like any biological organism, are not defined by a single DNA sequence. Given natural wild variation and the conceivable range of tolerable synthetic variation, a “cloud” of related sequences of similar biological properties are all assigned the same taxonomic name. There may be sequences that are just as closely related but are not Select Agents, including vaccine strains and attenuated research strains that the U.S. government want explicitly to avoid encumbering with the Select Agent Regulations.
In its deliberations, the committee found that it was useful and important to distinguish sequence-based prediction of biological properties from sequence-based classification. A regulatory system based on prediction must be able to recognize that an entirely novel genome sequence (unrelated to any known sequence) encodes a pathogen that should be assigned Select Agent status. A regulatory system based on classification “merely” tries to decide