| ||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 103
BTOMOLECULAR COMPUTING
OCR for page 104
OCR for page 105
DNA Computing by Self-Assembly
ERIK WINFREE
Departments of Computer Science and Computation
and Neural Systems
California Institute of Technology
Pasadena, California
INFORMATION AND ALGORITHMS IN BIOCHEMISTRY
Information and algorithms appear to be central to biological organization
and processes, from the storage and reproduction of genetic information to the
control of developmental processes to the sophisticated computations performed
by the nervous system. Much as human technology uses electronic microproces-
sors to control electromechanical devices, biological organisms use biochemical
circuits to control molecular and chemical events. The engineering and pro-
gramming of biochemical circuits, in vivo and in vitro, would transform indus-
tries that use chemical and nanostructured materials. Although the construction
of biochemical circuits has been explored theoretically since the birth of molecu-
lar biology, our practical experience with the capabilities and possible program-
ming of biochemical algorithms is still very young.
In this paper, I will review a simple form of biochemical algorithm based on
the molecular self-assembly of heterogeneous crystals that illustrates some as-
pects of programming in vitro biochemical systems and their potential applica-
tions. There are two complementary perspectives on molecular computation:
(1) using the astounding parallelism of chemistry to solve mathematical prob-
lems, such as combinatorial search problems; and (2) using biochemical algo-
rithms to direct and control molecular processes, such as complex fabrication
tasks. The latter currently appears to be the more promising of the two.
Some major theoretical issues are common to both approaches how algo-
rithms can be encoded efficiently in molecules with programmable binding in-
teractions and how these algorithms can be shown to be robust to asynchronous
and unreliable molecular processes. Proof-of-principle has been experimentally
105
OCR for page 106
106
FRONTIERS OF ENGINEERING
demonstrated using synthetic DNA molecules; how well these techniques scale
remains to be seen.
ALGORITHMIC SELF-ASSEMBLY AS
GENERALIZED CRYSTAL GROWTH
The idea of algorithmic self-assembly arose from the combination of DNA
computing (Adleman, 1994), the theory of tilings (Grunbaum and Sheppard,
1986), and DNA nanotechnology (Seeman, 1982; Seeman, 2003~. Conceptually,
algorithmic self-assembly naturally spans the range between maximal simplicity
(crystals) and arbitrarily complex information processing. Furthermore, it is
amenable to experimental investigation, so we can rigorously probe our under-
standing of the physical phenomena involved. This understanding may eventu-
ally result in new nanostructured materials and devices.
DNA Computing
Leonard Adleman's original paper on DNA computing contained the seed
of the idea we'll pursue here that the programmability of DNA hybridization
reactions can be used to direct self-assembly according to simple rules. In the
first combinatorial-generation step of Adleman's procedure, DNA molecules rep-
resenting all possible paths through the target graph were assembled by DNA
hybridization in a single step. The basic idea (Figure 1) is for a set of molecules
with unique sequences to represent the vertices and edges of the graph, thus
governing which vertices can follow which other vertices. Each possible se-
quence of hybridization reactions, occurring spontaneously in any order, pro-
duces a double-stranded DNA molecule whose sequence encodes a valid path
through the graph. By thus generalizing one-dimensional polymerization to in-
clude programmable binding, Adleman coaxed the DNA to generate patterns
that follow certain mathematical rules. This is an elegant idea and it works!
The problem is that only simple computations can be performed with linear self-
assembly. Paths through graphs correspond to regular languages, which have
the complexity of finite-state machines thus more sophisticated aspects of com-
putation cannot be reached by this technique.
Tiling Theory
A tiling is an arrangement of a few basic shapes (called tiles) that fit to-
gether perfectly in the infinite plane. For each tiling, the set of shapes must be
finite; for example, the tile set could consist of an octagon and a square, both
with unit-length sides. One motivation for studying tiling is that the tiles corre-
spond to the periodic arrangement of atoms in crystals. A remarkable result
is that all possible periodic arrangements can be classified according to their
OCR for page 107
107
lo
l
O. ,
<,
a.
~ o
o
C)
-( ¢
~ '}
o.
O.
O. at_
{ W~ ~ ~
a, aim ~ ~
~C1 ~ ~ O ~
<~m 3
:O, ,
aft, ,
1~} 'I'm
{o:= ~ ~
~C, , - ~ ~ ·C~
~C, , ,-~¢ =.~
°{=E° =~=
~CI , -: ~ ~
~C, , ,-~¢ =~
{Poe= ~ ~
-1 } ~ ~
Tic l in 0
{oh= ~o
~C1 , ,-_) O ~
-~¢ ° =3
a, , ~ ~ ~
a, , ~ S~
. -
·=
i of i g
~ .~3 ~- ~ a
A ~ ~
a, ~ .o = :d
Od ~ ic ~004
~C' ~ #=
· ~ ~
OCR for page 108
108
FRONTIERS OF ENGINEERING
fundamental symmetries; in three dimensions there are 230 symmetries, and in
two dimensions there are 17 symmetries. This suggests that, given a finite set of
polygonal tiles, one should be able to determine whether they can be arranged
according to one of the known symmetries, or whether there is no way to arrange
them on the plane.
This is what Hao Wang thought in the 1960s, but when he looked into the
question, known as the tiling problem, he discovered that it is provably unsolv-
able (Wang, 1963~! That is to say, aperiodic tilings are also possible. In addi-
tion, it can be incredibly difficult to determine whether a given set of tiles can
tile the plane aperiodically or whether every attempt will ultimately fail. To
prove this result, Wang developed a way to create a set of tiles that fit together
uniquely to reproduce the space-time history of any chosen Turing1 machine, in
such a way that, if the Turing machine halts (with an output), then the attempted
tiling has to get stuck; if the Turing machine continues computing forever, then a
consistent global tiling is possible.
Thus, the tiling problem reduces to the halting problem, the first problem
proved to be formally undecidable. This result shows that tiling is theoretically
as powerful as general-purpose computers. In fact, the tiles Wang used were all
essentially square, distinguished only by labels on their sides that had to match
up when the tiles were juxtaposed. Thus, the complexity arises from the logical
constraints in how the tiles fit together, rather than from the tiles themselves.
Given the intimate relation between crystals and tiling theory, it is natural to
ask if crystal growth has the potential to compute as powerfully. To answer this
question, we need two things: (1) the ability to design molecular Wang tiles; and
(2) precise rules for crystal growth that can be implemented reliably.
DNA Nanotechnology
We now turn to DNA nanotechnology, the brainchild of Nadrian Seeman's
vision of using DNA as an architectural element (Seeman, 1982~. Like RNA,
DNA can make structures other than the usual double helix. These other struc-
tures include hairpins and three- and four-way branch points, which are impor-
tant for biological function. Seeman, however, pictured these structures as hinges
and joints, bolts and braces that could be programmed to fold and bind to each
other by careful design of the DNA base sequence. Seeman and his students
constructed a wide variety of amazing nanostructures: a wire-frame cube and
truncated octahedron; single-stranded DNA and RNA knots, including the tre-
1Turing machines, invented by Alan Turing in 1936, are extremely simple computers that consist
of a finite-state compute head that can move back and forth on an infinite one-dimensional memory
tape. Turing showed that these machines are universal in the sense that they can perform any
computation that can be performed by any other mechanical device there is no fundamental need to
use a more complicated kind of computer!
OCR for page 109
DNA COMPUTING BY SELF-ASSEMBLY
109
foil, the figure-eight, and Borromean rings; and rigid building-block structures,
such as triangles and four-armed "bricks" known as double-crossover (DX) mol-
ecules; and more (Seeman, 2003~.
The idea, then, is to use these "bricks" as molecular Wang tiles (Winfree et
al., 1998a). The four arms of the DX molecules can be given sequences corre-
sponding to the labels on the four sides of the Wang tiles. Thus, any chosen
Wang tile can be implemented as a DNA molecule. Appropriate design of the
molecule will encourage assembly into two-dimensional sheets.
The problem, then, is to ensure that the growth process results in tile ar-
rangements in which all tiles match with their neighbors. It is easy, however, to
envision ways of putting the tiles together so that the tiles match at each step but
soon create a configuration for which there is no way to proceed without creating
a mismatch or having to remove offending tiles. This situation is analogous to
the distinction between uncontrolled precipitation, which occurs rapidly when
there is a strong thermodynamic advantage to aggregation, and quality crystal
growth, which occurs slowly when there is a slight thermodynamic advantage
for molecules that bind in the preferred orientation, but other possible ways to
bind are disadvantageous.
A formalization of this notion for Wang tiles, the Tile Assembly Model,
supposes that each label on a Wang tile binds with a certain strength (typically,
0, 1, or 2) and that tiles will only stick to a growing assembly if they bind
(possibly via multiple bonds) with a total strength greater than some threshold ~
(typically 1 or 2~; tiles that bind with a weaker strength immediately fall off
(Winfree, 1998~. Under these rules, growth from a "seed tile" can result in a
unique, well defined pattern. Because Turing machines and cellular automata
can be simulated by this process, the Turing-universality of tiling is retained.
As an example, consider the seven tiles shown in Figure 2 assembling at ~ =
2. These tiles perform a simple computation they count in binary. Starting
with the seed tile, labeled S. the tiles with strength-2 bonds polymerize to form a
V-shaped boundary for the computation. There is a unique tile that can fit into
the nook of the V; because it makes two strength-1 bonds, it can in fact be added.
Two new nooks are created, and again a unique tile can be added in each loca-
tion. The assembly thus grows forever, counting and counting with unabated
madness.
Tiles can be added in any order, but the resulting pattern is the same. The
same basic self-assembly mechanisms used here are sufficient to perform more
sophisticated computations. No new ideas or mechanisms are necessary to ob-
tain fully programmable Turing-universal behavior.
EXPERIMENTAL ADVANCES
The first demonstration of these ideas two-dimensional, periodic arrays of
DNA tiles could hardly be called "algorithmic," but it did show that the se-
OCR for page 110
110
FRONTIERS OF ENGINEERING
bit =0
bit = 1
no rollover
'...
..~ .~ ~
.'.. .1 .:: .... ..;.::...'..
.~ ~ :
rollover
. .. ~ ALL
~ -$ ~
....... .......
`'...'.'.'.::::'..'.' 14
, ~, ,,i,, ,i, ,,~
~ ~ ..::... ~ ..;:.:;... ~ ...:::... ~
r W ......
...: .':'::'..,:: .'::::'...:, 1.'... 1...: 14
~ ~ ~ ^ ~
~ '.:: ::' '.: .'.;:.:; .1 : 1 ~ ::'. ~
...... . ~ . ~ ....... l . ~ ...... ~
~ .. .':':: '..' .'::::'. ." .'::"' '' "':':" ' 1 ''' "::::" " 1 ~
A T ~ ~ ~ ~ ~ ~
..' ..'.::'::.. '..' ..::'. ..'. .'.;:.:;.. "' ":::" "'' "::'::" '" 1. ~ ';
, - , ..... ) , ...... , id, . ,T, . ,i, ..... l,, ...... , ~
~ ' ' :::: ' :: ,.,' ':':" '':: "':':: ''"" '::::' "" "::"' '"' 1 '':: 1 ~
r : . ~ ~ + ) . ~ ~ ~
''..' ..:::.... ..;:.:;...." ':::' "': "::'::" '':: ":::" '"' ';:.:;' "' ":::" ": 1 ~ ':::' ~
.f ~ ~ ~ I. I act . ~ ~
,,. ,,.., ,,. ,...-... ,. .,.-... , ., . - ' -: '"" :::: "' 1 ~
r ''A" 'a" "A' 'A'' ''A" 'a" "A' 'A''
~ ~ ~ ~ ~ ~ ~ ~ ~ IS
FIGURE 2 A set of seven tiles that implement a binary counter when started with the
seed tile S. Strength-2 bonds are indicated by tile sides with two projections (or indenta-
tions); other bonds have strength 1. Arrows indicate sites where a tile may be added at
= 2. Source: Adapted with permission from Winfree, 2000.
quences given to the tiles' sticky ends could be used to program different peri-
odic arrangements of tiles (Winfree et al., 1998a). The encoding of tiles as DNA
DX molecules is illustrated in Figure 3; Figure 4 shows small crystals of DX
molecules adsorbed on mica, as they appear in the atomic force microscope.
Subsequent studies have shown that DNA tiles can be made from a variety of
different molecular structures. Thus, the principle that the arrangement of two-
dimensional tiles can be directed by programmable, sticky-end interactions ap-
pears to be quite robust.
The goal of creating three-dimensional, periodic arrays of DNA tiles, origi-
nally formulated by Seeman more than 20 years ago, remains an open problem
in the field. Once solved, it will allow for more sophisticated information-
processing techniques in algorithmic self-assembly, roughly analogous to the
increase in power from one-dimensional to two-dimensional cellular automata or
Turing machines.
For the time being, experimental demonstration of algorithmic self-
assembly has been confined to one- and two-dimensional assemblies. The first
use of one-dimensional algorithmic self-assembly appeared as the first step in
Adleman's original DNA-based computing demonstration; this process formally
corresponds to the generation of languages by finite-state machines. Further-
OCR for page 111
DNA COMPUTING BY SELF-ASSEMBLY
~ C ~ ~ C ~ ~ C ~ ~ C
Or
111
:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:
~ ,a~ . A
~~ ~~_~ ~:~ ~~ ~~ ~~
25 nanometers
of
I' ~_~A^~ ~~:~ ~~
FIGURE 3 DNA double-crossover molecules can implement abstract Wang tiles, pro-
ducing a two-dimensional lattice of DNA with binding interactions dictated by the DNA
sticky ends. Source: Adapted with permission from Winfree, 2000.
more, using one-dimensional, tile-based assembly, it is possible to read an input
string (encoded as a one-dimensional tile assembly) and generate an output
string consisting of the cumulative2 exclusive-OR (XOR) of the input string
(Mao et al., 2000~; this formally corresponds to a finite-state transducer.
The first two-dimensional, algorithmic self-assembly process to be expert
mentally demonstrated with DNA is a generalization of the one-dimensional
XOR example (Rothemund and Winfree, in preparation). Beginning with an
input row consisting of a single 1 in a sea of 0's, the next layer grows by placing
a 0 where both neighbors in the layer below are the same and a 1 where they are
different. This process, an example of a one-dimensional cellular automaton,
generates a fractal pattern known as the Sierpinski gasket.
In addition to the DNA required to construct the input, only four DNA tiles
are required (in principle) to grow arbitrarily large Sierpinski triangles. Experi-
mentally, error-free Sierpinski triangles as large as 8 x 16 have been observed by
atomic force microscopy. However, error rates (the frequency with which the
2The nth bit of the cumulative XOR gives the parity of the first n bits of the input sequence.
OCR for page 112
2
FRONTIERS OF ENGINEERING
FIGURE 4 Atomic force microscope image of DNA double-crossover crystals. Stripes
are spaced at 25 nm; individual 2 x 4 x 13 nm tiles are visible. Source: Image taken by
Nick Papadal~is, Winfree Lab.
wrong tile was incorporated into the crystal) ranged from 1 to 10 percent, and
many fragments appeared to have grown independently of the input structure. It
is clear that controlling nucleation and finding mechanisms to reduce the error
rates are critical challenges for making algorithmic self-assembly practical.
POTENTIAL TECHNOLOGICAL APPLICATIONS
Combinatorial Optimization Problems
Solving combinatorial optimization problems, in the spirit of Adleman's origi-
nal paper, was the first application considered for algorithmic self-assembly.
Adleman's essential insight is based on the fact that a class of hard computa-
tional problems, the NP-complete problems, share a common generate-and-test
form does a sequence exist that satisfies easy-to-check properties X, Y. . . ..
and Z. All known algorithms for NP-complete problems require exponential3
time or exponential parallelism. The basic idea is to use combinatorial chemis-
try techniques to simultaneously generate all potential solutions and then to filter
them, based on chemical properties related to the information they encode, leav-
ing at the end possibly only a single molecule that has all of the desired proper-
ties. If the final solution to the problem is defined by satisfying a small number
of simple properties as is the case for all NP-complete problems then this
approach can be used to find the solution in a short amount of time, if the
3Exponen~ial in the length of the problem description, in bits.
OCR for page 113
DNA COMPUTING BY SELF-ASSEMBLY
113
parallelism is sufficient. That a single cc of DNA in solution at reasonable
concentrations can contain 260 bits of information which can be acted on si-
multaneously by chemical operations gives us hope that the parallelism could
be sufficient.
By exploiting the situation in which multiple different tiles could be added
at a given location much like Adleman's assembly step that produced all pos-
sible paths through a graph self-assembly can generate a combinatorial set of
possible assemblies and then continue growing according to a process that tests
the information to see if it has the desired properties. Theoretical schemes have
been worked out that use a single self-assembly step to solve the Hamiltonian
path problem (HPP) (Winfree et al., 1998b), solve the Boolean formula
satisfiability problem (SAT) (Lagoudakis and LaBean, 2000), and perform other
math calculations (Reif, 1997~. How much computation could be done this way?
If assembly were to proceed with few errors, solving a 40-variable SAT problem
would require 30 milliliters of DNA at a tile concentration of 1 micromolar and
might be completed in a few hours. This "best possible" estimate corresponds to
10~2 bit operations per second not bad for chemistry but still low compared
to electronic computers.
The sheer speed and flexibility of silicon-based electronic computers make
them preferable to DNA computing, even if self-assembly were to proceed with-
out errors. We can conclude, then, that the low-hanging fruit are not to be found
in the field of combinatorial search. But the ability of self-assembly to perform
sophisticated computations suggests that we are making progress toward our
goal of understanding (and potentially exploiting) autonomous biochemical al-
gorithms. A more promising application is suggested by examining how self-
assembly is used in biology.
Programmable Nanofabrication
Biology uses algorithmically controlled growth processes to produce nano-
scale and hierarchically structured materials with properties far beyond the
capability of today's human technology. Does DNA-based algorithmic self-
assembly give us access to new and useful technological capabilities? The sim-
plest applications would make use of self-assembled DNA as a template or scaf-
fold for arranging other molecular components into a desired pattern. This could
be used for biochemical assays, novel materials, or devices. Seeman has envi-
sioned, for example, using periodic three-dimensional DNA lattices to assist
with difficult protein crystallization or to direct construction of molecular elec-
tronic components into a memory (Robinson and Seeman, 1987~.
The potential of self-assembly for fabricating molecular electronic circuits
is intriguing, given the limitations of conventional silicon-circuit fabrication tech-
niques. Photolithography is unable to create features significantly smaller than
the wavelength of light, and even if it could, for several-nanometer line widths
OCR for page 120
120
FRONTIERS OF ENGINEERING
Existing directed-evolution processes are intended to improve biological se-
quences (genes, genomes) rapidly for complex but specific performance charac-
teristics in environments that closely resemble the conditions of final applica-
tion. All of the host genes other than the proteins) or genesis) being evolved are
kept constant so that all of the computational power of the population (typically
a molecular library created in vitro) is focused on optimizing the target protein
for specific performance criteria and in a specific environment. This kind of
laboratory-directed molecular computation, called screening, is used to measure
the sum of the performance alterations obtained by multiple mechanisms, in-
cluding DNA transcription; mRNA folding; translation and degradation; and
protein expression, folding, interaction, special distribution, specific activity, and
degradation.
Although molecular computation is widely regarded as futuristic, the natural
computation methods used in directed evolution of biological molecules have
proven to be practical and have generated strong commercial support. Laboratory-
directed derivatives of the natural design process have been used to create a variety
of valuable commercial products and product candidates in the areas of pharma-
ceuticals and vaccines (Maxygen, Inc.), chemicals, chemical processes and fer-
mentation strains (Codexis, Inc.), and agriculture (Verdict, Inc.~. These high-value
examples are important for the widespread recognition of molecular computation
as a powerful and commercially proven design process.
Natural computing and evolutionary optimization can, in principle, be ex-
tended to the design of a wide variety of nonbiological objects. Developing
better methods of universal (i.e., numerical) molecular computation is the goal
of most scientists in the field. However, nonnumerical applications of molecular
computation may, in aggregate, prove to be just as useful and can be realized
more immediately. Until there are practical formats for numerical molecular
computation, these nonnumerical formats will serve as examples of the power
promised by molecular computation.
OCR for page 121
Challenges and Opportunities in
Programming Living Cells
RON WEISS
Department of Electrical Engineering
Princeton University
Princeton, New Jersey
With recent advances in our understanding of cellular processes and DNA
synthesis methods, we can now regard cells as programmable matter. Cells
naturally process internal and environmental information in complex fashions
and interact with neighboring cells to achieve coordinated behavior. Through
genetic engineering, we can now equip cells with new sophisticated capabilities
for gene regulation, information processing, and communication. These new
capabilities serve as catalysts for synthetic biology, an emerging engineering
discipline to program cell behaviors as easily as we program computers.
Synthetic biology will benefit a wide variety of existing fields and enable us
to harness cells for applications that are not feasible today. Applications include
tissue engineering, molecular fabrication of biomaterials and nanostructures, syn-
thesis of pharmaceutical products, biosensing, and will surely lead to quantita-
tive insights into the operating principles that govern living organisms. Re-
search so far has focused on building an enabling infrastructure for synthetic
biology applications. A particular emphasis has been on constructing prototype
synthetic gene networks that perform digital computation, analog computation,
signal processing, and communications. In this paper, I will describe the build-
ing blocks for these genetic circuits, several intracellular and intercellular proto-
type genetic circuits that have been implemented recently, some of the chal-
lenges in designing such circuits, and the long-term significance of this work.
SYNTHETIC GENE NETWORKS
Genetic circuits are collections of basic elements that interact to produce a
particular behavior. These elements include: DNA regions where RNA poly-
121
OCR for page 122
22
FRONTIERS OF ENGINEERING
merase molecules bind and initiate transcription of DNA into messenger RNA
(mRNA); mRNA sequences where ribosomal RNA molecules bind and initiate
translation of mRNA into proteins; proteins that regulate the activity and pro-
duction of other proteins; DNA regions that terminate transcription; and motifs
that determine protein and mRNA stability. Each element serves a particular
function that helps accomplish the overall behavior of the genetic circuit.
The basic elements can be grouped into units (or "devices") that perform
logic operations. For example, Figure 1 shows a logic unit that implements the
digital NOT operation using the biochemical reactions of transcription, transla-
tion, and protein decay. Based on this mechanism, we have constructed and
tested synthetic gene networks with units that implement the NOT, AND, and IM-
PMES logic functions (Weiss, 2001; Weiss and Knight, 2000~. In addition, we
also proposed and modeled a biochemical NAND gate that consists of two NOT
gates whose outputs are connected with a wire-oR (Weiss et al., l999~. Theoreti-
cally, any arbitrary digital logic function can be implemented with genetic cir-
cuits using these gates.
By constructing biochemical logic circuits and embedding them in cells, one
can extend or modify the behavior of cells. Consider how computer designers
program the behavior of computers or robots by fabricating silicon-based logic
circuits. To program cells in an analogous way, the genetic-circuit designer
constructs a DNA sequence that encodes a particular circuit and then embeds
this DNA molecule in cells (a process called transformation). Typically, the
purpose of this network is to regulate precisely the production of proteins. Be-
cause proteins essentially perform all the "work" and information processing in
the cell, cell behavior can be programmed by controlling when and under what
conditions proteins are produced and degraded in the cell.
To date, several small synthetic gene networks have been built that accom-
plish specific genetic regulatory functions in vivo: the autorepressor (Figure 2a),
in which a repressor regulates its own production to reduce noise in gene expres-
sion (Becskei and Serrano, 2000~; the toggle-switch (Figure 2b), in which two
repressors inhibit each other's production to achieve a bistable system (Gardner
et al., 2000~; the repressilator (Figure 2c), in which three repressors are con-
nected in a ring topology to produce repeated oscillation (Elowitz and Leibler,
2000~; the genetic clock and toggle switch constructed from transcriptional re-
pressors and activators (Atkinson et al., 2003~; and our synthetic gene networks
used for engineering digital logic gates and circuits in cells (Weiss and Basu,
2002; Weiss et al., 1999~.
Despite their logically simple functions, the difficulties in building these
networks revealed that many challenges will be faced in building circuits of
increasing complexity. For example, designers of genetic circuits will have to
cope with the significant noise inherent in gene expression (Becskei and Serrano,
2000; Elowitz et al., 2002; McAdams and Arkin, 1997) and will have to match
carefully the "impedances" of constituent devices to achieve compound circuits
OCR for page 123
CHALLENGES AND OPPORTUNITIES IN PROGRAMMING LIVING CELLS 123
X=0
Y=1
-
output protein Y
/Transcription/
/ Translation
Y=0
input protein X
(repressor)
V
FIGURE 1 A simplified view of the two cases for a biochemical inverter, a logic device
whose output value is always the inverse of its input value. The concentration of two
particular proteins represent the input and output logic signals. In the first case (left),
input protein X is absent, and the cell transcribes the coding sequence for output protein
Y (labeled CDS). In the second case (right), input protein X is present and binds specifi-
cally to the gene at the promoter site (labeled P), preventing the cell from expressing
output protein Y.
that operate properly and reliably (Weiss and Basu, 2002~. Because these cir-
cuits operate in the context of a living organism that already has an existing
"program," understanding the interactions of the exogenous circuit with the en-
dogenous elements will be critical. When designing these circuits, such interac-
tions will have to be minimized to avoid undesired cross talk and interference
with normal cellular processes. The exception to this design rule applies when
control over endogenous cell behavior is intentional and desired (e.g., control-
ling cellular metabolism or growth).
COMMUNICATION AND SIGNAL PROCESSING
Intercellular communication allows individual cells to coordinate their be-
havior and accomplish sophisticated tasks they simply cannot perform alone. In
higher level organisms, such as mammals, eukaryotic cells send and receive
signals to perform a wide range of activities, from differentiation and growth
during embryogenesis to immune and stress responses during adult life. How-
ever, cell-to-cell communication is not exclusive to higher level organisms. Bac-
terial cells, for example, are known to regulate gene expression based on their
OCR for page 124
124
(a)
(b)
1 Rex
It I2
FRONTIERS OF ENGINEERING
(2a)
AWL 1
(2b)
fluorescence (A.U.)
AL
FIGURE 2 An overview of three synthetic gene networks, their corresponding logic
circuits, and experimental behavior. These networks have been implemented to demon-
strate specific genetic regulatory tasks. 2a. The auto-repressor circuit reduces variations
in gene expression. Source: Becskei and Serrano, 2000. Reprinted with permission. 2b.
The toggle switch circuit is a bistable system. Source: Gardner et al., 2000. Reprinted
with permission. 2c. The repressilator circuit exhibits periodic oscillation in gene ex-
pression. Source: Elowitz end Leibler, 2000. Reprinted with permission.
OCR for page 125
CHALLENGES AND OPPORTUNITIES IN PROGRAMMING LIVING CELLS 125
own cell density by secreting and then detecting concentrations of unique bio-
chemical signals in a behavior known as quorum sensing (Bassler, 1999~. This
coordinated behavior gives bacterial cells a competitive advantage both as patho-
gens and in symbiotic relationships with their hosts.
To explore potential applications that would benefit from coordinated, mul-
ticellular behavior, we have begun to integrate communication capabilities with
various synthetic genetic regulatory and information-processing networks. To-
ward this end, we first programmed Escherichia cold cells to communicate with
each other by connecting several transcriptional regulatory elements with previ-
ously unrelated signaling elements from the marine bacterium Vibrio fischeri
(Weiss and Knight, 2000~. In the experimental system, we externally induced
"sender" cells to synthesize the production of a small molecule (30C6HSL) that
then diffused outside the cell membrane and entered the cytoplasm of nearby
"receiver" cells. The receiver cells responded to this chemical message by ex-
pressing a green fluorescent protein that was visible under a microscope (Figure
3~. We are now extending this work to achieve programmed, two-way commu-
nications by constructing new circuits that use multiple signal synthesis and
response elements from various bacterial sources. Two important challenges in
engineering sophisticated and robust, multisignal, cell-to-cell communication
networks will be matching response sensitivities and reducing cross talk between
the signals.
Cells naturally analyze cell-to-cell communication and various environmen-
tal conditions with elaborate signal-processing circuitry that includes both digital
and analog components. The ability of cells to detect and subsequently react to
environmental and internal signals is a principal characteristic of many biologi-
cal phenomena. Examples include the movement of bacteria towards higher
concentrations of nutrients through chemotaxis, the release of fuel molecules in
response to hormones that signal hunger, and cell differentiation during embryo-
genesis based on chemical gradients. To extend a cell's ability to respond to
40 minutes
5:00 hours
7:30 hours
FIGURE 3 Time-series fluorescence images illustrating the response of colonies of
engineered receiver cells to communication from nearby sender cells on a petri dish.
OCR for page 126
26
FRONTIERS OF ENGINEERING
internal and environmental stimuli beyond the digital realm, we began to engi-
neer analog signal-processing capabilities using synthetic gene networks. In our
laboratory, we have built genetic circuits that enable cells to detect various chemi-
cal concentrations (below and above prespecified thresholds or only within cer-
tain ranges) and other circuits that allow cells to respond to multiple environ-
mental signals (Basu et al., 2003; Weiss et al., 2003~.
As a prototype for exploring issues in engineering signal-processing capa-
bilities, we designed a new genetic chemical-source pinpointing circuit. The
circuit is able to detect the presence of a particular extracellular molecule and
then distinguish between various chemical concentrations of the molecule (Basu
et al., 2003~. Consider a toxin analyte whose location or even presence in the
environment is unknown. The analyte is secreted from a particular pathogen and
forms a chemical gradient centered on the source. A potential method of deter-
mining the location of the pathogen is to spread engineered sentinel cells in the
suspected environment that can detect prespecified chemical concentrations. For
a given concentration range, these cells will fluoresce in a ring pattern around
the source. If the cells are engineered to detect multiple ranges distinguishable
by different fluorescent colors, a bullseye pattern will emerge with several con-
centric rings around the analyte source (Figure 4~.
The source-pinpointing circuit consists of several components that first detect
the external analyte concentration and then determine whether the concentration
falls above a prespecified low threshold and below a high threshold. Figure 5
shows two separate experiments of "sentinel" bacterial cells that have been pro-
grammed to respond to either low or high concentration thresholds of a biochemi-
cal secreted from nearby sender cells (Basu et al., 2003~. By combining and
inverting the low and high threshold outputs, one can realize a circuit that responds
with a high fluorescent output only when the analyte concentration is within the
prespecified range. With the aid of simulation tools, we have also been able to fine
tune the threshold responses of these circuits by modifying the DNA sequences of
chosen genetic elements. An important long-term goal for this type of research is
to be able to tune the responses of genetic circuits with the same predictability and
reliability as we can when we design electrical devices. By combining this type of
analog information processing with digital computation and programmed cell-to-
cell communication, we may be able to create a flexible and powerful engineering
discipline for programmed cell behaviors.
KEY CHALLENGES
The initial modeling and experimental efforts in this field have generated a
great deal of excitement but have also revealed many challenges that must be
faced. Some of the significant challenges are described below.
Programming complex behavior will require the assembly of a large, well
characterized library of intracellular and intercellular components. The library
OCR for page 127
CHALLENGES AND OPPORTUNITIES IN PROGRAMMING LIVING CELLS 127
a ~ ana~yte
"sentinel" ceil
FIGURE 4 4a. Chemical-source pinpointing can be accomplished using cells with a
genetic circuit that expresses a green fluorescent protein only when a given analyte con-
centration falls within a particular range. Because the chemical concentration is highest
at the source and forms a gradient as the distance from the chemical source increases,
cells with this circuit will form a fluorescent ring around the source. 4b. Given a small
library of circuits that can detect different chemical concentration ranges with unique
fluorescent colors, cells will form separate rings centered on the source that result in a
bullseye pattern.
should include elements for regulating transcription and translation, as well as
elements for regulating protein-to-protein interactions, such as phosphorylation-
based signal-transduction cascades. Most of the library elements must exhibit
minimal cross talk and must not affect the host's behavior. However, an impor-
tant subset of this component library should be devoted to interfacing with the
host to control desired cellular functions, such as the production of specific
enzymes during predefined conditions, control over cell replication, and pro-
grammed secretion of various chemicals.
Designing operational and efficient genetic circuits will require models and
simulation tools that can provide accurate quantitative predictions of circuit
behavior. Engineered genetic circuits operate within a highly complex environ-
OCR for page 128
128
FRONTIERS OF ENGINEERING
FIGURE 5 Fluorescence microscope images of "sentinel" bacterial cells programmed
to detect whether the concentration of a chemical secreted by another type of bacteria is
above or below particular thresholds. The bacteria that secrete the biochemical are fluo-
rescing in red, while the sentinel bacteria scattered throughout the entire viewing area are
fluorescing in yellow when the prespecified detection conditions are satisfied. 5a. Con-
centration above a high threshold. 5b. Concentration below a low threshold.
ment that is not well characterized and whose effects on the operation of the
circuit have not been quantified. Models are still unable to predict the precise
concentration averages and population statistics of the molecular components of
even relatively simple systems. Solutions to this challenge will require more
precise kinetic rates for describing the relevant biochemical reactions, models
that incorporate additional cellular state (e.g., accurate RNA polymerase and
ribosomal RNA levels), accurate predictions of noise in the biochemical reac-
tions, a physical model of the cells and their surroundings, and potentially com-
pletely different modeling techniques. Furthermore, as the complexity of engi-
neered circuits increases, they begin placing a significant metabolic burden on
OCR for page 129
CHALLENGES AND OPPORTUNITIES IN PROGRAMMING LIVING CELLS 129
the host. Molecular interactions between the exogenous components and the
endogenous cellular circuitry can also affect host behavior. Effects on the host's
environment must be understood and modeled to design complex synthetic
circuits.
Genetic circuits must integrate specific components or network motifs that
make them robust to fluctuations in the kinetics of biochemical reactions. Gene
expression tends to be noisy because of the stochastic nature of the constituent
biochemical reactions (Elowitz et al., 2002; McAdams and Arkin, 1997~. In
addition, fluctuations in environmental conditions, such as temperature and nu-
trient levels, affect cellular metabolism and consequently the operation of ge-
netic circuits. Circuits that achieve reproducible, reliable behavior must do so
despite components whose behavior fluctuates considerably. Mitigating the ef-
fects of gene expression noise will probably require a solution that incorporates
positive and negative feedback loops.
A major bottleneck in genetic-circuit engineering is the difficulty of synthe-
sizing DNA constructs. Anecdotal evidence suggests that the construction of
new strands of DNA consumes a significant portion, if not most, of the time
spent in circuit engineering. Currently, several ongoing efforts are trying to
remove this bottleneck by using various de novo DNA-synthesis methods.
Genetic-circuit design will require novel approaches that may be fundamen-
tally different from existing computer-circuit design methodologies. In con-
structing our circuits, we have used "rational" design in which detailed models
are used to guide the circuit engineering, helping to select appropriate compo-
nents and suggesting how to mutate them, when necessary, to achieve correct
circuit behavior (Weiss and Basu, 2002~. We have also used another technique
called directed evolution to optimize circuit behavior (Yokobayashi et al., 2002~.
Building on nature's fundamental principle of evolution, this unique process
directs cells to mutate their own DNA until they find gene network configura-
tions that exhibit the desired system characteristics. Because our understanding
of cellular processes is incomplete, efficient circuit design will most likely re-
quire a combination of rational design and directed evolution, or perhaps some
other completely different approach.
~ ~ ~ ~ . ~ ~
LONG-TERM SIGNIFICANCE
The field of synthetic gene networks is still in its infancy. Researchers are
currently trying to build small genetic-regulatory systems that exhibit a particu-
lar behavior reliably and predictably. Solving the existing challenges in building
complex genetic circuits will take considerable effort. However, once the major
challenges are solved, the construction of synthetic gene networks will enable us
to direct cells to perform sophisticated digital and analog functions, both as
individual entities and as part of larger cell communities. An engineering disci-
pline to program cell behaviors and its associated tools will advance the capabili-
OCR for page 130
130
FRONTIERS OF ENGINEERING
ties of genetic engineering and allow us to harness cells for a myriad of new
applications. As a result, someday we will be able to modify and extend the
behavior of cells in almost arbitrary ways. Because this technology will allow us
to modify the instructions that govern the capabilities of organisms that live
around us, as well as our own body's instructions, this technology has tremen-
dous potential for improving control over the environment, our surroundings,
and our quality of life.
REFERENCES
Atkinson, M.R., M.A. Savageau, J.T. Myers, and A.J. Ninfa. 2003. Development of genetic cir-
cuitry exhibiting toggle switch or oscillatory behavior in Escherichia coli. Cell 113(5): 597-
607.
Bassler, B.L. 1999. How bacteria talk to each other: regulation of gene expression by quorum
sensing. Current Opinion in Microbiology 2(6): 582-587.
Basu, S., D. Karig, and R. Weiss. 2003. Engineered signal processing of cells: towards molecular
concentration band detection. Natural Computing, an International Journal 2:4(463-478).
Becskei, A., and L. Serrano. 2000. Engineering stability in gene networks by autoregulation.
Nature 405(6786): 590-593.
Elowitz, M.B., and S. Leibler. 2000. A synthetic oscillatory network of transcriptional regulators.
Nature 403(6767): 335-338.
Elowitz, M.B., A.J. Levine, E.D. Siggia, and P. Swain. 2002. Stochastic gene expression in a single
cell. Science 297(5584): 1183-1186.
Gardner, T.S., C.R. Cantor, and J.J. Collins. 2000. Construction of a genetic toggle switch in
Escherichia coli. Nature 403(6767): 339-342.
McAdams, H.H., and A. Arkin. 1997. Stochastic mechanisms in gene expression. Proceedings of
the National Academy of Sciences 94(3): 814-819.
Weiss, R. 2001. Cellular Computation and Communications Using Engineered Genetic Regulatory
Networks. Ph.D. thesis, Massachusetts Institute of Technology, September 2001.
Weiss, R., and S. Basu. 2002. The device physics of cellular logic gates. Pp. 54-61 in NSC-1: The
First Workshop of Non-Silicon Computing, Boston, Massachusetts, February 2002. Available
online at: .
Weiss, R., S. Basu, A. Kalmbach, S. Hooshangi, D. Karig, R. Mehreja, and I. Netravali. 2003.
Genetic circuit building blocks for cellular computation, communications, and signal process-
ing. Natural Computing, an International Journal 2(1): 47-84.
Weiss, R., G. E. Homsy, and T. F. Knight, Jr.. 1999. Pp. 275-295 in Dimacs Workshop on
Evolution as Computation. Princeton, N.J.: Springer Verlag.
Weiss, R., and T.F. Knight, Jr. 2000. Engineered Communications for Microbial Robotics. Pp. 1-
16 in DNA6: Sixth International Workshop on DNA-Based Computers, DNA2000, Leiden,
The Netherlands, June 2000. New York: Springer-Verlag.
Yokobayashi, Y., R. Weiss, and F.H. Arnold. 2002. Directed evolution of a genetic circuit. Pro-
ceedings of the National Academy of Sciences 99(26): 16587-16591.
Representative terms from entire chapter:
biochemical circuits