Click for next page ( 104


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 103
BTOMOLECULAR COMPUTING

OCR for page 103

OCR for page 103
DNA Computing by Self-Assembly ERIK WINFREE Departments of Computer Science and Computation and Neural Systems California Institute of Technology Pasadena, California INFORMATION AND ALGORITHMS IN BIOCHEMISTRY Information and algorithms appear to be central to biological organization and processes, from the storage and reproduction of genetic information to the control of developmental processes to the sophisticated computations performed by the nervous system. Much as human technology uses electronic microproces- sors to control electromechanical devices, biological organisms use biochemical circuits to control molecular and chemical events. The engineering and pro- gramming of biochemical circuits, in vivo and in vitro, would transform indus- tries that use chemical and nanostructured materials. Although the construction of biochemical circuits has been explored theoretically since the birth of molecu- lar biology, our practical experience with the capabilities and possible program- ming of biochemical algorithms is still very young. In this paper, I will review a simple form of biochemical algorithm based on the molecular self-assembly of heterogeneous crystals that illustrates some as- pects of programming in vitro biochemical systems and their potential applica- tions. There are two complementary perspectives on molecular computation: (1) using the astounding parallelism of chemistry to solve mathematical prob- lems, such as combinatorial search problems; and (2) using biochemical algo- rithms to direct and control molecular processes, such as complex fabrication tasks. The latter currently appears to be the more promising of the two. Some major theoretical issues are common to both approaches how algo- rithms can be encoded efficiently in molecules with programmable binding in- teractions and how these algorithms can be shown to be robust to asynchronous and unreliable molecular processes. Proof-of-principle has been experimentally 105

OCR for page 103
106 FRONTIERS OF ENGINEERING demonstrated using synthetic DNA molecules; how well these techniques scale remains to be seen. ALGORITHMIC SELF-ASSEMBLY AS GENERALIZED CRYSTAL GROWTH The idea of algorithmic self-assembly arose from the combination of DNA computing (Adleman, 1994), the theory of tilings (Grunbaum and Sheppard, 1986), and DNA nanotechnology (Seeman, 1982; Seeman, 2003~. Conceptually, algorithmic self-assembly naturally spans the range between maximal simplicity (crystals) and arbitrarily complex information processing. Furthermore, it is amenable to experimental investigation, so we can rigorously probe our under- standing of the physical phenomena involved. This understanding may eventu- ally result in new nanostructured materials and devices. DNA Computing Leonard Adleman's original paper on DNA computing contained the seed of the idea we'll pursue here that the programmability of DNA hybridization reactions can be used to direct self-assembly according to simple rules. In the first combinatorial-generation step of Adleman's procedure, DNA molecules rep- resenting all possible paths through the target graph were assembled by DNA hybridization in a single step. The basic idea (Figure 1) is for a set of molecules with unique sequences to represent the vertices and edges of the graph, thus governing which vertices can follow which other vertices. Each possible se- quence of hybridization reactions, occurring spontaneously in any order, pro- duces a double-stranded DNA molecule whose sequence encodes a valid path through the graph. By thus generalizing one-dimensional polymerization to in- clude programmable binding, Adleman coaxed the DNA to generate patterns that follow certain mathematical rules. This is an elegant idea and it works! The problem is that only simple computations can be performed with linear self- assembly. Paths through graphs correspond to regular languages, which have the complexity of finite-state machines thus more sophisticated aspects of com- putation cannot be reached by this technique. Tiling Theory A tiling is an arrangement of a few basic shapes (called tiles) that fit to- gether perfectly in the infinite plane. For each tiling, the set of shapes must be finite; for example, the tile set could consist of an octagon and a square, both with unit-length sides. One motivation for studying tiling is that the tiles corre- spond to the periodic arrangement of atoms in crystals. A remarkable result is that all possible periodic arrangements can be classified according to their

OCR for page 103
107 lo l O. , <, a. ~ o o C) -( ~ '} o. O. O. at_ { W~ ~ ~ a, aim ~ ~ ~C1 ~ ~ O ~ <~m 3 :O, , aft, , 1~} 'I'm {o:= ~ ~ ~C, , - ~ ~ C~ ~C, , ,-~ =.~ {=E =~= ~CI , -: ~ ~ ~C, , ,-~ =~ {Poe= ~ ~ -1 } ~ ~ Tic l in 0 {oh= ~o ~C1 , ,-_) O ~ -~ =3 a, , ~ ~ ~ a, , ~ S~ . - = i of i g ~ .~3 ~- ~ a A ~ ~ a, ~ .o = :d Od ~ ic ~004 ~C' ~ #= ~ ~

OCR for page 103
108 FRONTIERS OF ENGINEERING fundamental symmetries; in three dimensions there are 230 symmetries, and in two dimensions there are 17 symmetries. This suggests that, given a finite set of polygonal tiles, one should be able to determine whether they can be arranged according to one of the known symmetries, or whether there is no way to arrange them on the plane. This is what Hao Wang thought in the 1960s, but when he looked into the question, known as the tiling problem, he discovered that it is provably unsolv- able (Wang, 1963~! That is to say, aperiodic tilings are also possible. In addi- tion, it can be incredibly difficult to determine whether a given set of tiles can tile the plane aperiodically or whether every attempt will ultimately fail. To prove this result, Wang developed a way to create a set of tiles that fit together uniquely to reproduce the space-time history of any chosen Turing1 machine, in such a way that, if the Turing machine halts (with an output), then the attempted tiling has to get stuck; if the Turing machine continues computing forever, then a consistent global tiling is possible. Thus, the tiling problem reduces to the halting problem, the first problem proved to be formally undecidable. This result shows that tiling is theoretically as powerful as general-purpose computers. In fact, the tiles Wang used were all essentially square, distinguished only by labels on their sides that had to match up when the tiles were juxtaposed. Thus, the complexity arises from the logical constraints in how the tiles fit together, rather than from the tiles themselves. Given the intimate relation between crystals and tiling theory, it is natural to ask if crystal growth has the potential to compute as powerfully. To answer this question, we need two things: (1) the ability to design molecular Wang tiles; and (2) precise rules for crystal growth that can be implemented reliably. DNA Nanotechnology We now turn to DNA nanotechnology, the brainchild of Nadrian Seeman's vision of using DNA as an architectural element (Seeman, 1982~. Like RNA, DNA can make structures other than the usual double helix. These other struc- tures include hairpins and three- and four-way branch points, which are impor- tant for biological function. Seeman, however, pictured these structures as hinges and joints, bolts and braces that could be programmed to fold and bind to each other by careful design of the DNA base sequence. Seeman and his students constructed a wide variety of amazing nanostructures: a wire-frame cube and truncated octahedron; single-stranded DNA and RNA knots, including the tre- 1Turing machines, invented by Alan Turing in 1936, are extremely simple computers that consist of a finite-state compute head that can move back and forth on an infinite one-dimensional memory tape. Turing showed that these machines are universal in the sense that they can perform any computation that can be performed by any other mechanical device there is no fundamental need to use a more complicated kind of computer!

OCR for page 103
DNA COMPUTING BY SELF-ASSEMBLY 109 foil, the figure-eight, and Borromean rings; and rigid building-block structures, such as triangles and four-armed "bricks" known as double-crossover (DX) mol- ecules; and more (Seeman, 2003~. The idea, then, is to use these "bricks" as molecular Wang tiles (Winfree et al., 1998a). The four arms of the DX molecules can be given sequences corre- sponding to the labels on the four sides of the Wang tiles. Thus, any chosen Wang tile can be implemented as a DNA molecule. Appropriate design of the molecule will encourage assembly into two-dimensional sheets. The problem, then, is to ensure that the growth process results in tile ar- rangements in which all tiles match with their neighbors. It is easy, however, to envision ways of putting the tiles together so that the tiles match at each step but soon create a configuration for which there is no way to proceed without creating a mismatch or having to remove offending tiles. This situation is analogous to the distinction between uncontrolled precipitation, which occurs rapidly when there is a strong thermodynamic advantage to aggregation, and quality crystal growth, which occurs slowly when there is a slight thermodynamic advantage for molecules that bind in the preferred orientation, but other possible ways to bind are disadvantageous. A formalization of this notion for Wang tiles, the Tile Assembly Model, supposes that each label on a Wang tile binds with a certain strength (typically, 0, 1, or 2) and that tiles will only stick to a growing assembly if they bind (possibly via multiple bonds) with a total strength greater than some threshold ~ (typically 1 or 2~; tiles that bind with a weaker strength immediately fall off (Winfree, 1998~. Under these rules, growth from a "seed tile" can result in a unique, well defined pattern. Because Turing machines and cellular automata can be simulated by this process, the Turing-universality of tiling is retained. As an example, consider the seven tiles shown in Figure 2 assembling at ~ = 2. These tiles perform a simple computation they count in binary. Starting with the seed tile, labeled S. the tiles with strength-2 bonds polymerize to form a V-shaped boundary for the computation. There is a unique tile that can fit into the nook of the V; because it makes two strength-1 bonds, it can in fact be added. Two new nooks are created, and again a unique tile can be added in each loca- tion. The assembly thus grows forever, counting and counting with unabated madness. Tiles can be added in any order, but the resulting pattern is the same. The same basic self-assembly mechanisms used here are sufficient to perform more sophisticated computations. No new ideas or mechanisms are necessary to ob- tain fully programmable Turing-universal behavior. EXPERIMENTAL ADVANCES The first demonstration of these ideas two-dimensional, periodic arrays of DNA tiles could hardly be called "algorithmic," but it did show that the se-

OCR for page 103
110 FRONTIERS OF ENGINEERING bit =0 bit = 1 no rollover '... ..~ .~ ~ .'.. .1 .:: .... ..;.::...'.. .~ ~ : rollover . .. ~ ALL ~ -$ ~ ....... ....... `'...'.'.'.::::'..'.' 14 , ~, ,,i,, ,i, ,,~ ~ ~ ..::... ~ ..;:.:;... ~ ...:::... ~ r W ...... ...: .':'::'..,:: .'::::'...:, 1.'... 1...: 14 ~ ~ ~ ^ ~ ~ '.:: ::' '.: .'.;:.:; .1 : 1 ~ ::'. ~ ...... . ~ . ~ ....... l . ~ ...... ~ ~ .. .':':: '..' .'::::'. ." .'::"' '' "':':" ' 1 ''' "::::" " 1 ~ A T ~ ~ ~ ~ ~ ~ ..' ..'.::'::.. '..' ..::'. ..'. .'.;:.:;.. "' ":::" "'' "::'::" '" 1. ~ '; , - , ..... ) , ...... , id, . ,T, . ,i, ..... l,, ...... , ~ ~ ' ' :::: ' :: ,.,' ':':" '':: "':':: ''"" '::::' "" "::"' '"' 1 '':: 1 ~ r : . ~ ~ + ) . ~ ~ ~ ''..' ..:::.... ..;:.:;...." ':::' "': "::'::" '':: ":::" '"' ';:.:;' "' ":::" ": 1 ~ ':::' ~ .f ~ ~ ~ I. I act . ~ ~ ,,. ,,.., ,,. ,...-... ,. .,.-... , ., . - ' -: '"" :::: "' 1 ~ r ''A" 'a" "A' 'A'' ''A" 'a" "A' 'A'' ~ ~ ~ ~ ~ ~ ~ ~ ~ IS FIGURE 2 A set of seven tiles that implement a binary counter when started with the seed tile S. Strength-2 bonds are indicated by tile sides with two projections (or indenta- tions); other bonds have strength 1. Arrows indicate sites where a tile may be added at = 2. Source: Adapted with permission from Winfree, 2000. quences given to the tiles' sticky ends could be used to program different peri- odic arrangements of tiles (Winfree et al., 1998a). The encoding of tiles as DNA DX molecules is illustrated in Figure 3; Figure 4 shows small crystals of DX molecules adsorbed on mica, as they appear in the atomic force microscope. Subsequent studies have shown that DNA tiles can be made from a variety of different molecular structures. Thus, the principle that the arrangement of two- dimensional tiles can be directed by programmable, sticky-end interactions ap- pears to be quite robust. The goal of creating three-dimensional, periodic arrays of DNA tiles, origi- nally formulated by Seeman more than 20 years ago, remains an open problem in the field. Once solved, it will allow for more sophisticated information- processing techniques in algorithmic self-assembly, roughly analogous to the increase in power from one-dimensional to two-dimensional cellular automata or Turing machines. For the time being, experimental demonstration of algorithmic self- assembly has been confined to one- and two-dimensional assemblies. The first use of one-dimensional algorithmic self-assembly appeared as the first step in Adleman's original DNA-based computing demonstration; this process formally corresponds to the generation of languages by finite-state machines. Further-

OCR for page 103
DNA COMPUTING BY SELF-ASSEMBLY ~ C ~ ~ C ~ ~ C ~ ~ C Or 111 :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~: ~ ,a~ . A ~~ ~~_~ ~:~ ~~ ~~ ~~ 25 nanometers of I' ~_~A^~ ~~:~ ~~ FIGURE 3 DNA double-crossover molecules can implement abstract Wang tiles, pro- ducing a two-dimensional lattice of DNA with binding interactions dictated by the DNA sticky ends. Source: Adapted with permission from Winfree, 2000. more, using one-dimensional, tile-based assembly, it is possible to read an input string (encoded as a one-dimensional tile assembly) and generate an output string consisting of the cumulative2 exclusive-OR (XOR) of the input string (Mao et al., 2000~; this formally corresponds to a finite-state transducer. The first two-dimensional, algorithmic self-assembly process to be expert mentally demonstrated with DNA is a generalization of the one-dimensional XOR example (Rothemund and Winfree, in preparation). Beginning with an input row consisting of a single 1 in a sea of 0's, the next layer grows by placing a 0 where both neighbors in the layer below are the same and a 1 where they are different. This process, an example of a one-dimensional cellular automaton, generates a fractal pattern known as the Sierpinski gasket. In addition to the DNA required to construct the input, only four DNA tiles are required (in principle) to grow arbitrarily large Sierpinski triangles. Experi- mentally, error-free Sierpinski triangles as large as 8 x 16 have been observed by atomic force microscopy. However, error rates (the frequency with which the 2The nth bit of the cumulative XOR gives the parity of the first n bits of the input sequence.

OCR for page 103
2 FRONTIERS OF ENGINEERING FIGURE 4 Atomic force microscope image of DNA double-crossover crystals. Stripes are spaced at 25 nm; individual 2 x 4 x 13 nm tiles are visible. Source: Image taken by Nick Papadal~is, Winfree Lab. wrong tile was incorporated into the crystal) ranged from 1 to 10 percent, and many fragments appeared to have grown independently of the input structure. It is clear that controlling nucleation and finding mechanisms to reduce the error rates are critical challenges for making algorithmic self-assembly practical. POTENTIAL TECHNOLOGICAL APPLICATIONS Combinatorial Optimization Problems Solving combinatorial optimization problems, in the spirit of Adleman's origi- nal paper, was the first application considered for algorithmic self-assembly. Adleman's essential insight is based on the fact that a class of hard computa- tional problems, the NP-complete problems, share a common generate-and-test form does a sequence exist that satisfies easy-to-check properties X, Y. . . .. and Z. All known algorithms for NP-complete problems require exponential3 time or exponential parallelism. The basic idea is to use combinatorial chemis- try techniques to simultaneously generate all potential solutions and then to filter them, based on chemical properties related to the information they encode, leav- ing at the end possibly only a single molecule that has all of the desired proper- ties. If the final solution to the problem is defined by satisfying a small number of simple properties as is the case for all NP-complete problems then this approach can be used to find the solution in a short amount of time, if the 3Exponen~ial in the length of the problem description, in bits.

OCR for page 103
DNA COMPUTING BY SELF-ASSEMBLY 113 parallelism is sufficient. That a single cc of DNA in solution at reasonable concentrations can contain 260 bits of information which can be acted on si- multaneously by chemical operations gives us hope that the parallelism could be sufficient. By exploiting the situation in which multiple different tiles could be added at a given location much like Adleman's assembly step that produced all pos- sible paths through a graph self-assembly can generate a combinatorial set of possible assemblies and then continue growing according to a process that tests the information to see if it has the desired properties. Theoretical schemes have been worked out that use a single self-assembly step to solve the Hamiltonian path problem (HPP) (Winfree et al., 1998b), solve the Boolean formula satisfiability problem (SAT) (Lagoudakis and LaBean, 2000), and perform other math calculations (Reif, 1997~. How much computation could be done this way? If assembly were to proceed with few errors, solving a 40-variable SAT problem would require 30 milliliters of DNA at a tile concentration of 1 micromolar and might be completed in a few hours. This "best possible" estimate corresponds to 10~2 bit operations per second not bad for chemistry but still low compared to electronic computers. The sheer speed and flexibility of silicon-based electronic computers make them preferable to DNA computing, even if self-assembly were to proceed with- out errors. We can conclude, then, that the low-hanging fruit are not to be found in the field of combinatorial search. But the ability of self-assembly to perform sophisticated computations suggests that we are making progress toward our goal of understanding (and potentially exploiting) autonomous biochemical al- gorithms. A more promising application is suggested by examining how self- assembly is used in biology. Programmable Nanofabrication Biology uses algorithmically controlled growth processes to produce nano- scale and hierarchically structured materials with properties far beyond the capability of today's human technology. Does DNA-based algorithmic self- assembly give us access to new and useful technological capabilities? The sim- plest applications would make use of self-assembled DNA as a template or scaf- fold for arranging other molecular components into a desired pattern. This could be used for biochemical assays, novel materials, or devices. Seeman has envi- sioned, for example, using periodic three-dimensional DNA lattices to assist with difficult protein crystallization or to direct construction of molecular elec- tronic components into a memory (Robinson and Seeman, 1987~. The potential of self-assembly for fabricating molecular electronic circuits is intriguing, given the limitations of conventional silicon-circuit fabrication tech- niques. Photolithography is unable to create features significantly smaller than the wavelength of light, and even if it could, for several-nanometer line widths

OCR for page 103
120 FRONTIERS OF ENGINEERING Existing directed-evolution processes are intended to improve biological se- quences (genes, genomes) rapidly for complex but specific performance charac- teristics in environments that closely resemble the conditions of final applica- tion. All of the host genes other than the proteins) or genesis) being evolved are kept constant so that all of the computational power of the population (typically a molecular library created in vitro) is focused on optimizing the target protein for specific performance criteria and in a specific environment. This kind of laboratory-directed molecular computation, called screening, is used to measure the sum of the performance alterations obtained by multiple mechanisms, in- cluding DNA transcription; mRNA folding; translation and degradation; and protein expression, folding, interaction, special distribution, specific activity, and degradation. Although molecular computation is widely regarded as futuristic, the natural computation methods used in directed evolution of biological molecules have proven to be practical and have generated strong commercial support. Laboratory- directed derivatives of the natural design process have been used to create a variety of valuable commercial products and product candidates in the areas of pharma- ceuticals and vaccines (Maxygen, Inc.), chemicals, chemical processes and fer- mentation strains (Codexis, Inc.), and agriculture (Verdict, Inc.~. These high-value examples are important for the widespread recognition of molecular computation as a powerful and commercially proven design process. Natural computing and evolutionary optimization can, in principle, be ex- tended to the design of a wide variety of nonbiological objects. Developing better methods of universal (i.e., numerical) molecular computation is the goal of most scientists in the field. However, nonnumerical applications of molecular computation may, in aggregate, prove to be just as useful and can be realized more immediately. Until there are practical formats for numerical molecular computation, these nonnumerical formats will serve as examples of the power promised by molecular computation.

OCR for page 103
Challenges and Opportunities in Programming Living Cells RON WEISS Department of Electrical Engineering Princeton University Princeton, New Jersey With recent advances in our understanding of cellular processes and DNA synthesis methods, we can now regard cells as programmable matter. Cells naturally process internal and environmental information in complex fashions and interact with neighboring cells to achieve coordinated behavior. Through genetic engineering, we can now equip cells with new sophisticated capabilities for gene regulation, information processing, and communication. These new capabilities serve as catalysts for synthetic biology, an emerging engineering discipline to program cell behaviors as easily as we program computers. Synthetic biology will benefit a wide variety of existing fields and enable us to harness cells for applications that are not feasible today. Applications include tissue engineering, molecular fabrication of biomaterials and nanostructures, syn- thesis of pharmaceutical products, biosensing, and will surely lead to quantita- tive insights into the operating principles that govern living organisms. Re- search so far has focused on building an enabling infrastructure for synthetic biology applications. A particular emphasis has been on constructing prototype synthetic gene networks that perform digital computation, analog computation, signal processing, and communications. In this paper, I will describe the build- ing blocks for these genetic circuits, several intracellular and intercellular proto- type genetic circuits that have been implemented recently, some of the chal- lenges in designing such circuits, and the long-term significance of this work. SYNTHETIC GENE NETWORKS Genetic circuits are collections of basic elements that interact to produce a particular behavior. These elements include: DNA regions where RNA poly- 121

OCR for page 103
22 FRONTIERS OF ENGINEERING merase molecules bind and initiate transcription of DNA into messenger RNA (mRNA); mRNA sequences where ribosomal RNA molecules bind and initiate translation of mRNA into proteins; proteins that regulate the activity and pro- duction of other proteins; DNA regions that terminate transcription; and motifs that determine protein and mRNA stability. Each element serves a particular function that helps accomplish the overall behavior of the genetic circuit. The basic elements can be grouped into units (or "devices") that perform logic operations. For example, Figure 1 shows a logic unit that implements the digital NOT operation using the biochemical reactions of transcription, transla- tion, and protein decay. Based on this mechanism, we have constructed and tested synthetic gene networks with units that implement the NOT, AND, and IM- PMES logic functions (Weiss, 2001; Weiss and Knight, 2000~. In addition, we also proposed and modeled a biochemical NAND gate that consists of two NOT gates whose outputs are connected with a wire-oR (Weiss et al., l999~. Theoreti- cally, any arbitrary digital logic function can be implemented with genetic cir- cuits using these gates. By constructing biochemical logic circuits and embedding them in cells, one can extend or modify the behavior of cells. Consider how computer designers program the behavior of computers or robots by fabricating silicon-based logic circuits. To program cells in an analogous way, the genetic-circuit designer constructs a DNA sequence that encodes a particular circuit and then embeds this DNA molecule in cells (a process called transformation). Typically, the purpose of this network is to regulate precisely the production of proteins. Be- cause proteins essentially perform all the "work" and information processing in the cell, cell behavior can be programmed by controlling when and under what conditions proteins are produced and degraded in the cell. To date, several small synthetic gene networks have been built that accom- plish specific genetic regulatory functions in vivo: the autorepressor (Figure 2a), in which a repressor regulates its own production to reduce noise in gene expres- sion (Becskei and Serrano, 2000~; the toggle-switch (Figure 2b), in which two repressors inhibit each other's production to achieve a bistable system (Gardner et al., 2000~; the repressilator (Figure 2c), in which three repressors are con- nected in a ring topology to produce repeated oscillation (Elowitz and Leibler, 2000~; the genetic clock and toggle switch constructed from transcriptional re- pressors and activators (Atkinson et al., 2003~; and our synthetic gene networks used for engineering digital logic gates and circuits in cells (Weiss and Basu, 2002; Weiss et al., 1999~. Despite their logically simple functions, the difficulties in building these networks revealed that many challenges will be faced in building circuits of increasing complexity. For example, designers of genetic circuits will have to cope with the significant noise inherent in gene expression (Becskei and Serrano, 2000; Elowitz et al., 2002; McAdams and Arkin, 1997) and will have to match carefully the "impedances" of constituent devices to achieve compound circuits

OCR for page 103
CHALLENGES AND OPPORTUNITIES IN PROGRAMMING LIVING CELLS 123 X=0 Y=1 - output protein Y /Transcription/ / Translation Y=0 input protein X (repressor) V FIGURE 1 A simplified view of the two cases for a biochemical inverter, a logic device whose output value is always the inverse of its input value. The concentration of two particular proteins represent the input and output logic signals. In the first case (left), input protein X is absent, and the cell transcribes the coding sequence for output protein Y (labeled CDS). In the second case (right), input protein X is present and binds specifi- cally to the gene at the promoter site (labeled P), preventing the cell from expressing output protein Y. that operate properly and reliably (Weiss and Basu, 2002~. Because these cir- cuits operate in the context of a living organism that already has an existing "program," understanding the interactions of the exogenous circuit with the en- dogenous elements will be critical. When designing these circuits, such interac- tions will have to be minimized to avoid undesired cross talk and interference with normal cellular processes. The exception to this design rule applies when control over endogenous cell behavior is intentional and desired (e.g., control- ling cellular metabolism or growth). COMMUNICATION AND SIGNAL PROCESSING Intercellular communication allows individual cells to coordinate their be- havior and accomplish sophisticated tasks they simply cannot perform alone. In higher level organisms, such as mammals, eukaryotic cells send and receive signals to perform a wide range of activities, from differentiation and growth during embryogenesis to immune and stress responses during adult life. How- ever, cell-to-cell communication is not exclusive to higher level organisms. Bac- terial cells, for example, are known to regulate gene expression based on their

OCR for page 103
124 (a) (b) 1 Rex It I2 FRONTIERS OF ENGINEERING (2a) AWL 1 (2b) fluorescence (A.U.) AL FIGURE 2 An overview of three synthetic gene networks, their corresponding logic circuits, and experimental behavior. These networks have been implemented to demon- strate specific genetic regulatory tasks. 2a. The auto-repressor circuit reduces variations in gene expression. Source: Becskei and Serrano, 2000. Reprinted with permission. 2b. The toggle switch circuit is a bistable system. Source: Gardner et al., 2000. Reprinted with permission. 2c. The repressilator circuit exhibits periodic oscillation in gene ex- pression. Source: Elowitz end Leibler, 2000. Reprinted with permission.

OCR for page 103
CHALLENGES AND OPPORTUNITIES IN PROGRAMMING LIVING CELLS 125 own cell density by secreting and then detecting concentrations of unique bio- chemical signals in a behavior known as quorum sensing (Bassler, 1999~. This coordinated behavior gives bacterial cells a competitive advantage both as patho- gens and in symbiotic relationships with their hosts. To explore potential applications that would benefit from coordinated, mul- ticellular behavior, we have begun to integrate communication capabilities with various synthetic genetic regulatory and information-processing networks. To- ward this end, we first programmed Escherichia cold cells to communicate with each other by connecting several transcriptional regulatory elements with previ- ously unrelated signaling elements from the marine bacterium Vibrio fischeri (Weiss and Knight, 2000~. In the experimental system, we externally induced "sender" cells to synthesize the production of a small molecule (30C6HSL) that then diffused outside the cell membrane and entered the cytoplasm of nearby "receiver" cells. The receiver cells responded to this chemical message by ex- pressing a green fluorescent protein that was visible under a microscope (Figure 3~. We are now extending this work to achieve programmed, two-way commu- nications by constructing new circuits that use multiple signal synthesis and response elements from various bacterial sources. Two important challenges in engineering sophisticated and robust, multisignal, cell-to-cell communication networks will be matching response sensitivities and reducing cross talk between the signals. Cells naturally analyze cell-to-cell communication and various environmen- tal conditions with elaborate signal-processing circuitry that includes both digital and analog components. The ability of cells to detect and subsequently react to environmental and internal signals is a principal characteristic of many biologi- cal phenomena. Examples include the movement of bacteria towards higher concentrations of nutrients through chemotaxis, the release of fuel molecules in response to hormones that signal hunger, and cell differentiation during embryo- genesis based on chemical gradients. To extend a cell's ability to respond to 40 minutes 5:00 hours 7:30 hours FIGURE 3 Time-series fluorescence images illustrating the response of colonies of engineered receiver cells to communication from nearby sender cells on a petri dish.

OCR for page 103
26 FRONTIERS OF ENGINEERING internal and environmental stimuli beyond the digital realm, we began to engi- neer analog signal-processing capabilities using synthetic gene networks. In our laboratory, we have built genetic circuits that enable cells to detect various chemi- cal concentrations (below and above prespecified thresholds or only within cer- tain ranges) and other circuits that allow cells to respond to multiple environ- mental signals (Basu et al., 2003; Weiss et al., 2003~. As a prototype for exploring issues in engineering signal-processing capa- bilities, we designed a new genetic chemical-source pinpointing circuit. The circuit is able to detect the presence of a particular extracellular molecule and then distinguish between various chemical concentrations of the molecule (Basu et al., 2003~. Consider a toxin analyte whose location or even presence in the environment is unknown. The analyte is secreted from a particular pathogen and forms a chemical gradient centered on the source. A potential method of deter- mining the location of the pathogen is to spread engineered sentinel cells in the suspected environment that can detect prespecified chemical concentrations. For a given concentration range, these cells will fluoresce in a ring pattern around the source. If the cells are engineered to detect multiple ranges distinguishable by different fluorescent colors, a bullseye pattern will emerge with several con- centric rings around the analyte source (Figure 4~. The source-pinpointing circuit consists of several components that first detect the external analyte concentration and then determine whether the concentration falls above a prespecified low threshold and below a high threshold. Figure 5 shows two separate experiments of "sentinel" bacterial cells that have been pro- grammed to respond to either low or high concentration thresholds of a biochemi- cal secreted from nearby sender cells (Basu et al., 2003~. By combining and inverting the low and high threshold outputs, one can realize a circuit that responds with a high fluorescent output only when the analyte concentration is within the prespecified range. With the aid of simulation tools, we have also been able to fine tune the threshold responses of these circuits by modifying the DNA sequences of chosen genetic elements. An important long-term goal for this type of research is to be able to tune the responses of genetic circuits with the same predictability and reliability as we can when we design electrical devices. By combining this type of analog information processing with digital computation and programmed cell-to- cell communication, we may be able to create a flexible and powerful engineering discipline for programmed cell behaviors. KEY CHALLENGES The initial modeling and experimental efforts in this field have generated a great deal of excitement but have also revealed many challenges that must be faced. Some of the significant challenges are described below. Programming complex behavior will require the assembly of a large, well characterized library of intracellular and intercellular components. The library

OCR for page 103
CHALLENGES AND OPPORTUNITIES IN PROGRAMMING LIVING CELLS 127 a ~ ana~yte "sentinel" ceil FIGURE 4 4a. Chemical-source pinpointing can be accomplished using cells with a genetic circuit that expresses a green fluorescent protein only when a given analyte con- centration falls within a particular range. Because the chemical concentration is highest at the source and forms a gradient as the distance from the chemical source increases, cells with this circuit will form a fluorescent ring around the source. 4b. Given a small library of circuits that can detect different chemical concentration ranges with unique fluorescent colors, cells will form separate rings centered on the source that result in a bullseye pattern. should include elements for regulating transcription and translation, as well as elements for regulating protein-to-protein interactions, such as phosphorylation- based signal-transduction cascades. Most of the library elements must exhibit minimal cross talk and must not affect the host's behavior. However, an impor- tant subset of this component library should be devoted to interfacing with the host to control desired cellular functions, such as the production of specific enzymes during predefined conditions, control over cell replication, and pro- grammed secretion of various chemicals. Designing operational and efficient genetic circuits will require models and simulation tools that can provide accurate quantitative predictions of circuit behavior. Engineered genetic circuits operate within a highly complex environ-

OCR for page 103
128 FRONTIERS OF ENGINEERING FIGURE 5 Fluorescence microscope images of "sentinel" bacterial cells programmed to detect whether the concentration of a chemical secreted by another type of bacteria is above or below particular thresholds. The bacteria that secrete the biochemical are fluo- rescing in red, while the sentinel bacteria scattered throughout the entire viewing area are fluorescing in yellow when the prespecified detection conditions are satisfied. 5a. Con- centration above a high threshold. 5b. Concentration below a low threshold. ment that is not well characterized and whose effects on the operation of the circuit have not been quantified. Models are still unable to predict the precise concentration averages and population statistics of the molecular components of even relatively simple systems. Solutions to this challenge will require more precise kinetic rates for describing the relevant biochemical reactions, models that incorporate additional cellular state (e.g., accurate RNA polymerase and ribosomal RNA levels), accurate predictions of noise in the biochemical reac- tions, a physical model of the cells and their surroundings, and potentially com- pletely different modeling techniques. Furthermore, as the complexity of engi- neered circuits increases, they begin placing a significant metabolic burden on

OCR for page 103
CHALLENGES AND OPPORTUNITIES IN PROGRAMMING LIVING CELLS 129 the host. Molecular interactions between the exogenous components and the endogenous cellular circuitry can also affect host behavior. Effects on the host's environment must be understood and modeled to design complex synthetic circuits. Genetic circuits must integrate specific components or network motifs that make them robust to fluctuations in the kinetics of biochemical reactions. Gene expression tends to be noisy because of the stochastic nature of the constituent biochemical reactions (Elowitz et al., 2002; McAdams and Arkin, 1997~. In addition, fluctuations in environmental conditions, such as temperature and nu- trient levels, affect cellular metabolism and consequently the operation of ge- netic circuits. Circuits that achieve reproducible, reliable behavior must do so despite components whose behavior fluctuates considerably. Mitigating the ef- fects of gene expression noise will probably require a solution that incorporates positive and negative feedback loops. A major bottleneck in genetic-circuit engineering is the difficulty of synthe- sizing DNA constructs. Anecdotal evidence suggests that the construction of new strands of DNA consumes a significant portion, if not most, of the time spent in circuit engineering. Currently, several ongoing efforts are trying to remove this bottleneck by using various de novo DNA-synthesis methods. Genetic-circuit design will require novel approaches that may be fundamen- tally different from existing computer-circuit design methodologies. In con- structing our circuits, we have used "rational" design in which detailed models are used to guide the circuit engineering, helping to select appropriate compo- nents and suggesting how to mutate them, when necessary, to achieve correct circuit behavior (Weiss and Basu, 2002~. We have also used another technique called directed evolution to optimize circuit behavior (Yokobayashi et al., 2002~. Building on nature's fundamental principle of evolution, this unique process directs cells to mutate their own DNA until they find gene network configura- tions that exhibit the desired system characteristics. Because our understanding of cellular processes is incomplete, efficient circuit design will most likely re- quire a combination of rational design and directed evolution, or perhaps some other completely different approach. ~ ~ ~ ~ . ~ ~ LONG-TERM SIGNIFICANCE The field of synthetic gene networks is still in its infancy. Researchers are currently trying to build small genetic-regulatory systems that exhibit a particu- lar behavior reliably and predictably. Solving the existing challenges in building complex genetic circuits will take considerable effort. However, once the major challenges are solved, the construction of synthetic gene networks will enable us to direct cells to perform sophisticated digital and analog functions, both as individual entities and as part of larger cell communities. An engineering disci- pline to program cell behaviors and its associated tools will advance the capabili-

OCR for page 103
130 FRONTIERS OF ENGINEERING ties of genetic engineering and allow us to harness cells for a myriad of new applications. As a result, someday we will be able to modify and extend the behavior of cells in almost arbitrary ways. Because this technology will allow us to modify the instructions that govern the capabilities of organisms that live around us, as well as our own body's instructions, this technology has tremen- dous potential for improving control over the environment, our surroundings, and our quality of life. REFERENCES Atkinson, M.R., M.A. Savageau, J.T. Myers, and A.J. Ninfa. 2003. Development of genetic cir- cuitry exhibiting toggle switch or oscillatory behavior in Escherichia coli. Cell 113(5): 597- 607. Bassler, B.L. 1999. How bacteria talk to each other: regulation of gene expression by quorum sensing. Current Opinion in Microbiology 2(6): 582-587. Basu, S., D. Karig, and R. Weiss. 2003. Engineered signal processing of cells: towards molecular concentration band detection. Natural Computing, an International Journal 2:4(463-478). Becskei, A., and L. Serrano. 2000. Engineering stability in gene networks by autoregulation. Nature 405(6786): 590-593. Elowitz, M.B., and S. Leibler. 2000. A synthetic oscillatory network of transcriptional regulators. Nature 403(6767): 335-338. Elowitz, M.B., A.J. Levine, E.D. Siggia, and P. Swain. 2002. Stochastic gene expression in a single cell. Science 297(5584): 1183-1186. Gardner, T.S., C.R. Cantor, and J.J. Collins. 2000. Construction of a genetic toggle switch in Escherichia coli. Nature 403(6767): 339-342. McAdams, H.H., and A. Arkin. 1997. Stochastic mechanisms in gene expression. Proceedings of the National Academy of Sciences 94(3): 814-819. Weiss, R. 2001. Cellular Computation and Communications Using Engineered Genetic Regulatory Networks. Ph.D. thesis, Massachusetts Institute of Technology, September 2001. Weiss, R., and S. Basu. 2002. The device physics of cellular logic gates. Pp. 54-61 in NSC-1: The First Workshop of Non-Silicon Computing, Boston, Massachusetts, February 2002. Available online at: . Weiss, R., S. Basu, A. Kalmbach, S. Hooshangi, D. Karig, R. Mehreja, and I. Netravali. 2003. Genetic circuit building blocks for cellular computation, communications, and signal process- ing. Natural Computing, an International Journal 2(1): 47-84. Weiss, R., G. E. Homsy, and T. F. Knight, Jr.. 1999. Pp. 275-295 in Dimacs Workshop on Evolution as Computation. Princeton, N.J.: Springer Verlag. Weiss, R., and T.F. Knight, Jr. 2000. Engineered Communications for Microbial Robotics. Pp. 1- 16 in DNA6: Sixth International Workshop on DNA-Based Computers, DNA2000, Leiden, The Netherlands, June 2000. New York: Springer-Verlag. Yokobayashi, Y., R. Weiss, and F.H. Arnold. 2002. Directed evolution of a genetic circuit. Pro- ceedings of the National Academy of Sciences 99(26): 16587-16591.