Overview
Overview Abstract
The current paradigm of part creation, characterization, and documentation is rate-limiting for scientific discovery. The 2019 Stanford iGEM team envisions an alternative model for facile part creation where final protein performance necessarily conforms to initial design specifications. To make this future a reality, we focused on developing self-selecting systems (SSS): directed evolution platforms that selectively amplify the genotypes corresponding to desirable phenotypes.
Specifically, we developed Directed Chassis-agnostic Evolution, or DiCE, a novel, easy-to-implement selection-based directed evolution platform built off Qbeta replicase, an RNA-based RNA polymerase. We worked towards demonstrating DiCE’s ability to evolve proteins in both E. coli and cell-free environments (projects: “DiCE In Vivo” and “DiCE In Vitro”.
Furthermore, we generated standard selection schema compatible with PREDCEL (Heidelberg 2017) to expand the range of synthetic biological parts that can be created by any SSS (projects “PREDCEL-Plus Selection Schema” and “AcrIIA4 Evolution”).
Taken together, our work on SSS presents a foundational advance towards a future where part creation is easier, faster, and more accessible.
Overview Background and Significance
Ever washed your laundry on cold? Ever wondered what research went into the enzymes that make your laundry come out feeling fresh and clean? The same technology used to develop these laundry enzymes, a technology called directed evolution, has been used across industries to produce everything from printer paper to modern processed oil (Gurung 2013). The 2018 Nobel Prize in medicine was even awarded to a directed evolution advance that led to the 21 billion dollar drug company Humira (Informa 2018).
So what is directed evolution? To better understand directed evolution (DE), let’s trace through the DE biological process of developing a protease typically used in laundry detergent.
The enzymes active in a Tide Pod, such as proteases, were isolated from living organisms, and thus were originally optimized to clean most efficiently at higher temperatures (STAT 2019). While these proteins have already evolved to operate in their biological environment of roughly 37ºC, these same natural enzymatic powerhouses are rendered almost inert when just 10ºC lower, the normal temperature for cold laundry cycles.
In order to develop mutant enzyme versions that were more thermostable and suitable for washing clothes at colder water temperatures, researchers tapped into the biological technique of directed evolution.
What this meant for our example of a protease, is that researchers took the starting sequence of the enzyme (“Natural Gene” in Figure 1), and began by mutagenizing the DNA to create a diverse gene library. The resulting protein library was then meticulously screened by researchers who looked for versions of the enzyme that could function slightly more effectively at lower temperatures. Following this, the gene that corresponded to the improved enzyme was input back into the start of the cycle to continue the process of gradual improvement over time.
This screening-based directed evolution is the current standard across industry, and is a massive expense that largely contributes to the multi-billion dollar R&D pipelines of large chemical companies (Pratap 2019). Though directed evolution aims to mimic the evolutionary process that is so powerful in nature, the continuity of the cycle is broken by the laborious screening stage that is generally done by hand.
More recently, self-selecting directed evolution techniques (self-selecting systems, or SSS) have been developed in an attempt to close the hole in the continuous loop caused by hand screening and fully actualize the immense power of the natural evolution cycle in a laboratory setting. One such example of this is PACE, or Phage-Assisted Continuous Evolution (Esvelt 2011). Developed in the Liu lab in 2011, PACE takes advantage of the error-prone DNA replication of bacteriophages and, most importantly, uses a synthetic biological circuit to couple the functionality of a mutant protein with its propogation. Thus, optimal library members are selectively amplified and time-consuming manual screening processes are bypassed.
Though PACE has been implemented by select labs, it still has significant barriers and limitations. At the start of the summer our team planned to apply an SSS DE method to evolve novel proteins, but we ended up pivoting from this focus due to the following:
- Challenging Set-Up
First and foremost, current self-selecting directed evolution systems generally rely on a complex custom-made bioreactor for continuous bacterial culture. This challenge has been documented by numerous researchers and has greatly decreased accessibility to SSS (Heidelberg 2017). Our team was excited by the advances of PADE and PREDCEL, which aim to mitigate this challenge by discretizing PACE into individual reaction steps. While these discretized SSS methods have been shown to be effective for some proteins, continuous growth DE methods have been consistently shown to confer the advantages of “shorter experiment duration, [as well as] greater constancy of selective pressure and population size throughout evolution,” thereby increasing the likelihood of success (Suzuki 2017).
- Phage Toxicity
An additional challenge is posed by the use of bacteriophages. While M13 does not kill E. coli, infection significantly reduces growth rate. Phage proteins can be toxic to the cell when overexpressed as our team found out the hard way over the summer when we tried to implement PREDCEL. To reduce toxicity, we had to troubleshoot driving expression of those proteins via implementation of a weak RBS and using alternate start codons. Other researchers such as the Liu Lab have aimed to bypass this toxicity by implementing more challenging
These challenges led our team to shift focus from applying already developed directed evolution methods towards instead trying to foundationally improve the technique. We focused in on SSS directed evolution methods and did a deep dive into the literature. In addition to the above challenges that we wanted to address, we also came across the following aspects of SSS methods that had room for improvement:
- Organism Specific Constraints Self-selecting directed evolution techniques utilize chassis organism to amplify, mutate, and select for optimal gene/protein function. This reliance on in vivo protein development inherently limits the evolution to occur in organism-specific environmental constraints. For example, in the recent efforts to develop terminal deoxynucleotidyl transferase (TdT), researchers have struggled to apply directed evolution due to the fact that TdT is toxic for E. coli to produce (Perkel 2019). With our DiCE system, however, these organismal constraints would be lifted.
- Phage Washout A common problem with phage-based directed evolution technique is phage washout. This is when the selection pressure exceeds the proteins ability to evolve and therefore your protein fails to evolve.
- Lack of Standard Selection Schema For each directed evolution system, the user has to creatively design genetic circuits to link functionality to cell survival. This is a significant challenge and often limitation when trying to apply self-selecting directed evolution systems.
- DiCE In Vivo
- DiCE In Vitro
- Novel Selection Schema (PREDCEL-Plus)
- Novel DE Application (AcrIIA4 evolution)
With these challenges and areas for improvement in mind, we converged on the 2019 Stanford iGEM project.
First and foremost, we developed the foundations for a novel self-selecting directed evolution method. This method, detailed in the next overview section, we termed Direct Chassis-agnostic Evolution, or DiCE. We worked to demonstrate that the fundamentals necessary for directed evolution could be achieved with our novel system. Furthermore, we worked to show that DiCE has the potential to be applied both within E. coli, as well as in cell-free extract. Hence both the chassis-free nomination for our new method as well as the division of our DiCE project sections into DiCE In Vivo and DiCE In Vitro.
Additionally, we worked to advance SSS directed evolution methods by developing standard selection schema, termed PREDCEL-Plus, that could be applied both across prior directed evolution techniques such as PREDCEL and PACE, as well as for our novel DiCE system. Finally, we designed a DiCE self-selecting gene circuit to design a novel anti-CRISPR protein (AcrIIA4).
NOTE on Project Website Navigation:
For each project section (Description, Design, Experiments, and Results), we break down each web page into an overview followed by the respective information for our four subprojects. For example, readers have the option of either reading all of the subproject blocks for Description before continuing onto Design, or they can choose to trace one subproject through all project sections Description→Design→Experiments→Results by using the forward arrows on each content block. Our four subprojects, together serving as a foundational advance for self-selecting directed evolution, are as follows:
DiCE in vivo
The vision of accessible chassis-agnostic directed evolution
More than fifty years ago, a peculiar polymerase was isolated from the single-stranded RNA coliphage, Qβ.1 Around the same time thoughts of synthesizing evolution were emerging. The two events were not unrelated. In fact, the scientific histories of both directed evolution and Qβ replicase were entangled in the beginning, since Spiegelman’s Monster experiment.2 Today, Qβ’s RNA-inducible RNA polymerase is a well-characterized protein and directed evolution has won Arnold Francis’ 2018 Nobel Prize in chemistry for her contribution to the developing field of protein engineering. Despite these two subjects proximity, never has their relationship been exploited to realize the full potential of a directed evolution system.
The 2019 Stanford iGEM team saw an opportunity in the narrative above. Beginning our exploration for project ideas, we familiarize ourselves with the current hot-topic that is protein engineering via directed evolution. We researched many methodologies encompassing past iGEM projects like with PADE by Heidelberg’s 2017 team as well as more institutionalized methods such as OrthoRep by Chang Liu and PACE among many others. During our investigation of the topic, we noted the drawbacks that most methods share: time inefficiency, dependence upon the chassis organism, costs, and its subsequent inaccessibility.3 These difficulties that come with implementing directed evolutionary methods did not deter us; rather, we saw these disadvantages to current methodologies as an opportunity to improve such a foundational tool. Thus, we determined directed evolution to be the platform on which we will work. Our motivation for doing so is not only to offer another way of creating new proteins but also increase the accessibility to do so.
This is where Qβ comes in. This interesting replicase appeared to us as a potential solution to some common problems in directed evolution. Every method has its advantages and disadvantages, for instance, some have inherent biases in SNP formation. Others are non-specific in nature, creating SNPs throughout a vector or entire genome. But Qβ’s unique properties such as expressing no mutagenic biases for transversions to transitions nor for A/T and G/C substitutions, its ability to bypass the need for a DNA template, and its relatively high error rate of1.510-3SNPs/kb made it an excellent target for improving directed evolutionary methods.4, 5 Our team was fascinated by these and many other abilities of this replicase, such as being RNA-based. This facette, particular, was unique to mutagenesis methodologies, and we thought it to be quite advantageous. As RNA is universal to all living organisms, we hypothesized that by using an RNA-based mechanism we could, one day, achieve an organism-agnostic system. Upon letting the idea simmer in our team’s brains for several weeks, we thought we could take it a step further and make the system as easy as an RNA transfection. In creating a self-selective system, the need for manual selection could be hypothetically bypassed, reducing protein engineering largely down to an initial transfection of a target gene. Continuing the stream of ideas, the growing enthusiasm, and accruing desire to lessen the burdens of protein engineering, we became committed to creating the DiCE system.
DiCE in vitro
Motivation
Although directed evolution can be a powerful bioengineering technique, harnessing the same forces of selection and random mutation that resulted in the diversity and ingenuity of life today, its application in practice can be limited by several factors. On top of accessibility issues such as lacking specialized equipment, experience, and financial resources, many methods will come with their own caveats including preparing target genes for evolution, construct design and cloning, or even inbuilt limitations into the scope of the evolutions that are possible.
Since most self-selecting directed evolution techniques utilize a host organism that houses the evolving protein, any evolutions are subject to the biological constraints of their host organism, preventing the exploration of toxicity through these techniques. Being hosted inside an organism also means researchers have limited control over the environment a target protein is exposed to during the course of its evolution, and could result in the loss of old functionality or the gain of new characteristics from the different selective pressures inside the host organism. This could result in unintended consequences, such as a eukaryotic protein that no longer functions well inside its original environment after undergoing evolution inside a prokaryotic host. Finally, and most significantly, the iterative design process for most organismal directed evolution techniques is difficult, tedious, and time consuming. Preparing and troubleshooting constructs for a given evolution can take from weeks to months, and changing experimental designs or testing modified parameters for an evolution may mean another several months of construct design before seeing data again.
Our team ran into numerous challenges when trying to set up and execute selection schema for PREDCEL+ and DiCE in vivo, including the need to clone multiple constructs before even beginning to undertake the work of implementing an evolution, which itself can take a significant amount of time and troubleshooting. Despite improvements to previous in vivo techniques, there is still a significant barrier to the application of directed evolution in cells. As a result of the challenges with cellular directed evolution we noticed and experienced, keeping with our overall project theme of reducing financial, capital, and temporal burdens for utilizing directed evolution, we decided to develop and implement a cell free directed evolution technique.
Background
Cell free systems, also referred to as in vitro systems, are cell-extract derived systems that recapitulate transcription and translation (TX-TL) in a test tube as opposed to inside cells. Containing all of the machinery for protein expression, with the simple addition of DNA, which can be a plasmid or given as a simple linear fragment, the system is able to begin protein production.
Cell free systems can be surprisingly cheap monetarily - as low as $0.01 per 10μL TX-TL reaction. They can be ordered from suppliers or created in house with common lab equipment and materials, and even modified to fit the specific needs of each researcher. Since such systems are nonliving, they can be used to express toxic proteins that put metabolic strain on a cell, opening up new candidates for evolution. Furthermore, since these systems lack cell membranes, researchers have complete access and control to the state of the cell free system at all times, allowing the easy monitoring and addition of substrates to the media. Finally, and most importantly, in vitro systems allow for incredibly fast and accessible rapid prototyping. Able to use plasmid DNA or even linear fragments of DNA fresh from the PCR, such flexibility in template format allows for turnaround from ideation to implementation in the span of hours and days instead of weeks or months. Researchers can rapidly prototype biological circuits within a day’s time. A cell free based self selecting system would, thus, display a key speed advantage over PREDCEL+ and DiCE in vivo.
Goal
In order to achieve a working self selective system in vitro, we first had to find a replicative system that we could utilize. Deciding to apply the Qbeta RNA dependent RNA Transcriptase system in vitro, we first needed to demonstrate that our system was capable of self-replication and RNA amplification in cell free systems. We did this through a construct with Qbeta and MDV region flanks for self amplification.
Once we were able to demonstrate that Qbeta functioned in cell free and was capable of self replication, we wanted to identify the relationship between the MDV regions and the replicability and amplification of RNA in the Qbeta system was. We did this by analyzing the differences in the rates of increase in fluorescence for the MDV-Qbeta-Spinach construct and a Qbeta-Spinach construct without MDVs.
Next, we need to demonstrate its ability to replicate and amplify other arbitrary genes and RNA. We attempted to do this through GFP constructs flanked with MDV regions.
After we were able to demonstrate that Qbeta was capable of self replication, and that this replication could be connected to arbitrary genes, we still had to address the chief challenge to developing a self selective system in vitro-- amplifying only desirable mutants and linking “genotype to phenotype”. Each protein in a cell free extract is free to move throughout the solution and interact with any other component in the media, making it hard to selectively amplify “good” mutants.
We solve this problem in our system using water-in-oil droplets. Tawfik and Griffiths (1998) showed that by vortexing non-polar and polar solvents, emulsions would form and, within these emulsions, evolution can take place. More recently, Ichihashi et al. (2013) was able to use cell free to evolve Qß-replicase self-encoded by its RNA. Inspired by this breakthrough, we hoped to harness Qß replicase to conduct arbitrary evolutions in vitro. Our envisioned selection schema is described in the diagram below.
Novel Selection Schema
One of the earliest engineered self-selecting systems (SSS), phage-assisted continuous evolution (PACE1), has been used to evolve modified proteases2, improved Bt toxins3, Cas9 variants4, and soluble antibodies5. However, due to the difficulty of setting up a continuous system, PACE has commonly been implemented as a discretized system6,7,8, which has proved simpler to assemble and produces results on timescales similar to PACE7. Overall, we hoped to implement and improve this discretized system, which we term PREDCEL+ in homage to the 2017 Heidelberg iGEM team, in order to support a wider array of standardized selection schema that can be used to evolve a wider net of proteins.
Novel DE Application
This video details the DiCE in vivo genetic circuit we designed to evolve a novel anti-CRISPR, AcrIIA4. We recommend first reading the Description section for project motivation and background!
Motivation
In the last 5 years, upward of 450 thousand genetically modified (GM) mosquitoes were released in Jacobina, Bahia, and Brazil [2]. These mosquitoes carried a powerful and deadly gene drive system that has a synthetic dominant inheritance pattern designed to wipe out large portions of the mosquito population in the cities where they are released. With a huge potential for these GM mosquitoes to help control malaria and other blood-borne diseases around the world, these gene drive technologies are rapidly being developed in labs across the country, with mass implementation on the horizon.
However, in an ominous and recent publication, genetic evidence emerged that the genetic modifications of the engineered species has spread to mosquitoes beyond the intended target population [3]. Indeed, since the technologies inception environmental experts have warned that "this is an experimental technology which could have devastating impacts [4]," and this recent publication with unexpected genetic consequences is one of the first examples of the peril of this technologies.
This is where AcrIIA4 comes in—anti-CRISPRs have the potential to provide an “off-switch” to gene drives (see Background below), and thus give scientists a safety net when releasing organisms with this lethal technology. To safe-guard against aberrant and unintended consequences of “gene-drives gone wrong,” our team aimed to develop a novel anti-CRISPR that could provide an “off-switch” to xCas9-based gene drives. Thus, we hope to further the powerful potential of these disease-eradicating gene drives with a foundational advance that could increase safety.
Background
The CRISPR-Cas9 system presents exciting possibilities in the fields of bioengineering and synthetic biology, due to its DNA specificity and versatility. It has functioned as the cornerstone for several previous iGEM projects, from site-directed mutagenesis to DNA detection systems.
However, Cas9’s gene-editing applications are limited by its dependency on a specific protospacer adjacent motif (PAM) upstream of the targeted nucleotide sequence, the least restrictive of which was formerly NGG. However in 2018, Hu et al. used phage-assisted continuous evolution (PACE) to evolve a Cas9 variant (xCas9) with a less specific PAM requirement [5]. Given its increased effectiveness and versatility, this novel Cas9 variant has continued to grow in popularity.
This new xCas9, however, has not been shown to have anti-CRISPR proteins that can effectively deactivate it. Anti-CRISPRs are small proteins that can block the Cas9 protein from binding to DNA and thereby prevent genetic modification or downregulation typically caused by Cas9. Anti-CRISPRs are important for designing novel genetic circuits involving CRISPR, and most importantly for our project, they provide a safety net for CRISPR-based gene drives.
While we will not get into the genetic details of CRISPR-based gene drives (we recommend that article though!), Cas9 is the critical protein that leads to the engineered dominant inheritance pattern. What anti-CRISPRs allow is a way to block the Cas9 and thereby shutdown the gene drive. Hence, when designing a genetic circuit for a CRISPR-based gene drive, scientists can put in an anti-CRISPR under an inducible promoter as a countermeasure to activate if a gene drive does not function as expected [6].
Thus, as xCas9 becomes more popular and a likely candidate for future gene drives, it is imperative that we develop an anti-CRISPR that can effectively shutdown this gene drive if something goes wrong.
Since xCas9 is similar in structure and sequence to Cas9, and AcrIIA4 is the anti-CRISPR associated with Cas9 deactivation, we decided AcrIIA4 would be a good starting point for our DiCE evolution in order to eventually evolve a novel anti-CRISPR that could block xCas9.