Team:Fudan-TSI/Design

Design | 2019 iGEM Team:Fudan-TSI


Design

We hereby present a toolbox for in vivo continuous mutation library construction. R-Evolution could mutate coding sequences and regulatory sequences, which enables the evolution of individual proteins or multiple targets at a time, promotes high-throughput research, and serves as a foundational advance to synthetic biology.

cover Design

Moloney murine leukemia virus (MMLV) reverse transcriptase

Why MMLV?

Reverse transcriptase (RT) is one of the most crucial part in our system, we chose it from Moloney murine leukemia virus (MMLV) for five reasons.

  • MMLV-RT’s enhanced version is commonly used in in vitro reverse transcription, which guarantees the safety and well characterization of this part.
  • Unlike the reverse transcriptase of HIV, MMLV-RT acts as a monomer, this brings less trouble to its production in an heterogenous host.
  • MMLV is a eukaryotic virus, which means it’s orthogonal to prokaryotic species. This orthogonality makes up of the host adaptability of our system.
  • MMLV-RT has a higher processivity in relation to other viruses’ RT such as HIV and AMV. This feature gives our system a greater range of mutagenesis length.
  • MMLV-RT has low primer specificity, which means that if we change the sequence of its primer and its corresponding primer binding site (PBS), unlike HIV, the change will not be reverted in the following reverse transcription process. This makes it possible for us to customize the sequence of the tRNA primer used in accord with the target sequence in our software.

Even though RT does the function of reverse transcription, what is expressed in the cell is its polyprotein version. The gag-pol polyprotein has four parts—capsid protein, protease (stop codon separated), reverse transcriptase, and integrase. The integrase is deleted from the polyprotein to eliminate the possibility of genome interference. The protease contains a UAG stop codon at its 5th amino acid site (Yoshinaka et al.), which is readthrough as glutamine at a 5% efficiency in its native host cells, to enable the 20:1 ratio between capsid and reverse transcriptase protein. As its readthrough efficiency is much lower in E. coli cells, and studies have shown that lower efficiency greatly damages the activity of reverse transcriptase (Csibra et al.), we mutated the UAG codon into CAG, making a complete readthrough of the polyprotein. This will slightly decrease the activity of RT but within an acceptable range.

Figure 1. Crystal structure of MMLV reverse transcriptase (PDB:4MH8).

The capsid protein is necessary as it has been found to promote the annealing of tRNA primer to the primer binding site (PBS) in MMLV, and plays an important role in the following two strand transfer steps (Mak et al., Gonsky et al.). To be certain of this design, we have consulted Prof. Alper through email and received his confirmation on the necessity of the capsid protein.

To increase the mutation of our RT, we built a mutated version (Y1245F). This mutation has been shown to increase the mutation level by 5 times (Zhang et al.).

The gag-pol polyprotein is placed under Lac operon, whose expression controlled by IPTG.

tRNA primer

The initiation of reverse transcription requires its cognate tRNA primer. For MMLV-RT, the primer is tRNAPro(AGG) (Harada et al.). After comparing its sequence to the endogenous tRNA sequences in E. coli, we found very little homology between them, which again proves our system’s orthogonality to prokaryotic system. However, this also arises problems regarding its post-transcriptional processing.

Figure 2. Schematic diagram of reverse transcription process.

To make sure that the exogenous tRNA could be successfully expressed and processed in E. coli cells into its matured form, we placed it under two promoters, one T7 promoter, and another pGlnW promoter, which is the endogenous promoter for GlnW and its downstream MetU. The CCA motif is added to the 3’ end, and followed by the interval sequence between two tRNAGln, and a T7Te terminator.

Apart from using the native tRNAPro, which requires an additional PBS outside the target sequence, we have designed a software tool which can design novel tRNAs aligning with the target sequence.

Flanking sequences

The reverse transcriptase has three functions: 1) RNA-dependent DNA polymerase; 2) DNA-dependent DNA polymerase; 3) RNaseH activity.

In a full reverse transcription process, the target strand will go through annealing with tRNA primer, and two strand transfer. To make sure that all three stages will be completed successfully, the noncoding sequences on the MMLV genome has been cloned to flank the target sequence. Flanking sequences include the following components, which are transcribed with the target and helps to accomplish reverse transcription.

  • Primer binding site (PBS) recognizes and anneals with tRNA (Lund et al.), initiates reverse transcription.
  • 5’ noncoding sequence (U5) has been shown to affect the replication ability of RT and its alteration affects the synthesis of minus strand DNA (Kulpa et al.). It is also linked with the efficiency of the first strand transfer.
  • Poly-purine tract (PPT) is resistant to RNaseH digestion, its remaining RNA serves as primer for the plus strand initiation (Finston et al., Kelleher et al.). We also reserved several nucleotides near its site for they have shown to impact the function of PPT (Robson et al.).

Cre recombinase

Cre recombinase is derived from P1 bacteriophage, which could initiate recombination events between two loxP sites. Cre recombinase binds to the palindromic sequence in loxP, and after forming a tetramer (two Cre on one loxP), its active site would initiate different recombination process based on the orientation of the two loxP (Stark et al.). Two loxP sites in the same direction would initiate self-splicing, resulting the sequence in between be excised and form a circular loop. If the two loxP sites are oriented opposingly, the sequence in between would be inverted.

Figure 3. Crystal structure of Cre recombinase bound to a loxP holliday junction (PDB:3MGV).

If the two loxP sites are placed on different sequences, the two sequence would be transferred into each other’s place, but at a lower efficiency than the other two events. Our system utilizes its strand-transfer recombination activity to insert the mutated target into its original place, thus replacing the unmutated version and allows for another round of mutation.

The expression of Cre recombinase is placed under a different operon, and anhydrotetracycline (aTc) serves as its inducer. The reason behind placing RT and Cre under different control has been elaborated in our modeling. Apart from the need of different final concentration of these two proteins for the system to achieve its optimal function, putting them under different promoters also enables better control over the system’s status.

Sequences flanking loxP

Cre recognizes loxP sites and after forming a tetramer, initiates the recombination process (ref12). The 2 loxP sites are placed at the farthest 5’ and 3’ end of the target sequence respectively. Since we need to utilize the recombination activity and eliminate its self-splicing ability, the two loxP sites need to be compatible with itself but incompatible with each other. We chose lox511, lox5171 and lox2272 to pair with wild-type loxP sites (Hoess et al., Lee et al.), and examined their incompatibility with wild-type loxP.

Cre mutation and alternative versions

We found that even though we’re putting Cre under a controllable promoter, its small leakage already can initiate self-splicing between 2 wildtype loxP (see Results for elaboration). This is undesirable as uncontrollable recombination could greatly damage the confidentiality of our result. The desired sequence could be recombined and expressed for a time and then be gone with the ongoing recombination. Our modeling result also shows that Cre expression needs to stay at a low level for a higher recombination rate.

To bring Cre expression under more stringent control, we made several different versions of Cre.

  • Firstly, we mutated some of the encoding codons to rare codons in hope that this would bring difficulty to translation and thus bring down the expression level.
  • Then, we added different degradation tags following it. We designed the tags utilizing the endogenous degradation system of E. coli (Karzai et al.). We tested 5 tags, (YA)LAA, LVA, LAV, LVV, and (WV)LAA based on research literatures (Landry et al., Janssen et al.). With the support of our modeling, we found that WVLAA tag best suits our need, with moderate steady state level and quick degradation dynamic.
  • Figure 4. Different versions of Cre recombinase.

  • As an alternative approach, we also tested the split-Cre construct (Jullien et al.). Cre recombinase is split between its 59th and 60th amino acid, the N- and C-terminal fragments are attached to FRB and FKBP separately. When the inducer rapamycin is absent, the two fragments will not polymerize and no detectable Cre activity is found. After adding rapamycin into the culture, FKBP and FRB will polymerize, thus bringing the two Cre fragments into contact, and recombination activity will be gained.

Experiments

Successful induction and processing of reverse transcriptase

EGFP is cloned in the place of the reverse transcriptase, and fluorescence level is measured. The gag-pol polyprotein is induced and SDS-PAGE is run to see if it has been successfully processed by the protease.

Figure 5. Schematic diagram of the system testing Cre recombination.

Reverse transcription completion

When Cre is not present in the cell, the system’s outcome would be reverse transcribed double-stranded cDNA. After lysing the cell and extracting its DNA components within, PCR amplification of the cDNA product would be performed and after electrophoresis, we could observe a brighter band if RT is induced to express compared to RT non-existent cells. For more quantitative analysis, qPCR would be performed.

Testing

Cre expression

EGFP is put under the ptetR promoter in place of Cre, and its fluorescence level is measured.

Cre is co-transformed with another plasmid containing a stably expressed mCherry with two wildtype loxP sites flanking. Afterwards the whole sequence is amplified through PCR, and if Cre is in the system, mCherry would be spliced from its plasmid, and the amplification result would be a short sequence, while the non-spliced mCherry would be fully amplified.

Compatibility test of loxP sites

Cre is co-transformed with another plasmid containing a stably expressed mCherry flanking by two different loxP sites. The whole region is PCR amplified, and gel run to see if self-splicing event has occurred.

Degradation tag dynamic

EGFP carrying all five tags have been cloned and measured for its fluorescent level.

Recombination event test

Cre is co-transformed with two plasmids. One plasmid contains a stably expressed full-length mCherry flanked by 2 different loxP sites; the other carries a truncated version of mCherry, its flanking loxP sites match with the first plasmid. The truncated version is known to be inactive. If recombination occurs successfully, we would detect the first plasmid carrying the truncated version through PCR amplification and agarose gel analysis. Also, the fluorescent level would decrease.

System verification

We have constructed plasmids carrying chloramphenicol resistance gene with different stop-codon mutations. The gene is thus inactive, and acts as the target sequence. The two plasmids, carrying the target and other mutation necessary components respectively, are co-transformed.

After incubation, the reverse transcriptase would be induced first, then Cre recombinase. The target sequence would be randomly mutated in the system, and if the mutation happens to successfully reverse the stop-codon back into its original coding sequence, the cell would grow successfully on solid medium containing chloramphenicol. By counting the number of cells grown, we could get a picture of the efficiency of single site mutation.

Figure 6. Schematic diagram of the system verification experiment design.

Sequence specificity

The surviving cells would be cultured and plasmid extracted. The plasmids would be sent for sequencing to prove that other sequences unrelative to the target is not mutated. It would also be retransformed into bacteria which has never encountered the system, and be planted on media containing chloramphenicol. If the cell could still grow normally, we could assume that the bacteria’s genome is not mutated in our mutagenesis process.

Host adaptability

Since we’re using parts that are orthogonal of native bacterial systems, our system is expected to work independently in different species. The system is tested in E. coli, and will then expand to species including cyanobacteria and lactic acid bacteria.

References

  1. [1] Yoshinaka, Y., Katoh, I., Copeland, T. D. & Oroszlan, S. Murine leukemia virus protease is encoded by the gag-pol gene and is synthesized through suppression of an amber termination codon. Proc. Natl. Acad. Sci. U. S. A. 82, 1618®C1622 (1985).
  2. [2] Csibra, E., Brierley, I. & Irigoyen, N. Modulation of Stop Codon Read-Through Efficiency and Its Effect on the Replication of Murine Leukemia Virus. J. Virol. 88, 10364 (2014).
  3. [3] Mak, J. & Kleiman, L. Primer tRNAs for reverse transcription. J. Virol. 71, 8087®C8095 (1997).
  4. [4] Gonsky, J., Bacharach, E. & Goff, S. P. Identification of residues of the Moloney murine leukemia virus nucleocapsid critical for viral DNA synthesis in vivo. J. Virol. 75, 2616®C2626 (2001).
  5. [5] Zhang, W.-H., Svarovskaia, E. S., Barr, R. & Pathak, V. K. Y586F mutation in murine leukemia virus reverse transcriptase decreases fidelity of DNA synthesis in regions associated with adenine-thymine tracts. Proc. Natl. Acad. Sci. U. S. A. 99, 10090®C10095 (2002).
  6. [6] Harada, F., Peters, G. G. & Dahlberg, J. E. The primer tRNA for Moloney murine leukemia virus DNA synthesis. Nucleotide sequence and aminoacylation of tRNAPro. J. Biol. Chem. 254, 10979®C10985 (1979).
  7. [7] Lund, A. H., Duch, M., Lovmand, J., J?rgensen, P. & Pedersen, F. S. Complementation of a primer binding site-impaired murine leukemia virus-derived retroviral vector by a genetically engineered tRNA-like primer. J. Virol. 71, 1191®C1195 (1997).
  8. [8] Kulpa, D., Topping, R. & Telesnitsky, A. Determination of the site of first strand transfer during Moloney murine leukemia virus reverse transcription and identification of strand transfer-associated reverse transcriptase errors. EMBO J. 16, 856®C865 (1997).
  9. [9] Finston, W. I. & Champoux, J. J. RNA-primed initiation of Moloney murine leukemia virus plus strands by reverse transcriptase in vitro. J. Virol. 51, 26®C33 (1984).
  10. [10] Kelleher, C. D. & Champoux, J. J. RNA Degradation and Primer Selection by Moloney Murine Leukemia Virus Reverse Transcriptase Contribute to the Accuracy of Plus Strand Initiation. J. Biol. Chem. 275, 13061®C13070 (2000).
  11. [11] Robson, N. D. & Telesnitsky, A. Selection of optimal polypurine tract region sequences during Moloney murine leukemia virus replication. J. Virol. 74, 10293®C10303 (2000).
  12. [12] Stark, W. M. The Serine Recombinases. Microbiol. Spectr. 2, (2014).
  13. [13] Hoess, R. H., Wierzbicki, A. & Abremski, K. The role of the loxP spacer region in P1 site-specific recombination. Nucleic Acids Res. 14, 2287®C2300 (1986).
  14. [14] Lee, G. & Saito, I. Role of nucleotide sequences of loxP spacer region in Cre-mediated recombination. Gene 216, 55®C65 (1998).
  15. [15] Karzai, A. W., Roche, E. D. & Sauer, R. T. The SsrA®CSmpB system for protein tagging, directed degradation and ribosome rescue. Nat. Struct. Biol. 7, 449®C455 (2000).
  16. [16] Landry, B. P., St?ckel, J. & Pakrasi, H. B. Use of degradation tags to control protein levels in the Cyanobacterium Synechocystis sp. Strain PCC 6803. Appl. Environ. Microbiol. 79, 2833®C2835 (2013).
  17. [17] Janssen, B. D. & Hayes, C. S. The tmRNA ribosome-rescue system., Adv. Protein Chem. Struct. Biol. 86, 151®C191 (2012).
  18. [18] Jullien, N., Sampieri, F., Enjalbert, A. & Herman, J.-P. Regulation of Cre recombinase by ligand-induced complementation of inactive fragments. Nucleic Acids Res. 31, e131®Ce131 (2003).