Difference between revisions of "Team:Fudan-TSI/Model"

(Undo revision 256623 by Ehtele (talk))
Line 1,366: Line 1,366:
 
  <rect id="rect2" x="0" y="0" width="1600" height="600" stroke="none" fill="url(#grad2)"></rect>
 
  <rect id="rect2" x="0" y="0" width="1600" height="600" stroke="none" fill="url(#grad2)"></rect>
 
</svg>
 
</svg>
<div id="demoCover"><img id="coverPic" src="https://static.igem.org/mediawiki/2019/c/c4/T--Fudan-TSI--coverModel.gif"></div>
+
<div id="demoCover"><img id="coverPic" src="https://static.igem.org/mediawiki/2019/9/97/T--Fudan-TSI--coverSafety.gif"></div>
 
</div>
 
</div>
 
<style>
 
<style>
Line 1,384: Line 1,384:
 
}
 
}
 
#coverPic{
 
#coverPic{
width:550px;
+
width:50vw;
 
margin:15vh auto;
 
margin:15vh auto;
 
}
 
}
Line 1,416: Line 1,416:
 
}
 
}
 
#coverPic{
 
#coverPic{
width:500px;
+
width:50vw;
 
margin:7vh auto;
 
margin:7vh auto;
 
}
 
}
Line 1,428: Line 1,428:
 
}
 
}
 
#coverPic{
 
#coverPic{
width:500px;
+
width:55vw;
 
margin:6vh auto;
 
margin:6vh auto;
 
}
 
}
Line 1,440: Line 1,440:
 
}
 
}
 
#coverPic{
 
#coverPic{
width:400px;
+
width:70vw;
margin:8vh auto;
+
}
+
}
+
@media only screen and (max-width:500px){
+
#coverPic{
+
width:200px;
+
 
margin:8vh auto;
 
margin:8vh auto;
 
}
 
}
Line 1,788: Line 1,782:
 
<ul class="leftNav" style="margin:0;padding:0;">
 
<ul class="leftNav" style="margin:0;padding:0;">
 
 
<li class="leftNavLi"><a class="leftNavA" href="#mainTitle1">Overview</a>
+
<li class="leftNavLi"><a class="leftNavA" href="#mainTitle1">Motivation</a></li>
</li>
+
<li class="leftNavLi"><a class="leftNavA" href="#mainTitle2">Theoretical basis</a></li>
 
+
<li class="leftNavLi"><a class="leftNavA" href="#mainTitle3">User guidelines</a></li>
<li class="leftNavLi"><a class="leftNavA" href="#mainTitle2">Part I: Yield of recombined P<sub>target</sub></a>
+
<li class="leftNavLi"><a class="leftNavA" href="#mainTitle4">References</a></li>
<ul class="leftNavUl2">
+
<li class="leftNavLi2"><a class="leftNavA2" href="#mainTitle2_1">Induced expression model</a></li>
+
<li class="leftNavLi2"><a class="leftNavA2" href="#mainTitle2_2">Reverse Transcription model</a></li>
+
<li class="leftNavLi2"><a class="leftNavA2" href="#mainTitle2_3">Cre Recombination Model</a></li>
+
<li class="leftNavLi2" style="display:none;"><a class="leftNavA2" href="#mainTitle2_4"></a></li>
+
</ul>
+
</li>
+
 
+
<li class="leftNavLi"><a class="leftNavA" href="#mainTitle3">Part II: Times of occurrence of recombined P<sub>target</sub></a>
+
</li>
+
 
+
<li class="leftNavLi"><a class="leftNavA" href="#mainTitle4">Part III: Optimal induction time</a>
+
</li>
+
 
+
<li class="leftNavLi"><a class="leftNavA" href="#mainTitle5">Reference</a>
+
</li>
+
 
+
 
</ul>
 
</ul>
 
 
Line 1,913: Line 1,890:
 
<div class="container" id="containerWithLeftNav">
 
<div class="container" id="containerWithLeftNav">
 
<div class="row">
 
<div class="row">
 +
 
 
 
<div class="row title2" id="mainTitle1">
 
<div class="row title2" id="mainTitle1">
<div class="col">Overview</div>
+
<div class="col">Motivation</div>
 
</div>
 
</div>
 
 
 
<div class="row para1">
 
<div class="row para1">
 
<div class="row">
 
<div class="row">
<div class="col col-lg-12">
+
<div class="col">
Our mutagenesis system uses the BL21 (DE3) <i>E. coli</i> strain transformed with two plasmids, a stringent plasmid named P<sub>target</sub> carrying the target sequence that we want to mutate, and a relaxed plasmid named P<sub>mutant</sub>, carrying the gene encoding the tools necessary for mutagenesis, i.e. reverse transcriptase (RT) and Cre. <br /><br />
+
Previous studies have shown that a tRNA primer is required for the initiation of reverse transcription <a>(Dahlberg et al.)</a>. In our system, we express the tRNA primer in E.coli by cloning it onto the plasmid that is used for generating the tools for mutation, i.e. P_mutant. However, designing the primer sequence according to different target sequences is time-consuming and needs many adjustments to find the perfect match. This motivates us to build a software for tRNA primer designing.
As we are designing a brand-new mutagenesis system inside <i>E. coli</i>, we want to demonstrate whether and under what condition it can work, so we turn to modelling to answer these questions. Our modelling work is comprised of 3 parts. 1) We used 3 deterministic models to describe the 3 reaction steps of our system—induced expression, reverse transcription and recombination. This allows us to compute and maximize the yield of the recombined P<sub>target</sub> which in turn, contributes to the optimization of our experimental setup. 2) We simulated the recombination process stochastically and calculated the number of recombined products that occurred during one replication cycle of <i>E. coli</i>. 3) We combined the 3 reaction steps together using deterministic model and found that the two kinds of inducers can be added at the same time to achieve optimal recombination efficiency within one life-cycle of <i>E. coli</i>.
+
 
+
 
</div>
 
</div>
 
</div>
 
</div>
Line 1,929: Line 1,905:
 
 
 
<div class="row title2" id="mainTitle2">
 
<div class="row title2" id="mainTitle2">
<div class="col">Part I: Deterministic model to compute the yield of recombined P<sub>target</sub></div>
+
<div class="col"><i>Theoretical basis</i></div>
 
</div>
 
</div>
 
+
 
<div class="row para1">
 
<div class="row para1">
<div class="col">
+
<div class="row">
When we were constructing the plasmid, we encountered a dilemma concerning how RT and Cre should be expressed. Firstly, we thought of putting them both under a same Lac operon so that their expression can be easily induced merely by one kind of inducer—IPTG. Meanwhile, we also considered using different inducers to achieve a more modular design which would be easier to control. As it would take a long time to test which induced expression scheme is better through experiments, we used modelling to test the two constructs. We modelled all the reactions involved and computed the yield of the desired product, i.e. recombined P_target. Through comparison of the yield acquired using these two induced expression schemes, we decided that the latter scheme should be employed for our system to perform better. <br /><br />
+
<div class="col">
By common knowledge we can assume that, if the amount of RT and Cre needs to be different to achieve optimal yield, we should choose the second scheme and put them under different operons. On the contrary, if the yield reaches the maximum under the maximum amount of RT and Cre, the first scheme should be chosen. <br /><br />  
+
Previous studies have reported that the interactions between tRNA primer and mRNA template as well as the reverse transcriptase are crucial in directing subsequent cDNA synthesis <a>(James E. Dahlberg et al.)</a>. Specifically, according to the model for reverse transcription proposed by Kulpa et al., reverse transcription includes 5 steps (Fig. 1), in which the annealing of tRNA primer to the primer binding site (PBS) region on mRNA template is crucial for the synthesis of minus strand strong stop DNA (–ssDNA) and the following cDNA synthesis process. <br /><br />
In our initial attempt, we found that modelling all the reactions involved is rather difficult, as the reactions are in such a large number and all mixed together. This circumstance makes inspection of the reasonability of our models and parameters impossible. To overcome this issue, we decided to separate these reactions into three minor models and use the steady-state concentration of the substances derived from the previous model as the input of the next model. The three minor models are: <b>induced expression model, reverse transcription model and Cre recombination model,</b> corresponding to the 3 reaction steps in R-Evolution. The schematic diagram is shown in Fig. 1.
+
Many researchers have studied the reverse transcription process in viruses, from which we find two critical properties in the annealing process of tRNA primer and PBS that should be taken into consideration when building the tRNA primer designer. <br /><br />
</div>
+
The first property is that the 3'-terminal of the tRNA primer should be complementary to the PBS on mRNA template <a>(Kosloff et al.)</a>(Kosloff et al.). The second one is that different viruses prefer specific type of tRNA primer for reverse transcription <a>(Kulpa et al.,</a><a> Kosloff et al.)</a>. What should also be noted is that for different viruses, the lengths of PBS as well as the types of tRNA primer are different. The PBS lengths and the preferred tRNA types of 3 most well-studied retroviruses are listed in Table I.<br /><br />
</div>
+
These discoveries serve as the theoretical basis for our tRNA primer designer. So basically, the function of our tRNA primer designer is to change the tRNA template in order to suit the basic properties of the reverse transcriptase (MMLV RT/ HIV-1 RT/ RSV RT) selected by the user as well as to replace several nucleotides (17 or 18) on 3'-terminal of the tRNA templates to match with nucleotides at the 5'-terminal of the GOI which users input. Also, to make sure that the RNA sequence is a tRNA sequence, the secondary structure should be revealed. We achieve this goal by using the similar tRNA secondary structure prediction scheme as the one implemented in the opensource software tRNAfinder <a>(Kurokawa et al.)</a>.<br /><br />
+
Studies have shown that the primary factor guiding the selection of tRNA primer for MMLV RT is the PBS sequence instead of the inherent nature of reverse transcriptase <a>(A. H. Lund et al.,</a><a> S. P. Goff et al.)</a>. So, by making mutations on both the PBS and tRNA sequence, the researchers have found that reverse transcription could still successfully take place while the virus’ titer is not greatly affected. Also, after several cycles of replication, the mutated sequence is not changed back to its original version <a>(Pedersen et al., 1997)</a>. Even though it is found that the primer is not stringent for MMLV, studies have revealed that the tRNA-like structure is necessary. A study that the inclusion of one single non-Watson-Crick base pair between PBS and tRNA primer would improve the replication efficiency (F. S. Pedersen et al., 1993), but we didn’t adopt this construct as the one base pair mismatch would often be changed to the full-complementary version after the first cycle of replication <a>(Pedersen et al., 1997)</a>(Pedersen et al., 1997), making this addition unnecessary.
<div class="row title3" id="mainTitle2_1">
+
<div class="col"><i>Induced expression model</i></div>
+
</div>
+
+
<div class="row para1">
+
<div class="col">
+
We first assumed that both genes encoding RT and Cre are placed together under a lac operon (Fig 2a). The repressor protein LacI is stably expressed in the cell, 2 molecules of LacI will form a dimer which binds to LacO DNA fragment and represses the expression of RT and Cre. When IPTG is added and transported into the cell, IPTG molecules will bind with LacI and inhibit its binding to LacO. In this way, RT and Cre can be rescued from suppression (Nikos et al.). Details of the substance names, parameter names and mathematical equations can be found in the appendix.<br /><br />
+
According to our modelling result, the amount of target protein (RT and Cre) will be extremely low when IPTG is not added (Fig. 2). The origin point represents the time when an E. coli comes into being through reproduction. As a result, the lac operon is not fully repressed by LacI dimer, causing a leakage expression of target protein (from 0 min to 1 min, Fig. 2b). After that, due to slow degradation rate of the target protein’s mRNA as well as the target protein itself, the amount of target protein will continue to accumulate to a certain amount (from 1 min to 5 min, Fig. 2b) after the lac operon is fully repressed. Finally, the degradation process removes target protein from the system (from 5 min to 50 min, Fig. 2b). When IPTG is added, we find that the concentration of protein product quickly rises (Fig. 2c). The steady-state concentration of target protein is 1.63 μM. This number will be used for further analysis.
+
  
</div>
+
</div>
</div>
+
</div>
+
<div class="row title3" id="mainTitle2_2">
+
<div class="col"><i>Reverse Transcription model</i></div>
+
</div>
+
+
<div class="row para1">
+
<div class="col">
+
From the first model, the concentration of both RT and Cre are acquired. The concentration of RT serves as input to the reverse transcription model. As the schematic diagram depicts (Fig. 3a), tRNA primer first binds with reverse transcriptase. When this complex binds with a certain fragment on the target sequence, which is called primer binding site (PBS), the reverse transcription will start and cDNA will be synthesized.<br /><br />
+
Although a more elaborate model of reverse transcription has been proposed by Kulpa et al, it includes many reactions whose kinetic properties are not well characterized. As a result, we simplified that model and came up with our own. Details of the substance names, parameter names and mathematical equations we used can be found in the appendix.<br /><br />
+
The modelling result is shown in Fig. 3b. It shows that the concentration of cDNA will accumulate at the presence of RT (whose initial concentration is 1.63 μM, computed by the induced expression model) and finally reach a steady-state of 66.5 nM. This number will be used for further analysis.
+
</div>
+
</div>
+
+
<div class="row title3" id="mainTitle2_3">
+
<div class="col"><i>Cre Recombination Model</i></div>
+
</div>
+
+
+
<div class="row para1">
+
<div class="col">
+
Our first assumption is that the genes encoding RT and Cre are both placed under lac operon and thus be expressed in the same amount. So now we are about to compute the yield of our desired product to identify whether this experimental setup is feasible. The model of the recombination process has been clearly described by Ehrlich et al. We made some changes to it according to our own experimental design. The schematic diagram is shown in Fig. 4a. Details of the substance names, parameter names and mathematical equations can be found in the appendix.<br /><br />
+
As is shown in the diagram, 2 Cre molecules bind with 1 loxP site successively, either on cDNA or P_target. Four Cre molecules will form a Holliday junction, and thus starting the recombination reaction. Two pairs of loxP will work together and complete the strand exchange between cDNA and P_target. After that, the recombined product is produced. What we are interested in is the percentage of recombined P_target among all P_targets in one E. coli. So, we turn to compute that percentage based on the model that we have established.<br /><br />
+
Unfortunately, we found that the amount of substances is too small. For example, the concentration of P_target is only 10 nM, which means there are only about 5 molecules of P_target in one cell. These small numbers caused some computational problems in Matlab when we were using its ODE solver (ode15s). To address this problem, we converted the units of the amount of the substances from mole per litter (M) to molecule. The units of the kinetic parameters are also converted accordingly. The necessity of these conversions is clarified in the appendix.<br /><br />
+
Now the recombination step is modeled under the initial condition of 5 molecules of non-mutated P_target, 785 molecules of Cre and 31 molecules of cDNA (Fig 4b). The last two numbers are the outputs of previous models after going through some unit conversion steps.
+
</div>
+
</div>
+
<div class="row para1">
+
<div class="col">
+
The result is disappointing. After a long period of reaction, no recombined P_target showed up. It is because there are too many Cre molecules so P_targets are all bounded by them and remain in the intermediate form. What’s more, P_target can't bind with T7 RNA polymerase and be transcribed as a consequence of Cre occupation. This leads to the system’s inability of undergoing further reverse transcription process, stopping cDNA’s production, resulting in a stop of the system, and rendering mutation accumulation impossible (Fig. 4c).<br /><br />
+
This result tells us that the number of Cre molecules needs to be much lower for the system to function. We then set out to determine how many Cre is optimal. After we fed the recombination model with cDNA and Cre at different concentrations, the problem seems to be clear as the yield of recombined P_target varies greatly responding to different numbers of cDNA and Cre (Fig. 4d). When cDNA is confined to 31 molecules, we will get no yield at all in the period of E. coli's replication cycle if the concentration of Cre is greater than 80 nanomoles. Instead, the yield is maximized when the final Cre concentration is around 27 molecules (Fig 4e).
+
</div>
+
</div>
+
<div class="row para1">
+
<div class="col">
+
Now we use the optimized number of Cre as the input to our third model. The result is shown in Fig. 4f, which is satisfactory. The recombined P_target finally occurs and P_target has a chance to bind with T7 RNA polymerase, which means mutated gene of interest could be transcribed and further mutated, thus making the accumulation of mutations possible (Fig 4g).
+
</div>
+
</div>
+
<div class="row para1">
+
<div class="col">
+
There is still something that is not well explained in our current model. The final percentage of recombined P_target is around 2.5%. The unit of the substance is molecules, so it means there is 0.125 recombined P_target in one cell, which is unrealistic. This problem reflects that converting the unit of substance into molecule when doing deterministic modelling cannot offer a precise description of the system’s status.<br /><br />
+
We then used stochastic modelling techniques to determine whether and how many recombined P_targets will show up in one replication cycle of E. coli.
+
 
+
</div>
+
</div>
+
+
<div class="row title3" id="mainTitle2_4">
+
 
</div>
 
</div>
 
 
 
<div class="row title2" id="mainTitle3">
 
<div class="row title2" id="mainTitle3">
<div class="col">Part II: Stochastic model to compute times of occurrence of recombined P<sub>target</sub></div>
+
<div class="col"><i>User guidelines</i></div>
 
</div>
 
</div>
 
+
 
<div class="row para1">
 
<div class="row para1">
<div class="col">
+
<div class="row">
We use Gillespie algorithm in stochastic modelling. Detailed description of this technique is described in the appendix. Although the algorithm is rather simple, basic mathematical skills is required to understand its theoretical basis. The result is shown in Fig. 5.<br /><br />
+
<div class="col">
The result demonstrates that recombined P_targets do occur and two rounds of reverse transcription and recombination can take place in one replication cycle of E. coli (1200 s) (Fig 5a). On the contrary, no recombined P_target will come out within that period if the initial cDNA is 31 molecules and initial Cre is 785 molecules (Fig 5b). This again demonstrates the necessity of putting RT and Cre under different induction setups. The fluctuation of the number of recombined P_targets is due to the backward reaction that Cre can rebind with recombined P_target and reverting the action, making it not counted as recombined P_target by the algorithm.  
+
Our tRNA primer designer is a web tool for potential users of our mutagenesis system to design their own tRNA primers according to their experimental setups. Here we provide a step-by-step guide to using this software.<br /><br />
</div>
+
Step 1. Input a DNA sequence that you want to mutate. The last 18 nucleotides of the sequence are selected to be PBS. Note that this sequence should be longer than 18 nucleotides. Besides, it shouldn't contain any characters other than A/T/C/G.<br /><br />
 +
Step 2. Choose the type of reverse transcriptase that you want to use based on your experimental design. Note that this software only allows you to choose from MMLV RT/ HIV-1 RT/ RSV RT.<br /><br />
 +
Step3. Click on the "DESIGN FOR ME!!!" button and see the result. The result is composed of two parts. The first part shows you the secondary structure of the template tRNA that you will be using as well as the designed tRNA primer. The fragment that can be annealed to PBS of the input DNA sequence is shown in red. The second part will give you the DNA sequence encoding the tRNA primer that satisfies your need. You can just copy it and use it elsewhere.
 +
</div>
 +
</div>
 
</div>
 
</div>
 
 
 
<div class="row title2" id="mainTitle4">
 
<div class="row title2" id="mainTitle4">
<div class="col">Part III: Deterministic model to determine optimal induction time</div>
+
<div class="col"><i>References</i></div>
 
</div>
 
</div>
 
+
 
<div class="row para1">
 
<div class="row para1">
<div class="col">
+
<div class="row">
In the first part, we demonstrated models which separate the 3 reaction steps and use the output from the preceding model as the input of the subsequent one. The previous setup successfully provided us with a clear insight into the reactions and dynamic changes of substances that underlie our mutagenesis system. However, this simplification doesn’t match real reaction situations. For example, when RT and Cre are expressed simultaneously upon induction, cDNA would bind with Cre and undergo recombination as soon as it is synthesized. This fact contradicts with our model assumption that recombination only takes place after cDNA has reached its steady-state concentration. To overcome this problem, we employed deterministic model to combine the separate steps together into one and better simulate real intracellular circumstances.<br /><br />
+
<div class="col">
The first part of our model presents to us the optimal amount of Cre that should exist in the system, but leaves us with a problem concerning when Cre should be induced to achieve the greatest recombination efficiency. We first asked ourselves: can Cre function after cDNA accumulates to its steady-state, just as our previous model assumes? After inspecting the time required for the cDNA accumulation step, we found that this isn’t the case. The time needed for cDNA accumulation is close to the time length of a single E. coli replication cycle (1200 s). So if recombination happens only after cDNA reached its steady-state concentration, it does not happen at all. This can be explained by the substance division process when 1 parent E. coli reproduces into 2 child E. coli cells. As a result, when cDNA nearly reaches steady-state concentration in the parent E. coli, its concentration will consecutively be reduced by half in child E. coli, which breaks the steady-state again. <br /><br />
+
<ul class="paraUl" style="list-style:none;">
After realizing the fact that recombination cannot take place at steady-state cDNA concentration, we are faced with the second question: when should Cre be induced in one E. coli replication cycle, to enable the maximized percentage of recombined P_target? One possible answer is to induce the expression of Cre at the same time when RT is induced through a different inducer aTc (anhydrotetracycline). Under this method, recombination can occur throughout E. coli replication cycle, and thus has the longest duration. Adding the two inducers simultaneously in real experimental setup will further decreas the labor work of applying R-Evolution as well. However, at initial stages when cDNA concentration is minimized due to low concentration of RT and resulting in a low rate of cDNA synthesis process (reverse transcription), recombination efficiency will be at its minimal. To resolve this problem, we would like to find out whether there exists a certain time point that maximizes the recombination efficiency in one E. coli replication cycle by facilitating sufficient time for recombination as well as moderate initial reverse transcription.<br /><br />
+
<li>[1]: Peters G , Dahlberg J E . RNA-directed DNA synthesis in Moloney murine leukemia virus: interaction between the primer tRNA and the genome RNA.[J]. Journal of Virology, 1979, 31(2):398-407.</li>
By combining previous models (Part I. induced expression model, reverse transcription model, recombination model) and using the aTc induction model proposed by Steel et al. to simulate the Cre induced expression process (the schematic diagram of this process is shown in Fig 6a. Details of the substance names, parameter names and mathematical equations can be found in the appendix), we confirm that the optimal recombination efficiency will be achieved when expression of RT and Cre is induced at the same time point (the origin point represents the moment when IPTG is added to initiate RT expression, with 50 μM IPTG dosage and 1.75 μM aTc dosage), characterized by the maximized percentage of recombined product at the 20th minute (Fig 6 b-d, modeling different moment of aTc induction—the 5th min, 10th min, 15th min in b &amp; c).
+
<li>[2]: Kulpa, D. Determination of the site of first strand transfer during Moloney murine leukemia virus reverse transcription and identification of strand transfer-associated reverse transcriptase errors[J]. EMBO (European Molecular Biology Organization) Journal, 1997, 16(4):856-865.</li>
</div>
+
<li>[3]: Palmer M T , Kirkman R , Kosloff B R , et al. tRNA Isoacceptor Preference prior to Retrovirus Gag-Pol Junction Links Primer Selection and Viral Translation[J]. Journal of Virology, 2007, 81(9):4397-4404.</li>
</div>
+
<li>[4]: Kinouchi M , Kurokawa K . [Special Issue: Fact Databases and Freewares] tRNAfinder: A Software System To Find All tRNA Genes in the DNA Sequence Based on the Cloverleaf Secondary Structure[J]. Journal of Computer Aided Chemistry, 2006, 7:116-124.</li>
<div class="row para1">
+
<li>[5]: Lund, Anders H. et al. “Mutated primer binding sites interacting with different tRNAs allow efficient murine leukemia virus replication.” Journal of virology, 67 12 (1993): 7125-30.</li>
<div class="col">
+
</ul>
In the deterministic model, we combined the three minor models proposed previously and assessed the mutagenesis system in whole. Through this addition, we achieved a better simulation of the real intracellular reactions and answered the question of when Cre should be induced for the highest level of recombination efficiency to be obtaine
+
</div>
</div>
+
</div>
 
</div>
 
</div>
 
 
 
 
<div class="row title3" id="mainTitle5">
+
</div>
<div class="col">References</div>
+
</div>
+
 
 
<div class="row para1">
 
<div class="col">
 
<ul class="paraUl" style="list-style:none;">
 
<li>[1]: Stamatakis M, Mantzaris N V. Comparison of Deterministic and Stochastic Models of the lac Operon Genetic Network[J]. Biophysical Journal, 2009, 96(3):887-906.</li>
 
<li>[2]: Kulpa, D. Determination of the site of first strand transfer during Moloney murine leukemia virus reverse transcription and identification of strand transfer-associated reverse transcriptase errors[J]. EMBO (European Molecular Biology Organization) Journal, 1997, 16(4):856-865.</li>
 
<li>[3]: Lanchy J M, Ehresmann C, Le Grice S F, et al. Binding and kinetic properties of HIV-1 reverse transcriptase markedly differ during initiation and elongation of reverse transcription.[J]. The EMBO Journal, 1996, 15(24):7178-7187.</li>
 
<li>[4]: Kati W M, Johnson K A, Jerva L F, et al. Mechanism and fidelity of HIV reverse transcriptase[J]. Journal of Biological Chemistry, 1993, 267(36):25988-25997.</li>
 
<li>[5]: Ringrose L, Lounnas V, Ehrlich L, et al. Comparative kinetic analysis of FLP and cre recombinases: mathematical models for DNA binding and recombination[J]. Journal of Molecular Biology, 1998, 284(2):0-384.</li>
 
<li>[6]: Harris A W K, Kelly C L, Steel H, et al. The autorepressor: A case study of the importance of model selection[C]. Decision &amp; Control. IEEE, 2018.</li>
 
</ul>
 
</div>
 
</div>
 
  
+
 
</div>
+
 
</div>
 
</div>
 
 
Line 2,305: Line 2,216:
  
  
 
 
 
 
 

Revision as of 10:25, 14 October 2019

Motivation
Previous studies have shown that a tRNA primer is required for the initiation of reverse transcription (Dahlberg et al.). In our system, we express the tRNA primer in E.coli by cloning it onto the plasmid that is used for generating the tools for mutation, i.e. P_mutant. However, designing the primer sequence according to different target sequences is time-consuming and needs many adjustments to find the perfect match. This motivates us to build a software for tRNA primer designing.
Theoretical basis
Previous studies have reported that the interactions between tRNA primer and mRNA template as well as the reverse transcriptase are crucial in directing subsequent cDNA synthesis (James E. Dahlberg et al.). Specifically, according to the model for reverse transcription proposed by Kulpa et al., reverse transcription includes 5 steps (Fig. 1), in which the annealing of tRNA primer to the primer binding site (PBS) region on mRNA template is crucial for the synthesis of minus strand strong stop DNA (–ssDNA) and the following cDNA synthesis process.

Many researchers have studied the reverse transcription process in viruses, from which we find two critical properties in the annealing process of tRNA primer and PBS that should be taken into consideration when building the tRNA primer designer.

The first property is that the 3'-terminal of the tRNA primer should be complementary to the PBS on mRNA template (Kosloff et al.)(Kosloff et al.). The second one is that different viruses prefer specific type of tRNA primer for reverse transcription (Kulpa et al., Kosloff et al.). What should also be noted is that for different viruses, the lengths of PBS as well as the types of tRNA primer are different. The PBS lengths and the preferred tRNA types of 3 most well-studied retroviruses are listed in Table I.

These discoveries serve as the theoretical basis for our tRNA primer designer. So basically, the function of our tRNA primer designer is to change the tRNA template in order to suit the basic properties of the reverse transcriptase (MMLV RT/ HIV-1 RT/ RSV RT) selected by the user as well as to replace several nucleotides (17 or 18) on 3'-terminal of the tRNA templates to match with nucleotides at the 5'-terminal of the GOI which users input. Also, to make sure that the RNA sequence is a tRNA sequence, the secondary structure should be revealed. We achieve this goal by using the similar tRNA secondary structure prediction scheme as the one implemented in the opensource software tRNAfinder (Kurokawa et al.).

Studies have shown that the primary factor guiding the selection of tRNA primer for MMLV RT is the PBS sequence instead of the inherent nature of reverse transcriptase (A. H. Lund et al., S. P. Goff et al.). So, by making mutations on both the PBS and tRNA sequence, the researchers have found that reverse transcription could still successfully take place while the virus’ titer is not greatly affected. Also, after several cycles of replication, the mutated sequence is not changed back to its original version (Pedersen et al., 1997). Even though it is found that the primer is not stringent for MMLV, studies have revealed that the tRNA-like structure is necessary. A study that the inclusion of one single non-Watson-Crick base pair between PBS and tRNA primer would improve the replication efficiency (F. S. Pedersen et al., 1993), but we didn’t adopt this construct as the one base pair mismatch would often be changed to the full-complementary version after the first cycle of replication (Pedersen et al., 1997)(Pedersen et al., 1997), making this addition unnecessary.
User guidelines
Our tRNA primer designer is a web tool for potential users of our mutagenesis system to design their own tRNA primers according to their experimental setups. Here we provide a step-by-step guide to using this software.

Step 1. Input a DNA sequence that you want to mutate. The last 18 nucleotides of the sequence are selected to be PBS. Note that this sequence should be longer than 18 nucleotides. Besides, it shouldn't contain any characters other than A/T/C/G.

Step 2. Choose the type of reverse transcriptase that you want to use based on your experimental design. Note that this software only allows you to choose from MMLV RT/ HIV-1 RT/ RSV RT.

Step3. Click on the "DESIGN FOR ME!!!" button and see the result. The result is composed of two parts. The first part shows you the secondary structure of the template tRNA that you will be using as well as the designed tRNA primer. The fragment that can be annealed to PBS of the input DNA sequence is shown in red. The second part will give you the DNA sequence encoding the tRNA primer that satisfies your need. You can just copy it and use it elsewhere.
References
  • [1]: Peters G , Dahlberg J E . RNA-directed DNA synthesis in Moloney murine leukemia virus: interaction between the primer tRNA and the genome RNA.[J]. Journal of Virology, 1979, 31(2):398-407.
  • [2]: Kulpa, D. Determination of the site of first strand transfer during Moloney murine leukemia virus reverse transcription and identification of strand transfer-associated reverse transcriptase errors[J]. EMBO (European Molecular Biology Organization) Journal, 1997, 16(4):856-865.
  • [3]: Palmer M T , Kirkman R , Kosloff B R , et al. tRNA Isoacceptor Preference prior to Retrovirus Gag-Pol Junction Links Primer Selection and Viral Translation[J]. Journal of Virology, 2007, 81(9):4397-4404.
  • [4]: Kinouchi M , Kurokawa K . [Special Issue: Fact Databases and Freewares] tRNAfinder: A Software System To Find All tRNA Genes in the DNA Sequence Based on the Cloverleaf Secondary Structure[J]. Journal of Computer Aided Chemistry, 2006, 7:116-124.
  • [5]: Lund, Anders H. et al. “Mutated primer binding sites interacting with different tRNAs allow efficient murine leukemia virus replication.” Journal of virology, 67 12 (1993): 7125-30.