Team:NAU-CHINA/PES Model

NAU-iGEM

peptide expression stochastic model


Nothing is impossible to model.

Find Out More

Q & A

Q: Why do we establish this model?
A: To search for the optimal combination of RBS and promoters to maximize the quantity of anti-malarial peptides.

Q: Why do we believe that there exist optimal combinations?
A: By analyzing the experimental data of the fluorescence secretion of HasA system, we learned that the secretion rate reaches an upper bound. It indicates that the rate of secreted protein(TEVp & fusion protein) is constant, and so there exists an optimal ratio of TEVp and fusion protein, which is determined by the combination of their RBS and promoters.

Q: Why can we simulate the expression of peptides of the whole flora by only simulating one single bacteria?
A: The literature[1] shows that after blood-meal the population of the bacteria in mosquitoes growth rapidly to a high and stable stage, which will last for 3 days. This stage is the key period to kill Plasmodium. In this stage, we believe that every single bacteria can produce peptides steadily.

Abstract

By analyzing the experimental data of the fluorescence secretion of HasA system, we learned that the secretion rate reaches an upper bound. So, a peptide expression stochastic model was established to optimize the gene circuit design. We divided the pathway into two parts: extracellular specific cleavage, intracellular protein constitutive expression. Firstly, according to the principle of enzymatic reaction kinetics, the reaction rate of extracellular cleavage was determined by the Michaelis-Menten equation. Then a random cleavage model was established. The model simulates the reaction to seek the relation between the quantity of protein of interest(POI) and secretion ratio. With this model we obtained the optimal secretion ratio to maximize the quantity of POI under different secretion amounts. Secondly, based on the Gillespie algorithm we established a stochastic model of intracellular gene expression. By adjusting the strength of promoter and RBS we explored the effects of the strength on the quantity of intracellular protein expression.
Finally, with these two models, we obtained the optimal promoter combination of TEVp and fusion protein in HasA secretion system. And we made a deduction for our experiment, which is consists with the experimental results.


1. Background

The fusion protein effector protein and Tobacco Etch Virus protease(TEVp) are two key substances to produce anti-malarial peptides. The fusion protein consists of 9 anti-malarial peptides, but it has no activity to Plasmodium unless being cut into 9 single peptides. TEVp is a kind of enzyme which specifically recognizes and cuts the certain sites between every two peptides. Considering the secretion rate will reach an upper bound, we need to find an optimal ratio of TEVp and fusion protein in the process of secretion which makes the quantity of anti-malarial peptides highest. The ratio in the secretion is determined by the intracellular ratio. And the intracellular ratio is determined by the combination of promoters and RBSs. So the whole anti-malarial peptide producing process can be divided into two parts.

The Extracellular Part

After being transported to the extracellular, TEVp begin to fold, and then it has the ability to specifically recognizes the sites and cleaves the fusion protein.

The Intracellular Part

The genes of the fusion protein and TEVp, are unique sequences constructed in the same plasmid. The promoters are selected from a constitutive promoter family[9], companied with different RBS, controlling the expression of these two genes respectively.

The HasA secretion system bridges these two parts by secreting the intracellular fusion protein and TEVp to the extracellular.


2. Model Design

Based on the two parts above, we established two models. And then combined them.

For The Intracellular Part, we established a Stochastic Simulation Algrithm Model(SSA)[2] to simulate the expression of the fusion protein and TEVp. By changing the strength of promoter and RBS, we get the quantity of these two protein in a cell.

For The Extracellular Part, based on Michaelis-Menten equation we established a cleavage model. Setting the secretion rate in a range, at different fixed secretion rate we change the ratio of fusion protein and TEVp, and then get the quantity of POI. Then we can find the optimal ratio of fusion protein and TEVp for different secretion rates.

Finally, based on the assumption that the ratio of the intracellular fusion protein and TEVp is equal to the secretion ratio of them, these two models were combined. Ultimately, we can search the optimal ratio of promoter and RBS strength to satisfy the results of The Extracellular Part.


3. Symbols & Explanation

Variables Explanation Value Units
Part 1 Extracellular
ETEVp Extracellular TEVp Molecule
Efusion protein Extracellular fusion protein Molecule
EHTEVp Half life period of Extracellular TEVp 360* s
Hpeptidei Half life period of peptidei 180[3] s
Kcat Constant of Michaelis-Menten equation relates to the Enzymatic reaction rate 0.19[4] s-1
Km Michaelis constant 0.041[4] mmol·L-1
vt Reaction rate of TEVp cleave fusion protein at time t s-1
Secretion
S Total quantity of molecule of TEVp and fusion protein secreted out of a cell per second when secretion reaches the upper bound Molecule
r Proportion of fusion protein in S
Part2 Intracellular
Nplasmid Average copy number of plasmids per cell 500[5] copies
mRNAfusion protein Molecules of mRNA of TEVp per cell Molecule
mRNATEVp Molecules of mRNA of fusion protein per cell Molecule
ITEVp Intracellular molecules of TEVp per cell Molecule
Ifusion protein Intracellular molecules of fusion protein per cell Molecule
KTEVp Transcription rate of the DNA of TEVp s-1
Kfusion protein Transcription rate of the DNA of fusion protein s-1
PTEVp Translation rate of the mRNA of TEVp 0.044[6] s-1
Pfusion protein Translation rate of the mRNA of fusion protein 0.029[6] s-1
DmTEVp Degradation rate of the mRNA of TEVp 0.0041[7] s-1
Dmfusion protein Degradation rate of the mRNA of fusion protein 0.0041[7] s-1
DTEVp Degradation rate of intracellular TEVp s-1
Dfusion protein Degradation rate of intracellular fusion protein s-1

4. Assumptions

The intracellular part

1. The degradation rate of mRNA is a constant.
2. The mRNA is generated at a constant constitutive transcription rate.
3. The copy number of plasmids is kept as a constant.
4. Other species such as RNAP polymerases and ribosomes are kept constant as well.
5. Reactions are at equilibrium, steady state, or quasi-steady state[8].
6. Every kind of substance in the fusion protein and TEVp expression process is evenly distributed in the cell.
7. Each RBS part has a native strength irrespective of the promoter and protein-coding part it can be used with, and the translation rate with same RBS has a linear relationship with the mRNA length[8].

The extracellular part

1. TEVp and fusion protein are fully mixed in the extracellular.
2. The extracellular is in a stable state during the enzymatic reaction
3. We ignore the difference of the possibility of different sizes of substrate integrating with enzyme.
4. TEVp will recognize the sites of fusion protein with same possibility


5. Model Establishment

5.1 Part1 Extracellular

Michaelis-Menten equation has been widely use to simulate the enzymatic reaction. The process that TEVp cleaves the fusion protein is typical enzymatic reaction. TEVp is the enzyme, fusion protein is the substrate, the reaction rate can be calculated by Michaelis-Menten equation.

5.1.1 Reactant & Product

Denote the initial fusion protein with 9 kinds of peptide as [1,2,3,4,5,6,7,8,9,10] where number 10 is not the anti-malarial peptide. Except the number 10 each number corresponds to the anti-malarial peptide on the fusion protein, between each two numbers these is a cutting site. During the reaction the initial fusion protein will be cleaved. The cleaved fusion protein can be divided into 9 categories according to the number of cutting sites.

Example:
fusion protein with 1 cutting site:
[1,2][2,3][3,4][4,5][5,6][6,7][7,8][9,8][9,10]
fusion protein with 2 cutting sites:
[1,2,3][2,3,4][3,4,5][4,5,6][5,6,7][6,7,8][7,8,9][8,9,10]
...
fusion protein with 9 cutting sites:
[1,2,3,4,5,6,7,8,9,10]

Totally there are 45 kinds of fusion protein with cutting sites.
The product are 9 kinds of peptide:[1],[2],[3],[4],[5],[6],[7],[8],[9]

5.1.2 Enzymatic reaction process

1.reaction rate

Since the volume of fusion protein and TEVp is far small than the space between them, the difference of probability that different sizes of fusion protein contact TEVp can be ignored. Then the 45 kinds of fusion protein can be considered as the same reactant when it comes to computing reaction rate.Denote [S]tas the molecules of reactant [K]t as the molecules of TEVp at the time t. According to the Michaelis-Menten equation the reaction rate vt at time t can be computed as follows:

2.Random Cleavage

Since the cutting sites on the fusion protein are same and the assumption that the difference of cleavage possibility of different cutting sites led by the structure of fusion protein is ignored, we can use a stochastic simulation to describe the cleavage process.

Step1:

Set Δt as 1 second by the reaction rate formula and the substance at time t compute the total fusion protein vt × Δt between time t and t + Δt will be cleaved. Then for each kind of fusion protein compute the number of the cleaved between time t and t + Δt according to their proportion in the total fusion protein.

Step2:

Denote the smallest and largest number of a fusion protein as fmin and fmax. For every fusion protein cleaved between time t and t + Δt generate a random integer c where c satisfies fmin+1<=c<=fmax.

Step3:

Reduce the molecule of the fusion protein cleaved by one and add the molecule of the fusion protein or the product peptide the cleavage generates [fmin,..c-1] and [c,..fmax] by one.

Example of Step3:

To the fusion protein[3,4,5,6,7] ,generate a random integer 5 which is between 4 and 7. Then reduce the molecule of fusion protein [3,4,5,6,7] by one and add the molecule of fusion protein [3,4] and [5,6,7] by one.

Step4:

After the cleavage between t and t + Δt finished, set t as t + Δt and then go back to step1.

5.1.3 Degradation

Since the degradation information of the related substances got in literature are their half-life, the degradation rate at time t can be derived from the half-life formula as follows:



Where Nt is the molecule of substance,N0 is the initial molecule of the substance,τ is the half-time of the substance and ΔN/Δt the degradation rate when the molecule of the substance is Nt.

5.1.4 Model1 test

Base on the formula and the algorithm above we create a difference equation model. Set the total secretion per second S as 1000, the proportion of fusion protein in the secretion as 0.5,0.7,0.9, we get the expression of peptides and mRNA shows in the three figures below.


5.1.5 Analysis

That the molecules of 9 kinds of peptides change over time turns out to seem like five curves but actually there are 9 curves. At the early stage, these curves have a relatively big variant, as the reaction goes they get closer, and finally they merge into one curve. From the curve of the molecules of TEVp we can see that the quantity of TEVp has a negative correlation with r (the proportion of fusion protein in the secretion).
These phenomena can be explained as follows:

1. When the reaction rate at a low level, most fusion protein are not thoroughly cleaved. So the difference in quantity of different peptide is related to their positions at the initial fusion protein. The peptide at the end of fusion protein has the highest probability to be become product peptide while the middle is the lowest.

2. When the quantity of TEVp gets stable, the degradation rate of TEVp is equal to the secretion rate of TEVp r×S. Since the degradation rate of TEVp is constant, the quantity of TEVp has a negative correlation with r.

3. A smaller difference of the 5 curves reflects a more thorough cleavage which requires more TEVp. A larger quantity of the product peptides means a better anti-malarial effect which requires more fusion protein. The proportion of fusion protein in the secretionr is decisive to both that means there exists an optimal r.

5.1.6 Conclusion

From the fig.test1(r = 0.5), fig.test2(r = 0.7), fig.test3(r = 0.9) we can see that when the proportion of fusion protein in the secretion is low(r = 0.5) the five molecule curves of the peptide has a lesser difference. However, the quantity of product peptide is relatively low. When the proportion of fusion protein in the secretion is high(r = 0.9) the five molecule curves of the peptide have a greater difference and the quantity of TEVp and product is relatively low. When the proportion of fusion protein in the secretion is in a medium interval the result has the best performance. That means there is optimal rop where the product peptide has the highest quantity of molecules.

5.2 Part2 Intracellular

The Stochastic Model is widely used to simulate biological reactions which has some advantages compared to traditional deterministic model.
Instead of making the assumption that the parameter set is constant, stochastic model introduces Stochastic Simulation Algorithms(SSA) to simulate the intrinsic random factors of real biological systems.
In the early stage of the intracellular reaction, the reactants such as mRNA has a relatively low quantity of molecules. In that stage, the intrinsic randomness may has a great impact on the reaction result. So, to better simulate the reaction we use stochastic model and solve that model by the famous SSA called Gillespie Algorithm.

5.2.1 Reactant & Product

In the Stochastic Model the molecules of the reactants and products involved in the reaction at time t can be denote as a vector
Xt [ [X1t], [X2t], [X3t], ... , [Xkt], .. , [Xnt] ] Where [Xkt] is molecules of substance Xk at time t. In our model the involve reactants and products can be denoted as
X[ [DNAt], [mRNAfusion proteint], [mRNATEVpt], [Ifusion proteint], [ITEVpt] ]

5.2.2 Reactions

In our stochastic model the reactions involved are as follows:



All these reactions can be denote as a state-change vector with the same length of substance vector Xt.
Example:
For reaction Rk:



if the vector of all substance isXt [ [S1t], [S2t], [S3t], [S4t], [S5t] ], the vector of Rk is vk[ -c1, -c2, c3, c4, 0]. The components of the vector correspond to the reaction coefficients of the equation. The positive sign of the components represents that the corresponding substance is increased in the reaction while the negative sign represents decrease. The zeroes in the vector corresponds to the substances not involved in the reaction Rk.
So state-change vectors of the 10 reactions above are as follows:



The Efusion protein and ETEVp are not considered in the intracellular reaction model. So the reaction R9 and R10 can be denote as:



5.2.3 Propensity function

In stochastic model when the substance vector is Xt [ [X1t], [X2t], [X3t], ... , [Xkt], ... , [Xnt] ] at time t, there is a sufficiently small interval Δt, in which only one reaction happens. It is randomly selected that which reaction happens in that interval, according to the value of their propensity function. The propensity function of one reaction is related to the molecule quantity of the substances in the reaction and the reaction rate. So the propensity function value of one reaction at time t is determined by substance vectorXt and parameter set {p}. We denote the propensity function of Rk at time t as Fk( Xt, {p} ). The values of propensity functions in our intracellular stochastic model can be computed as follows:

R1 & R2: Based on the assumption that the average copy number of plasmids Nplasmid is kept as a constant, the [DNAfusion protein] and [DNATEVp] are constants during the reaction. So the propensity function of R1 and R2 F1 and F2 are determined by Kfusion protein, KTEVp.



R3 & R4: The translation process of mRNA in a cell is an enzymatic reaction where mRNA is substrate. Since the quantity of enzyme of translation is relatively stable we assume it as a constant and the propensity function of R3 and R4 are:



R5 & R6: The degradation rate of protein can be computed by our degradation formula derived from half-life formular. So the propensity function of R4 and R5 are :




Where τ1 and τ2 are the half-life of intracellular fusion protein and TEVp respectively.

R7 & R8: It can be find in literature that the degradation rate of intracellular mRNA is almost a constant so the propensity function of F7 and F8 are as follows:



R9 & R10: To simplify our model we make the assumption that the secretion rate is a constant when the molecules of intracellular protein reaches a threshold, before reaching the threshold the secretion rate is zero. So the propensity function of R9 and R10 are as follows:



5.2.4 Solving Algorithm

By using Gillespies Algorithm we solve our stochastic model and the steps are as follows:

Step1: Set initial substance vector X0[ 500 0 0 0 0 ], t = 0

Step2: Compute the propensity function Fi ( i = 1,2, ... ,10 ) record their value as a(i).

Step3: Generate a random number a1 ∈[ 0, 1 ] then compute the Δt by the formula below to define when the next reaction will happen.



Step4: Generate another random number a2 ∈[ 0, 1 ] then define which reaction Fk happens in [ t , t + Δt ] while the value of Fk satisfies the condition as follows:



Step5: Add the state-change vector vk to substance vector Xt, update t as t + Δt, then go back to Step2

5.2.5 Model2 Test

In order to validate our model, we determined the parameters from the literature[6], and then adjust our parameters in an appropriate range, setting the timespan as 3 hours, to observe the results of the model. The result shows as follows



Under our simulation, the results show that the molecules of intracellular mRNA are stable at 1500-2500, and the intracellular protein reaches a level of 105 molecules. This is consistent with the data in the literature[6]. By adjusting the strength of promoters and RBS and the degradation rate of mRNA and protein by 10%, we can see that the results do not vary significantly, which shows that our model is robust.


6. Results & Analysis

To Completely simulate the producing of peptide we need to combine these two models of Extracellular Part1 and Intracellular Part2. The key point of the connection is the secretion rate. However the secretion rate of our HasA secretion system is difficult to get. So we apply Gradient Test to give the best combination of promoters and RBS at different secretion rates S.

6.1 RBS & Promoter Sensitivity Analysis

1. The promoter strength sensitivity analysis.
Base on the initial condition we introduce a coefficient k1 which ranges from 0.1 to 1 with step 0.1. The results show as below.

Then set k1 ranging from 1 to 4 with step 1, and the results show as below

2. The RBS strength sensitivity analysis.
Base on the initial condition we introduce a coefficient k2 which ranges from 0.1 to 1 with step 0.1. The results show as below.



Then set k2 ranging from 1 to 5 with step 1, and the results show as below.



Aimed to secrete more anti-malarial peptides, we need an efficient circuit design. The possible improvment would be focus on the strength of RBS and promoters. By the sensitivity analyses above we find that the quantity of protein is similarly sensitive with the strength of RBS and promoters. Since it is easier to change the promoters in paslmids, and the efficient design can be realized by only changing the strength of promoters when the RBS is fixed.

6.2 optimal ratio

Since the secretion rate has not detected by now, we give the optimal promoter combination by Grid search of r(proportion of fusion protein in secretion) and S(total molecules secreted per second) in different levels. Then for each S we can find the interval of r where the quantity of peptide is highest. With that result, as long as we estimated the secretion upper bound of our HasA, we can find the appropriate combination of promotors for a better circuit design.

In Fig 6.2.1, we set S ranging from 100 to 1000 with setp 100 and r range from 0.1 to 0.9 with step 0.1, and then we get total molecules of 9 peptides, when cleavage reaches a steady stage.

Fig 6.2.1

From the figure above we can give the optimal r with different S. We also find that optimal r increases with S.

6.3 Intracelluar ratio with Promoter

We set two coefficients A and B to the strength of promoters of fusion protein and TEVp respectively. By setting the range of A and B from 0.1 to 1 with step 0.1 and from 1 to 10 with step 1, we give the intracellular ratio of fusion protein and TEVp in diffierent promoter strength combinations. The result of the molecule ratio of the intracellular fusion protein and TEVp is shown in Fig 6.2.1 and Fig 6.2.2



Fig 6.3.1 (left) The Ratio of intracelluar protein (fusion protein/TEVp) with different A and B combination where the ratio > 1
Fig 6.3.2 (right) The Ratio of intracelluar protein (fusion protein/TEVp) with different A and B combination where the ratio < 1



6.4 optimal promoter combination

Based on the assumption that the ratio of intracellular fusion protein and TEVp is equal to the ratio of those in secretion, we combine these two figures above, and then for every secretion level we give optimal promoter strength ratio and the possible promoter combinations from the promoter series, are shown in the Table 2.2.5

S optimal r Promoter strength proportion range Potential promoter* combinations of fusion protein & TEVp
100 0.5-0.6 1.1667-1.0526 J23111 & J23108
200 0.5-0.6 1.1667-1.0526 J23111 & J23108
300 0.5-0.6 1.1667 - 1.0526 J23111 & J23108
400 0.6-0.7 1.1765-1.5 J23101 & J23111
500 0.6-0.7 1.1765-1.5 J23101 & J23111
600 0.6-0.7 1.1765-1.5 J23101 & J23111
700 0.7-0.8 1.5-1.6667 J23108 & J23110
800 0.7-0.8 1.5-1.6667 MJ23108 & J23110
900 0.8-0.9 1.75-1.80 J23100 & J23118
1000 0.8-0.9 1.75-1.80 J23100 & J23118

Table 6.3.1 Potential promoter combinations of fusion protein & TEVp simulation result. * Promoters are selected from the constitutive promoter family(parts J23100 through J23118)[9] in igem parts.


7. Integration

According to Table 6.3.1, it can be inferred that when the RBS strength of fusion protein is equivalent to that of TEVp, the optimal ratio of the promoter strength is about 1.5. Limited by the experimental conditions, RBS has been selected and is difficult to change. Since the RBS strength of the fusion protein used in our experiment is weaker than that of TEVp, we deduce that the optimal ratio of the promoter strength under our experimental conditions should be greater than 1.5. Therefore, we select three promoter combinations(J23100 & J23111, J23100 & J23106,J23100 & J23110), of which the strength ratios are around 1.7, 2.1, 3.5. The experimental resultsshow that the performance of the ratio of 2.1 is better than the ratio of 1.7, which is consistent with our deduction. However, the performance of the ratio of 3.5 is poorer than the former two. Through analysis, we deduce that the optimal ratio is less than 3.5.


References

[1] Wang S , Dos-Santos, André L. A, Huang W , et al. Driving mosquito refractoriness to Plasmodium falciparum with engineered symbiotic bacteria[J]. Science, 2017, 357(6358):1399-1402.

[2] Gillespie D T . Gillespie Algorithm for Biochemical Reaction Simulation[J]. 2015.

[3] https://web.expasy.org/cgi-bin/protparam/protparam
[4] Kapust, R.B. & Tözsér, József & Fox, J.D. & Anderson, D.E. & Cherry, S & Copeland, T.D. & Waugh, David. (2002). Tbacco etch virus protease: Mechanism of autolysis and rational design of stable mutants with wild-type catalytic proficiency. Protein engineering. 14. 993-1000. 10.1093/protein/14.12.993.

[5] http://wolfson.huji.ac.il/expression/vector/origin.html

[6] R. Milo and R. Phillips. Cell Biology by the Numbers. First edition, 2015. ISBN9780815345374.

[7] Boada Y , Vignoni A , Pico J . Engineered control of genetic variability reveals interplay between quorum sensing, feedback regulation and biochemical noise[J]. ACS Synthetic Biology, 2017, 6(10).

[8] Zheng Y , Sriram G . Mathematical Modeling: Bridging the Gap between Concept and Realization in Synthetic Biology[J]. Journal of Biomedicine and Biotechnology, 2010, 2010(1110-7243):541609.

[9] http://parts.igem.org/Part:BBa_J23100

Download Source Code

Copyright © 2019 - NAU - CHINA