Team:ZJU-China/RPAPCR

OPT 1: Smoothness Optimization OPT 2: Valid Copy Number Part 2. Contribution & Proposal

Part 1. Modelling & Optimization

In fact, not only in iGEM, in the scientific research, the rational application of mathematical models can greatly reduce the experimental expenses. In order to build a simpler model, we need to complete the abstraction of the corresponding reaction process based on some assumptions. This abstract result may lead to the over-idealization of the model, which is what we need to discuss in this part.

Once, a member of out wet lab team put forward the idea of estimating the reaction time of RPA. Initially, we plan to compare RPA with PCR and build a model characterized by an exponential function based on the model designed by Munich iGEM team in 2017.

$$N_t=N_0{(1+P)^t}$$

Among the parameters, P is the amplification rate per unit time. The reaction is stopped when the set time has elapsed. All previous reactions are exponentially increasing. The curve produced by the fit should be the pattern shown as follow.

However, in actual experiments, according to the records of the literature from the Wellcome Trust Sanger Institute[7], the time-varying results characterized by fluorescence intensity did not show the pattern we envisioned, that is, the RPA did not exhibit a state of sustained exponential growth. It was a short exponential growth at the beginning, after that, the growth rate slowed down and finally entered a state of linear growth. Therefore, it is actually stated that this estimate is very rough, and even there is a big difference in the concavity and convexity of the function. There are still many places in the overall reaction process that deserve more attention but have not been paid yet.

This figure is selected from published literature[7].

We began to conduct an in-depth analysis of the reaction process and constraints of RPA/PCR. In fact, the PCR/RPA reaction should consist of two phases: in the first phase, the reaction is performed according to our expected exponential amplification, but in the second phase, when the reaction proceeds to a certain stage, the amount of substrate exceeds the enzyme content. At the time, what should be presented is an approximate linear growth feature. Therefore, as the reaction proceeds, the derivative of the number of rounds of the substrate in the reaction system should be a piecewise function (with a correction term in the second phase)[6].

$${\frac{dN_{amp}}{dn}}=f(n)= \begin{cases} N_{targ}(1+P)^n ln(1+P) & n < n_1 \\ \frac{1}{2}CenZ\cdot U\cdot P\cdot (1-\frac{Time(2n-n_1)}{2e^{n+b\cdot Temp}}) & n\geq n_1 \end{cases}$$

The critical point n1 represents the number of critical rounds of the two growth trends. After trimming, the trend change of the curve is greatly optimized compared with the previous one.

The above work is based on existing literature and wiki content. For the estimation of reaction time, such improvement seems to be sufficient to meet the basic requirements of predicting reaction time and estimating amplification multiple. However, compared with the actual reaction, it seems to owe a further improvement: for example, the current curve does not reflect the smoothness of the RPA natural reaction process. Therefore, our team has proposed a two-step refinement by self-thinking, which makes the model closer to the actual reaction without increasing the complexity of the model too much.

Optimization 1: Smoothness Optimization

First, let us optimize the smoothness problem just mentioned. From the point of view of mathematical analysis, the reason for the inflection point at the n1 is that the model does not guarantee the continuity of the first derivative. Therefore, the most straightforward way to solve the smoothness is to add a “buffer” in the vicinity of the turning point. In the buffer zone, the derivative of the first phase changes continuously to the second phase. Therefore, the first solution we think of is using a linear function. Here, the expression of the function is modified to the following form.

$${\frac{dN_{amp}}{dn}}=f(n)= \begin{cases} N_{targ}(1+P)^n ln(1+P) & n < n_1 \\ \\ N_{targ}(1+P)^{n_1} ln(1+P)\frac{n_2 - n}{n_2 - n_1} + \frac{1}{2}CenZ\cdot U\cdot P\cdot (1-\frac{Time(2n_2-n_1)}{2e^{n_2+b\cdot Temp}})\frac{n - n_1}{n_2 - n_1} & n_1\leq n < n_2 \\ \\ \frac{1}{2}CenZ\cdot U\cdot P\cdot (1-\frac{Time(2n-n_1)}{2e^{n+b\cdot Temp}}) & n \geq n_2 \\ \end{cases}$$

Correspondingly, the reaction kinetic curve was corrected to the red curve shown in the figure below. It can be seen that the smoothness of the curve has been largely optimized compared to the previous scheme.

Of course, this correction is not perfect. For example, the continuity of the second derivative cannot be guaranteed. Therefore, a better solution is to use the explicit expression of the Logistic equation to make the transition. Thus, the equation can be modified to the form shown in the figure below. Because the overall trend does not have too much intuitive difference in the dynamic characteristics, we do not independently compare the curves.

$${\frac{dN_{amp}}{dn}}=f(n)= \begin{cases} N_{targ}(1+P)^n ln(1+P) & n < n_1 \\ \\ N^{(1)}(n_1)\cdot\frac{N^{(1)}(n_2)-N^{(1)}(n_1)}{1+e^{-a\frac{2n-(n_1+n_2)}{2(n_2-n_1)}}}+N^{(1)}(n_2)\cdot(1-\frac{N^{(1)}(n_2)-N^{(1)}(n_1)}{1+e^{-a\frac{2n-(n_1+n_2)}{2(n_2-n_1)}}}) & n_1\leq n < n_2 \\ \\ \frac{1}{2}CenZ\cdot U\cdot P\cdot (1-\frac{Time(2n-n_1)}{2e^{n+b\cdot Temp}}) & n \geq n_2 \\ \end{cases}$$

In fact, all of the above are solutions to this problem from a mathematical point of view, but does this correction have a corresponding biological significance? For bioreactor modeling, this is a problem that should not be avoided. After all, modeling is our tool for portraying biological processes in order to understand the progress of the reaction better.

As far as the above correction is concerned, the key purpose of achieving smoothness is to downplay the concept of "cycle" in RPA. Because RPA is carried out at the same temperature, unlike in PCR reactions, the copy number actually varies discretely over time. Near the inflection point, not all molecules can complete the two-stage replacement simultaneously, but gradually transit in reality. The "linear connection" here is actually a change in the proportion of molecules entering the second reaction phase during the buffering time, which is considered to be a leading feature over time. The Logistic function is the same in theory. Therefore, in this part, we successfully completed a model optimization that conforms to biological characteristics with both mathematical and biological thinking.

Optimization 2: Valid Copy Number

In addition to smoothness, another optimization we have made is the addition of the concept of valid copy numbers. Because of the particularity of the RPA reaction principle, the optimization of this part merely focuses on PCR.

In fact, for a fragment containing a partial extra sequence at both ends of the target sequence, after the first amplification, the target sequence is not obtained, but only a fragment with an additional sequence. The target sequence can only be obtained after another round. The reaction process is shown in the figure below.

The exponential growth phase of the modified kinetic equation is changed to:

$${\frac{dN_{amp}}{dn}}=N_{targ}(1+P_1)^n ln(1+P_1)+N_{targ}(1+P_2)^n ln(1+P_2)$$

Among them, P1 and P2 are the characteristic roots obtained after the discrete series of the above reaction are subjected to the elimination processing. From the image corrected below, it is obvious that the actual number of effective segments is actually less than the ideal case.

Part 2. Contribution & Proposal

All in all, in this module of modelling, we have shown a process in which the model that once appeared on the iGEM is gradually optimized. In fact, with iGEM going year after year, there are too many excellent teams who have designed many wonderful models. Although different teams have designed a variety of models based on their own characteristics, many models are actually very portable. Just like the PCR/RPA model mentioned in this example, it actually has a certain value for most teams.

What we want to propose through our work is that, we hope that in the future, someone can establish a unified management and evolution mechanism of the models designed in iGEM, even design a corresponding relational database to improve the reusability of these models, let teams in the future study and improve their mathematical models more conveniently based on the work of their predecessors’.