Team:SUSTech Shenzhen/Predictive and Control System

Load Tools

Back to Top



Predictive and Control System



Derivation of ODEs characterizing gene expression process

Figure 1. Light-induced gene expression process
Figure 1. Light-induced gene expression process


To quantitatively describe the concentration of molecular components in a single cell underlying process in light-inducible gene expression, we developed 5 ordinary differential equations based on enzyme kinetics model and relevant biochemical reactions.


Figure 2. Schematic representation of dimerization and binding of GAVPO
Figure 2. Schematic representation of dimerization and binding of GAVPO


After light activation, the blue light-dependent transactivator GAVPO can homodimerize to interact with UASG elements (5xUASG). This process can be described with Michaelis–Menten kinetics since the formation of GAVPO(dimerized)-5xUAS complex is saturating for the high concentration of monomeric GAVPO. And based on the Quasi-steady state, we assumed that the total amount of GAVPO is constant. Thus, we replace the concentration of monomeric GAVPO with GAVPO(total) - GAVPO(dimerized) - GAVPO(5xUAS). The GAVPO expression is under the control of ubiquitous CAG promoter, and is considered to remain constant concentration. And the concentration of GAVPO(5xUAS) complex depends on the binding affinity of GAVPO(dimerized) to the promoter. By adding a linear degradation rate kdeg, DX*, we obtain the first ODE :



Depending on the light condition GAVPO can activate the target promoter. We describe the activation of the mRNA with Monod kinetics:



The basal production rate Kbasal,mRNA is due to the leakiness of the promoter. The second term describes the activation process depending on the recruitment of transcriptional initiation complex by bound transactivator GAVPO. Therefore, another Michaelis–Menten kinetics is used here. Ktc,mRNA is the maximum transcription rate and Kmtc,mRNA is the concentration of GAVPO(5xUAS) on reaching half of the maximum transcription rate. Including a linear term to model mRNA degradation we obtain:



However, there is a significant decrease in mRNA level observed in experimental data which cannot be explained by the above mechanism. Thus we hypothesize that it may due to the phototoxicity which attenuates the transcription rate by affecting the metabolic activity of cells. And we design and perform experiments to test this hypothesis( Results - for more details ). Therefore, we set up another pathway ‘light-dependent effects on cell’ to describe the decrease of transcription rate caused by phototoxicity.


Figure 3. Biological network of light-controlled gene expression (Left: Light-dependent effects on cell ; Right: Light-induced gene expression)
Figure 3. Biological network of light-controlled gene expression (Left: Light-dependent effects on cell ; Right: Light-induced gene expression)


We introduce an equation for the metabolic activity(MA) of engineered cells by applying the one-phase exponential model according to the experimental data:



This means all parameters which are related to the transcription rate should be modified with the metabolic activity. Thus we obtain the final ODE of mRNA:



And the output on the protein level is the cytokine(IL10). The dynamics of the measured IL10 concentration showed a time delay relative to the corresponding kinetics of the mRNA concentration. To account for this effect, we modeled protein synthesis in two steps:


mRNA → ProteinPRE → Protein


ProteinPRE can be explained as the protein itself which first has to pass a preliminary compartment, e.g. the endoplasmic reticulum (ER) before it is available for measurements. Its production is proportional to the mRNA concentration. Also, the protein amount which is leaving the preliminary compartment can be described by the term Kout[Ppre], thus leading to equations:



The gain in the protein concentration in the cell culture medium can be influenced by the amount of [Ppre] leaving in the ER. Also, including a linear degradation term we can obtain the concentration of cytokine:



Finally, this leads us to the following system of coupled ordinary differential equations:


Parameter fitting


To simulate the biology system by solving ODE equations, parameter fitting is a crucial part.


The 5 ordinary equations in our system represent 4 discrete steps of the biological process, which are dimers, metabolic activity, mRNA, and protein. Our task is to solve for the functions and derive the last function – protein. 15 parameters are involved in this system, some of which are phenomenological. However, they are playing different roles to influence the biological process. By comparing the function values we solve in our ODE system with our experimental data, we can obtain the final parameters that govern the trend of function. It is important to specify that our data are the amount of cck8 and protein changing with time. The time range is 0-50 h, divided into 10 intervals. (5 hours for each)


Fig:注释
Figure 1:The five-equation system


In order to find the most proper parameter array, the algorithm we choose should be robust enough. Genetic Algorithm can avoid local minima for the fitting, and then find the global optimized parameters. By iterating several hundred generations, we identified a parameter set that fits the best among the entire parameter groups to the experimental data.


Genetic Algorithm

Genetic Algorithm (GA) is a global optimization algorithm, which can avoid the local minima of the fitting.

GA can mimic the process of natural selection out of large parameter groups, which are called “populations”. The change of population will happen within the maximum generations we set initially. Finally, a binary conversion should be considered in the algorithm, where the data is contained inside an array called “chromosome”. By calculating the fitness – the inverse of error, the algorithm can finally find the best population, which is the final parameter group we choose.

We developed an inhouse code to do the GA process. Parameters are stored as binary arrays in certain groups called “population”, which is actually a concept in ecology. Then the selection of the populations in our system will take place. This process will select the best populations and do “mutation” and “crossover” on these groups. New populations will be then put into our fitness calculation code. If the fitness worked out exceeds our setting, the related population will become the final parameter group we choose. It is worth mentioning that the process will take place for about 100 ~ 300 generations.


Fig:注释
Figure 2: The process of Genetic Algorithm


Given a group of protein concentration data, we can find parameters by applying GA to our ODE system. After generations of runs, we get the parameter and plot the function we find, which will be then compared to the experimental data.


It is worth mentioning that we did the parameter fitting “layer by layer”. For example, the governing ODE of cck8 is extracted out to do the fitting process independently. It shows that metabolic activity is a crucial part of the biology system we study. In our ODE system, variable v represents the concentration of metabolic activity, which will be then fitted by an independent equation. We have shown the derivative steps in the last part. We now get an explicit expression:


Fig:注释
Figure 3: Explicit expression of metabolic activity


This expression erases two of our parameters, which are k7and k9.

Now we got a four-variable ODE system, in which v can be given as the above expression:


Fig:注释
Figure 4: The four-equation system


Fig:注释
Figure 5: Dynamic change of DX


Fig:注释
Figure 6: Dynamic change of mRNA


Fig:注释
Figure 7: Dynamic change of Protein-Pre


The function trend we got in this process indicates that our ODE system does work. We then will be confident to do fine adjustments to our parameters. GA gives us the initial parameter group, which can primarily help us get the trend of the function. However, a sensitivity test should be applied to the parameters we got. Thus, an order of influence can be given, through which we can get the final parameters of the ODE system.

Sensitivity Test

The best parameter we got performs well in our model. Now we want to do some sensitivity test to determine the most important parameter in our model, which will give us more insight into the synthetic process. Sensitivity test can be considered both before and after parameter fitting. Before fitting, we can rank the parameters in order. Thus we can get the most proper initial parameters. After fitting, we can use the sensitivity test to know what the most “important” parameters are in the biological system. We used a global sensitivity analysis toolbox in MATLAB called Generic Global Sensitivity Analysis (GSA) to do the test. Suppose that the five functions in our ODE system have become stable, we set the dY/dt = 0:


In the ODE system sensitivity test, we need to fix the function value and change time t for different values. Then we substitute every variable by q, the last dependent variable, which represents the amount of protein. The function we got is the one that will serve as the output function in the calculation of sensitivity.


Fig:注释
Figure 8: Protein expressed as a function of k and t


For different given t, we can obtain different sensitivity output. What we care about is the order of influence of these parameters. It is obvious that k12,k13,k14,k15are the most influential parameters in the test since they are exactly in the equations that shows how protein changes. Thus, we plot the variation of sensitivity offset from t=0 for the remaining parameters: (k1,k2,k3,k4,k5,k6,k8,k10,k11)


Fig:注释
Figure 9: Variation of sensitivity offset from t=0


It can be noticed from the figure that k8 is a crucial parameter in this system, which can dramatically influence the amount of protein. K8 is one of the parameters in the ordinary differential equation governing the amount of mRNA. It shows that we can do the parameter fitting process to the second equation in our system independently.


After getting the initial parameters by the GA method and the order of the sensitivity of the parameters, we can determine the initial value of parameters that will be then put into “fmincon” method to get the final value of parameters. “Fmincon” is a parameter fitting method that requires a group of initial parameters. Compared to genetic algorism, “fmincon” cannot avoid local minima. However, we can get the final results in a minute, which is much lower than that of genetic algorism. (GA fitting takes several hours) We plot the final function with the parameters we finally got together with our experimental data. Here is our final result, which is accurate enough to fit the real case:


Fig:注释
Figure 10: Protein data & function

Simulation and Control Test

Based on the model of secretion we established on the biological process, we hope to accurately regulate the secretion. To achieve this goal, we divide our controlling part into four steps. 


At the very beginning, we need to check the basic shape of the simulation in our models. As shown in Fig 1, we have generated a smooth curve which indicates the secretion induced by a continuous illumination.


Figure 1. Continuous illumination


The simulation curve fits in with experiment data. However, this simulation curve begins to decrease after reaching the peak, which suggests that the curve might not be a plateau-like "S"-shape curve. Also, since there is no plateau in secretion, we have to create a control system to make the secretion amount fluctuate within a certain range.


To explore the vibration mode and establish the basic rule for controlling, we decide to try a manual control at first. Cells were illuminated for 20 hours and 15 hours respectively to detect the amount of protein and mRNA. And Fig 2 and Fig 3 were outputted from our simulation algorithm.


Figure 2. Simulation of switch-off kinetics of interleukin 10 mRNA


Figure 3. Simulation of switch-off kinetics of interleukin 10


To verify the results of simulation, We perform switch-off experiment and further compare the data between simulation and experiment curves as Fig 4 and Fig 5 shown.


Fig:注释
Figure 4. Switch-off kinetics of interleukin 10 mRNA


Fig:注释
Figure 5. Switch-off kinetics of interleukin 10


These two curves are quite similar both in peak value and trend. However, this method is hard for computer to find a precise control sequence.

Since the control mode can be translated into a series of vibration within a certain range, we design a better strategy to calculate the control sequence automatically. Because our goal is to keep the secretion amount stay in a certain range, the dichotomy method comes into our mind. Dichotomy can be used to calculate the numerical solution according to constrained conditions. The control system can also be used to control the secretion amount. And we set the control domain as (a,b), which is completely below the maximum value of secretion. Then, we apply the dichotomy method to generate the control sequences. As Fig 6 shows, our control algorithm is effective and robust in adjusting both the amplititude and trendency of secretion curve.


Figure 6. Simulation of different ranges


To further test the accuracy of our model, we design another experiment as shown in Fig 7.

Figure 7. Regulation test


It can be observed that the secretion amount was exactly in the control domain. Besides, these data fit in well with experiment data. However, there are still some problems. In the first 50 hours, the data significantly deviates from our simulation curve. Through analyzing the parameters of the curve, we found the increase rate of secretion should be a little bit slower in the first rising period, which means our model needs optimizing by modifying the minimum step in our control system or adjusting the basic mathematics function in our predicitive system.