Worldinchaos (Talk | contribs) |
Worldinchaos (Talk | contribs) |
||
| Line 4: | Line 4: | ||
<div class="column full_size" style="width: 90%; float: right;"> | <div class="column full_size" style="width: 90%; float: right;"> | ||
| − | |||
| − | |||
<div class="clear extra_space"></div> | <div class="clear extra_space"></div> | ||
| − | |||
<style type='text/css'>html {overflow-x: initial !important;}:root { --bg-color: #ffffff; --text-color: #333333; --select-text-bg-color: #B5D6FC; --select-text-font-color: auto; --monospace: "Lucida Console",Consolas,"Courier",monospace; } | <style type='text/css'>html {overflow-x: initial !important;}:root { --bg-color: #ffffff; --text-color: #333333; --select-text-bg-color: #B5D6FC; --select-text-font-color: auto; --monospace: "Lucida Console",Consolas,"Courier",monospace; } | ||
html { font-size: 14px; background-color: var(--bg-color); color: var(--text-color); font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; -webkit-font-smoothing: antialiased; } | html { font-size: 14px; background-color: var(--bg-color); color: var(--text-color); font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; -webkit-font-smoothing: antialiased; } | ||
| − | body { margin: 0px; padding: 0px; height: auto; bottom: 0px; top: 0px; left: 0px; right: 0px; font-size: 1rem; line-height: 1.42857143; overflow-x: hidden | + | body { margin: 0px; padding: 0px; height: auto; bottom: 0px; top: 0px; left: 0px; right: 0px; font-size: 1rem; line-height: 1.42857143; overflow-x: hidden;} |
a.url { word-break: break-all; } | a.url { word-break: break-all; } | ||
a:active, a:hover { outline: 0px; } | a:active, a:hover { outline: 0px; } | ||
Revision as of 12:59, 19 October 2019
Modeling
We applied modeling and quantitative methods in all aspects of our project, from micro scope to a macro scope, from mechanics to applications.
First, we designed a model to predict how the bi-regulation switch - arabinose and IPTG may influence the concentration of effective dCas9-sgRNA complex in a bacterial cell. Second, we designed a model to explain how our system can slow down bacteria's replication. Third, we designed a model to explain how our system can control plasmid copy number. Fourth, we modeled the elongation of our cell. Fifth, we applied a model to explain how our bacteria can improve the productivity of some specific bioproducts. Sixth, we designed a model to explain how our system, coupled with a quorum sensing system - can automatcally regulate the population. Finally, corresponding to our future plan, we designed a model to discuss our system's potential of reducing the intracellular gene expression noise.
Our team emphasizes the importance of quantitative methods and mathematical description of our project.
regulatory part
Introduction
In this section, we apply a deterministic model to predict the behavior of a regulatory part in our project.
Originally, we did not include the regulation of sgRNA. In this case, the sgRNA is constantly transcripted (By a J23119 promoter, see Project Description) However, we found some works reporting that dCas9 itself can change the cell's growth rate, thus we have to explain that our system works in our exprected ways. Aditionally, we know that the promoters, even if regulated by an inducer or an inhibitor, may perfrom a "leakage expression", hence we have to apply some other ways to gain a better control of the system. Both of these motivated us to design a second switch for sgRNA.
Our regulatory part includes an IPTG-activated pLac-promoted sgRNA and an Arabinose-activated pBAD-promoted dCas9. The mechanics are described in Pic 1. The formation of the effective dCas9-sgRNA complex depends on both IPTG and Arabinose inducers.
Mono-regulation
We started from the simpler case: when there is only one regulated variable, e.g. dCas9 regulated by Arabinose. Since the expression of dCas9 is promoted by pBAD promoter and regulated by Arabinose, it is reasonable to describe this process by Hill's equation
in which is the producing rate of dCas9 mRNA, is the Hill's coefficient of pBAD promoter indicating the sharpness of induction rate and is the coefficient indicating the IPTG concentration at which .
Bi-regulation

We introduced the regulation of expresion of sgRNA by changing the promoter of sgRNA from J23119 to T7. This T7 promoter's activation is further regulated by a pTAC promoter. Therefore, we can further get another Hill's function for this promoter. Besides the transcription of sgRNA and dCas9 mRNA, the translation of dCas9, the binding of dCas9 and sgRNA and the degradation of all the mentioned matters are to be considered. Here, we describe this system by an ordinary differential equation system:
Where represents the concentration of dCas9 mRNA, represents the concentration of dCas9, represents the concentration of sgRNA and represents the concentration of dCas9-sgRNA concentration. The parameters are determined from both others' works and our experiments and listed as follows
| denotion | explanation | value | origin |
|---|---|---|---|
| Hill's coefficient of pBAD promoter | |||
| Half-activate Arabinose concentration of pBAD promoter | |||
| Maximum dCas9 mRNA expression rate | 0.0011 nM·min-1 | ||
| Hill's coefficient of pTac promoter | |||
| Half-activate Arabinose concentration of pTac promoter | |||
| Maximum sgRNA expression rate | 0.0011 nM·min-1 | ||
| dCas9 protein synthesis rate from dCas9 mRNA | 0.0057 protein·transcript-1·min-1 | ||
| dCas9 degradation rate | 5.6408×10-4min-1 | ||
| Rate of dimerization of dCas9 and sgRNA | |||
| Rate of dissociation of dCas9-sgRNA complex | |||
| sgRNA degradation rate | |||
| dCas9-sgRNA degradation rate | 5.6408×10-4min-1 |
To study the steady state of the system, we solve the equation when all particles' concentration do not vary with time (e.g. the left-hand side of the ODE systems equal to zero)
Since the effective molecule in our system is the dCas9-sgRNA complex, we are interested in the concentration of dCas9-sgRNA complex (e.g. the value of ).
from the system above, we can yield
where
This quadratic equation has two possitive solutions. However,only the smaller solution is reasonable (consider when , the smaller solution is 0 and the bigger one is a positive real number
biologicaly, when is extremely big, mRNA degrades so frequently that there is almost no dCas9 mRNA, hence no dCas9, and subsequentially no dCas9-sgRNA complex, so the zero solution is the one corresponding to the real biological situation)
Thus, we get the final solution
where
This equation gives the relation between the concentration of dCas9-sgRNA and the amount of inducer we add. Applying the given parameters, we can demonstrate this relationship by a 3D surface plot. Instinctively , the expression output follows the AND logic: the concentration of dCas9-sgRNA is only considerable when the concentration of both IPTG and Arabinose are high enough.

The relationship between the expression of dCas9 and the cell's growth rate
In this section we show that our system delays the bacteria's replication. Specifically, the average time of a bacteria's cell cycle depends linearly on the concentration of the dCas9-sgRNA molecule copy number. This is deduced from a coarse-grained model including the process of dCas9 and DnaA binding to the OriC site and the cell's replication. Here we assume that
- The bingding process of both binding of dCas9 and DnaA are reversible
- Genome DNA replication can start only when the OriC is bound to DnaA
A Markov chain on continuous time is built, with the hypothesis that the replication time is subjected to a Poisson distribution. A cell has three possible states at a certain time
- OriC is bound to a dCas9 molecule
- OriC is naked
- The bacteria start replication
A cell can transform from state 1 to state 2 (), from state 2 to state 1(), and from state 2 to state 3 (). A infinitesimal transition matrix is written to describe the transition among these states:
where
- represents the frequency that an exposed OriC is bound by a dCas9-sgRNA molecule (state 1 state 2). This value is proportion to the concentration of dCas9-sgRNA (denoted as ):
- represents the frequency that a dCas9-sgRNA molecule decouple with OriC (state 2 state 1).
- represents the frequency that a exposed OriC is bound by replication initiators and the replication starts (state 1 state 3)

uppose initially, the bacterium's OriC not bound with dCas9-sgRNA (i.e. the initial state distribution is ). We calculate the distribution function of the waiting time before replication. To do this, we only have to calculate
thus the third element of will be the probability we want.
It is not hard to get an explicit expression of :
where
The average waiting time is:
linearly related to the concentration of dCas9-sgRNA.
Plasmid Copy number Hacking
Introduction
In this section we discuss how our system can control the plasmid copy number (see Project Description). Typically the plasmids replicate themselves during the cell cycle. After a cell's division, the plasmids are equivalently distributed to both of the cells. Here we discuss a cell line's behavior over time.



In an unbiased cell division, the plasmids in a cell are distributed equivalently to both of the children cells. Therefore, while tracing a cell line, we see that a cell lose half of its plasmids after each division. It is reasonable to assume this process as a random process in which each plasmid has 1/2 probability to "disappear" (actually entering its sibling cell), and 1/2 probability to remain in the cell line we are interested in.
Besides splitting, the plasmids may also replicate itself and bind or unbind to dCas9, These process, together with the division "1/2 disappear", make up all the plasmids' behaviors we are interested in. When bound to dCas9, the plasmid cannot replicate. Here we introduce two parameters , the frequency that a plasmid bind to a dCas9 molecule, and , the frequency that a plasmid and dCas9 decouple.
Another factor taken into account is the process that the replication of cells are also regulated by the intracellular environment to ensure that the plasmid number do not increase uncontrolled. Hence we assume that during a cell's cycle, the growth of the cell's plasmid number is a logistic process. Precisely, the frequency that plasmid replicate itself is
where and are the number of plasmids unbound or bound to dCas9, is the maximum number of the plasmid allowed in a single cell, and is a constant replicate rate.
Since the cell cycle length is relatively invariant in a stable environment, we use a fixed cell length , and as the cell's replication frequency. At each time , the cell's plasmid number undergo a rappid change
where
Putting all these together, we can perform a modified Gillespie simulation over time. We are especially interested in the parameter , which varies with both the binding box's affinity and the concentration of dCas9. Therefore, we run the simulation with different values while other parameters are unchanged . The simulation result shows that when increase, the ratio of dCas9 bound to sgRNA tend to be greater and the cell line run out of the plasmid in less generations. (see pic plasmid_time_series, plasmid_clearance_generation.png)


Gene Expression Noise Control
In this section we model to illustrate how our system can control the expression noise in a cell. Gene expression noise is explained as fluctuation of "very low copy numbers of many components" leading to "large amounts of cell-cell variation observed in isogenic populations". This noise can be either intrinsic or extrinsic. Extrinsic noise include most of the environmental factors like the nutrient and antibiotics, gene expression regulation by inhibitor or enhancer, and also the gene copy number in the cell.
In a fast growing cell, the copy number of OriC can be more than ten. In these cases, genes near the OriC may express more than genes far from OriC. Ting Lu deduced from Helmstetter-Copper model a relation between the gene's relative location to OriC and gene's copy number:
where is the gene's copy number, is the cell's growth rate and is the gene's relative location to OriC, and ohter parameters are constants. However, both Helmstetter and Cooper's model and Ting Lu's model are deterministic and unable to be applied to analyze the random factors. Daniel L. Jones et.al included gene copy number variance as a factor of noise, but the model was coarse-grained and only genes with only one replication fork are considered. To fully expose the "noisy" nature of intracellular gene copy number variance, we describe the genome's replication as a stochastic process. We introduce a parameter representing the frequency that of a single OriC site forming a new replication fork, and as the time it takes for the replication complex to replicate the whole genome. We deduce that this process is a Yule-Furry process to gene's relative location to OriC with parameter .
In our system, the genome or plasmid DNA replication is blocked by dCas9. This can prevent the genome from forming new replication forks. Specifically, our system decreases , the frequency of replication fork formation.

Hypothesis
- Each cell contains only one set of genome, which may contain multiple replication forks so that each gene's copy number in the genome may be different. The cell divide immediately after genome replication. (Our experiment results show that some cells actually contains more than one set of genome. These cells are very long and in each cell genomes are distant from each other. We treat this kind of cell as a chain of multiple cells)
- The cell containing the studied genome has been replicating exponentially for several generations in a stable environment, thus the replication fork's distribution on the genome is steady (i.e. sampled from a fixed distribution)
- The replication fork only forms at OriC and the formations of replication forks are independent with each other.
- The frequency that of a single OriC site forming a new replication fork and the time it takes for the replication complex to replicate the whole genome are constant.
- Our system works by decreasing . This process is a fast grade 1 process.
Analysis of gene copy number and its fluctuation
To study the copy number of each in a given branchy genome, we firstly need to know how this genome and its replication forks are formed. Considering a newly formed gene. ago, the replication fork forming the current genome was just newly formed and the OriC number of this branch is 1. In the next period, DNA replication complex bind to OriC to form new replication fork, creating a new OriC for this branch. According to hypothesis 3 and 4, the copy increasing process of OriC through the time period is a Yule-Furry process (a stochastic counterpart to the deterministic exponential growth model) with parameter . According to the theory of Yule-Furry process, the probabilty that the number of OriC is at time is:
(Note: according to hypothesis 1, the initial number of OriC is 1, therefore we are using the formula under the condition )
Furthermore, the DNA replication complex is moving from OriC to Ter in a constant speed after forming the replication fork at OriC. Suppose we have known that the number of OriC at time is , then after forms the genome we are studying. In this period, all the DNA replication complex move forward a distance relative to the genome. Therefore, after the copy number of gene at relative to OriC, or relative to Ter, equals to , the number of OriC at time .

We introduce as a site's relative distance to Ter in the genome (, corresponding to Ter, corresponding to OriC), then we get
This equation determines the copy number distribution of all the genes at all the sites. We can further deduce
and
The nearer the gene is to OriC, the greater is, the greater both and are. and corresponds to the model hypothesis that the copy number of Ter is invariantly 1.
Variance itself is not sufficient enough to indicate the intensity of fluctuation because the average expression is also increasing. We apply CV value, the ratio of standard deviation to the average, to more precisely describe the intensity of fluctuation:
, indicating that that the copy number of Ter is invariant. increases with , indicating that the copy number of genes near OriC in sequence tend to vary more than that of genes far from OriC in sequence.
Analysis of dCas9 binding process
Productivity
In this section we provide a model to explain why our system is able to increase the cells' productivity to multiple types of products like GFP and indigo. It is somehow contrasting our intuition that a decline in microbiomes' growth rate may increase the microbial productivity. Therefore, it is necessary to carefully test this issue in a quantitive method.
The undesired productivity of microbiomes is thwarting this technology from being more widely applicable. Therefore, it is always concerned how to improve the productivity of microbiomes, or how to enforce these wild or engineered cells to turn more of the substrates we feed them into the bio-products we want.
A non-negligible factor thwarding the bicrobiomes from producing our bioproduct is the growth of the bicrobiomes themselve. All the cell products we want to yield from the microbiomes are products of the microbiomes' own metabolism forming the microbiomes' own biomass. Therefore, their production is under strict regulation of the microbiomes. In most engineered microbiomes, the genes we introduce to the microbiomes are foreign genes unrelated to the cells' growth and normal metabolism, and the production of these genes are undesired for the microbiomes themselves. The engineered microbiomes, stressed by these products, will activate all the possible pathways to change its gene expression pattern and to reallocate its nutrients and enzymes to produce more "necessities" for themselves. This is particularly true for bicrobiomes under steady growth rate, whose nutrients are mainly used for growth.
Actually, many studies concerns the relationship between microbiomes' growth and the way they allocate their resources. Terence Hwa et.al studied E. coli's metabolism under different nutrient and antibiotic condition by dividing the cells' proteins into different sectors. Particularly in this work, a sector defined as "Unnecessary Expression"(corresponding to the bio-products we want) is found to be negatively related to the growth rate. The authors described this relation semi-quantitively as
were is the ratio of "Unnecessary" protein mass to the total protein mass of a cell, and is the growth rate corresponding to the ratio . Ting Lu et. al built a coarse-grained whole-cell model including the productions and functions of proteins of different sectors and further validated Hwa's model. In this model, the regulation performed by ppGpp is included and this provides an explanation of the phenomenological result in Hwa's work. ppGpp is an important gene expression regulation molecule responding to various types of environmental stress and cell's abnormal pnysiology. One of its function is to down-regulate the expression of ribosomal RNA, protein and expression affiliated proteins while up-regulate the expression of some enzymes relating to the cell's core metabolism, adapting the cell from a fast-growing state to a slow-growing state. Some optimization models not explicitly including ppGpp also discovers a similarity between cells' optimal resource allocation strategies and the ppGpp regulation strategies.
All the works mentioned above regard the growth rate as the result or equivalence of the microbiomes' protein accumulation. In our system, however, the growth rate is hacked. Because of this, the causal relationship between the cell physiology (growth rate) and the nutrient reallocation is different from these works. A better explanation of the mechanics of our system is that the hacked growth rate freed the cells from producing too much growth-necessary proteins and enabled them to produce the bio-product we want. We emphasize that our model is still a phenominological model in which the resource allocation regulation is finished in a "black box", not explicitly related to the regulation of any single pathway.

In our model, there is a single-source nutrient. An engineered cell uptakes the nutrient and uses it both for its own growth and for the production of the bio-product. The ratio of nutrient used for cell's own growth to the total nutrient uptaken by the microbiomes is , and the ratio of nutrient used for bio-product production to the total nutrient uptaken by the microbiomes is . Thus, corresponds to a "production-only" state and corresponds to a "growth-only" state. This process is written as an ODE system
where is the mass of nutrient, is the biomass, is the mass of the bioproduct, is the cell's nutrient uptake rate, is the maximum growth rate when the nutrient is sufficient and all the uptaken nutrient are used for cell's own growth, and is maximum production rate when the nutrient is sufficient and all the uptaken nutrient are used for the bio-product's production, is the nutrient corresponding to half-maximum nutrient uptaking rate, and is the above-mentioned allocation ratio.
Noticing that for a given , when the nutrient is sufficient, the second ODE can be re-written as
This corresponds to an exponential growth with a growth rate .
the equation explicitly includes the growth rate.

The ODE is difficult to solve but we can analyze the steady states about it.
suppose initially , noticing that
thus we know that and remain constant in the whole dynamic process. Provided the initial condition, we can deduce
holds throughout the growth-and-production process. Finally, the nutrients are exhausted and the microbiomes stop both growing and producing. Denoting the concentration of all the materials at the final state as and , the above-mentioned equation still holds:
moreover, at the final stage, the nutrients are exhausted so . Then we can solve
noticing that is negatively and linearly related to the growth rate. This result is in consistent to the empirical equation provided by Terence Hwa.
Quorum Sensing
In this section we model to illustrate how our system works coupling with a quorum sensing system. We want out system to realize the population's auto-regulation - the cells stop growing fast when they sense a lot of other cells crowding around it. We realize this by applying LuxI-LuxR system, in which a kind of small molecule call AHL. This system was firstly discovered in V. fischeri as a means of intercellular communication. The cells produce exceeding amount of AHL which can move either into or out of the cells. When the population is large or dense in a region, the local AHL concentration increases and the cells sense this high concentration and respond to this by up- or down-regulating some genes' expression. In our system, the sensing of AHL results in an increment of dCas9, and consequently a down regulation in growth or replication rate (see design for the design and experiments)
A Simulation of Donor-Receptor Experiment
Our first series of experiments involve the testing of our fore-mentioned logic. We separately introduced the AHL producing parts and the AHL sensing parts to two strains of E. coli. The former is called "donor" and the latter is called "receptor". The receptor cells are evenly coated onto the solid medium and the donor cells were dropped at the center of the medium. It is expected that the donors AHL, the AHL difusses around and inhibits the growth of the receptors around it (see design)
Here we use a simple diffusion model and visualized simulation to describe this process. We suppose that the solid media is a 2D plane, the receptor is uniformly distributed on the plane, and the donor is dense at the center of the media (subjected to a normal distribution), thus the AHL production is subjected to a normal distribution centered at . A partial differential equation (PDE) can be derived from these hypothethes:
where are coordinates, represents the radius of the donor colony, and [=2.0μM/(L·cm2)] denotes the diffusion constant. The randomly distributed cells stop growing when the local concentration reach 10-6μM/L.
Here we numerically solve the equation with finite difference method. We draw an animation visualize this dynamic process. Furthermore, we randomly place 400 receptors on the plane to show their colony formation. It can be seen that the receptors distant from the center grow into colonies, while most of the receptors nearby the center grow into relatively small colonies or cannot grow into colonies.
