Team:SJTU-BioX-Shanghai/Model/General Model

   


   


Team-iGEM SJTU BioX 201

Team-iGEM SJTU BioX 201

Background

The RNA-programmable Cas9 endonuclease cleaves double-stranded DNA at sites complementary to a 20-base-pair guide RNA. The Cas9 system has been used to modify genomes in multiple cells and organisms, demonstrating its potential as a facile genome- engineering tool.

However, As viruses evolve in response to the selective pressure induced by the CRISPR-Cas immune system, the host is in turn under pressure to attack slightly mutated target sequences in addition to the target. It is therefore not surprising that Cas nucleases exhibit considerable off-target activity on sequences similar to the intended target. Such off targeting presents a severe problem for therapeutics, as DNA breaks introduced at the wrong site could lead to loss-of-function mutations in a well-functioning gene or the improper repair of a disease-causing gene.

The cleavage probability

The cleavage probability

The probability to cleave a target site once the substrate is bound $ P_{clv}$ is equivalent to the fixation probability of a Birth-Death process with absorbing states being the unbound and post-cleavage states (Nowak, 2006). Now we are going to derive the probability formula for cleavage.

Let us imagine the situation that RGN is binding at site n right now.When starting with an R-loop of length n-1, we calculate the probability to cleave $P_{clv,n−1}$ before reducing the R-loop to a length of n-2. Counting all paths that take us from n-1 to N+1 ,we can construct a recursion relation for $P_{clv}$

\begin{equation} \begin{aligned} P_{\mathrm{clv}, n} &=\sum_{m=0}^{\infty}\left(\frac{k_{\mathrm{f}}(n)}{k_{\mathrm{b}}(n-1)+k_{\mathrm{f}}(n)}\left(1-P_{\mathrm{clv}, n+1}\right)\right)^{m} \frac{k_{\mathrm{f}}(n)}{k_{\mathrm{b}}(n)+k_{\mathrm{f}}(n)} P_{\mathrm{clv}, n+1} \\ &=\frac{P_{\mathrm{clv}, n+1}}{\gamma_{n}+P_{\mathrm{clv}, n+1}}, \quad \gamma_{n}=\frac{k_{\mathrm{b}}(n)}{k_{\mathrm{f}}(n)} \end{aligned} \end{equation}

or equivalently:\begin{equation} \frac{1}{P_{\mathrm{clv}, n}}=1+\frac{\gamma_{n}}{P_{\mathrm{clv}, n+1}} \end{equation} The boundary probability $P_{clv,N}$ , representing the probability to cleave starting with a full R-loop and without reducing the R-loop’s length, is given by a simple splitting probability: \begin{equation} P_{\mathrm{clv}, N}=\frac{k_{\mathrm{f}}(N)}{k_{\mathrm{f}}(N)+k_{\mathrm{b}}(N)}=\frac{1}{1+\gamma_{N}} \end{equation}

From Equations 2 and 3 we have: \begin{equation} \frac{1}{P_{\mathrm{clv}, 0}}=1+\gamma_{0} \frac{1}{P_{\mathrm{clv}, 1}}=1+\gamma_{0}+\gamma_{0} \gamma_{1} \frac{1}{P_{\mathrm{clv}, 2}}=1+\gamma_{0}+\gamma_{0} \gamma_{1}+\gamma_{0} \gamma_{1} \gamma_{2} \frac{1}{P_{\mathrm{clv}, 3}}=\ldots=1+\sum_{n=0}^{N} \prod_{i=0}^{n} \gamma_{i} \end{equation}

From Equation 4 we can solve $P_{clv,0}$ as well as $P_{clv}$: \begin{equation} P_{\mathrm{clv}} \equiv P_{\mathrm{clv}, 0}=\frac{1}{1+\sum_{n=0}^{N} \prod_{i=0}^{n} \gamma_{i}} \end{equation}

Transition Site

We assign a free-energy $F_i$ to each metastable state $i\in[0, N]$, and the transition state energy $T_i$ to the highest free energy point on the reaction path from i to i+1, for $i\in[−1, N]$. Introducing the attempt rate k0 we write the associated forward and backward rates as follows: \begin{equation} k_{\mathrm{f}}(i)=k_{0} \exp \left(-\left(T_{i}-F_{i}\right)\right), \quad k_{\mathrm{b}}(i)=k_{0} \exp \left(-\left(T_{i-1}-F_{i}\right)\right) \quad \Rightarrow \quad \gamma_{i}=\exp \left(-\Delta_{i}\right), \quad \Delta_{i}=T_{i-1}-T_{i} \end{equation}

In terms of transition-state free energies we can write Equation 4 as: \begin{equation} P_{\mathrm{clv}}=\frac{1}{1+\sum_{n=0}^{N} \exp \left(-\sum_{i=0}^{n} \Delta_{i}\right)} \equiv \frac{1}{1+\sum_{n=0}^{N} \exp \left(-\Delta T_{n}\right)}, \quad \Delta T_{n}=\sum_{i=0}^{n} \Delta_{i} \end{equation}

From the above it is clear that the cleavage probability depends only on the transition state energies, and not on the free energies of the metastable states. If we assume there to be one dominant minimal bias, say for $n = n^{*}$ , then this can be approximated as: \begin{equation} P_{\mathrm{clv}} \approx \frac{1}{1+\exp \left(-\Delta T_{n^{*}}\right)} \end{equation}

Now we get a formula to predict the possibility with a lot of approximation. Fortunately this approximation works well for single mismatch situation. As a result, we will firstly derive the analytic solution for single mismatch situation. Then we will optimize the approximation and give a computational solution for multi-mismatch situation.

Summary

To conclude, under several reasonable approximation, we finally obtain a simple relationship to predict the possibility. This approximation works well for single mismatch situation in numerical tests. The fitting figure will be shown in the next chapter.
As a result, we will firstly derive the analytic solution for single mismatch situation. Then we will optimize the approximation and give a computational solution for multi-mismatch situation.

SJTU-BioX-Shanghai

Contact us: sjtuigem@gmail.com

Bio-X Institute, Shanghai Jiao Tong University, Dongchuan Rd. 800


© 2019 SJTU-BioX-Shanghai