Difference between revisions of "Team:TUDelft/ModelTest"

Line 4: Line 4:
  
 
<html>
 
<html>
<head>
+
    <head>
  <style>
+
        <style>
  #Overview {color: #000000; background-color: #f8fcfe;}
+
            #Overview {color: #000000; background-color: #f8fcfe;}
  #iFFL {color: #000000; background-color: #f8fcfe;}
+
            #iFFL {color: #000000; background-color: #f8fcfe;}
  #Kinetics {color: #000000; background-color:#f8fcfe;}
+
            #Kinetics {color: #000000; background-color:#f8fcfe;}
  #PlasmidCopyNumber {color: #000000; background-color: #f8fcfe;}
+
            #PlasmidCopyNumber {color: #000000; background-color: #f8fcfe;}
  #TranscriptionalVariations {color: #000000; background-color: #f8fcfe;}
+
            #TranscriptionalVariations {color: #000000; background-color: #f8fcfe;}
  #TranslationalVariations {color: #000000; background-color:#f8fcfe;}
+
            #TranslationalVariations {color: #000000; background-color:#f8fcfe;}
  #CodonUsage {color: #000000; background-color: #f8fcfe;}
+
            #Insulation{color: #000000; background-color: #f8fcfe;}
  #Insulations{color: #000000; background-color: #f8fcfe;}
+
            #CodonUsage {color: #000000; background-color: #f8fcfe;}
 +
           
  
  .inner{
+
            .inner{
    border: 1px solid #888888;
+
                border: 1px solid #888888;
    box-shadow: 2.5px 5px rgba(136, 136, 136,0.8);
+
                box-shadow: 2.5px 5px rgba(136, 136, 136,0.8);
    border-top-style:none;
+
                border-top-style:none;
  }
+
            }
  .column {
+
            .column {
    box-sizing: border-box;
+
                box-sizing: border-box;
    display: table-cell;
+
                display: table-cell;
    vertical-align: middle;
+
                vertical-align: middle;
  }
+
            }
  .column:after {
+
            .column:after {
    content: "";
+
                content: "";
    display: table;
+
                display: table;
    clear:both;
+
                clear:both;
  }
+
            }
  .left {
+
            .left {
    padding-top:10px;
+
                padding-top:10px;
    width: 15%;
+
                width: 15%;
    justify-content: center;
+
                justify-content: center;
  }
+
            }
  .right {
+
            .right {
    width: 85%;
+
                width: 85%;
    padding:10px;
+
                padding:10px;
    padding-right: 40px;
+
                padding-right: 40px;
  }
+
            }
  .centermodel {
+
            .centermodel {
    display: block;
+
                display: block;
    margin-left: auto;
+
                margin-left: auto;
    margin-right: auto;
+
                margin-right: auto;
    width: 50%;
+
                width: 50%;
  }
+
            }
  
  .row {  
+
            .row {  
    display: table;
+
                display: table;
  }
+
            }
  img.sponsorpage {
+
            img.sponsorpage {
    display: block;
+
                display: block;
    margin:auto;
+
                margin:auto;
    width:75%;
+
                width:75%;
  }
+
            }
  .row:after {
+
            .row:after {
    content: "";
+
                content: "";
    display: table;
+
                display: table;
    clear: both;
+
                clear: both;
  }
+
            }
  @media screen and (max-width: 800px) {
+
            @media screen and (max-width: 800px) {
    .row {
+
                .row {
    display: flex;
+
                    display: flex;
    flex-wrap: wrap;
+
                    flex-wrap: wrap;
    }
+
                }
    .left,
+
                .left,
    .right {
+
                .right {
    flex: 0 0 100%;
+
                    flex: 0 0 100%;
    max-width: 100%;
+
                    max-width: 100%;
    width:100%;
+
                    width:100%;
    padding:25px;
+
                    padding:25px;
    }  
+
                }  
    img.sponsorpage {
+
                img.sponsorpage {
    display: block;
+
                    display: block;
    margin:auto;
+
                    margin:auto;
    width:40%;
+
                    width:40%;
    }
+
                }
  }
+
            }
  </style>
+
        </style>
</head>
+
    </head>
  
<body data-spy="scroll" data-target="#myScrollspy" data-offset="180">
+
    <body data-spy="scroll" data-target="#myScrollspy" data-offset="180">
  <div id="Overview">  
+
        <div id="Overview">  
  <div class="Banner container-fluid text-center mb-0 align-items-center ">
+
            <div class="Banner container-fluid text-center mb-0 align-items-center ">
    <div class="display-2 mb-0">  
+
                <div class="display-2 mb-0">  
    <br>
+
                    <br>
    <br>
+
                    <br>
    <br>
+
                    <br>
    <br>
+
                    <br>
    <img src = "https://static.igem.org/mediawiki/2019/0/03/T--TUDelft--Modeling_logo.png" alt="Modeling" style="width:60%";>  
+
                    <img src = "https://static.igem.org/mediawiki/2019/0/03/T--TUDelft--Modeling_logo.png" alt="Modeling" style="width:60%";>  
  
    </div>   
+
                </div>   
  </div>  
+
            </div>  
  
  <div class="centerjustify2">
+
            <div class="centerjustify2">
    <nav class="col-sm-3" id="myScrollspy">
+
                <nav class="col-sm-3" id="myScrollspy">
    <ul class=" nav nav-pills nav-stacked accordion" data-spy= "affix" data-offset-top="180">
+
                    <ul class=" nav nav-pills nav-stacked accordion" data-spy= "affix" data-offset-top="180">
      <li><a class="jump" href="#Overview">Overview</a></li>
+
                        <li><a class="jump" href="#Overview">Overview</a></li>
      <li><a class="jump" href="#iFFL">iFFL</a></li>
+
                        <li><a class="jump" href="#iFFL">iFFL</a></li>
      <li><a class="jump" href="#Kinetics">Reaction Kinectics</a></li>
+
                        <li><a class="jump" href="#Kinetics">Reaction Kinectics</a></li>
      <li><a class="jump" href="#PlasmidCopyNumber">Plasmid Copy Number</a></li>
+
                        <li><a class="jump" href="#PlasmidCopyNumber">Plasmid Copy Number</a></li>
      <li><a class="jump" href="#TranscriptionalVariations">Transcriptional Variations</a></li>
+
                        <li><a class="jump" href="#TranscriptionalVariations">Transcriptional Variations</a></li>
      <li><a class="jump" href="#TranslationalVariations">Translational Variations</a></li>
+
                        <li><a class="jump" href="#TranslationalVariations">Translational Variations</a></li>
      <li><a class="jump" href="#CodonUsage">Codon Usage</a></li>
+
                        <li><a class="jump" href="#CodonUsage">Codon Usage</a></li>
      <li><a class="jump" href="#Insulations">Importance of Insulation</a></li>
+
                        <li><a class="jump" href="#Insulations">Importance of Insulation</a></li>
    </ul>
+
                    </ul>
    </nav>
+
                </nav>
  </div>
+
            </div>
  <div class= "centerjustify2">
+
            <div class= "centerjustify2">
    <h2>Overview</h2>
+
                <h2>Overview</h2>
    <p> With our modeling, we aimed to apply a control systems approach to achieve stability of gene expression across bacterial species. To make expression host-independent, we included an incoherent feed-forward loop (iFFL) in our design. An iFFL can be used to make the output of a system independent of the input. Our analytical steady-state solution of this loop showed that expression was completely independent of plasmid copy number and transcriptional-translational rates. We verified this analytical solution by the implementation of a full kinetic model.  
+
                <p> With our modeling, we aimed to apply a control systems approach to achieve stability of gene expression across bacterial species. The behavior of genetic circuits depends on a lot of variables, most of which change when transferring to different organisms. To make expression host-independent, we included an incoherent feed-forward loop (iFFL) in our design. An iFFL can be used to make the output of a system independent of the input. The input of a genetic circuit can be many variables, such as plasmid copy number, transcriptional and translational rates. We therefore, wanted to apply the iFFL system to make our genetic circuit independent to plasmid copy number, transcriptional and translational rates. <br> <br> We made a mathematical model of a genetic implementation of the iFFL and derived a steady-state solution analytically. Our analytical steady-state solution of this loop showed that expression was completely independent of plasmid copy number and transcriptional-translational rates. We verified this analytical solution by the implementation of a full kinetic model.  
    <br>
+
                    <br>
    <br>  
+
                    <br>  
  
    The key variables in the design of genetic circuits are plasmid copy number and transcriptional-translational rates. These variables determine the steady-state levels of gene expression. However, when transferring genetic circuits between organisms, these variables change in unpredictable ways.  
+
                    The key variables in the design of genetic circuits are plasmid copy number and transcriptional-translational rates. These variables determine the steady-state levels of gene expression. However, when transferring genetic circuits between organisms, these variables change in unpredictable ways.  
    <div class="row">
+
                <div class="row">
    <div class="column left">
+
                    <div class="column left">
      <img src = "https://static.igem.org/mediawiki/2019/4/4b/T--TUDelft--promotermodel.png" alt="promoter SOBL" style="width:90%;">
+
                        <img src = "https://static.igem.org/mediawiki/2019/4/4b/T--TUDelft--promotermodel.png" alt="promoter SOBL" style="width:90%;">
    </div>
+
                    </div>
    <div class="column right">
+
                    <div class="column right">
      <p>Promoters have different strengths in different organisms. Some promoters only work in a very narrow range of bacterial species <cite><a href="https://www.ncbi.nlm.nih.gov/pubmed/29061047"> (Yang et al., 2018)</a></cite>. To circumvent host-related changes, we chose our system orthogonal to the host. We implement orthogonality in our system by using T7 RNA polymerase. However, orthogonal transcription might not behave similarly when applied in varying biological contexts. Through our modeling, we show that gene expression levels remain the same in varying biological contexts when using our genetic circuit implementation of an iFFL. </p>
+
                        <p>Promoters have different strengths in different organisms. Some promoters only work in a very narrow range of bacterial species <cite><a href="https://www.ncbi.nlm.nih.gov/pubmed/29061047"> (Yang et al., 2018)</a></cite>. To circumvent host-related changes, we chose our system orthogonal to the host. We implement orthogonality in our system by using T7 RNA polymerase. However, orthogonal transcription might not behave similarly when applied in varying biological contexts. Through our modeling, we show that gene expression levels remain the same in varying biological contexts when using our genetic circuit implementation of an iFFL. </p>
    </div>
+
                    </div>
    </div>
+
                </div>
    <br>
+
                <br>
  
    <div class="row">
+
                <div class="row">
    <div class="column left">
+
                    <div class="column left">
      <img src = "https://static.igem.org/mediawiki/2019/a/a6/T--TUDelft--RBS.png" alt="RBS SOBL" style="width:90%;">
+
                        <img src = "https://static.igem.org/mediawiki/2019/a/a6/T--TUDelft--RBS.png" alt="RBS SOBL" style="width:90%;">
    </div>
+
                    </div>
    <div class="column right">
+
                    <div class="column right">
      <p> Ribosome binding sites contain the Shine-Dalgarno sequence, where the 16s rRNA of the ribosome binds. However, this sequence varies across species, and often ribosome binding sites are extremely inefficient when applied in phylogenetically distant species <cite><a href="https://www.nature.com/articles/nbt.1568"> (Salis et al., 2009)</a></cite>. Our model shows that similar expression levels across organisms can be maintained when all genes in our genetic circuit contain the same ribosome binding site. Assuming translation elongation is similar across different species, our model shows that expression levels in different organisms can be maintained when all genes in our genetic circuit contain the same ribosome binding site. Nevertheless, translation elongation is influenced by codon usage, which differs per organism. We therefore developed a software tool that determines a coding sequence similar that is similar in codon usage across different species. Similar codon usage minimizes the chance of different translation elongation rates between organisms. </p>
+
                        <p> Ribosome binding sites contain the Shine-Dalgarno sequence, where the 16s rRNA of the ribosome binds. However, this sequence varies across species, and often ribosome binding sites are extremely inefficient when applied in phylogenetically distant species <cite><a href="https://www.nature.com/articles/nbt.1568"> (Salis et al., 2009)</a></cite>. Our model shows that similar expression levels across organisms can be maintained when all genes in our genetic circuit contain the same ribosome binding site. Assuming translation elongation is similar across different species, our model shows that expression levels in different organisms can be maintained when all genes in our genetic circuit contain the same ribosome binding site. Nevertheless, translation elongation is influenced by codon usage, which differs per organism. We therefore developed a software tool that determines a coding sequence similar that is similar in codon usage across different species. Similar codon usage minimizes the chance of different translation elongation rates between organisms. </p>
    </div>
+
                    </div>
    </div>
+
                </div>
    <br>
+
                <br>
  </div>
+
            </div>
  <!--   
+
            <!--   
 
<div class="row">
 
<div class="row">
 
<div class="column left">
 
<div class="column left">
Line 145: Line 146:
 
</div>
 
</div>
 
-->
 
-->
  <div class="centerjustify2">
+
            <div class="centerjustify2">
    <div id="iFFL">  
+
                <div id="iFFL">  
    <h2>The core of our design - Incoherent feed-forward loop </h2>
+
                    <h2>The core of our design - The incoherent feed-forward loop </h2>
    <p>
+
                    <p>
      We implemented an incoherent feed-forward loop (iFFL) in a genetic circuit. In an iFFL, the input signal regulates both the activator and the repressor of the output of the system in the same way. The iFFL results in perfect adaptation to the input when the binding of the repressor is fully non-cooperative (binding of one repressor at a time) <cite><a href="https://www.nature.com/articles/nbt.4111">(Segall-Shapiro et al., 2018)</a></cite>.
+
                        We implemented an incoherent feed-forward loop (iFFL) in a genetic circuit. In an iFFL, the input signal regulates both the activator and the repressor of the output of the system in the same way (figure 1). The iFFL results in perfect adaptation to the input when the binding of the repressor is fully non-cooperative (binding of one repressor at a time) <cite><a href="https://www.nature.com/articles/nbt.4111">(Segall-Shapiro et al., 2018)</a></cite>.
      In our case, the input is the plasmid copy number of the DNA template, and the output is the steady-state expression of a GOI. In our system, we use a transcription activator-like effector (TALE) protein as a repressor. TALE proteins recognize DNA by a simple DNA-binding mechanism (Doyle 2013) and have been shown to bind fully non-cooperative <cite><a href="https://www.nature.com/articles/nbt.4111">(Segall-Shapiro et al., 2018)</a></cite>. The promoter controlling the GOI has been engineered to contain a binding site of a TALE protein. When the TALE protein is bound to the promoter, the expression of the GOI is repressed as demonstrated by <cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Talei">2018 iGEM Thessaloniki</a></cite>.  
+
                        In our case, the input is the plasmid copy number of the DNA template, and the output is the steady-state expression of a GOI. In our system, we use a transcription activator-like effector (TALE) protein as a repressor. TALE proteins recognize DNA by a simple DNA-binding mechanism (Doyle 2013) and have been shown to bind fully non-cooperative (figure 2) <cite><a href="https://www.nature.com/articles/nbt.4111">(Segall-Shapiro et al., 2018)</a></cite>. The promoter controlling the GOI has been engineered to contain a binding site of a TALE protein. When the TALE protein is bound to the promoter, the expression of the GOI is repressed as demonstrated by <cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Tale">2018 iGEM Thessaloniki</a></cite>.  
      <br>
+
                        <br>
    </p>
+
                    </p>
    <figure>
+
                    <figure>
      <img src="https://static.igem.org/mediawiki/2019/5/5a/T--TUDelft--iFFL.png" style="width:70%" class="centermodel"
+
                        <img src="https://static.igem.org/mediawiki/2019/5/5a/T--TUDelft--iFFL.png" style="width:70%" class="centermodel"
        alt="iFFL" >
+
                            alt="iFFL" >
      <figcaption class="centermodel"><br><b>Figure 1: Scheme of incoherent feed- forward loop.</b> Red indicates how the output normally increases linearly with the input. Green depicts the addition of a repressor which results in independence of the output to the input. </figcaption>
+
                        <figcaption class="centermodel"><br><b>Figure 1: Scheme of incoherent feed- forward loop.</b> Red indicates how the output normally increases linearly with the input. Green depicts the addition of a repressor which results in independence of the output to the input. </figcaption>
    </figure>
+
                    </figure>
    <br>
+
                    <br>
    <figure>
+
                    <figure>
      <img src="https://static.igem.org/mediawiki/2019/8/82/T--TUDelft--TALEanimation.gif" style="width:100%;border:1px solid #00a6d6;" class="centermodel"
+
                        <img src="https://static.igem.org/mediawiki/2019/8/82/T--TUDelft--TALEanimation.gif" style="width:100%;border:1px solid #00a6d6;" class="centermodel"
        alt="iFFL" >
+
                            alt="iFFL" >
      <figcaption class="centermodel"><br><b>Figure 2: Animation of TALE protein binding to the promoter of a GOI.</b> The binding of the TALE protein represses the expression of a GOI. </figcaption>
+
                        <figcaption class="centermodel"><br><b>Figure 2: Animation of TALE protein binding to the promoter of a GOI.</b> The binding of the TALE protein represses the expression of a GOI. </figcaption>
    </figure>
+
                    </figure>
    We have modeled the function of the genetics of this system. An analytical steady-state solution of the system showed that the steady-state expression level of a GOI is completely independent to plasmid copy number and can be independent of transcriptional and translational rates when the right design choices are made. After further verification through the implementation of a full ordinary differential equation (ODE) model, we designed experiments to test the independence to these variables. Key design choices were identified by modeling. These consist of:
+
                    We have modeled the function of the genetics of this system. An analytical steady-state solution of the system showed that the steady-state expression level of a GOI is completely independent to plasmid copy number and can be independent of transcriptional and translational rates when the right design choices are made. After further verification through the implementation of a full ordinary differential equation (ODE) model, we designed experiments to test the independence to these variables. Using our model, we were able to identify to some key design choices of our project.. These consist of:
    <ul>
+
                    <ul>
      <li>
+
                        <li>
      The need for good insulation of the genes.
+
                            The need for good insulation of the genes.
      </li>
+
                        </li>
      <li>
+
                        <li>
      The promoter strengths of the TALE protein and the GOI need to maintain the same ratio.  
+
                            The promoter strengths of the TALE protein and the GOI need to maintain the same ratio.  
      </li>
+
                        </li>
      <li>
+
                        <li>
      The ribosome binding site strengths of the TALE protein and the GOI need to maintain the same ratio.
+
                            The ribosome binding site strengths of the TALE protein and the GOI need to maintain the same ratio.
      </li>
+
                        </li>
    </ul>  
+
                    </ul>  
    <br>  
+
                    <br>  
    </div>
+
                </div>
  
    <div id="Kinetics">  
+
                <div id="Kinetics">  
    <h2>The kinetics</h2>
+
                    <h2>The kinetics</h2>
    <p>In this section, we explain the kinetics of our iFFL and derive a system of ordinary differential equations to describe the interactions within the genetic circuit. We derive a steady-state solution from the system of equations and describe the properties of the system. In the next sections, we use the steady-state solution and ODE model to describe how our circuit can be used to transfer genetic circuits between prokaryotes.  
+
                    <p>In this section, we explain the kinetics of our iFFL and derive a system of ordinary differential equations to describe the interactions within the genetic circuit. We derive a steady-state solution from the system of equations and describe the properties of the system. In the next sections, we use the steady-state solution and ODE model to describe how our circuit can be used to transfer genetic circuits between prokaryotes.  
      <br><br>
+
                        <br><br>
      Figure 3, depicts all interactions considered in our system. </p>
+
                        Figure 3, depicts all interactions considered in our system. </p>
  
    <figure>
+
                    <figure>
      <img src="https://static.igem.org/mediawiki/2019/8/85/T--TUDelft--TALE_system.png" style="width:100%;border:1px solid #00a6d6;" class="centermodel"
+
                        <img src="https://static.igem.org/mediawiki/2019/8/85/T--TUDelft--TALE_system.png" style="width:100%;border:1px solid #00a6d6;" class="centermodel"
        alt="TALE system">
+
                            alt="TALE system">
      <figcaption class="centermodel"><br><b>Figure 3</b>: Scheme of genetic circuit interactions developed by <cite><a href="https://www.nature.com/articles/nbt.4111">Segall-Shapiro et al. (2018)</a></cite></figcaption>
+
                        <figcaption class="centermodel"><br><b>Figure 3</b>: Scheme of genetic circuit interactions developed by <cite><a href="https://www.nature.com/articles/nbt.4111">Segall-Shapiro et al. (2018)</a></cite></figcaption>
    </figure>
+
                    </figure>
    <br>
+
                    <br>
    <ul class="accordion">
+
                    <ul class="accordion">
      <li>
+
                        <li>
      <a class="toggle " href="javascript:void(0);" ><b>Click here to find out more about the details of the kinetic model</b><span style="float:right;"><b>&#xfe40;</b></span></a>
+
                            <a class="toggle " href="javascript:void(0);" ><b>Click here to find out more about the details of the kinetic model</b><span style="float:right;"><b>&#xfe40;</b></span></a>
      <ul class="inner accordion">
+
                            <ul class="inner accordion">
        <p>  
+
                                <p>  
        <p>From these interactions we can derive the following system of ordinary differential equations: <br> </p> <br>
+
                                <p>From these interactions we can derive the following system of ordinary differential equations: <br> </p> <br>
        <ul>  
+
                                <ul>  
        <il>${dm_T \over dt} = {c \cdot a_T - y_m \cdot m_T}$ </il> <br> <br>
+
                                    <il>${dm_T \over dt} = {c \cdot a_T - y_m \cdot m_T}$ </il> <br> <br>
        <il> $\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T - n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T}$</il> <br><br>
+
                                    <il> $\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T - n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T}$</il> <br><br>
        <il>$\frac{dP_G}{dt} = k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T}$</il><br><br>
+
                                    <il>$\frac{dP_G}{dt} = k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T}$</il><br><br>
        <il>$\frac{dP_{G.T}}{dt} = n \cdot k_{on} \cdot T^n \cdot P_G - k_{off} \cdot P_{G.T} - n \cdot y_T \cdot P_{G.T} $ </il><br><br>
+
                                    <il>$\frac{dP_{G.T}}{dt} = n \cdot k_{on} \cdot T^n \cdot P_G - k_{off} \cdot P_{G.T} - n \cdot y_T \cdot P_{G.T} $ </il><br><br>
        <il>$\frac{dm_G}{dt} = a_{Gmax} \cdot P_G + a_{Gmin} \cdot P_{G.T} - y_m \cdot m_G $</il> <br><br>
+
                                    <il>$\frac{dm_G}{dt} = a_{Gmax} \cdot P_G + a_{Gmin} \cdot P_{G.T} - y_m \cdot m_G $</il> <br><br>
        <il>$\frac{G}{dt} = b_G \cdot m_G - y_G \cdot G $</il><br><br>
+
                                    <il>$\frac{G}{dt} = b_G \cdot m_G - y_G \cdot G $</il><br><br>
        </ul>
+
                                </ul>
        <br>
+
                                <br>
        <table id="tabletu">
+
                                <table id="tabletu">
        <tr>
+
                                    <tr>
          <th>Parameter</th>
+
                                        <th>Parameter</th>
          <th> Value </th>
+
                                        <th> Value </th>
          <th>Unit</th>
+
                                        <th>Unit</th>
          <th>Explanation</th>
+
                                        <th>Explanation</th>
          <th> Source </th>
+
                                        <th> Source </th>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$a_T$</td>
+
                                        <td>$a_T$</td>
          <td>1.03 </td>
+
                                        <td>1.03 </td>
          <td> nM/min </td>
+
                                        <td> nM/min </td>
          <td>Transcription rate TALE</td>
+
                                        <td>Transcription rate TALE</td>
          <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Talei">2018 iGEM Thessaloniki</a></cite> </td>
+
                                        <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Talei">2018 iGEM Thessaloniki</a></cite> </td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$y_m$</td>
+
                                        <td>$y_m$</td>
          <td>log(2)/5 </td>
+
                                        <td>log(2)/5 </td>
          <td>1/min </td>
+
                                        <td>1/min </td>
          <td>degradation rate mRNA</td>
+
                                        <td>degradation rate mRNA</td>
          <td><cite><a href="https://www.nature.com/articles/ncomms8832">Kushwaha and Salis 2015<</a></cite> </td>
+
                                        <td><cite><a href="https://www.nature.com/articles/ncomms8832">Kushwaha and Salis 2015</a></cite> </td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$b_T$</td>
+
                                        <td>$b_T$</td>
          <td> 0.44 </td>
+
                                        <td> 0.44 </td>
          <td> 1/min </td>
+
                                        <td> 1/min </td>
          <td>Translation rate TALE</td>
+
                                        <td>Translation rate TALE</td>
          <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Tale">2018 iGEM Thessaloniki<</a></cite>. </td>
+
                                        <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Tale">2018 iGEM Thessaloniki</a></cite>. </td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$y_T$</td>
+
                                        <td>$y_T$</td>
          <td> 0.0347 </td>
+
                                        <td> 0.0347 </td>
          <td> 1/min </td>
+
                                        <td> 1/min </td>
          <td>Degradation rate TALE</td>
+
                                        <td>Degradation rate TALE</td>
          <td>Assuming degradation is only dependent on growth rate (20 min)</td>
+
                                        <td>Assuming degradation is only dependent on growth rate (20 min)</td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>n</td>
+
                                        <td>n</td>
          <td>1</td>
+
                                        <td>1</td>
          <td>-</td>
+
                                        <td>-</td>
          <td>Cooperativity of binding</td>
+
                                        <td>Cooperativity of binding</td>
          <td><cite><a href="https://www.ncbi.nlm.nih.gov/pubmed/29553576">Segall-Shapiro et al.</a></cite> </td>
+
                                        <td><cite><a href="https://www.ncbi.nlm.nih.gov/pubmed/29553576">Segall-Shapiro et al.</a></cite> </td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$k_{on}$</td>
+
                                        <td>$k_{on}$</td>
          <td>9.85 </td>
+
                                        <td>9.85 </td>
          <td>1/(nM*min) </td>
+
                                        <td>1/(nM*min) </td>
          <td>Binding of TALE to promoter</td>
+
                                        <td>Binding of TALE to promoter</td>
          <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Tale">2018 iGEM Thessaloniki<</a></cite>. </td>
+
                                        <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Tale">2018 iGEM Thessaloniki<</a></cite>. </td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$k_{off}$</td>
+
                                        <td>$k_{off}$</td>
          <td>2.19</td>
+
                                        <td>2.19</td>
          <td>1/min</td>
+
                                        <td>1/min</td>
          <td>Unbinding of TALE to promoter</td>
+
                                        <td>Unbinding of TALE to promoter</td>
          <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Tale">2018 iGEM Thessaloniki<</a></cite>. </td>
+
                                        <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Tale">2018 iGEM Thessaloniki<</a></cite>. </td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$a_{Gmax}$</td>
+
                                        <td>$a_{Gmax}$</td>
          <td> 3.78 </td>
+
                                        <td> 3.78 </td>
          <td>1/(nM*min) </td>
+
                                        <td>1/(nM*min) </td>
          <td>Maximum transcription of GFP</td>
+
                                        <td>Maximum transcription of GFP</td>
          <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Tale">2018 iGEM Thessaloniki</a></cite> </td>
+
                                        <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Tale">2018 iGEM Thessaloniki</a></cite> </td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
        <tr>
+
                                    <tr>
          <td>$a_{Gmin}$</td>
+
                                        <td>$a_{Gmin}$</td>
          <td> 0 </td>
+
                                        <td> 0 </td>
          <td> 1/(nM*min) </td>
+
                                        <td> 1/(nM*min) </td>
          <td>Minimum transcription of GFP</td>
+
                                        <td>Minimum transcription of GFP</td>
          <td><cite><a href="https://www.ncbi.nlm.nih.gov/pubmed/29553576">Segall-Shapiro et al.</a></cite></td>
+
                                        <td><cite><a href="https://www.ncbi.nlm.nih.gov/pubmed/29553576">Segall-Shapiro et al.</a></cite></td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$b_G$</td>
+
                                        <td>$b_G$</td>
          <td>3.65 </td>
+
                                        <td>3.65 </td>
          <td> 1/min</td>
+
                                        <td> 1/min</td>
          <td>Translation rate GFP</td>
+
                                        <td>Translation rate GFP</td>
          <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Tale">2018 iGEM Thessaloniki</a></cite></td>
+
                                        <td><cite><a href="https://2018.igem.org/Team:Thessaloniki/Model/Tale">2018 iGEM Thessaloniki</a></cite></td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$y_G$</td>
+
                                        <td>$y_G$</td>
          <td> 0.0347 </td>
+
                                        <td> 0.0347 </td>
          <td> 1/min </td>
+
                                        <td> 1/min </td>
          <td>Degradation rate GFP</td>
+
                                        <td>Degradation rate GFP</td>
          <td>Assuming degradation is only dependent on growth rate (20 min) </td>
+
                                        <td>Assuming degradation is only dependent on growth rate (20 min) </td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$c$</td>
+
                                        <td>$c$</td>
          <td> variable </td>
+
                                        <td> variable </td>
          <td> Unitless </td>
+
                                        <td> Unitless </td>
          <td> Plasmid copy number of plasmid </td>
+
                                        <td> Plasmid copy number of plasmid </td>
          <td> </td>
+
                                        <td> </td>
        </tr>
+
                                    </tr>
        </table>
+
                                </table>
        <br>
+
                                <br>
        <table id="tabletu">
+
                                <table id="tabletu">
        <tr>
+
                                    <tr>
          <th>Variable</th>
+
                                        <th>Variable</th>
          <th>Explanation</th>
+
                                        <th>Explanation</th>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$m_T$</td>
+
                                        <td>$m_T$</td>
          <td>concentration of TALE mRNA</td>
+
                                        <td>concentration of TALE mRNA</td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>T</td>
+
                                        <td>T</td>
          <td>concentration of TALE</td>
+
                                        <td>concentration of TALE</td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$P_G$</td>
+
                                        <td>$P_G$</td>
          <td>Promoter GFP</td>
+
                                        <td>Promoter GFP</td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$P_{G.T}$</td>
+
                                        <td>$P_{G.T}$</td>
          <td>Promoter GFP with TALE bound</td>
+
                                        <td>Promoter GFP with TALE bound</td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>$m_G$</td>
+
                                        <td>$m_G$</td>
          <td>concentration of mRNA GFP</td>
+
                                        <td>concentration of mRNA GFP</td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
          <td>G</td>
+
                                        <td>G</td>
          <td>concentration of GFP</td>
+
                                        <td>concentration of GFP</td>
        </tr>
+
                                    </tr>
        <tr>
+
                                    <tr>
  
        </tr>
+
                                    </tr>
        </table>
+
                                </table>
        <br>
+
                                <br>
        <h4>Simplification of the system</h4>
+
                                <h4>Simplification of the system</h4>
        <p>
+
                                <p>
        This system can be simplified by making a few assumptions (Segall-Shapiro et al., 2018):</p>
+
                                    This system can be simplified by making a few assumptions (Segall-Shapiro et al., 2018):</p>
        <ol>
+
                                <ol>
        <li>The Amount of TALE protein is much larger than the number of binding sites for TALE</li>
+
                                    <li>The Amount of TALE protein is much larger than the number of binding sites for TALE</li>
        <li>TALE binding and unbinding occurs much more rapidly than protein production and degradation</li>
+
                                    <li>TALE binding and unbinding occurs much more rapidly than protein production and degradation</li>
        <li>When the promoter is repressed the expression level is negligable.</li>
+
                                    <li>When the promoter is repressed the expression level is negligable.</li>
        </ol>
+
                                </ol>
        <br>
+
                                <br>
        Using these assumptions, we can derive analytically a steady-state solution for this system. This derivation results in the following steady-state solution: <br> <br>
+
                                Using these assumptions, we can derive analytically a steady-state solution for this system. This derivation results in the following steady-state solution: <br> <br>
        $$G = \left(\frac{c}{c^n}\right) \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$ <br>
+
                                $$G = \left(\frac{c}{c^n}\right) \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$ <br>
        <br>
+
                                <br>
  
        <li>
+
                                <li>
        <a class="toggle " href="javascript:void(0);" ><b>Click here to see the full derivation</b><span style="float:right;"><b>&#xfe40;</b></span></a>
+
                                    <a class="toggle " href="javascript:void(0);" ><b>Click here to see the full derivation</b><span style="float:right;"><b>&#xfe40;</b></span></a>
        <div class="inner accordion">  
+
                                    <div class="inner accordion">  
  
          <p>  
+
                                        <p>  
          By making use of assumption 2, we can assume a quasi-steady state for $\frac{dP_G}{dt}$ and $\frac{dP_{G.T}}{dt}$. Quasi-steady state means we assume the equation reaches steady-state much faster than the other variables in the model and thus the rate of change is zero. This results in the following system of equations. <br>
+
                                            By making use of assumption 2, we can assume a quasi-steady state for $\frac{dP_G}{dt}$ and $\frac{dP_{G.T}}{dt}$. Quasi-steady state means we assume the equation reaches steady-state much faster than the other variables in the model and thus the rate of change is zero. This results in the following system of equations. <br>
          </p>
+
                                        </p>
          <br>
+
                                        <br>
          <ol>  
+
                                        <ol>  
          <li>${dm_T \over dt} = {c \cdot a_T - y_m \cdot m_T}$ </li> <br>
+
                                            <li>${dm_T \over dt} = {c \cdot a_T - y_m \cdot m_T}$ </li> <br>
          <li>$\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T - n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T} + n \cdot (n-1)\cdot y_T \cdot P_{G.T}$</li> <br>
+
                                            <li>$\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T - n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T} + n \cdot (n-1)\cdot y_T \cdot P_{G.T}$</li> <br>
          <li>$\frac{dP_G}{dt} = k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T} = 0$ </li><br>
+
                                            <li>$\frac{dP_G}{dt} = k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T} = 0$ </li><br>
          <li>$\frac{dP_{G.T}}{dt} = n \cdot k_{on} \cdot T^n \cdot P_G - k_{off} \cdot P_{G.T} - n \cdot y_T \cdot P_{G.T} = 0$ </li><br>
+
                                            <li>$\frac{dP_{G.T}}{dt} = n \cdot k_{on} \cdot T^n \cdot P_G - k_{off} \cdot P_{G.T} - n \cdot y_T \cdot P_{G.T} = 0$ </li><br>
          <li>$\frac{dm_G}{dt} = a_{Gmax} \cdot P_G + a_{Gmin} \cdot P_{G.T} - y_m \cdot m_G $</li> <br>
+
                                            <li>$\frac{dm_G}{dt} = a_{Gmax} \cdot P_G + a_{Gmin} \cdot P_{G.T} - y_m \cdot m_G $</li> <br>
          <li>$\frac{dG}{dt} = b_G \cdot m_G - y_G \cdot G $</li><br>
+
                                            <li>$\frac{dG}{dt} = b_G \cdot m_G - y_G \cdot G $</li><br>
          </ol>
+
                                        </ol>
          <br>
+
                                        <br>
          <p>We can now use equation 3 to simplify the system:</p> <br>
+
                                        <p>We can now use equation 3 to simplify the system:</p> <br>
          <ul>
+
                                        <ul>
          <li>$\color{red}{k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G} + n \cdot y_T \cdot P_{G.T} = 0$ </li>
+
                                            <li>$\color{red}{k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G} + n \cdot y_T \cdot P_{G.T} = 0$ </li>
          <li>The red part can be taken to one side of the equation: </li>
+
                                            <li>The red part can be taken to one side of the equation: </li>
          <li>$\color{red}{n \cdot k_{on} \cdot T^n \cdot P_G - k_{off} \cdot P_{G.T}} = n \cdot y_T \cdot P_{G.T}$ </li>  
+
                                            <li>$\color{red}{n \cdot k_{on} \cdot T^n \cdot P_G - k_{off} \cdot P_{G.T}} = n \cdot y_T \cdot P_{G.T}$ </li>  
          <li>Then we substitute that expression in equation 2: </li>
+
                                            <li>Then we substitute that expression in equation 2: </li>
          <li>$\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T \color{red}{- n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T}}+ n \cdot (n-1)\cdot y_T \cdot P_{G.T}$</li>
+
                                            <li>$\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T \color{red}{- n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T}}+ n \cdot (n-1)\cdot y_T \cdot P_{G.T}$</li>
          <li>Which becomes: <br> $\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T -(n \cdot y_T \cdot P_{G.T}) + n \cdot (n-1)\cdot y_T \cdot P_{G.T} = b_T \cdot m_T - y_T \cdot T$ </li>
+
                                            <li>Which becomes: <br> $\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T -(n \cdot y_T \cdot P_{G.T}) + n \cdot (n-1)\cdot y_T \cdot P_{G.T} = b_T \cdot m_T - y_T \cdot T$ </li>
          </ul>
+
                                        </ul>
          <br>
+
                                        <br>
          <p>Furthermore, we can use equation 3 to get an expression for $P_G$</p>
+
                                        <p>Furthermore, we can use equation 3 to get an expression for $P_G$</p>
          <br>
+
                                        <br>
          <ul>
+
                                        <ul>
          <li>$k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T} = 0$</li>
+
                                            <li>$k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T} = 0$</li>
          <li>$P_G = \frac{k_{off}}{k_{on}} \cdot P_{G.T} + \frac{1}{k_{on}} \cdot n \cdot y_T \cdot P_{G.T}$</li>
+
                                            <li>$P_G = \frac{k_{off}}{k_{on}} \cdot P_{G.T} + \frac{1}{k_{on}} \cdot n \cdot y_T \cdot P_{G.T}$</li>
          </ul>
+
                                        </ul>
          <br>
+
                                        <br>
          <p>Using assumption 1, we can say T >> c. It follows that the amount of free repressor barely changes when some of the repressors bind to $P_G$, meaning $T \approx T + nP_{G.T}$</p>
+
                                        <p>Using assumption 1, we can say T >> c. It follows that the amount of free repressor barely changes when some of the repressors bind to $P_G$, meaning $T \approx T + nP_{G.T}$</p>
          <br>
+
                                        <br>
          <ul>
+
                                        <ul>
          <li>$P_G = \frac{k_{off}}{k_{on}} \cdot P_{G.T}$</li>
+
                                            <li>$P_G = \frac{k_{off}}{k_{on}} \cdot P_{G.T}$</li>
          <li>Using: $c = P_G + P_{G.T}$ we get the following: </li>
+
                                            <li>Using: $c = P_G + P_{G.T}$ we get the following: </li>
          <li>$P_G = \frac{c}{1 + K_D \cdot R^n}$, Where: $K_D = \frac{k_{off}}{k_{on}}$</li>
+
                                            <li>$P_G = \frac{c}{1 + K_D \cdot R^n}$, Where: $K_D = \frac{k_{off}}{k_{on}}$</li>
          <li>Plugging this into equation 5 and again making use of $c = P_G + P_{G.T}$, equation 5 becomes: </li>
+
                                            <li>Plugging this into equation 5 and again making use of $c = P_G + P_{G.T}$, equation 5 becomes: </li>
          <li>$\frac{dm_G}{dt} = c \cdot (a_{Gmin} + (a_{Gmax} - a_{Gmin})[\frac{K_D}{K_D + R^n}]) - y_m \cdot m_G$</li>
+
                                            <li>$\frac{dm_G}{dt} = c \cdot (a_{Gmin} + (a_{Gmax} - a_{Gmin})[\frac{K_D}{K_D + R^n}]) - y_m \cdot m_G$</li>
          </ul>
+
                                        </ul>
          <br>
+
                                        <br>
          <p>Assumption 3 tells us that we can assume $a_{Gmin} \approx 0$.
+
                                        <p>Assumption 3 tells us that we can assume $a_{Gmin} \approx 0$.
          Again using assumption 1, we can say $T^n >> K_D$, resulting in $T^n + K_D \approx T^n$ Using these two assumptions equation 5 can be further simplified to: </p>  
+
                                            Again using assumption 1, we can say $T^n >> K_D$, resulting in $T^n + K_D \approx T^n$ Using these two assumptions equation 5 can be further simplified to: </p>  
          <br>
+
                                        <br>
          <ul>
+
                                        <ul>
          <li>$\frac{dm_G}{dt} = c \cdot a_{G} \cdot[\frac{K_D}{T^n}] - y_m \cdot m_G$, where $a_{G} = a_{Gmax}K_D$</li>
+
                                            <li>$\frac{dm_G}{dt} = c \cdot a_{G} \cdot[\frac{K_D}{T^n}] - y_m \cdot m_G$, where $a_{G} = a_{Gmax}K_D$</li>
          </ul>
+
                                        </ul>
          <br>
+
                                        <br>
          <p>Using this reduced system of equations we can now derive the steady-state solution for the GOI. </p>
+
                                        <p>Using this reduced system of equations we can now derive the steady-state solution for the GOI. </p>
          <br>
+
                                        <br>
          <ul>  
+
                                        <ul>  
          <li>$ c \cdot a_T - y_m \cdot m_T = 0$ $\rightarrow$ $ m_T = c \frac{a_T}{y_m} $</li>
+
                                            <li>$ c \cdot a_T - y_m \cdot m_T = 0$ $\rightarrow$ $ m_T = c \frac{a_T}{y_m} $</li>
          <li>$ b_T \cdot m_{T} - y_T \cdot y_T \cdot T = 0$ $\rightarrow$ $ T = \frac{b_T \cdot m_{T}}{y_T}$ </li>
+
                                            <li>$ b_T \cdot m_{T} - y_T \cdot y_T \cdot T = 0$ $\rightarrow$ $ T = \frac{b_T \cdot m_{T}}{y_T}$ </li>
          <li>$ c \cdot a_{G} \cdot[\frac{K_D}{R^n}] - y_m \cdot m_G = 0$ $\rightarrow$ $ m_G = c \frac{a_{G} \cdot[\frac{K_D}{R^n}]}{y_m}$ </li>
+
                                            <li>$ c \cdot a_{G} \cdot[\frac{K_D}{R^n}] - y_m \cdot m_G = 0$ $\rightarrow$ $ m_G = c \frac{a_{G} \cdot[\frac{K_D}{R^n}]}{y_m}$ </li>
          <li>$ b_G \cdot m_{G} - \cdot y_G \cdot G = 0$ $\rightarrow$ $ G = \frac{b_G \cdot m_{G}}{y_G} $ </li>  
+
                                            <li>$ b_G \cdot m_{G} - \cdot y_G \cdot G = 0$ $\rightarrow$ $ G = \frac{b_G \cdot m_{G}}{y_G} $ </li>  
          </ul> <br>
+
                                        </ul> <br>
  
          Plugging everything into the last equation gives:
+
                                        Plugging everything into the last equation gives:
          $$G = \left(\frac{c}{c^n}\right) \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$ <br>
+
                                        $$G = \left(\frac{c}{c^n}\right) \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$ <br>
        </div>
+
                                    </div>
        </li>
+
                                </li>
  
  
        <br>
+
                                <br>
        <p>According to our analytical solution, the level of the protein of interest is only dependents on plasmid copy number, and the ratios of transcription and translation rates of the genes in the circuit. In the next sections, we use this steady-state solution to demonstrate how it can be used to transfer genetic circuits between organisms. Furthermore, we solve the full system of ordinary differential equations in Matlab to gain insight in the kinetics of the system. </p>
+
                                <p>According to our analytical solution, the level of the protein of interest is only dependents on plasmid copy number, and the ratios of transcription and translation rates of the genes in the circuit. In the next sections, we use this steady-state solution to demonstrate how it can be used to transfer genetic circuits between organisms. Furthermore, we solve the full system of ordinary differential equations in Matlab to gain insight in the kinetics of the system. </p>
        <br>
+
                                <br>
      </ul>
+
                            </ul>
      </li>  
+
                        </li>  
    </ul>
+
                    </ul>
    </div>
+
                </div>
  
  
    <div id="PlasmidCopyNumber">  
+
                <div id="PlasmidCopyNumber">  
    <h2>Plasmid copy number</h2>
+
                    <h2>Plasmid copy number</h2>
    <p>The expression levels in a genetic circuit are strongly correlated to the plasmid copy number of the DNA template <cite><a href="https://www.ncbi.nlm.nih.gov/pubmed/29553576">Segall-Shapiro et al., (2018)</a></cite> The amount of gene plasmid copy number can change when transferred between organisms. Therefore there is a need for consistent expression at a wide range of plasmid copy numbers if the same genetic circuit is used in different organisms.  
+
                    <p>The expression levels in a genetic circuit are strongly correlated to the plasmid copy number of the DNA template <cite><a href="https://www.ncbi.nlm.nih.gov/pubmed/29553576">Segall-Shapiro et al., (2018)</a></cite> The amount of gene plasmid copy number can change when the plasmid is transferred between organisms. Therefore there is a need for expression levels independent of plasmid copy number if the same genetic circuit is used in different organisms.  
      The steady-state solution of our model tells us that when our repressor binding is fully non-cooperative, n = 1, we have complete independence of plasmid copy number:
+
                        The steady-state solution of our model tells us that when our repressor binding is fully non-cooperative, n = 1, we have complete independence of plasmid copy number:
    </p> <br>
+
                    </p> <br>
    $$G = \left(\frac{\color{red}c}{\color{red}c^\color{red}n}\right)_{\color{red}n\color{red}=\color{red}1} \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$  
+
                    $$G = \left(\frac{\color{red}c}{\color{red}c^\color{red}n}\right)_{\color{red}n\color{red}=\color{red}1} \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$  
    <br>
+
                    <br>
    <br>
+
                    <br>
    <p>This formula is however based on a few assumptions. To see how the system would behave without making these assumptions we implemented the full system of ordinary differential equations (Figure 4). </p>
+
                    <p>This formula is however based on a few assumptions. To see how the system would behave without making these assumptions we implemented the full system of ordinary differential equations (Figure 4). </p>
    <img src="https://static.igem.org/mediawiki/2019/e/e4/T--TUDelft--copynumber.svg" style="width:80%;border:1px solid #00a6d6;" class="centermodel"
+
                    <img src="https://static.igem.org/mediawiki/2019/e/e4/T--TUDelft--copynumber.svg" style="width:80%;border:1px solid #00a6d6;" class="centermodel"
      alt="TALE system">
+
                        alt="TALE system">
    <figcaption class="centermodel"> <b>Figure 4</b>: Steady-state GOI production for gene plasmid copy number 1 to 600 (genome integration to high plasmid copy number plasmid). </figcaption> <br>
+
                    <figcaption class="centermodel"> <b>Figure 4</b>: Steady-state GOI production for gene plasmid copy number 1 to 600 (genome integration to high plasmid copy number plasmid). </figcaption> <br>
    <p>The model without assumptions has the same expression level independent of plasmid copy number (figure 4). We therefore can transfer our circuit between organisms and expect the expression of the GOI to be independent of the changes in plasmid copy number of our orthogonal plasmid. </p> <br>
+
                    <p>The model without assumptions has the same expression level independent of plasmid copy number (figure 4). We therefore can transfer our circuit between organisms and expect the expression of the GOI to be independent of the changes in plasmid copy number of our orthogonal plasmid. </p> <br>
    <h3>Wet lab</h3>
+
                    <h3>Wet lab</h3>
    <p>
+
                    <p>
      We tested the prediction of plasmid copy number independence by implementing the iFFL system, with GFP as the output. We cloned the system in backbones containing different origins of replication. As a control, we also cloned GFP into the same backbones to demonstrate different levels of expression.
+
                        We tested the prediction of plasmid copy number independence by implementing the iFFL system, with GFP as the output. We cloned the system in backbones containing different origins of replication. As a control, we also cloned GFP into the same backbones to demonstrate different levels of expression.
      More info can be found here.  
+
                        More info can be found here.  
    </p>
+
                    </p>
    </div>
+
                </div>
  
    <div id="TranscriptionalVariations">  
+
                <div id="TranscriptionalVariations">  
    <h2>Transcriptional variations</h2>
+
                    <h2>Transcriptional variations</h2>
    <p>  
+
                    <p>  
      Every promoter might have a different strength when used in different organisms <cite><a href="https://academic.oup.com/femsle/article/348/2/87/731695">(Yang et al., 2018)</a></cite>. Thus, when using the same promoter in different organisms, you can get unpredictable behavior. The steady-state solution of our model tells us that the steady-state expression level of the GOI is only dependent on the ratio of transcription rates of the GOI and TALE.  
+
                        Every promoter might have a different strength when used in different organisms <cite><a href="https://academic.oup.com/femsle/article/348/2/87/731695">(Yang et al., 2018)</a></cite>. Thus, when using the same promoter in different organisms, you can get unpredictable behavior. The steady-state solution of our model tells us that the steady-state expression level of the GOI is only dependent on the ratio of transcription rates of the GOI and TALE.  
    </p> <br>
+
                    </p> <br>
  
    $$G = \left(\frac{c}{c^n}\right) \left(\frac{\color{red}a_\color{red}G \color{black} b_Gy_T^ny_m^n}{\color{red}a_\color{red}T \color{black} b_Ty_Gy_m}\right)$$  
+
                    $$G = \left(\frac{c}{c^n}\right) \left(\frac{\color{red}a_\color{red}G \color{black} b_Gy_T^ny_m^n}{\color{red}a_\color{red}T \color{black} b_Ty_Gy_m}\right)$$  
    <br>
+
                    <br>
  
    In our full kinetic model we vary both transcription parameters. In figure 5 we plot the resulting steady-state solutions as a function of the transcription rate of TALE and GFP.
+
                    In our full kinetic model we vary both transcription parameters. In figure 5 we plot the resulting steady-state solutions as a function of the transcription rate of TALE and GFP.
    <br>
+
                    <br>
    <img src="https://static.igem.org/mediawiki/2019/f/f8/T--TUDelft--transcriptionvariation.svg" style="width:85%;border:1px solid #00a6d6;" class="centermodel"
+
                    <img src="https://static.igem.org/mediawiki/2019/f/f8/T--TUDelft--transcriptionvariation.svg" style="width:85%;border:1px solid #00a6d6;" class="centermodel"
      alt="TALE system">
+
                        alt="TALE system">
    <figcaption class="centermodel"><b>Figure 5</b>: Steady-state GFP production while transcription rates of both TALE and GOI are changed. The lines indicate constant ratio of transcription rates </figcaption>
+
                    <figcaption class="centermodel"><b>Figure 5</b>: Steady-state GFP production while transcription rates of both TALE and GOI are changed (aT/aG = constant). The lines indicate constant ratio of transcription rates </figcaption>
    <br>
+
                    <br>
    <p>The full kinetic model shows that the expression level of GFP is the same when the transcription rate of TALE and of GFP remain constant (figure 5). In order to achieve a constant ratio of transcription rates in our genetic circuit, we use the T7 orthogonal transcription system which is transcribed by its own RNA polymerase. We implemented T7 promoters with varying strengths compared to the wild-type, developed by Ryo Komura et al. (2018). More information can be found <a href="2019.igem.org/Team:TUDelft/Design">here</a>.  
+
                    <p>The full kinetic model shows that the expression level of GFP is the same when the transcription rate of TALE and of GFP remain constant (figure 5). In order to achieve a constant ratio of transcription rates in our genetic circuit, we use the T7 orthogonal transcription system which is transcribed by its own RNA polymerase. We implemented T7 promoters with varying strengths compared to the wild-type, developed by Ryo Komura et al. (2018). More information can be found <a href="2019.igem.org/Team:TUDelft/Design">here</a>.  
    </p>
+
                    </p>
    <h3>Wet lab</h3>
+
                    <h3>Wet lab</h3>
    <p>  
+
                    <p>  
  
      We demonstrated the prediction of transcription rate independence when the same ratio in the transcription rate of both genes is maintained. <br> <br>
+
                        <b>We have succesfully demonstrated the prediction of transcription rate independence when the same ratio in the transcription rate of both genes is maintained</b>. This <br> <br>
  
      We made variations of the system where we changed the promoters of both genes to a 50% strength version of that same promoter (figure 6). As a control, we also cloned GFP without repression under the control of these same promoters.<br>  
+
                        We made variations of the system where we changed the promoters of both genes to a 50% strength version of that same promoter (figure 6). As a control, we also cloned GFP without repression under the control of these same promoters.<br>  
        <img src="https://static.igem.org/mediawiki/2019/5/5c/T--TUDelft--promoter_variation_wetlab_model.svg" style="width:60%;border:1px solid #00a6d6;" class="centermodel"
+
                        <img src="https://static.igem.org/mediawiki/2019/5/5c/T--TUDelft--promoter_variation_wetlab_model.svg" style="width:60%;border:1px solid #00a6d6;" class="centermodel"
        alt="TALE system">
+
                            alt="TALE system">
          <figcaption class="centermodel"><b>Figure 6</b>: Steady-state GFP fluorescence measurement of promoter variation using FACS. The graph depicts T7 and 0.5 T7 iFFL systems, expected to give the same fluorescence according to the model. As a control, GFP under control of an unrepressed T7 promoter was used. </figcaption>
+
                        <figcaption class="centermodel"><b>Figure 6</b>: Steady-state GFP fluorescence measurement of promoter variation using FACS. The graph depicts T7 and medium T7 iFFL systems, expected to give the same fluorescence according to the model. As a control, GFP under control of an unrepressed T7 promoter was used. </figcaption>
        <br>
+
                        <br>
      Furthermore, we also tested independence to transcriptional variation by using different IPTG concentrations (figure 7). Again, as a control, we cloned GFP without repression under the control of these same promoters.
+
                        Furthermore, we also tested independence to transcriptional variation by using different IPTG concentrations (figure 7). Again, as a control, we cloned GFP without repression under the control of these same promoters.
      <br> <br>
+
                        <br> <br>
<img src="https://static.igem.org/mediawiki/2019/a/a9/T--TUDelft--IPTGtitration.svg" style="width:60%;border:1px solid #00a6d6;" class="centermodel"
+
                        <img src="https://static.igem.org/mediawiki/2019/a/a9/T--TUDelft--IPTGtitration.svg" style="width:60%;border:1px solid #00a6d6;" class="centermodel"
        alt="TALE system">
+
                            alt="TALE system">
        <figcaption class="centermodel"><b>Figure 7</b>: Steady-state GFP fluorescence measurement of IPTG titration using FACS. The graph depicts a T7 iFFL system induced using different levels of IPTG, which according to the model should give the same result. As a control GFP under control of an unrepressed T7 promoter was used. </figcaption>
+
                        <figcaption class="centermodel"><b>Figure 7</b>: Steady-state GFP fluorescence measurement of IPTG titration using FACS. The graph depicts a T7 iFFL system induced using different levels of IPTG, which according to the model should give the same result. As a control GFP under control of an unrepressed T7 promoter was used. </figcaption>
        <br>
+
                        <br>
      <br>
+
                        <br>
      More information can be found here <a href="2019.igem.org/Team:TUDelft/Results">here</a>.
+
                     
    </p>
+
                        More information can be found here <a href="2019.igem.org/Team:TUDelft/Results">here</a>.
    </div>
+
                        <br><br>
 +
                       
 +
                        <a href="https://static.igem.org/mediawiki/2019/4/48/T--TUDelft--iFFL_ODE_transcription.m" >Download our code here.</a>
 +
                    </p>
 +
                </div>
  
    <div id="TranslationalVariations">  
+
                <div id="TranslationalVariations">  
    <h2>Translational variations</h2>
+
                    <h2>Translational variations</h2>
    <p>  
+
                    <p>  
      As for in transcription, our model steady-state solution tells us that the steady-state expression level of the GOI is only dependent on the rate of translation of the GOI and TALE,  
+
                        As for in transcription, our model steady-state solution tells us that the steady-state expression level of the GOI is only dependent on the rate of translation of the GOI and TALE,  
    </p> <br>
+
                    </p> <br>
    $$G = \left(\frac{c}{c^n}\right) \left(\frac{a_G \color{red}b_\color{red}Gy_T^ny_m^n}{a_T \color{red}b_\color{red}Ty_Gy_m}\right)$$  
+
                    $$G = \left(\frac{c}{c^n}\right) \left(\frac{a_G \color{red}b_\color{red}Gy_T^ny_m^n}{a_T \color{red}b_\color{red}Ty_Gy_m}\right)$$  
    <br>
+
                    <br>
  
    <p>In figure 6 we plot the resulting steady-state solutions as a function of the translation rate of TALE and GOI using the full kinetic model to see how the system without assumptions behaves.  
+
                    <p>In figure 6 we plot the resulting steady-state solutions as a function of the translation rate of TALE and GOI using the full kinetic model to see how the system without assumptions behaves.  
    </p>
+
                    </p>
    <br>
+
                    <br>
  
    <img src="https://static.igem.org/mediawiki/2019/4/4c/T--TUDelft--translationvariation.svg" style="width:85%;border:1px solid #00a6d6;" class="centermodel"
+
                    <img src="https://static.igem.org/mediawiki/2019/4/4c/T--TUDelft--translationvariation.svg" style="width:85%;border:1px solid #00a6d6;" class="centermodel"
      alt="TALE system">
+
                        alt="TALE system">
    <figcaption class="centermodel"><b>Figure 8</b>: Steady-state GOI production while translation rates of both TALE and GOI are changed. The lines indicate the constant rate of the translation rates. </figcaption> <br>
+
                    <figcaption class="centermodel"><b>Figure 8</b>: Steady-state GOI production while translation rates of both TALE and GOI are changed. The lines indicate the constant rate of the translation rates. </figcaption> <br>
    <p>As can be seen in figure 8 the full kinetic model maintains the same level of GFP expression when the translation rates for both genes remain in a constant ratio. To keep the same ratio in translation rates across organisms we used the same ribosome binding site (RBS) for both genes. Using the same RBS ensures that translation initiation for both genes change in a similar manner <cite><a href="https://www.nature.com/articles/nbt.1568"> (Salis et al., 2009)</a></cite>, more on the design choices can be found <a href="2019.igem.org/Team:TUDelft/Results">here</a>. </p>
+
                    <p>As can be seen in figure 8 the full kinetic model maintains the same level of GFP expression when the translation rates for both genes remain in a constant ratio. To keep the same ratio in translation rates across organisms we used the same ribosome binding site (RBS) for both genes. Using the same RBS ensures that translation initiation for both genes change in a similar manner <cite><a href="https://www.nature.com/articles/nbt.1568"> (Salis et al., 2009)</a></cite>, more on the design choices can be found <a href="2019.igem.org/Team:TUDelft/Results">here</a>. </p>
  
    <h3>Wet lab</h3>
+
                    <h3>Wet lab</h3>
    <p>
+
                    <p>
      We tested the prediction of independence to variation in translational rates by implementing the iFFL system, where the output is GFP. We made variations of the system where we change both ribosome binding sites in the same way. As a control, we also cloned GFP without repression into under control of these ribosome binding sites. More information can be found <a href="2019.igem.org/Team:TUDelft/Results">here</a>.
+
                        We tested the prediction of independence to variation in translational rates by implementing the iFFL system, where the output is GFP. We made variations of the system where we change both ribosome binding sites in the same way. As a control, we also cloned GFP without repression into under control of these ribosome binding sites. More information can be found <a href="2019.igem.org/Team:TUDelft/Results">here</a>.
    </p>
+
                    </p>
    </div>
+
                </div>
 +
<div id="Insulations">
 +
                    <h2>Importance of insulation </h2>
 +
                    <p>
 +
                        In our model solutions so far we assumed the promoter of the GOI to be completely insulated from expression from the TALE protein. However, in reality when two transcription units are placed in series leaky expression of the second gene can occur. This is due to the efficiency of the terminator of the first gene <cite><a href="https://www.nature.com/articles/nmeth.2515">(Ying-Ja et al,. 2018)</a></cite>. The iFFL system originally developed by <cite><a href="https://www.nature.com/articles/nbt.4111">Segall-Shapiro et al. (2018)</a></cite>, uses the ECK120029600 terminator for the TALE protein. This terminator has a reported efficiency of 1/612, meaning that for every 612 TALE proteins produced, 1 protein of the GOI is made <cite><a href="https://www.nature.com/articles/nmeth.2515">(Ying-Ja et al,. 2018)</a></cite>. We incorporate this efficiency into our model and solve again for steady-state GOI expression levels to see the effect of terminator efficiency on plasmid copy number independence.
 +
                <br>
 +
                <br>
 +
                <img src="https://static.igem.org/mediawiki/2019/0/0e/T--TUDelft--leakyterminator.svg" style="width:70%;border:1px solid #00a6d6;" class="centermodel"
 +
                    alt="TALE system">
 +
                <figcaption class="centermodel"><b>Figure 11</b>: Comparison of a perfect terminator and a leaky terminator on the expression level at different plasmid copy number. </figcaption>
 +
                <br>
 +
                <p>The model shows that the leaky expression negatively impacts the system's ability to adapt to gene plasmid copy number. We therefore designed our system to have the transcriptional unit of TALE in a different orientation than the transcriptional unit of the GOI.</p>
 +
                <br>
 +
                <br>
 +
                <img src="https://static.igem.org/mediawiki/2019/b/bf/T--TUDelft--TALEsystemimprove.png" style="width:70%;border:1px solid #00a6d6;" class="centermodel"
 +
                    alt="TALE system">
 +
                <figcaption class="centermodel"><b>Figure 12</b>: Design of genetic circuit, to circumvent the issue of leaky expression </figcaption>
  
    <div id="CodonUsage">  
+
                <br>
    <h2>Codon Usage - Cross-species codon harmonization</h2>
+
            </div>
 +
                <div id="CodonUsage">  
 +
                    <h2>Codon Usage - Cross-species codon harmonization</h2>
  
    <p>When transferring genetic circuits across different organisms, translation is not only dependent on translation initiation. The translation also depends on codon usage. During expressing of heterologous protein in new bacterial host cells, it has shown altered protein levels compared to that in the original microorganism. One of the reasons for a lower expression level is the variance in codon usage between the original organism and the new host cell <a href="#Angov2008">(Angov et al., 2008)</a>.  
+
                    <p>When transferring genetic circuits across different organisms, translation is not only dependent on translation initiation. The translation also depends on codon usage. During expressing of heterologous protein in new bacterial host cells, it has shown altered protein levels compared to that in the original microorganism. One of the reasons for a lower expression level is the variance in codon usage between the original organism and the new host cell <a href="#Angov2008">(Angov et al., 2008)</a>.  
      The foundation of the variance in codon usage is written in the DNA sequence itself.
+
                        The foundation of the variance in codon usage is written in the DNA sequence itself.
      Protein structures are dependent on the DNA sequence, which is translated into a functional protein through two subsequent cellular processes: transcription and translation <a href="#Angov2008">(Angov et al., 2008)</a>. <br>
+
                        Protein structures are dependent on the DNA sequence, which is translated into a functional protein through two subsequent cellular processes: transcription and translation <a href="#Angov2008">(Angov et al., 2008)</a>. <br>
      <br>
+
                        <br>
      In general bacterial cells contain 20 different amino acids encoded by 64 codons (excluding 3 stop codons) and has resulted in a phenomenon called synonymous codon usage. Synonymous codon usage means that most of the 20 amino acids are encoded by more than one codon. <a href="#Nascimento2018"> (Nascimento et al., 2018)</a>. Nascimento, et al. (2018) have proven that cells are making great use of the codon choice that this offers, since codon usage directly affects both the level of mRNA copies and the translation rate. They showed that proteins expressed at high levels have more mRNA copies and contain more frequently used codons in order to speed up the translation rate <a href="#Nascimento2018"> (Nascimento et al., 2018)</a>.
+
                        In general bacterial cells contain 20 different amino acids encoded by 64 codons (excluding 3 stop codons) and has resulted in a phenomenon called synonymous codon usage. Synonymous codon usage means that most of the 20 amino acids are encoded by more than one codon. <a href="#Nascimento2018"> (Nascimento et al., 2018)</a>. Nascimento, et al. (2018) have proven that cells are making great use of the codon choice that this offers, since codon usage directly affects both the level of mRNA copies and the translation rate. They showed that proteins expressed at high levels have more mRNA copies and contain more frequently used codons in order to speed up the translation rate <a href="#Nascimento2018"> (Nascimento et al., 2018)</a>.
      In order to increase the expression level of the heterologous protein in the host cell, new codon optimization tools were developed. However, it remains difficult to predict which tool will design the optimal sequences <a href="#Mignon2008">(Mignon et al., 2008)</a>. The codon optimization tools available now can be divided into two main groups based on how the tool's algorithm functions: <br>
+
                        In order to increase the expression level of the heterologous protein in the host cell, new codon optimization tools were developed. However, it remains difficult to predict which tool will design the optimal sequences <a href="#Mignon2008">(Mignon et al., 2008)</a>. The codon optimization tools available now can be divided into two main groups based on how the tool's algorithm functions: <br>
  
      <br>
+
                        <br>
      <br>
+
                        <br>
  
  
  
  
    <ol>
+
                    <ol>
      <li> <b><u>Codon optimization tools</u></b>: The basic idea of a codon optimisation tool is to achieve the highest translation rate possible. The translation rate is increased by substituting each codon with the codon that is used mosed frequently for the corresponding amino acid and keep the ribosomal binding site (RBS) freely accessible for the ribosomal subunit RBS by avoiding hairpin formation at the translation initiation side <a href="#Puigbo2008"> (Puigbò et al., 2018)</a>. <br> The relative codon frequencies are calculated through the Codon Adaptation Index (CAI) as shown in the following equation. In this equation $w_i$ is the CAI, $f_i$ is the frequency of a particular codon, and $max(f_i)$ is the codon that is used most frequently for the corresponding amino acid. </li>  
+
                        <li> <b><u>Codon optimization tools</u></b>: The basic idea of a codon optimisation tool is to achieve the highest translation rate possible. The translation rate is increased by substituting each codon with the codon that is used mosed frequently for the corresponding amino acid and keep the ribosomal binding site (RBS) freely accessible for the ribosomal subunit RBS by avoiding hairpin formation at the translation initiation side <a href="#Puigbo2008"> (Puigbò et al., 2018)</a>. <br> The relative codon frequencies are calculated through the Codon Adaptation Index (CAI) as shown in the following equation. In this equation $w_i$ is the CAI, $f_i$ is the frequency of a particular codon, and $max(f_i)$ is the codon that is used most frequently for the corresponding amino acid. </li>  
      <p>$$w_i = \frac{f_i}{max( f_i )}$$</p>  
+
                        <p>$$w_i = \frac{f_i}{max( f_i )}$$</p>  
      <li> <b><u>Codon harmonization tools</u></b>: The basic idea of a codon harmonization tool is to mimic the native translation rate in the host organism by using rare codons at specific places and avoiding hairpin formation as much as possible. The codon usage of the original microorganism functions as the reference point <a href="#Athey2017">(Athey et al., 2017)</a>. This approach allows pre-folding of the protein during translation in order to reduce the chance of the protein misfolding as much as possible (<b>Figure 9</b>). </li>
+
                        <li> <b><u>Codon harmonization tools</u></b>: The basic idea of a codon harmonization tool is to mimic the native translation rate in the host organism by using rare codons at specific places and avoiding hairpin formation as much as possible. The codon usage of the original microorganism functions as the reference point <a href="#Athey2017">(Athey et al., 2017)</a>. This approach allows pre-folding of the protein during translation in order to reduce the chance of the protein misfolding as much as possible (<b>Figure 9</b>). </li>
      <figure>
+
                        <figure>
      <img src="https://static.igem.org/mediawiki/2019/0/0d/T--TUDelft--Ozzy.png" style="width:100%"
+
                            <img src="https://static.igem.org/mediawiki/2019/0/0d/T--TUDelft--Ozzy.png" style="width:100%"
        alt="Translation rate">
+
                                alt="Translation rate">
      <figcaption><b>Figure 9</b>: Schematic representation of protein translation. The green parts are encoded with high frequency codons in order to speed up the translation rate. The red parts are encoded by rare codons in order to slow down the translation rate, which limits misfolding of proteins by creating a small time window for protein pre-folding. Codon harmonization aims to create the same codon usage pattern as the native host in order to increase the amount of functional protein.</figcaption>
+
                            <figcaption><b>Figure 9</b>: Schematic representation of protein translation. The green parts are encoded with high frequency codons in order to speed up the translation rate. The red parts are encoded by rare codons in order to slow down the translation rate, which limits misfolding of proteins by creating a small time window for protein pre-folding. Codon harmonization aims to create the same codon usage pattern as the native host in order to increase the amount of functional protein.</figcaption>
      </figure>  
+
                        </figure>  
    </ol>
+
                    </ol>
  
    <br>
+
                    <br>
    <br>
+
                    <br>
    <p> The current limitation of both codon adaptation tools, is that it is not possible to sequence one single DNA part that has been adapted to function across multiple species. we boosted our project by creating the first cross-species codon harmonization tool. This harmonization tool provides the user with a single DNA coding sequence that will yield the same protein expression level in different bacterial host cells. The codon harmonization approach as explained above forms the core of our algorithm. We modified this algorithm by making use of statistical analysis method of the least square variance. Furthermore, we made our tool BioBrick RFC compatible by removing type II standard restriction sites. </p>
+
                    <p> The current limitation of both codon adaptation tools, is that it is not possible to sequence one single DNA part that has been adapted to function across multiple species. we boosted our project by creating the first cross-species codon harmonization tool. This harmonization tool provides the user with a single DNA coding sequence that will yield the same protein expression level in different bacterial host cells. The codon harmonization approach as explained above forms the core of our algorithm. We modified this algorithm by making use of statistical analysis method of the least square variance. Furthermore, we made our tool BioBrick RFC compatible by removing type II standard restriction sites. </p>
    <br>
+
                    <br>
  
    <ul class="accordion">
+
                    <ul class="accordion">
      <li>
+
                        <li>
      <a class="toggle" href="javascript:void(0);" ><b>Click here to find out more about the codon harmonization</b><span style="float:right;"><b>&#xfe40;</b></span></a>
+
                            <a class="toggle" href="javascript:void(0);" ><b>Click here to find out more about the codon harmonization</b><span style="float:right;"><b>&#xfe40;</b></span></a>
      <ul class="inner accordion">
+
                            <ul class="inner accordion">
        <h4>Preselection data</h4>
+
                                <h4>Preselection data</h4>
  
        <p>Our harmonization tool is based on a database containing the codon usage of 152903 microorganisms obtained from a paper published by <a href="#Athey2017"> (Athey et al., (2017)</a>. Since working with such a big database slows down the tool, we designed a MATLAB script that pre-selects the data of microorganisms of your interest. This MATLAB script is available in the supplementary list below. <br>
+
                                <p>Our harmonization tool is based on a database containing the codon usage of 152903 microorganisms obtained from a paper published by <a href="#Athey2017"> (Athey et al., (2017)</a>. Since working with such a big database slows down the tool, we designed a MATLAB script that pre-selects the data of microorganisms of your interest. This MATLAB script is available in the supplementary list below. <br>
  
        The taxonomy identification number (taxid) is used as input for the preselection in order to reduce the chance if a mistake based on type error of the name. Each organism name is converted into a specific taxid in order to make it easier to search for a specific organism or strain in the NCBI Taxonomy database <a href="#Federhen2011"> (Federhen et al., 2011)</a>. <br>
+
                                    The taxonomy identification number (taxid) is used as input for the preselection in order to reduce the chance if a mistake based on type error of the name. Each organism name is converted into a specific taxid in order to make it easier to search for a specific organism or strain in the NCBI Taxonomy database <a href="#Federhen2011"> (Federhen et al., 2011)</a>. <br>
        As mentioned before, the codon harmonization approach will function as the core of our code, which means that we will create the same codon usage pattern as the native host in order to increase the amount of functional protein expression. The native organisms own codon usage functions as the reference point for the tool.  
+
                                    As mentioned before, the codon harmonization approach will function as the core of our code, which means that we will create the same codon usage pattern as the native host in order to increase the amount of functional protein expression. The native organisms own codon usage functions as the reference point for the tool.  
        <br>
+
                                    <br>
  
        In our case we used eGFP as the heterologous protein and <i>Escherichia coli</i> strain BL21(DE3) as native host (taxid 469008). The actual native organism for the eGFP protein is the jelly fish <I>Aequorea victoria</I> (taxid 6100). However, we first designed our tool to function in bacterial hosts before expanding the tool further to eukaryotic cell. <br>
+
                                    In our case we used eGFP as the heterologous protein and <i>Escherichia coli</i> strain BL21(DE3) as native host (taxid 469008). The actual native organism for the eGFP protein is the jelly fish <I>Aequorea victoria</I> (taxid 6100). However, we first designed our tool to function in bacterial hosts before expanding the tool further to eukaryotic cell. <br>
        When generating the filtered database containing data of only the organisms of interest, we designed the code in such a way that the first input row will contain data of the reference organism and the rows below contain data of the potential new host organisms. Each new row is a new potential organism. A schematic representation of the preselection is shown in <b>Figure 10</b>. </p>  
+
                                    When generating the filtered database containing data of only the organisms of interest, we designed the code in such a way that the first input row will contain data of the reference organism and the rows below contain data of the potential new host organisms. Each new row is a new potential organism. A schematic representation of the preselection is shown in <b>Figure 10</b>. </p>  
  
        <figure>
+
                                <figure>
        <img src="https://static.igem.org/mediawiki/2019/f/fb/T--TUDelft--data_filter.gif" style="width:100%"
+
                                    <img src="https://static.igem.org/mediawiki/2019/f/fb/T--TUDelft--data_filter.gif" style="width:100%"
          alt="Translation rate">
+
                                        alt="Translation rate">
        <figcaption><b>Figure 10:</b> A schematic representation of the preselection code to obtain data of only the organisms of interest from the main database. The top row corresponds with the data of the codon usage of the reference organism, while each following row corresponds with the codon usage of the organisms of interest. In the preselection code, the taxid for each organisms of interest is used as query to find the right row in the main database. The selected row are combined together to form a filtered database containing the codon usage of only the reference organism and the hosts of interest. </figcaption>
+
                                    <figcaption><b>Figure 10:</b> A schematic representation of the preselection code to obtain data of only the organisms of interest from the main database. The top row corresponds with the data of the codon usage of the reference organism, while each following row corresponds with the codon usage of the organisms of interest. In the preselection code, the taxid for each organisms of interest is used as query to find the right row in the main database. The selected row are combined together to form a filtered database containing the codon usage of only the reference organism and the hosts of interest. </figcaption>
        </figure>  
+
                                </figure>  
        <br>
+
                                <br>
   
+
  
        <h4>Harmonization code</h4>
 
  
 +
                                <h4>Harmonization code</h4>
  
        Before the harmonized coding sequence for the organisms of interest is generated, the deviation of the endogenous GC content of each organism of interest from the mean GC content is calculated. This calculation has been added to inform the user whether the GC concert of the organisms of interest are to different from each other. In case the GC content for one of the organisms deviates more then 5% from the mean, a notification will occur to inform the user that the GC content is to different from each other and that might result in a harmoinzed coding sequence, which not functions in the same way.
 
  
        <p>First, the codon frequency is calculated in the same as was described by Athey et al. (2017). We used this formula instead of the formula for CAI (equation ....), since the CAI calculation (formula 1) requires a reference gene. However, our system functions across species, so an adapted version of the CAI is used, as shown in equation obtained from Athey et al. (2017). This adapted version calculates the frequency of the used codon for that specific amino acid instead of calculating the relative codon use for that specific amino acid. .........
+
                                Before the harmonized coding sequence for the organisms of interest is generated, the deviation of the endogenous GC content of each organism of interest from the mean GC content is calculated. This calculation has been added to inform the user whether the GC concert of the organisms of interest are to different from each other. In case the GC content for one of the organisms deviates more then 5% from the mean, a notification will occur to inform the user that the GC content is to different from each other and that might result in a harmoinzed coding sequence, which not functions in the same way.  
        </p>
+
        $$freq_{codon,i} = \frac{codon}{\sum codon,i}$$
+
        <br>
+
        <br>
+
        <p>
+
        Secondly, the variance for each codon position is calculated separately as a intermediate step for the calculation of the least square variance. During the calculation of the variance, the codon frequency at that particular sequence position is also taken into account in order to remove outliers as much as possible. The calculated variance for each position is ordered from lowest to highest.  
+
        <br>
+
        <br>
+
        In the first iteration of generating the final sequence, we use the lowest variance codon at every position. The generated sequence will go through screening for type II restriction enzyme recognition sequences to make it MoClo compatible, and easier to use in a construct. In case a site is found, the codon at that particular position will be substituted with the synonymous codon that has the second lowest variance. When going through this iteration cycle multiple times we derive a single nucleotide sequence cleared from type II restriction sites and codon harmonized for all organisms of interest in order to achieve the same translation rate in each organism of interest.
+
  
 +
                                <p>First, the codon frequency is calculated in the same as was described by Athey et al. (2017). We used this formula instead of the formula for CAI (equation ....), since the CAI calculation (formula 1) requires a reference gene. However, our system functions across species, so an adapted version of the CAI is used, as shown in equation obtained from Athey et al. (2017). This adapted version calculates the frequency of the used codon for that specific amino acid instead of calculating the relative codon use for that specific amino acid. .........
 +
                                </p>
 +
                                $$freq_{codon,i} = \frac{codon}{\sum codon,i}$$
 +
                                <br>
 +
                                <br>
 +
                                <p>
 +
                                    Secondly, the variance for each codon position is calculated separately as a intermediate step for the calculation of the least square variance. During the calculation of the variance, the codon frequency at that particular sequence position is also taken into account in order to remove outliers as much as possible. The calculated variance for each position is ordered from lowest to highest.
 +
                                    <br>
 +
                                    <br>
 +
                                    In the first iteration of generating the final sequence, we use the lowest variance codon at every position. The generated sequence will go through screening for type II restriction enzyme recognition sequences to make it MoClo compatible, and easier to use in a construct. In case a site is found, the codon at that particular position will be substituted with the synonymous codon that has the second lowest variance. When going through this iteration cycle multiple times we derive a single nucleotide sequence cleared from type II restriction sites and codon harmonized for all organisms of interest in order to achieve the same translation rate in each organism of interest.
  
        The deviation of the endogenous GC content of each organism of interest from the mean GC content across these organisms. ( This output is added additionally to inform the user wether the GC content of the organisms of interest are to differnt from each other. In case the GC content are to different, a notification will pop up in the screen.)
 
        </p>
 
  
 +
                                    The deviation of the endogenous GC content of each organism of interest from the mean GC content across these organisms. ( This output is added additionally to inform the user wether the GC content of the organisms of interest are to differnt from each other. In case the GC content are to different, a notification will pop up in the screen.)
 +
                                </p>
  
        <h4>Inputs and Outputs </h4>
 
  
        <p>Our codon harmonization tool requires three inputs and will generate two outputs. <br>
+
                                <h4>Inputs and Outputs </h4>
        As input the harmonization script requires the following three inputs:</p>
+
        <ol>
+
        <li> The filtered database containing only the codon usage of the organisms of interest and the reference organisms codon usage (data_formatted = output file name of the filtered database).</li>
+
        <li> Database containing recognition sites for type II restriction enzymes (restriction_enzymes_database).</li>
+
        <li> Nucleotide sequence of the gene of interest that will be harmonize. </li>
+
        </ol>
+
        <br>
+
        <p> As output the codon harmonization script will generate the following two outputs:</p>
+
        <ol>
+
        <li> A codon harmonized nucleotide sequence usable in all organisms of interest.</li>
+
        <li> The deviation of the endogenous GC content of each organism of interest from the mean GC content across these organisms. (This output is added additionally to inform the user wether the GC content of the organisms of interest are to differnt from each other. In case the GC content are to different, a notification will pop up in the screen.)</li>
+
        </ol>
+
        <br>
+
        <li>
+
  
 +
                                <p>Our codon harmonization tool requires three inputs and will generate two outputs. <br>
 +
                                    As input the harmonization script requires the following three inputs:</p>
 +
                                <ol>
 +
                                    <li> The filtered database containing only the codon usage of the organisms of interest and the reference organisms codon usage (data_formatted = output file name of the filtered database).</li>
 +
                                    <li> Database containing recognition sites for type II restriction enzymes (restriction_enzymes_database).</li>
 +
                                    <li> Nucleotide sequence of the gene of interest that will be harmonize. </li>
 +
                                </ol>
 +
                                <br>
 +
                                <p> As output the codon harmonization script will generate the following two outputs:</p>
 +
                                <ol>
 +
                                    <li> A codon harmonized nucleotide sequence usable in all organisms of interest.</li>
 +
                                    <li> The deviation of the endogenous GC content of each organism of interest from the mean GC content across these organisms. (This output is added additionally to inform the user wether the GC content of the organisms of interest are to differnt from each other. In case the GC content are to different, a notification will pop up in the screen.)</li>
 +
                                </ol>
 +
                                <br>
 +
                                <li>
  
        <b> Obtained output </b>
 
        <br>
 
  
        For the validation for our model we used the microorganisms listed below:
+
                                    <b> Obtained output </b>
        <table id="tabletu">  
+
                                    <br>
  
          <tr>
+
                                    For the validation for our model we used the microorganisms listed below:
          <th> Name organism </th>
+
                                    <table id="tabletu">  
          <th> taxid</th>
+
  
          </tr>
+
                                        <tr>
 +
                                            <th> Name organism </th>
 +
                                            <th> taxid</th>
  
 +
                                        </tr>
  
          <tr>
 
          <td> <I> Escherichia coli BL21 (DE3)</i> </td>
 
          <td> 469008 </td>
 
  
          </tr>
+
                                        <tr>
 +
                                            <td> <I> Escherichia coli BL21 (DE3)</i> </td>
 +
                                            <td> 469008 </td>
  
          <tr>
+
                                        </tr>
          <td> <I> Vibrio natriegens</i> NBRC 15636 = ATCC 14048 = DSM 759) </td>
+
          <td> 1219067 </td>
+
  
          </tr>
+
                                        <tr>
          <tr>
+
                                            <td> <I> Vibrio natriegens</i> NBRC 15636 = ATCC 14048 = DSM 759) </td>
          <td> <I> Bacillus subtilis </i> subsp. subtilis str. 16 </td>
+
                                            <td> 1219067 </td>
          <td> 224308 </td>
+
  
          </tr>
+
                                        </tr>
 +
                                        <tr>
 +
                                            <td> <I> Bacillus subtilis </i> subsp. subtilis str. 16 </td>
 +
                                            <td> 224308 </td>
  
        </table>
+
                                        </tr>
  
 +
                                    </table>
  
  
        <p>As input sequence we use the coding sequence for eGFP ( <a target="_blank" href="https://static.igem.org/mediawiki/2019/1/15/T--TUDelft--inputsequence.txt">click here to get the sequence</a>). <br>
 
  
          The generated output sequence is here (<a target="_blank" href="https://static.igem.org/mediawiki/2019/3/32/T--TUDelft--outputsequence.txt">click here to get the sequence</a>).  
+
                                    <p>As input sequence we use the coding sequence for eGFP ( <a target="_blank" href="https://static.igem.org/mediawiki/2019/1/15/T--TUDelft--inputsequence.txt">click here to get the sequence</a>). <br>
        <p>
+
  
        </p>
+
                                        The generated output sequence is here (<a target="_blank" href="https://static.igem.org/mediawiki/2019/3/32/T--TUDelft--outputsequence.txt">click here to get the sequence</a>).
 +
                                    <p>
  
        </li>
+
                                    </p>
  
 +
                                </li>
  
      </ul>
 
      </li>
 
    </ul>
 
    <ul class="accordion">
 
      <li>
 
      <a class="toggle" href="javascript:void(0);" ><b>Experimental Validation</b><span style="float:right;"><b>&#xfe40;</b></span></a>
 
      <div class="inner accordion "> 
 
  
        <p> text here </p>
+
                            </ul>
      </div>
+
                        </li>
      </li>
+
                    </ul>  
    </ul>
+
                    <ul class="accordion">
    </div>
+
                        <li>
 +
                            <a class="toggle" href="javascript:void(0);" ><b>Experimental Validation</b><span style="float:right;"><b>&#xfe40;</b></span></a>
 +
                            <div class="inner accordion ">
  
 +
                                <p> text here </p>
 +
                            </div>
 +
                        </li>
 +
                    </ul>
 +
                </div>
  
    <div id="Insulations">  
+
            <h3>References</h3>
    <h2>Importance of insulation </h2>
+
            <div id="reference1" class="reftu">
    <p>
+
                <ul style="list-style:none;">
      In our model solutions so far we assumed the promoter of the GOI to be completely insulated from expression from the TALE protein. However, in reality when two transcription units are placed in series leaky expression of the second gene can occur. This is due to the efficiency of the terminator of the first gene <cite><a href="https://www.nature.com/articles/nmeth.2515">(Ying-Ja et al,. 2018)</a></cite>. The iFFL system originally developed by <cite><a href="https://www.nature.com/articles/nbt.4111">Segall-Shapiro et al. (2018)</a></cite></figcaption>, uses the ECK120029600 terminator for the TALE protein. This terminator has a reported efficiency of 1/612, meaning that for every 612 TALE proteins produced, 1 protein of the GOI is made <cite><a href="https://www.nature.com/articles/nmeth.2515">(Ying-Ja et al,. 2018)</a></cite>. We incorporate this efficiency into our model and solve again for steady-state GOI expression levels to see the effect of terminator efficiency on plasmid copy number independence.
+
                    <li>
<br>
+
                        <a id="RBS calculator" href="http://www.nature.com/articles/nbt.1568" target="_blank">
      <br>
+
                            Salis, H. M., et al. (2009). "Automated design of synthetic ribosome binding sites to control protein expression." <i>Nature Biotechnology </i> 27(10): 946-950.
    <img src="https://static.igem.org/mediawiki/2019/0/0e/T--TUDelft--leakyterminator.svg" style="width:70%;border:1px solid #00a6d6;" class="centermodel"
+
                        </a>
      alt="TALE system">
+
                    </li>
    <figcaption class="centermodel"><b>Figure 11</b>: Comparison of a perfect terminator and a leaky terminator on the expression level at different plasmid copy number. </figcaption>
+
                    <li>
    <br>
+
                        <a id="Segall-Shapiro" href="http://doi.org/10.1038/nbt.4111" target="_blank">
    <p>The model shows that the leaky expression negatively impacts the system's ability to adapt to gene plasmid copy number. We therefore designed our system to have the transcriptional unit of TALE in a different orientation than the transcriptional unit of the GOI.</p>
+
                            Segall-Shapiro, T. H., <i>et al.</i> (2018). "Engineered promoters enable constant gene expression at any plasmid copy number in bacteria." <i>Nature Biotechnology</i> 36: 352.
    <br>
+
                        </a>
    <br>
+
                    </li>
    <img src="https://static.igem.org/mediawiki/2019/b/bf/T--TUDelft--TALEsystemimprove.png" style="width:70%;border:1px solid #00a6d6;" class="centermodel"
+
      alt="TALE system">
+
    <figcaption class="centermodel"><b>Figure 12</b>: Design of genetic circuit, to circumvent the issue of leaky expression </figcaption>
+
  
    <br>
+
                    <li>
  </div>
+
                        <a id="Doyle" href="https://lib.dr.iastate.edu/cgi/viewcontent.cgi?article=4139&context=etd" target="_blank">
 +
                            Doyle, E. L. (2013). Computational and experimental analysis of TALeffector-DNA binding. Plant Pathology and Microbiology, Iowa State University. Dissertation.
 +
                        </a>
 +
                    </li>
 +
                    <li>
 +
                        <a id="Yang" href="http://www.ncbi.nlm.nih.gov/pubmed/29061047" target="_blank">
 +
                            Yang, S., et al. (2018). "Construction and Characterization of Broad-Spectrum Promoters for Synthetic Biology." <u> ACS Synthetic Biology </u> 7(1): 287-291.
 +
                        </a>
 +
                    </li>
 +
                    <li>
 +
                        <a id="Jain" href="https://academic.oup.com/femsle/article/348/2/87/731695" target="_blank">
 +
                            Jain, A. and P. Srivastava (2013). "Broad host range plasmids." <i> FEMS Microbiology Letters </i>348(2): 87-96.
 +
                        </a>
  
  <h3>References</h3>
+
                    </li>
  <div id="reference1" class="reftu">
+
                    <li>
    <ul style="list-style:none;">
+
                        <a id="Ryo" href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0196905" target="_blank">
    <li>
+
                            Ryo Komura, W. A., Keisuke Motone, Atsushi Satomura, Mitsuyoshi Ueda (2018). "High-throughput evaluation of T7 promoter variants using biased randomization and DNA barcoding." <i>PLOS ONE</i>.
      <a id="RBS calculator" href="http://www.nature.com/articles/nbt.1568" target="_blank">
+
                        </a>
      Salis, H. M., et al. (2009). "Automated design of synthetic ribosome binding sites to control protein expression." <i>Nature Biotechnology </i> 27(10): 946-950.
+
      </a>
+
    </li>
+
    <li>
+
      <a id="Segall-Shapiro" href="http://doi.org/10.1038/nbt.4111" target="_blank">
+
      Segall-Shapiro, T. H., <i>et al.</i> (2018). "Engineered promoters enable constant gene expression at any plasmid copy number in bacteria." <i>Nature Biotechnology</i> 36: 352.
+
      </a>
+
    </li>
+
  
    <li>
+
                    </li>
      <a id="Doyle" href="https://lib.dr.iastate.edu/cgi/viewcontent.cgi?article=4139&context=etd" target="_blank">
+
                    <li>
      Doyle, E. L. (2013). Computational and experimental analysis of TALeffector-DNA binding. Plant Pathology and Microbiology, Iowa State University. Dissertation.
+
                        <a id="Angov2008" href="https://doi.org/10.1371/journal.pone.0002189" target="_blank">
      </a>
+
                            Angov, E., Hillier, C. J., Kincaid, R. L., & Lyon, J. A. (2008). Heterologous protein expression is enhanced by harmonizing the codon usage frequencies of the target gene with those of the expression host. <i>FEMS PLoS ONE</i>, 3(5): e2189.  
    </li>
+
                        </a>
    <li>
+
                    </li>
      <a id="Yang" href="http://www.ncbi.nlm.nih.gov/pubmed/29061047" target="_blank">
+
                    <li>
      Yang, S., et al. (2018). "Construction and Characterization of Broad-Spectrum Promoters for Synthetic Biology." <u> ACS Synthetic Biology </u> 7(1): 287-291.
+
                        <a id="Nascimento2018" href="https://doi.org/10.7554/eLife.32467" target="_blank">
      </a>
+
                            Nascimento, J. de F., Kelly, S., Sunter, J., & Carrington, M. (2018). Codon choice directs constitutive mRNA levels in trypanosomes. <I>ELife</I>.
    </li>
+
                        </a>
    <li>
+
                    </li>
      <a id="Jain" href="https://academic.oup.com/femsle/article/348/2/87/731695" target="_blank">
+
                    <li>
      Jain, A. and P. Srivastava (2013). "Broad host range plasmids." <i> FEMS Microbiology Letters </i>348(2): 87-96.
+
                        <a id="Mignon2018" href="http://doi.org/10.1002/1873-3468.13046" target="_blank">
      </a>
+
                            Mignon, C., Mariano, N., Stadthagen, G., Lugari, A., Lagoutte, P., Donnat, S., Werle, B. (2018). Codon harmonization – going beyond the speed limit for protein expression. <I>FEBS Letters</I>.
 +
                        </a>
 +
                    </li>
 +
                    <li>
 +
                        <a id="Athey2017" href="https://doi.org/10.1371/journal.pone.0002189" target="_blank">
 +
                            Athey, J., Alexaki, A., Osipova, E., Rostovtsev, A., Santana-Quintero, L. V., Katneni, U., … Kimchi-Sarfaty, C. (2017). A new and updated resource for codon usage tables. <I> BMC bioinformatics</I>, 18(1), 391.
 +
                        </a>
 +
                    </li>
 +
                    <li>
 +
                        <a id="Puigbo2008" href="http://doi.org/10.1371/journal.pone.0002189" target="_blank">
 +
                            Puigbò, P., Bravo, I. G., & Garcia-Vallve, S. (2008). CAIcal: a combined set of tools to assess codon usage adaptation. <I>Biology direct</I>, 3, 38.
 +
                        </a>
 +
                    </li>
 +
                    <li>
 +
                        <a id="Federhen2011" href="http://doi.org/10.1371/journal.pone.0002189" target="_blank">
 +
                            Federhen, S. (2011, April 7). Entrez Taxonomy Quick Start. https://www.ncbi.nlm.nih.gov/books/NBK53758/.  
 +
                        </a>
 +
                    </li>
  
    </li>
+
                </ul>
    <li>
+
            </div>
      <a id="Ryo" href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0196905" target="_blank">
+
      Ryo Komura, W. A., Keisuke Motone, Atsushi Satomura, Mitsuyoshi Ueda (2018). "High-throughput evaluation of T7 promoter variants using biased randomization and DNA barcoding." <i>PLOS ONE</i>.
+
      </a>
+
  
    </li>
+
        </div>
    <li>
+
        <script> $('.toggle').click(function(e) {
      <a id="Angov2008" href="https://doi.org/10.1371/journal.pone.0002189" target="_blank">
+
                e.preventDefault();
      Angov, E., Hillier, C. J., Kincaid, R. L., & Lyon, J. A. (2008). Heterologous protein expression is enhanced by harmonizing the codon usage frequencies of the target gene with those of the expression host. <i>FEMS PLoS ONE</i>, 3(5): e2189.
+
      </a>
+
    </li>
+
    <li>
+
      <a id="Nascimento2018" href="https://doi.org/10.7554/eLife.32467" target="_blank">
+
      Nascimento, J. de F., Kelly, S., Sunter, J., & Carrington, M. (2018). Codon choice directs constitutive mRNA levels in trypanosomes. <I>ELife</I>.
+
      </a>
+
    </li>
+
    <li>
+
      <a id="Mignon2018" href="http://doi.org/10.1002/1873-3468.13046" target="_blank">
+
      Mignon, C., Mariano, N., Stadthagen, G., Lugari, A., Lagoutte, P., Donnat, S., Werle, B. (2018). Codon harmonization – going beyond the speed limit for protein expression. <I>FEBS Letters</I>.
+
      </a>
+
    </li>
+
    <li>
+
      <a id="Athey2017" href="https://doi.org/10.1371/journal.pone.0002189" target="_blank">
+
      Athey, J., Alexaki, A., Osipova, E., Rostovtsev, A., Santana-Quintero, L. V., Katneni, U., … Kimchi-Sarfaty, C. (2017). A new and updated resource for codon usage tables. <I> BMC bioinformatics</I>, 18(1), 391.
+
      </a>
+
    </li>
+
    <li>
+
      <a id="Puigbo2008" href="http://doi.org/10.1371/journal.pone.0002189" target="_blank">
+
      Puigbò, P., Bravo, I. G., & Garcia-Vallve, S. (2008). CAIcal: a combined set of tools to assess codon usage adaptation. <I>Biology direct</I>, 3, 38.
+
      </a>
+
    </li>
+
    <li>
+
      <a id="Federhen2011" href="http://doi.org/10.1371/journal.pone.0002189" target="_blank">
+
      Federhen, S. (2011, April 7). Entrez Taxonomy Quick Start. https://www.ncbi.nlm.nih.gov/books/NBK53758/.
+
      </a>
+
    </li>
+
  
    </ul>
+
                var $this = $(this);
  </div>
+
  
  </div>
+
                if ($this.next().hasClass('show')) {
  <script> $('.toggle').click(function(e) {
+
                    $this.next().removeClass('show');
    e.preventDefault();
+
                    $this.next().slideUp(350);
 +
                } else {
 +
                    $this.parent().parent().find('li .inner').removeClass('show');
 +
                    $this.parent().parent().find('li .inner').slideUp(350);
 +
                    $this.next().toggleClass('show');
 +
                    $this.next().slideToggle(350);
 +
                }
 +
            });
  
    var $this = $(this);
+
            $(document).on("click", ".toggle-text-buttom", function() {
  
    if ($this.next().hasClass('show')) {
 
    $this.next().removeClass('show');
 
    $this.next().slideUp(350);
 
    } else {
 
    $this.parent().parent().find('li .inner').removeClass('show');
 
    $this.parent().parent().find('li .inner').slideUp(350);
 
    $this.next().toggleClass('show');
 
    $this.next().slideToggle(350);
 
    }
 
  });
 
  
  $(document).on("click", ".toggle-text-buttom", function() {
+
                if ($(this).text() == "Read More") {
  
 +
                    $(this).text("Read Less");
  
    if ($(this).text() == "Read More") {
+
                    // Use a jquery selector using the `.attr()` of the link
 +
                    $("#toggle-textm-" + $(this).attr("toggle-textm")).slideDown();
  
    $(this).text("Read Less");
+
                } else {
  
    // Use a jquery selector using the `.attr()` of the link
+
                    $(this).text("Read More");
    $("#toggle-textm-" + $(this).attr("toggle-textm")).slideDown();
+
  
    } else {
+
                    // Use a jquery selector using the `.attr()` of the link
 +
                    $("#toggle-textm-" + $(this).attr("toggle-textm")).slideUp();
  
    $(this).text("Read More");
+
                }
  
    // Use a jquery selector using the `.attr()` of the link
+
            });
    $("#toggle-textm-" + $(this).attr("toggle-textm")).slideUp();
+
  
    }
+
        </script>
 
+
    </body>
  });
+
 
+
  </script>
+
</body>
+
  
 
</html>
 
</html>
 
{{:Team:TUDelft/Footer}}
 
{{:Team:TUDelft/Footer}}

Revision as of 22:11, 19 October 2019

Sci-Phi 29

Overview

With our modeling, we aimed to apply a control systems approach to achieve stability of gene expression across bacterial species. The behavior of genetic circuits depends on a lot of variables, most of which change when transferring to different organisms. To make expression host-independent, we included an incoherent feed-forward loop (iFFL) in our design. An iFFL can be used to make the output of a system independent of the input. The input of a genetic circuit can be many variables, such as plasmid copy number, transcriptional and translational rates. We therefore, wanted to apply the iFFL system to make our genetic circuit independent to plasmid copy number, transcriptional and translational rates.

We made a mathematical model of a genetic implementation of the iFFL and derived a steady-state solution analytically. Our analytical steady-state solution of this loop showed that expression was completely independent of plasmid copy number and transcriptional-translational rates. We verified this analytical solution by the implementation of a full kinetic model.

The key variables in the design of genetic circuits are plasmid copy number and transcriptional-translational rates. These variables determine the steady-state levels of gene expression. However, when transferring genetic circuits between organisms, these variables change in unpredictable ways.

promoter SOBL

Promoters have different strengths in different organisms. Some promoters only work in a very narrow range of bacterial species (Yang et al., 2018). To circumvent host-related changes, we chose our system orthogonal to the host. We implement orthogonality in our system by using T7 RNA polymerase. However, orthogonal transcription might not behave similarly when applied in varying biological contexts. Through our modeling, we show that gene expression levels remain the same in varying biological contexts when using our genetic circuit implementation of an iFFL.


RBS SOBL

Ribosome binding sites contain the Shine-Dalgarno sequence, where the 16s rRNA of the ribosome binds. However, this sequence varies across species, and often ribosome binding sites are extremely inefficient when applied in phylogenetically distant species (Salis et al., 2009). Our model shows that similar expression levels across organisms can be maintained when all genes in our genetic circuit contain the same ribosome binding site. Assuming translation elongation is similar across different species, our model shows that expression levels in different organisms can be maintained when all genes in our genetic circuit contain the same ribosome binding site. Nevertheless, translation elongation is influenced by codon usage, which differs per organism. We therefore developed a software tool that determines a coding sequence similar that is similar in codon usage across different species. Similar codon usage minimizes the chance of different translation elongation rates between organisms.


The core of our design - The incoherent feed-forward loop

We implemented an incoherent feed-forward loop (iFFL) in a genetic circuit. In an iFFL, the input signal regulates both the activator and the repressor of the output of the system in the same way (figure 1). The iFFL results in perfect adaptation to the input when the binding of the repressor is fully non-cooperative (binding of one repressor at a time) (Segall-Shapiro et al., 2018). In our case, the input is the plasmid copy number of the DNA template, and the output is the steady-state expression of a GOI. In our system, we use a transcription activator-like effector (TALE) protein as a repressor. TALE proteins recognize DNA by a simple DNA-binding mechanism (Doyle 2013) and have been shown to bind fully non-cooperative (figure 2) (Segall-Shapiro et al., 2018). The promoter controlling the GOI has been engineered to contain a binding site of a TALE protein. When the TALE protein is bound to the promoter, the expression of the GOI is repressed as demonstrated by 2018 iGEM Thessaloniki.

iFFL

Figure 1: Scheme of incoherent feed- forward loop. Red indicates how the output normally increases linearly with the input. Green depicts the addition of a repressor which results in independence of the output to the input.

iFFL

Figure 2: Animation of TALE protein binding to the promoter of a GOI. The binding of the TALE protein represses the expression of a GOI.
We have modeled the function of the genetics of this system. An analytical steady-state solution of the system showed that the steady-state expression level of a GOI is completely independent to plasmid copy number and can be independent of transcriptional and translational rates when the right design choices are made. After further verification through the implementation of a full ordinary differential equation (ODE) model, we designed experiments to test the independence to these variables. Using our model, we were able to identify to some key design choices of our project.. These consist of:
  • The need for good insulation of the genes.
  • The promoter strengths of the TALE protein and the GOI need to maintain the same ratio.
  • The ribosome binding site strengths of the TALE protein and the GOI need to maintain the same ratio.

The kinetics

In this section, we explain the kinetics of our iFFL and derive a system of ordinary differential equations to describe the interactions within the genetic circuit. We derive a steady-state solution from the system of equations and describe the properties of the system. In the next sections, we use the steady-state solution and ODE model to describe how our circuit can be used to transfer genetic circuits between prokaryotes.

Figure 3, depicts all interactions considered in our system.

TALE system

Figure 3: Scheme of genetic circuit interactions developed by Segall-Shapiro et al. (2018)

  • Click here to find out more about the details of the kinetic model

      From these interactions we can derive the following system of ordinary differential equations:


        ${dm_T \over dt} = {c \cdot a_T - y_m \cdot m_T}$

        $\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T - n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T}$

        $\frac{dP_G}{dt} = k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T}$

        $\frac{dP_{G.T}}{dt} = n \cdot k_{on} \cdot T^n \cdot P_G - k_{off} \cdot P_{G.T} - n \cdot y_T \cdot P_{G.T} $

        $\frac{dm_G}{dt} = a_{Gmax} \cdot P_G + a_{Gmin} \cdot P_{G.T} - y_m \cdot m_G $

        $\frac{G}{dt} = b_G \cdot m_G - y_G \cdot G $


      Parameter Value Unit Explanation Source
      $a_T$ 1.03 nM/min Transcription rate TALE 2018 iGEM Thessaloniki
      $y_m$ log(2)/5 1/min degradation rate mRNA Kushwaha and Salis 2015
      $b_T$ 0.44 1/min Translation rate TALE 2018 iGEM Thessaloniki.
      $y_T$ 0.0347 1/min Degradation rate TALE Assuming degradation is only dependent on growth rate (20 min)
      n 1 - Cooperativity of binding Segall-Shapiro et al.
      $k_{on}$ 9.85 1/(nM*min) Binding of TALE to promoter 2018 iGEM Thessaloniki<.
      $k_{off}$ 2.19 1/min Unbinding of TALE to promoter 2018 iGEM Thessaloniki<.
      $a_{Gmax}$ 3.78 1/(nM*min) Maximum transcription of GFP 2018 iGEM Thessaloniki
      $a_{Gmin}$ 0 1/(nM*min) Minimum transcription of GFP Segall-Shapiro et al.
      $b_G$ 3.65 1/min Translation rate GFP 2018 iGEM Thessaloniki
      $y_G$ 0.0347 1/min Degradation rate GFP Assuming degradation is only dependent on growth rate (20 min)
      $c$ variable Unitless Plasmid copy number of plasmid

      Variable Explanation
      $m_T$ concentration of TALE mRNA
      T concentration of TALE
      $P_G$ Promoter GFP
      $P_{G.T}$ Promoter GFP with TALE bound
      $m_G$ concentration of mRNA GFP
      G concentration of GFP

      Simplification of the system

      This system can be simplified by making a few assumptions (Segall-Shapiro et al., 2018):

      1. The Amount of TALE protein is much larger than the number of binding sites for TALE
      2. TALE binding and unbinding occurs much more rapidly than protein production and degradation
      3. When the promoter is repressed the expression level is negligable.

      Using these assumptions, we can derive analytically a steady-state solution for this system. This derivation results in the following steady-state solution:

      $$G = \left(\frac{c}{c^n}\right) \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$

    • Click here to see the full derivation

      By making use of assumption 2, we can assume a quasi-steady state for $\frac{dP_G}{dt}$ and $\frac{dP_{G.T}}{dt}$. Quasi-steady state means we assume the equation reaches steady-state much faster than the other variables in the model and thus the rate of change is zero. This results in the following system of equations.


      1. ${dm_T \over dt} = {c \cdot a_T - y_m \cdot m_T}$

      2. $\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T - n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T} + n \cdot (n-1)\cdot y_T \cdot P_{G.T}$

      3. $\frac{dP_G}{dt} = k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T} = 0$

      4. $\frac{dP_{G.T}}{dt} = n \cdot k_{on} \cdot T^n \cdot P_G - k_{off} \cdot P_{G.T} - n \cdot y_T \cdot P_{G.T} = 0$

      5. $\frac{dm_G}{dt} = a_{Gmax} \cdot P_G + a_{Gmin} \cdot P_{G.T} - y_m \cdot m_G $

      6. $\frac{dG}{dt} = b_G \cdot m_G - y_G \cdot G $


      We can now use equation 3 to simplify the system:


      • $\color{red}{k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G} + n \cdot y_T \cdot P_{G.T} = 0$
      • The red part can be taken to one side of the equation:
      • $\color{red}{n \cdot k_{on} \cdot T^n \cdot P_G - k_{off} \cdot P_{G.T}} = n \cdot y_T \cdot P_{G.T}$
      • Then we substitute that expression in equation 2:
      • $\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T \color{red}{- n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T}}+ n \cdot (n-1)\cdot y_T \cdot P_{G.T}$
      • Which becomes:
        $\frac{dT}{dt} = b_T \cdot m_T - y_T \cdot T -(n \cdot y_T \cdot P_{G.T}) + n \cdot (n-1)\cdot y_T \cdot P_{G.T} = b_T \cdot m_T - y_T \cdot T$

      Furthermore, we can use equation 3 to get an expression for $P_G$


      • $k_{off} \cdot P_{G.T} - n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T} = 0$
      • $P_G = \frac{k_{off}}{k_{on}} \cdot P_{G.T} + \frac{1}{k_{on}} \cdot n \cdot y_T \cdot P_{G.T}$

      Using assumption 1, we can say T >> c. It follows that the amount of free repressor barely changes when some of the repressors bind to $P_G$, meaning $T \approx T + nP_{G.T}$


      • $P_G = \frac{k_{off}}{k_{on}} \cdot P_{G.T}$
      • Using: $c = P_G + P_{G.T}$ we get the following:
      • $P_G = \frac{c}{1 + K_D \cdot R^n}$, Where: $K_D = \frac{k_{off}}{k_{on}}$
      • Plugging this into equation 5 and again making use of $c = P_G + P_{G.T}$, equation 5 becomes:
      • $\frac{dm_G}{dt} = c \cdot (a_{Gmin} + (a_{Gmax} - a_{Gmin})[\frac{K_D}{K_D + R^n}]) - y_m \cdot m_G$

      Assumption 3 tells us that we can assume $a_{Gmin} \approx 0$. Again using assumption 1, we can say $T^n >> K_D$, resulting in $T^n + K_D \approx T^n$ Using these two assumptions equation 5 can be further simplified to:


      • $\frac{dm_G}{dt} = c \cdot a_{G} \cdot[\frac{K_D}{T^n}] - y_m \cdot m_G$, where $a_{G} = a_{Gmax}K_D$

      Using this reduced system of equations we can now derive the steady-state solution for the GOI.


      • $ c \cdot a_T - y_m \cdot m_T = 0$ $\rightarrow$ $ m_T = c \frac{a_T}{y_m} $
      • $ b_T \cdot m_{T} - y_T \cdot y_T \cdot T = 0$ $\rightarrow$ $ T = \frac{b_T \cdot m_{T}}{y_T}$
      • $ c \cdot a_{G} \cdot[\frac{K_D}{R^n}] - y_m \cdot m_G = 0$ $\rightarrow$ $ m_G = c \frac{a_{G} \cdot[\frac{K_D}{R^n}]}{y_m}$
      • $ b_G \cdot m_{G} - \cdot y_G \cdot G = 0$ $\rightarrow$ $ G = \frac{b_G \cdot m_{G}}{y_G} $

      Plugging everything into the last equation gives: $$G = \left(\frac{c}{c^n}\right) \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$

    • According to our analytical solution, the level of the protein of interest is only dependents on plasmid copy number, and the ratios of transcription and translation rates of the genes in the circuit. In the next sections, we use this steady-state solution to demonstrate how it can be used to transfer genetic circuits between organisms. Furthermore, we solve the full system of ordinary differential equations in Matlab to gain insight in the kinetics of the system.


Plasmid copy number

The expression levels in a genetic circuit are strongly correlated to the plasmid copy number of the DNA template Segall-Shapiro et al., (2018) The amount of gene plasmid copy number can change when the plasmid is transferred between organisms. Therefore there is a need for expression levels independent of plasmid copy number if the same genetic circuit is used in different organisms. The steady-state solution of our model tells us that when our repressor binding is fully non-cooperative, n = 1, we have complete independence of plasmid copy number:


$$G = \left(\frac{\color{red}c}{\color{red}c^\color{red}n}\right)_{\color{red}n\color{red}=\color{red}1} \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$

This formula is however based on a few assumptions. To see how the system would behave without making these assumptions we implemented the full system of ordinary differential equations (Figure 4).

TALE system
Figure 4: Steady-state GOI production for gene plasmid copy number 1 to 600 (genome integration to high plasmid copy number plasmid).

The model without assumptions has the same expression level independent of plasmid copy number (figure 4). We therefore can transfer our circuit between organisms and expect the expression of the GOI to be independent of the changes in plasmid copy number of our orthogonal plasmid.


Wet lab

We tested the prediction of plasmid copy number independence by implementing the iFFL system, with GFP as the output. We cloned the system in backbones containing different origins of replication. As a control, we also cloned GFP into the same backbones to demonstrate different levels of expression. More info can be found here.

Transcriptional variations

Every promoter might have a different strength when used in different organisms (Yang et al., 2018). Thus, when using the same promoter in different organisms, you can get unpredictable behavior. The steady-state solution of our model tells us that the steady-state expression level of the GOI is only dependent on the ratio of transcription rates of the GOI and TALE.


$$G = \left(\frac{c}{c^n}\right) \left(\frac{\color{red}a_\color{red}G \color{black} b_Gy_T^ny_m^n}{\color{red}a_\color{red}T \color{black} b_Ty_Gy_m}\right)$$
In our full kinetic model we vary both transcription parameters. In figure 5 we plot the resulting steady-state solutions as a function of the transcription rate of TALE and GFP.
TALE system
Figure 5: Steady-state GFP production while transcription rates of both TALE and GOI are changed (aT/aG = constant). The lines indicate constant ratio of transcription rates

The full kinetic model shows that the expression level of GFP is the same when the transcription rate of TALE and of GFP remain constant (figure 5). In order to achieve a constant ratio of transcription rates in our genetic circuit, we use the T7 orthogonal transcription system which is transcribed by its own RNA polymerase. We implemented T7 promoters with varying strengths compared to the wild-type, developed by Ryo Komura et al. (2018). More information can be found here.

Wet lab

We have succesfully demonstrated the prediction of transcription rate independence when the same ratio in the transcription rate of both genes is maintained. This

We made variations of the system where we changed the promoters of both genes to a 50% strength version of that same promoter (figure 6). As a control, we also cloned GFP without repression under the control of these same promoters.
TALE system

Figure 6: Steady-state GFP fluorescence measurement of promoter variation using FACS. The graph depicts T7 and medium T7 iFFL systems, expected to give the same fluorescence according to the model. As a control, GFP under control of an unrepressed T7 promoter was used.

Furthermore, we also tested independence to transcriptional variation by using different IPTG concentrations (figure 7). Again, as a control, we cloned GFP without repression under the control of these same promoters.

TALE system
Figure 7: Steady-state GFP fluorescence measurement of IPTG titration using FACS. The graph depicts a T7 iFFL system induced using different levels of IPTG, which according to the model should give the same result. As a control GFP under control of an unrepressed T7 promoter was used.


More information can be found here here.

Download our code here.

Translational variations

As for in transcription, our model steady-state solution tells us that the steady-state expression level of the GOI is only dependent on the rate of translation of the GOI and TALE,


$$G = \left(\frac{c}{c^n}\right) \left(\frac{a_G \color{red}b_\color{red}Gy_T^ny_m^n}{a_T \color{red}b_\color{red}Ty_Gy_m}\right)$$

In figure 6 we plot the resulting steady-state solutions as a function of the translation rate of TALE and GOI using the full kinetic model to see how the system without assumptions behaves.


TALE system
Figure 8: Steady-state GOI production while translation rates of both TALE and GOI are changed. The lines indicate the constant rate of the translation rates.

As can be seen in figure 8 the full kinetic model maintains the same level of GFP expression when the translation rates for both genes remain in a constant ratio. To keep the same ratio in translation rates across organisms we used the same ribosome binding site (RBS) for both genes. Using the same RBS ensures that translation initiation for both genes change in a similar manner (Salis et al., 2009), more on the design choices can be found here.

Wet lab

We tested the prediction of independence to variation in translational rates by implementing the iFFL system, where the output is GFP. We made variations of the system where we change both ribosome binding sites in the same way. As a control, we also cloned GFP without repression into under control of these ribosome binding sites. More information can be found here.

Importance of insulation

In our model solutions so far we assumed the promoter of the GOI to be completely insulated from expression from the TALE protein. However, in reality when two transcription units are placed in series leaky expression of the second gene can occur. This is due to the efficiency of the terminator of the first gene (Ying-Ja et al,. 2018). The iFFL system originally developed by Segall-Shapiro et al. (2018), uses the ECK120029600 terminator for the TALE protein. This terminator has a reported efficiency of 1/612, meaning that for every 612 TALE proteins produced, 1 protein of the GOI is made (Ying-Ja et al,. 2018). We incorporate this efficiency into our model and solve again for steady-state GOI expression levels to see the effect of terminator efficiency on plasmid copy number independence.

TALE system

Figure 11: Comparison of a perfect terminator and a leaky terminator on the expression level at different plasmid copy number.

The model shows that the leaky expression negatively impacts the system's ability to adapt to gene plasmid copy number. We therefore designed our system to have the transcriptional unit of TALE in a different orientation than the transcriptional unit of the GOI.



TALE system
Figure 12: Design of genetic circuit, to circumvent the issue of leaky expression

Codon Usage - Cross-species codon harmonization

When transferring genetic circuits across different organisms, translation is not only dependent on translation initiation. The translation also depends on codon usage. During expressing of heterologous protein in new bacterial host cells, it has shown altered protein levels compared to that in the original microorganism. One of the reasons for a lower expression level is the variance in codon usage between the original organism and the new host cell (Angov et al., 2008). The foundation of the variance in codon usage is written in the DNA sequence itself. Protein structures are dependent on the DNA sequence, which is translated into a functional protein through two subsequent cellular processes: transcription and translation (Angov et al., 2008).

In general bacterial cells contain 20 different amino acids encoded by 64 codons (excluding 3 stop codons) and has resulted in a phenomenon called synonymous codon usage. Synonymous codon usage means that most of the 20 amino acids are encoded by more than one codon. (Nascimento et al., 2018). Nascimento, et al. (2018) have proven that cells are making great use of the codon choice that this offers, since codon usage directly affects both the level of mRNA copies and the translation rate. They showed that proteins expressed at high levels have more mRNA copies and contain more frequently used codons in order to speed up the translation rate (Nascimento et al., 2018). In order to increase the expression level of the heterologous protein in the host cell, new codon optimization tools were developed. However, it remains difficult to predict which tool will design the optimal sequences (Mignon et al., 2008). The codon optimization tools available now can be divided into two main groups based on how the tool's algorithm functions:


  1. Codon optimization tools: The basic idea of a codon optimisation tool is to achieve the highest translation rate possible. The translation rate is increased by substituting each codon with the codon that is used mosed frequently for the corresponding amino acid and keep the ribosomal binding site (RBS) freely accessible for the ribosomal subunit RBS by avoiding hairpin formation at the translation initiation side (Puigbò et al., 2018).
    The relative codon frequencies are calculated through the Codon Adaptation Index (CAI) as shown in the following equation. In this equation $w_i$ is the CAI, $f_i$ is the frequency of a particular codon, and $max(f_i)$ is the codon that is used most frequently for the corresponding amino acid.
  2. $$w_i = \frac{f_i}{max( f_i )}$$

  3. Codon harmonization tools: The basic idea of a codon harmonization tool is to mimic the native translation rate in the host organism by using rare codons at specific places and avoiding hairpin formation as much as possible. The codon usage of the original microorganism functions as the reference point (Athey et al., 2017). This approach allows pre-folding of the protein during translation in order to reduce the chance of the protein misfolding as much as possible (Figure 9).
  4. Translation rate
    Figure 9: Schematic representation of protein translation. The green parts are encoded with high frequency codons in order to speed up the translation rate. The red parts are encoded by rare codons in order to slow down the translation rate, which limits misfolding of proteins by creating a small time window for protein pre-folding. Codon harmonization aims to create the same codon usage pattern as the native host in order to increase the amount of functional protein.


The current limitation of both codon adaptation tools, is that it is not possible to sequence one single DNA part that has been adapted to function across multiple species. we boosted our project by creating the first cross-species codon harmonization tool. This harmonization tool provides the user with a single DNA coding sequence that will yield the same protein expression level in different bacterial host cells. The codon harmonization approach as explained above forms the core of our algorithm. We modified this algorithm by making use of statistical analysis method of the least square variance. Furthermore, we made our tool BioBrick RFC compatible by removing type II standard restriction sites.


  • Click here to find out more about the codon harmonization

      Preselection data

      Our harmonization tool is based on a database containing the codon usage of 152903 microorganisms obtained from a paper published by (Athey et al., (2017). Since working with such a big database slows down the tool, we designed a MATLAB script that pre-selects the data of microorganisms of your interest. This MATLAB script is available in the supplementary list below.
      The taxonomy identification number (taxid) is used as input for the preselection in order to reduce the chance if a mistake based on type error of the name. Each organism name is converted into a specific taxid in order to make it easier to search for a specific organism or strain in the NCBI Taxonomy database (Federhen et al., 2011).
      As mentioned before, the codon harmonization approach will function as the core of our code, which means that we will create the same codon usage pattern as the native host in order to increase the amount of functional protein expression. The native organisms own codon usage functions as the reference point for the tool.
      In our case we used eGFP as the heterologous protein and Escherichia coli strain BL21(DE3) as native host (taxid 469008). The actual native organism for the eGFP protein is the jelly fish Aequorea victoria (taxid 6100). However, we first designed our tool to function in bacterial hosts before expanding the tool further to eukaryotic cell.
      When generating the filtered database containing data of only the organisms of interest, we designed the code in such a way that the first input row will contain data of the reference organism and the rows below contain data of the potential new host organisms. Each new row is a new potential organism. A schematic representation of the preselection is shown in Figure 10.

      Translation rate
      Figure 10: A schematic representation of the preselection code to obtain data of only the organisms of interest from the main database. The top row corresponds with the data of the codon usage of the reference organism, while each following row corresponds with the codon usage of the organisms of interest. In the preselection code, the taxid for each organisms of interest is used as query to find the right row in the main database. The selected row are combined together to form a filtered database containing the codon usage of only the reference organism and the hosts of interest.

      Harmonization code

      Before the harmonized coding sequence for the organisms of interest is generated, the deviation of the endogenous GC content of each organism of interest from the mean GC content is calculated. This calculation has been added to inform the user whether the GC concert of the organisms of interest are to different from each other. In case the GC content for one of the organisms deviates more then 5% from the mean, a notification will occur to inform the user that the GC content is to different from each other and that might result in a harmoinzed coding sequence, which not functions in the same way.

      First, the codon frequency is calculated in the same as was described by Athey et al. (2017). We used this formula instead of the formula for CAI (equation ....), since the CAI calculation (formula 1) requires a reference gene. However, our system functions across species, so an adapted version of the CAI is used, as shown in equation obtained from Athey et al. (2017). This adapted version calculates the frequency of the used codon for that specific amino acid instead of calculating the relative codon use for that specific amino acid. .........

      $$freq_{codon,i} = \frac{codon}{\sum codon,i}$$

      Secondly, the variance for each codon position is calculated separately as a intermediate step for the calculation of the least square variance. During the calculation of the variance, the codon frequency at that particular sequence position is also taken into account in order to remove outliers as much as possible. The calculated variance for each position is ordered from lowest to highest.

      In the first iteration of generating the final sequence, we use the lowest variance codon at every position. The generated sequence will go through screening for type II restriction enzyme recognition sequences to make it MoClo compatible, and easier to use in a construct. In case a site is found, the codon at that particular position will be substituted with the synonymous codon that has the second lowest variance. When going through this iteration cycle multiple times we derive a single nucleotide sequence cleared from type II restriction sites and codon harmonized for all organisms of interest in order to achieve the same translation rate in each organism of interest. The deviation of the endogenous GC content of each organism of interest from the mean GC content across these organisms. ( This output is added additionally to inform the user wether the GC content of the organisms of interest are to differnt from each other. In case the GC content are to different, a notification will pop up in the screen.)

      Inputs and Outputs

      Our codon harmonization tool requires three inputs and will generate two outputs.
      As input the harmonization script requires the following three inputs:

      1. The filtered database containing only the codon usage of the organisms of interest and the reference organisms codon usage (data_formatted = output file name of the filtered database).
      2. Database containing recognition sites for type II restriction enzymes (restriction_enzymes_database).
      3. Nucleotide sequence of the gene of interest that will be harmonize.

      As output the codon harmonization script will generate the following two outputs:

      1. A codon harmonized nucleotide sequence usable in all organisms of interest.
      2. The deviation of the endogenous GC content of each organism of interest from the mean GC content across these organisms. (This output is added additionally to inform the user wether the GC content of the organisms of interest are to differnt from each other. In case the GC content are to different, a notification will pop up in the screen.)

    • Obtained output
      For the validation for our model we used the microorganisms listed below:
      Name organism taxid
      Escherichia coli BL21 (DE3) 469008
      Vibrio natriegens NBRC 15636 = ATCC 14048 = DSM 759) 1219067
      Bacillus subtilis subsp. subtilis str. 16 224308

      As input sequence we use the coding sequence for eGFP ( click here to get the sequence).
      The generated output sequence is here (click here to get the sequence).

References