Difference between revisions of "Team:Marburg/Model"

Line 1,154: Line 1,154:
 
             <section class="section">
 
             <section class="section">
 
               <!--Content of popup-->
 
               <!--Content of popup-->
 +
              <h1 class="title">Terminator Model </h1>
 +
 +
<p>
 +
  Talking to numerous experts in the field of phototrophic research necessitated
 +
the need for strong transcriptional termination for large genetic engineering projects.
 +
<br>
 +
In bacteria, two processes are responsible for proper transcript termination: intrinsic Rho-independent terminators, generally low energy RNA hairpins; and Rho-dependent terminators, which rely on the binding of the Rho protein.
 +
The majority of bacteria have a homolog of the E. coli Rho protein, with a few exceptions such as our organism S. elongatus <a href="https://doi.org/10.1371/journal.pcbi.0010025">[<i>de Hoon et al.,</i> 2005]</a>.
 +
 +
<br>
 +
We therefore first of all concentrated on the investigation of the natural intrinsic terminators of our strain UTEX 2973. To do this, we had to take a closer look at how these intrinsic terminators function. Rho-independent terminators typically consist of short, 7-20 base pairs long, mostly GC-rich hairpins. The loop structure is followed by a chain of uracil residues. A protein bound to the RNA polymerase then binds to the stem-loop tightly enough to cause the polymerase to temporarily stall. The pausing of the polymerase coincides with the transcription of the poly-uracil region. The weak Adenine-Uracil bonds then lower the energy of destabilization for the RNA-DNA duplex, allowing it to unwind and dissociate from the RNA polymerase [<i>Krebs et al.,</i> 2014].
 +
<br>
 +
It’s important to note that, especially in our organism S. elongatus, not all terminators cause complete termination. In some cases, these terminators are found in between ORFs inside the same operon and might be involved in creating complex transcription structures. From here on, however, our analysis will be mainly focused on the standard case.
 +
<br>
 +
Our first stage objective was to find promising natural terminators. In order to achieve this goal we applied several state-of-the-art bioinformatics tools to obtain a comprehensive overview of as many candidates as possible. The software we used were:
 +
</p>
 +
<ol>
 +
  <li>ARNold, which in itselfs consist of two complementary programs: Erpin (<i>Gautheret et al.,</i> 2001); RNAmotif (<i>Macke et al.,</i> 2001).</li>
 +
  <li>TransTermHP (<i>Kingsford et al.,</i> 2007)</li>
 +
  <li>FindTerm (<i>Solovyev et al.,</i> 2011)</li>
 +
</ol>
 +
<p>
 +
 +
Due to its design the resulting list of 2113 sequences contained many false positive and duplicate terminator candidates.
 +
<br>
 +
In order to analyze the data we split it into two and ordered the sequences according to its strand. The next step was to clear the list of possible duplicates. This was done by analyzing the intersection of the respective bp positions. If both the intersection and the symmetric difference of two seperate terminator candidates were non empty we expanded its definition by the difference. To redefine the selection we later on analyzed the secondary RNA structure via kinetic modeling.
 +
<br>
 +
In order to filter out the misrecognized terminators from our list, we decided to use the much more detailed transcriptomics data of both UTEX 2973 and its closely related strain PCC 7942.
 +
Our approach was divided into two parts:
 +
</p>
 +
<ol>
 +
  <li>Identify if the sequence is contained inside an open reading frame.</li>
 +
  <li>Determine the approximate in vivo termination efficiency of each candidate.</li>
 +
</ol>
 +
<p>
 +
For the first part of this approach we’ve taken into account the Joint Genome Institute (JGI) predictions and transcriptionally identified ORFs. To make sure that we don’t consider wrong candidates we decided to remove any sequence whose intersection with an ORF exceeds a threshold of 15%.
 +
<br>
 +
For the in vivo efficiency approximation of the sequences we calculated the relative decline in average base counts in 25-base windows before and after the terminator candidates (Creecy et al., 2015). Sequences which had an approximated efficiency below a high threshold of 80% were ignored for further consideration.
 +
</p>
 +
<figure>
 +
  <img src="https://2019.igem.org/File:T--Marburg--m_terminator_bpcount.jpg">
 +
  <figcaption>
 +
    Exemplary efficiency analysis of a predicted terminator on the sense strand. The x-axis is the nucleotide position on the genome, y axis the counts for the associated base. The predicted terminator is displayed in read, the 25 bases before and after the terminator sequence are colored blue.
 +
  </figcaption>
 +
</figure>
 +
<p>
 +
After the careful separation of the unsuitable candidates we were left with the most promising terminators. To further analyze the functions of these terminators a kinetic approach was indispensable.
 +
<br>
 +
The RNA secondary structures were predicted using KineFold. To choose the most likely formation we performed multiple independent runs using different random seeds and chose the most frequent structure. 
 +
</p>
 +
<figure>
 +
  <img src="https://2019.igem.org/File:T--Marburg--m_terminator_folding.jpg">
 +
  <figcaption>
 +
    Example of a secondary structure prediction using MFOLD. We can clearly see that a poly(U) region is part of the hairpin. Additionally we can see the high GC content of the base, a typically small stem loop and the poly(A) region on the 5’ end.
 +
  </figcaption>
 +
</figure>
 +
Based upon these results we were tasked with the correct identification of the U-tract, hairpin and the A-tract regions. The predicted secondary structures were often hairpins that extended beyond the terminator hairpin. The reason for this was the formation of base pairs between the upstream poly(A) sequences and the U-tract. For the precise identification of these regions it was important that the poly(U) region was part of the U-tract and not the hairpin. To correctly distinguish these two several steps had to be taken. Given a stem loop structure, we screened for possible U-tracts in the region between the sixth nucleotide in the 3’-arm of the stem loop and the eighth nucleotide after the stem by evaluating every 8 base pairs. 
 +
<br>
 +
For this we have calculated the Gibbs free energy of all possible U-tracts with the formula
 +
<br>
 +
<figure>
 +
  <img src="https://via.placeholder.com/150" alt="Placeholder image">
 +
</figure>
 +
<p>
 +
Where N_U = 8 is the length of the U-tract, Delta G_RNA:DNA is the free-energy contribution of the RNA:DNA hybridization from the two nucleotides pairs at position i and i+1.
 +
 +
The hybridization were calculated using the nearest-neighbor thermodynamic parameters at the respective position (Sugimoto et al., 1996).
 +
The 8bp sequence with the highest Delta G_U value was then selected as the U-tract.
 +
 +
With the proper identification of the U-tract it was now possible for us to precisely define each region.
 +
<br>
 +
 +
<b>< TABLE OF 10 TER HERE></b>
 +
 +
<br>
 +
We now wanted to use these records to analyze the impact of mutations in different terminator regions. In order to experimentally test this, we established a workflow that allows us to screen a huge combinatorial library of terminators.
 +
 +
For this we have selected 3 of the strongest terminators which have mutually distinct features such as different hairpin and loop length.
 +
Based on research experience we have decided to include mutations in the respective U and  A-tracts. The synthetic library was ordered as degenerate oligos.
 +
 +
To test the terminator efficiency in vivo we build a GoldenGate Lvl2 constructs with a terminator spaceholder surrounded by 2 fluorescent proteins.
 +
<br>
 +
< Construct >
 +
<br>
 +
Because of the different emitting spectra of these fluorescent proteins we will be able to measure both independently which allows for indirect measurement of terminator strength.
 +
For this we calculate the ratio between induced mTurquoise  and induced YFP normalized by control (plasmid with no terminator inserted).
 +
 +
With the help of FACS we will be able to systematically separate the different terminators and analyze the impact of different mutations.
 +
 +
We hope that this approach will inspire other teams to build and screen large libraries of synthetic parts so that the scientific community can gain a deeper insight into the inner workings  of elementary molecular building blocks.
 +
 +
</p>
 +
 +
<h2 class="subtitle">References</h2>
 +
<p>
 +
 +
  Chen, J., Morita, T., & Gottesman, S. (2019). Regulation of Transcription Termination of Small RNAs and by Small RNAs: Molecular Mechanisms and Biological Functions. Frontiers in Cellular and Infection Microbiology, 9. https://doi.org/10.3389/fcimb.2019.00201
 +
<br>
 +
de Hoon, M. J. L., Makita, Y., Nakai, K., & Miyano, S. (2005). Prediction of Transcriptional Terminators in Bacillus subtilis and Related Species. PLoS Computational Biology, 1(3), e25. https://doi.org/10.1371/journal.pcbi.0010025
 +
<br>
 +
 +
Krebs, J., Lewin, B., Kilpatrick, S. & Goldstein, E. (2014). Lewin's genes XI. Burlington, Mass: Jones & Bartlett Learning.
 +
<br>
 +
 +
Gautheret D, Lambert A. (2001) Direct RNA Motif Definition and Identification from Multiple Sequence Alignments using Secondary Structure Profiles. J Mol Biol. 313:1003–11 (abstract).
 +
<br>
 +
 +
Macke T, Ecker D, Gutell R, Gautheret D, Case DA and Sampath R. (2001) RNAMotif – A new RNA secondary structure definition and discovery algorithm. Nucleic Acids Res. 29:4724–4735 (abstract).
 +
<br>
 +
 +
Kingsford, C. L., Ayanbule, K., & Salzberg, S. L. (2007). Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake. Genome Biology, 8(2), R22. https://doi.org/10.1186/gb-2007-8-2-r22
 +
<br>
 +
 +
V. Solovyev, A Salamov (2011) Automatic Annotation of Microbial Genomes and Metagenomic Sequences. In Metagenomics and its Applications in Agriculture, Biomedicine and Environmental Studies (Ed. R.W. Li), Nova Science Publishers, p. 61-78
 +
<br>
 +
 +
Chen, Y.-J., Liu, P., Nielsen, A. A. K., Brophy, J. A. N., Clancy, K., Peterson, T., & Voigt, C. A. (2013). Characterization of 582 natural and synthetic terminators and quantification of their design constraints. Nature Methods, 10(7), 659–664. https://doi.org/10.1038/nmeth.2515
 +
<br>
 +
 +
Tan, X., Hou, S., Song, K., Georg, J., Klähn, S., Lu, X., & Hess, W. R. (2018). The primary transcriptome of the fast-growing cyanobacterium Synechococcus elongatus UTEX 2973. Biotechnology for Biofuels, 11(1). https://doi.org/10.1186/s13068-018-1215-8
 +
<br>
 +
 +
Vijayan, V., Jain, I. H., & O’Shea, E. K. (2011). A high resolution map of a cyanobacterial transcriptome. Genome Biology, 12(5), R47. https://doi.org/10.1186/gb-2011-12-5-r47
 +
<br>
 +
 +
Creecy, J. P., & Conway, T. (2015). Quantitative bacterial transcriptomics with RNA-seq. Current Opinion in Microbiology, 23, 133–140. https://doi.org/10.1016/j.mib.2014.11.011
 +
<br>
 +
 +
Sugimoto, N., Nakano, S. -i., Yoneyama, M., & Honda, K. -i. (1996). Improved Thermodynamic Parameters and Helix Initiation Factor to Predict Stability of DNA Duplexes. Nucleic Acids Research, 24(22), 4501–4505. https://doi.org/10.1093/nar/24.22.4501
 +
 +
</p>
 +
 +
 
             </section>
 
             </section>
 
           </div>
 
           </div>

Revision as of 01:40, 22 October 2019

Modelling


This year we used our mathematical and programming background to look for artificial Neutral integration Site option (aNSo) and suitable terminators for our project. We took advantage of genome data bank of UTEX2973 and used bioinformatics tools to gain insights and implement it to our project. In addition to that, we designed a model to predict the doubling times of UTEX2973 that was only possible after a thorough investigation and standardization of the current state of the art methods. To achieve this level of standardization we also implemented a light model to properly predict light intensities for our cultures.


Growth Curve Model


artificial Neutral integration Site options


Terminator Model