Team:OUC-China/hometest2

kkkkk


























































Overview:

This year OUC-China are working on post-transcriptional regulation. At present, post-transcriptional regulation is still a hot topic of research, and various regulation methods and levels are what most people pursue. In our work, we want the modular riboswitch to have a wider level of regulation. Therefore, in order to explore what parameters play a crucial role in the translation level of our RiBoLeGo system, we carried out sensitivity analysis on the parameters obtained by ordinary differential equation(ODE). And the result of the ODE model is that translation rate plays an important role in different translation levels.

So in order to change the translation rate, the thermodynamic model was introduced . By this model, we know that riboswitch and stabilizer is how to affect the downstream gene translation level.What’s more, this sequence-to-function thermodynamic model can predict expression level ratios in natural operons and to design synthetic operons with desired expression level ratios.Depending on the thermodynamic model, a variety of tunners were successfully designed.

The thermodynamic model shows that different stabilizers will affect downstream gene translation rate. So how to select the suitable stabilizer become crucial. We get the docking matrix corresponding score function, and compared these score values, so as to obtain the suitable to stabilizer sequence.

Through various tunners, we have more translation levels in our Ribolego system. But our aim is to have more diverse regulation, so we introduced asRNA to re-activate or re-repress the riboswitch by hiding or exposing the RBS. According to the following design factors: an Hfq binding site, thermodynamics, the asRNA length, the number of mismatches, and the binding location. Finally, we successfully designed the corresponding asRNA sequence. And the inhibition effect of the designed asRNA sequence was predicted by the sequence-effect function.

Finally, in order to explore the whole process of the riboswitch from off to on at the molecular level with or without ligands, we used molecular dynamics to simulate the whole process and explored how the riboswitch function in more detail by distinguishing its on ,off and transition states by molecular distance. And depend on it, we can understand the molecular mechanism which the riboswitch is turned on.


Fig A. Model overview




PART1:ODE

In our work, our aim is to have more different translation’s levels. Therefore, in order to understand which parameters play a key role in the RiBoLeGo system, we used ODE model to simulate the whole process of the RiBoLeGo system.

REACTION:

We can describe our RiBoLeGo system to be followings:

① null -> Riboswitch_off

② Riboswitch_off + ligand <-> Riboswitch_on

③ Riboswitch_on -> STA + Riboswitch_on

④ Riboswitch_on -> GOI + Riboswitch_on

⑤ Riboswitch_off -> null

⑥ Riboswitch_on -> ligand

⑦ STA -> null

⑧ GOI -> null

ODES:

To simulate the dynamics of sfGFP, we use ordinary differential equations to model the reactions above. And ODEs are given as follows:

d[ Riboswitch_off ] dt = k 1 k d1 [ Riboswitch_off ] k on1 [ Riboswitch_off ][ ligand ]+ k off1 [ Riboswitch_on ] MathType@MTEF@5@5@+= feaagKart1ev2aaaM1ovLHfij5gC1rhimfMBNvxyNvgaCzMCHn2E7T hxY12EK1xFCXwzMr3wGSNuPj2BZDxA0ngAC91BMzwFGWLCPDgA01vF 9T3EKrxF9bspGS3AFftFG0ci7T2x7rwm91hxSvMz0Tfi7jvAI92C3L gDJHgxF9MzM1hiCjxANHgDDbslGS3AFT3BUftF9XfBLzgDBbYEsLMy Vn3DPr3yOX1xVzMz9bcxYL2zOrxxCXwzMr3wGShBPDwyUrwFGWLCPD gA01fiRaYER91EVzMzX0xFCXwzMr3wGSNuPj2BZDxA0ngAC91BU1hi CjxANHgDDbWexLMBbXgBcf2CPn2qVrwzqf2zLnharuavP1wzZbItLD his9wBH5garmWu51MyVXgaruWqVvNCPvMCG4uz3bqee0evGueE0jxy amXvP5wqSX2qVrwzqf2zLnharyWYoZC5aibaieYlNi=xH8yiVC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0dXd bba9pGe9xq=JbbG8A8frFve9Fve9Ff0dmeaabaqaciGacaGaaeqaba WaaeaaeaqbaOqaaabaaaaaaaaapeWaaSaaa8aabaWdbiaabsgadaWa daWdaeaapeGaamOuaiaadMgacaWGIbGaam4BaiaadohacaWG3bGaam yAaiaadshacaWGJbGaamiAaiaac+facaWGVbGaamOzaiaadAgaaiaa wUfacaGLDbaaa8aabaWdbiaadsgacaWG0baaaiabg2da9iaadUgapa WaaSbaaSqaa8qacaaIXaaapaqabaGcpeGaeyOeI0Iaam4Aa8aadaWg aaWcbaWdbiaadsgacaaIXaaapaqabaGcpeWaamWaa8aabaWdbiaadk facaWGPbGaamOyaiaad+gacaWGZbGaam4DaiaadMgacaWG0bGaam4y aiaadIgacaGGFbGaam4BaiaadAgacaWGMbaacaGLBbGaayzxaaGaey OeI0Iaam4Aa8aadaWgaaWcbaWdbiaad+gacaWGUbGaaGymaaWdaeqa aOWdbmaadmaapaqaa8qacaWGsbGaamyAaiaadkgacaWGVbGaam4Cai aadEhacaWGPbGaamiDaiaadogacaWGObGaai4xaiaad+gacaWGMbGa amOzaaGaay5waiaaw2faamaadmaapaqaa8qacaWGSbGaamyAaiaadE gacaWGHbGaamOBaiaadsgaaiaawUfacaGLDbaacqGHRaWkcaWGRbWd amaaBaaaleaapeGaam4BaiaadAgacaWGMbGaaGymaaWdaeqaaOWdbm aadmaapaqaa8qacaWGsbGaamyAaiaadkgacaWGVbGaam4CaiaadEha caWGPbGaamiDaiaadogacaWGObGaai4xaiaad+gacaWGUbaacaGLBb Gaayzxaaaaaa@F2B7@

d[ Riboswitch_on ] dt =kon1[ Riboswitch_off ][ ligand ]-kdc1[ Riboswitch_on ] MathType@MTEF@5@5@+= feaagKart1ev2aaaMXlvLHfij5gC1rhimfMBNvxyNvgaCzMCHn2E7T hxY12EK1xFCXwzMr3wGSNuPj2BZDxA0ngAC91BU1hiCjxANHgDD1xF 7Thz01xFG0diR9MBXWfBLzgDBbYEsLMyVn3DPr3yOX1xVzMz9bcxYL 2zOrxxCXwzMr3wGShBPDwyUrwFGWLCPDgA011ECjxB7bslG0xFRr2y XWfBLzgDBbYEsThxY12EPj2BZDxA0ngA91hxF9MB9bcxYL2zOrxxam XvP5wqSXMqHnxAJn0BKvguHDwzZbqefqvATv2CG4uz3bIuV1wyUbqe dmvETj2BSbqefm0B1jxALjhiov2DaebbnrfifHhDYfgatCvAUfeBSn 0BKvguHDwzZbqegSSZmxoasaacH8YjY=vipgYlh9vqqj=hEeeu0xXd bba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0= yqaiVgFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabauaa keaaqaaaaaaaaaWdbmaalaaapaqaa8qacaqGKbWaamWaa8aabaWdbi aadkfacaWGPbGaamOyaiaad+gacaWGZbGaam4DaiaadMgacaWG0bGa am4yaiaadIgacaGGFbGaam4Baiaad6gaaiaawUfacaGLDbaaa8aaba WdbiaadsgacaWG0baaaiabg2da9iaadUgacaWGVbGaamOBaiaaigda daWadaWdaeaapeGaamOuaiaadMgacaWGIbGaam4BaiaadohacaWG3b GaamyAaiaadshacaWGJbGaamiAaiaac+facaWGVbGaamOzaiaadAga aiaawUfacaGLDbaadaWadaWdaeaapeGaamiBaiaadMgacaWGNbGaam yyaiaad6gacaWGKbaacaGLBbGaayzxaaGaaeylaiaadUgacaWGKbGa am4yaiaaigdadaWadaWdaeaapeGaamOuaiaabMgacaqGIbGaae4Bai aabohacaqG3bGaaeyAaiaabshacaqGJbGaaeiAaiaac+facaWGVbGa amOBaaGaay5waiaaw2faaaaa@C7B6@

d[ ligand ] dt = k on1 [ Riboswitch_off ][ ligand ]+ k dc1 [ Riboswitch_on ] k off1 [ Riboswitch_on ] MathType@MTEF@5@5@+= feaagKart1ev2aaaMvnvLHfij5gC1rhimfMBNvxyNvgaCzMCHn2E7r gxSvMz0Tfi7XwANfMBK1hiCjxANHgDD1xF7Thz01xFG0di7T2x79MB X0xFCXwzMr3wGSNuPj2BZDxA0ngAC91BMzwFGWLCPDgA01fxSvMz0T fi7XwANfMBK1hiCjxANHgDDbYkGS3AFThzJftF9XfBLzgDBbYEsLMy Vn3DPr3yOX1xV5wFGWLCPDgA01fiTaYER91EVzMzX0xFCXwzMr3wGS NuPj2BZDxA0ngAC91BU1hiCjxANHgDDbWexLMBbXgBcf2CPn2qVrwz qf2zLnharuavP1wzZbItLDhis9wBH5garmWu51MyVXgaruWqVvNCPv MCG4uz3bqee0evGueE0jxyamXvP5wqSX2qVrwzqf2zLnharyWYoZC5 aibaieYlNi=xH8yiVC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm 0xbbG8FasPYRqj0=yi0dXdbba9pGe9xq=JbbG8A8frFve9Fve9Ff0d meaabaqaciGacaGaaeqabaWaaeaaeaqbaOqaaabaaaaaaaaapeWaaS aaa8aabaWdbiaadsgadaWadaWdaeaapeGaamiBaiaadMgacaWGNbGa amyyaiaad6gacaWGKbaacaGLBbGaayzxaaaapaqaa8qacaWGKbGaam iDaaaacqGH9aqpcaWGRbWdamaaBaaaleaapeGaam4Baiaad6gacaaI XaaapaqabaGcpeWaamWaa8aabaWdbiaadkfacaWGPbGaamOyaiaad+ gacaWGZbGaam4DaiaadMgacaWG0bGaam4yaiaadIgacaGGFbGaam4B aiaadAgacaWGMbaacaGLBbGaayzxaaWaamWaa8aabaWdbiaadYgaca WGPbGaam4zaiaadggacaWGUbGaamizaaGaay5waiaaw2faaiabgUca RiaadUgapaWaaSbaaSqaa8qacaWGKbGaam4yaiaaigdaa8aabeaak8 qadaWadaWdaeaapeGaamOuaiaadMgacaWGIbGaam4BaiaadohacaWG 3bGaamyAaiaadshacaWGJbGaamiAaiaac+facaWGVbGaamOBaaGaay 5waiaaw2faaiabgkHiTiaadUgapaWaaSbaaSqaa8qacaWGVbGaamOz aiaadAgacaaIXaaapaqabaGcpeWaamWaa8aabaWdbiaadkfacaWGPb GaamOyaiaad+gacaWGZbGaam4DaiaadMgacaWG0bGaam4yaiaadIga caGGFbGaam4Baiaad6gaaiaawUfacaGLDbaaaaa@DEFF@

d[ STA ] dt =kp1[ Riboswitch_on ]-kd2[ STA ] MathType@MTEF@5@5@+= feaagKart1ev2aaaM5hvLHfij5gC1rhimfMBNvxyNvgaCzMCHn2E7r gxSvMz0Tfi7nfvb1hiCjxANHgDD1xF7Thz01xFG0diRbxmCXwzMr3w GSNuPj2BZDxA0ngAC91BU1hiCjxANHgDDThxY12EG0ciRrMm91hxSv Mz0Tfi7nfvb1hiCjxANHgDDbWexLMBbXgBcf2CPn2qVrwzqf2zLnha ruavP1wzZbItLDhis9wBH5garmWu51MyVXgaruWqVvNCPvMCG4uz3b qee0evGueE0jxyamXvP5wqSX2qVrwzqf2zLnharyWYoZC5aibaieYl Ni=xH8yiVC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8Fa sPYRqj0=yi0dXdbba9pGe9xq=JbbG8A8frFve9Fve9Ff0dmeaabaqa ciGacaGaaeqabaWaaeaaeaqbaOqaaabaaaaaaaaapeWaaSaaa8aaba WdbiaadsgadaWadaWdaeaapeGaam4uaiaadsfacaWGbbaacaGLBbGa ayzxaaaapaqaa8qacaWGKbGaamiDaaaacqGH9aqpcaWGRbGaamiCai aaigdadaWadaWdaeaapeGaamOuaiaadMgacaWGIbGaam4Baiaadoha caWG3bGaamyAaiaadshacaWGJbGaamiAaiaac+facaWGVbGaamOBaa Gaay5waiaaw2faaiaab2cacaqGRbGaaeizaiaabkdadaWadaWdaeaa peGaam4uaiaadsfacaWGbbaacaGLBbGaayzxaaaaaa@8FE2@

d[ GOI ] dt =kp2[ Riboswitch_on ]-kd3[ GOI ] MathType@MTEF@5@5@+= feaagKart1ev2aaaMvivLHfij5gC1rhimfMBNvxyNvgaCzMCHn2E7T hxY12EK1xFCXwzMr3wGS3rpLuFGWLCPDgA01vF9T3EKrxF9bspGSgC YWfBLzgDBbYEsLMyVn3DPr3yOX1xV5wFGWLCPDgA011ECjxB7bslGS gzZ0xFCXwzMr3wGS3rpLuFGWLCPDgA01fatCvAUfeBSjuyZL2yd9gz LbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYL wzYbItLDharqqtubsr4rNCHbWexLMBbXgBd9gzLbvyNv2CaeHbl7mZ LdGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8 WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vq pWqaaeaabiGaciaacaqabeaadaqaaqaafaGcbaaeaaaaaaaaa8qada WcaaWdaeaapeGaaeizamaadmaapaqaa8qacaWGhbGaam4taiaadMea aiaawUfacaGLDbaaa8aabaWdbiaadsgacaWG0baaaiabg2da9iaadU gacaWGWbGaaGOmamaadmaapaqaa8qacaWGsbGaamyAaiaadkgacaWG VbGaam4CaiaadEhacaWGPbGaamiDaiaadogacaWGObGaai4xaiaad+ gacaWGUbaacaGLBbGaayzxaaGaaeylaiaabUgacaqGKbGaae4mamaa dmaapaqaa8qacaWGhbGaam4taiaadMeaaiaawUfacaGLDbaaaaa@92F2@

For the readability, the complex symbol is simplified as:

d[ A ] dt =k1-kd1[ A ]-kon1[ A ][ B ]+koff1*[ C ] MathType@MTEF@5@5@+= feaagKart1ev2aaaMjlvLHfij5gC1rhimfMBNvxyNvgaCzMCHn2E7T hxY12EK1xFCXwzMr3wGeeiCjxANHgDD1xF7T3ECjxB7rgD91xF9bsp GSwm7XLCTThiTaYAKftF9XfBLzgDBbsqGWLCPDgA011ECjxB7bslGS 2BUftF9XfBLzgDBbsqGWLCPDgA01fxSvMz0TficbcxYL2zOrxxGSci 7XLCTT3AVzMzX0xFQWfBLzgDBbYqGWLCPDgA01fatCvAUfeBSjuyZL 2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerb d9wDYLwzYbItLDharqqtubsr4rNCHbWexLMBbXgBd9gzLbvyNv2Cae Hbl7mZLdGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0x bba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0= vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaafaGcbaaeaaaaaaaa a8qadaWcaaWdaeaapeGaaeizamaadmaapaqaa8qacaWGbbaacaGLBb Gaayzxaaaapaqaa8qacaqGKbGaaeiDaaaacqGH9aqpcaWGRbGaaGym aiaab2cacaqGRbGaaeizaiaabgdadaWadaWdaeaapeGaamyqaaGaay 5waiaaw2faaiaab2cacaqGRbGaae4Baiaab6gacaqGXaWaamWaa8aa baWdbiaadgeaaiaawUfacaGLDbaadaWadaWdaeaapeGaamOqaaGaay 5waiaaw2faaiabgUcaRiaabUgacaqGVbGaaeOzaiaabAgacaqGXaGa aiOkamaadmaapaqaa8qacaWGdbaacaGLBbGaayzxaaaaaa@A299@

d[ B ] dt =B-kon1[ A ][ B ]+koff1[ C ]+kdc1[ C ] MathType@MTEF@5@5@+= feaagKart1ev2aaaMflvLHfij5gC1rhimfMBNvxyNvgaCzMCHn2E7T hxY12EK1xFCXwzMr3wGieiCjxANHgDD1xF7T3ECjxB7rgD91xF9bsp Gi0ECjxB7bslGS2BUftF9XfBLzgDBbsqGWLCPDgA01fxSvMz0Tficb cxYL2zOrxxGSci7XLCTT3AVzMzX0xFCXwzMr3wGmeiCjxANHgDDbYk GShxY12ERr2yX0xFCXwzMr3wGmeiCjxANHgDDbWexLMBbXgBcf2CPn 2qVrwzqf2zLnharuavP1wzZbItLDhis9wBH5garmWu51MyVXgaruWq VvNCPvMCG4uz3bqee0evGueE0jxyamXvP5wqSX2qVrwzqf2zLnhary WYoZC5aibaieYlNi=xH8yiVC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vq aqpepm0xbbG8FasPYRqj0=yi0dXdbba9pGe9xq=JbbG8A8frFve9Fv e9Ff0dmeaabaqaciGacaGaaeqabaWaaeaaeaqbaOqaaabaaaaaaaaa peWaaSaaa8aabaWdbiaabsgadaWadaWdaeaapeGaamOqaaGaay5wai aaw2faaaWdaeaapeGaaeizaiaabshaaaGaeyypa0JaamOqaiaab2ca caqGRbGaae4Baiaab6gacaqGXaWaamWaa8aabaWdbiaadgeaaiaawU facaGLDbaadaWadaWdaeaapeGaamOqaaGaay5waiaaw2faaiabgUca RiaabUgacaqGVbGaaeOzaiaabAgacaqGXaWaamWaa8aabaWdbiaado eaaiaawUfacaGLDbaacqGHRaWkcaqGRbGaaeizaiaabogacaqGXaWa amWaa8aabaWdbiaadoeaaiaawUfacaGLDbaaaaa@A201@

d[ C ] dt =kon1[ A ][ B ]-koff1*[ C ]-kdc1[ C ] MathType@MTEF@5@5@+= feaagKart1ev2aaaM5kvLHfij5gC1rhimfMBNvxyNvgaCzMCHn2E7T hxY12EK1xFCXwzMr3wGmeiCjxANHgDD1xF7T3ECjxB7rgD91xF9Thx Y12EG0diR9MB91xmCXwzMr3wGeeiCjxANHgDDXfBLzgDBbIqGWLCPD gA011ECjxB7bslGS2BMzwmQ0xFCXwzMr3wGmeiCjxANHgDDThxY12E G0ciRr2yX0xFCXwzMr3wGmeiCjxANHgDDbWexLMBbXgBcf2CPn2qVr wzqf2zLnharuavP1wzZbItLDhis9wBH5garmWu51MyVXgaruWqVvNC PvMCG4uz3bqee0evGueE0jxyamXvP5wqSX2qVrwzqf2zLnharyWYoZ C5aibaieYlNi=xH8yiVC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpe pm0xbbG8FasPYRqj0=yi0dXdbba9pGe9xq=JbbG8A8frFve9Fve9Ff 0dmeaabaqaciGacaGaaeqabaWaaeaaeaqbaOqaaabaaaaaaaaapeWa aSaaa8aabaWdbiaabsgadaWadaWdaeaapeGaam4qaaGaay5waiaaw2 faaaWdaeaapeGaaeizaiaabshaaaGaaeypaiaabUgacaqGVbGaaeOB aiaaigdadaWadaWdaeaapeGaamyqaaGaay5waiaaw2faamaadmaapa qaa8qacaWGcbaacaGLBbGaayzxaaGaaeylaiaabUgacaqGVbGaaeOz aiaabAgacaqGXaGaaeOkamaadmaapaqaa8qacaWGdbaacaGLBbGaay zxaaGaaeylaiaabUgacaqGKbGaae4yaiaabgdadaWadaWdaeaapeGa am4qaaGaay5waiaaw2faaaaa@A012@

d[ D ] dt =kp1[ C ]-kd2[ D ] MathType@MTEF@5@5@+= feaagKart1ev2aaaMXhvLHfij5gC1rhimfMBNvxyNvgaCzMCHn2E7T hxY12EK1xFCXwzMr3wGqeiCjxANHgDD1xF7T3ECjxB7rgD91xF9Thx Y12EG0diRbxm91hxSvMz0TfidbcxYL2zOrxx7XLCTThiTaYAKjtF9X fBLzgDBbcrGWLCPDgA01fatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerb uLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharq qtubsr4rNCHbWexLMBbXgBd9gzLbvyNv2CaeHbl7mZLdGeaGqiVCI8 FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLk VcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGa ciaacaqabeaadaqaaqaafaGcbaaeaaaaaaaaa8qadaWcaaWdaeaape Gaaeizamaadmaapaqaa8qacaWGebaacaGLBbGaayzxaaaapaqaa8qa caqGKbGaaeiDaaaacaqG9aGaae4AaiaabchacaqGXaWaamWaa8aaba WdbiaadoeaaiaawUfacaGLDbaacaqGTaGaae4AaiaabsgacaqGYaWa amWaa8aabaWdbiaadseaaiaawUfacaGLDbaaaaa@80B9@

d[ E ] dt =kp2*[ C ]-kd3*[ E ] MathType@MTEF@5@5@+= feaagKart1ev2aaaM5hvLHfij5gC1rhimfMBNvxyNvgaCzMCHn2E7T hxY12EK1xFCXwzMr3wGueiCjxANHgDD1xF7T3ECjxB7rgD91xF9Thx Y12EG0diRbxF9jJkCXwzMr3wGmeiCjxANHgDDThxY12EG0ciRrwF9n JkCXwzMr3wGueiCjxANHgDDbWexLMBbXgBcf2CPn2qVrwzqf2zLnha ruavP1wzZbItLDhis9wBH5garmWu51MyVXgaruWqVvNCPvMCG4uz3b qee0evGueE0jxyamXvP5wqSX2qVrwzqf2zLnharyWYoZC5aibaieYl Ni=xH8yiVC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8Fa sPYRqj0=yi0dXdbba9pGe9xq=JbbG8A8frFve9Fve9Ff0dmeaabaqa ciGacaGaaeqabaWaaeaaeaqbaOqaaabaaaaaaaaapeWaaSaaa8aaba WdbiaabsgadaWadaWdaeaapeGaamyraaGaay5waiaaw2faaaWdaeaa peGaaeizaiaabshaaaGaaeypaiaabUgacaqGWbGaaGOmaiaacQcada WadaWdaeaapeGaam4qaaGaay5waiaaw2faaiaab2cacaqGRbGaaeiz aiaaiodacaGGQaWaamWaa8aabaWdbiaadweaaiaawUfacaGLDbaaaa a@8281@

SPECIES, SYMBOLS AND PARAMETERS


FIG A.Species, symbols and parameters


The parament we used in the ODEs is listed in the following table:


FIG B.The parament we used in the ODEs(T--OUC-China--The parament we used in the ODEs.png)


SIMULATION

With the help of Simbiology toolbox in Matlab,we simulate our RiBoLeGo system for 16h, the result can be seen in the Fig.A.


Fig.A The dynamics of sfGFP by model prediction


We compare the experimental data to the simulation:




Fig.B The comparison between experimental data and simulation data


SENSITIVITIES ANALYSIS

In order to explore what parameters play a crucial role in the translation level of our Ribolego system, we carried out sensitivity analysis on the parameters obtained by Ordinary differential equation.


Fig C. The numerical integration of sensitivities of parameters in 14h


Through sensitivity analysis, we found that parameters k1, kp2, kdc1 and kd2 had an effect on the expression level of GOI. Since our work focuses on post-transcriptional regulation, we want to keep the riboswitch on-state and GOI translation level stable. Therefore, it is our best choice to change the translation rate to achieve the diversification of the regulation level of RiBoLeGo。












PART 2: NEW THERMODYNAMIC MODEL

FIRST PART: THERMODYNAMIC MODEL WITH PROMOTER

By performing sensitivity analysis of all parameters in this process, we find that translation rate plays an important role in different translation levels. So how to change the translation rate is a big problem for us. Fortunately, from last year’s work, we found that the rate of translation initiation is positively proportional to the translation rate. And the thermodynamic model can calculate the rate of translation initiation.So we try to use thermodynamic model to design different tunners to achieve our goal.

As we all know, in bacteria,translational coupling provides a genetically encoded mechanism to control expression level ratios within multi-cistronic operons.From picture A, we can know the coupled translation of a bi-cistronic operon.The schematic shows the promoter,the (yellow box) 5 UTR, the (red) upstream coding sequence, the (orange box) intergenic region, the (green) downstream coding sequence, and the transcriptional terminator. The intergenic region contains an inhibitory RNA structure composed of the common RBS, repressing region and coupling region . We named it tunner. When the upstream coding sequence is translated, the ribosome unbinds tunner’s inhibitory structure to facilitate downstream coding sequence.


Fig A. translational coupling encoded mechanism


Therefore, the inhibitory structure of tunner directly affects the translation rate of downstream coding sequence. The thermodynamic model takes advantage of this principle to transform the inhibitory structure into translation rate. Finally, We have developed a sequence-to-function thermodynamic model of translational coupling to predict expression level ratios in natural operons and to design synthetic operons with desired expression level ratios. You only give the sequences, the model will use this sequences to calculate the sequences' translation initiation rate .

Now we have known the coupled translation model for bicistronic. So we should know how to use this thermodynamic model to predict GOI’s translation level. In the process of translation, the initial rate of translation determines the rate of translation process. This thermodynamic model will predict the GOI’s initial rate of translation in two parts. The first part is ribosome re-initiation. The second part is ribosome de novo initiation. Ribosome re-initiation occurs when an upstream elongating ribosome dissociates and reassembles at an intergenic region, followed by translation initiation of the downstream coding sequence. Ribosome de novo initiation occurs when cytosolic ribosomes assemble onto the mRNA at intergenic regions and initiate translation of the downstream coding sequence.


Fig B. Ribosome re-initiation and de novo ribosome initiation


From this model, we get a formula to calculate the GOI’S initial rate of translation.The proposed thermodynamic model of translational coupling quantifies the molecular interactions that control both ribosome re-initiation and de novo initiation rates according to an inputted mRNA sequence. Model components and their corresponding sequence region are color coded(picture B), including the free energies of inhibitory RNA structures (coupled and non-coupled), the intergenic distance, the free energy of the ribosome-bound state, and the upstream coding sequence’s translation rate.



Fig C. The proposed biophysical model of translational coupling quantifies the molecular interactions that control both ribosome re-initiation and de novo initiation rates according to an inputted mRNA sequence(tunner).



The ΔG total,upstream refers to the total binding free energy between the ribosome and 5’UTR, according to the equation:

ΔGtotal,upstream=ΔGmRNArRNA+ΔGspacing+ΔGstart+ΔGstandbyΔGnoncoupling-ΔGcoupling*Fcoupling MathType@MTEF@5@5@+= feaagKart1ev2aaaMLovLHfij5gC1rhimfMBNvxyNvgaCruzSrxyGS hxY12EhbcDVrxySXsDWnhDYvwyT1xFG0diCruzSrxyGC0ECjxB71wF 9jLtbbslGShxY12EY1xFs5uqGSciCruzSrxyGC0ECjxB7nhCHnwAUD wF9bYkGWfrLXgDHbYr7XLCTT3C0fMC01xFGSciCruzSrxyGC0ECjxB 7nxF9XfDH5gi7XLCTThzILxF9bslGWfrLXgDHbYr7XLCTTNBV52yVv hCSLMBNbslG0xFCruzSrxyGShxY12Ehn2B1bhBP52zQy0yVvhCSLMB N1xFamXvP5wqSXMqHnxAJn0BKvguHDwzZbqefqvATv2CG4uz3bIuV1 wyUbqedmvETj2BSbqefm0B1jxALjhiov2DaebbnrfifHhDYfgatCvA UfeBSn0BKvguHDwzZbqegSSZmxoasaacH8YjY=vipgYlh9vqqj=hEe eu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=d ir=f0=yqaiVgFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaaba abauaakeaaqaaaaaaaaaWdbiabfs5aejaabEeacaqG0bGaae4Baiaa bshacaqGHbGaaeiBaiaacYcacaqG1bGaaeiCaiaabohacaqG0bGaae OCaiaabwgacaqGHbGaaeyBaiabg2da9iabfs5aejaadEeacaqGTbGa amOuaiaad6eacaWGbbGaeyOeI0IaaeOCaiaadkfacaWGobGaamyqai abgUcaRiabfs5aejaadEeacaqGZbGaaeiCaiaabggacaqGJbGaaeyA aiaab6gacaqGNbGaey4kaSIaeuiLdqKaam4raiaabohacaqG0bGaae yyaiaabkhacaqG0bGaey4kaSIaeuiLdqKaam4raiaabohaciGG0bGa aiyyaiaac6gacaqGKbGaaeOyaiaabMhacqGHsislcqqHuoarcaWGhb GaaeOBaiaab+gacaqGUbGaae4yaiaab+gacaqG1bGaaeiCaiaabYga caqGPbGaaeOBaiaabEgacaqGTaGaeuiLdqKaae4raiaabogacaqGVb GaaeyDaiaabchacaqGSbGaaeyAaiaab6gacaqGNbGaaeOkaiaabAea caqGJbGaae4BaiaabwhacaqGWbGaaeiBaiaabMgacaqGUbGaae4zaa aa@EBED@

The translation initiation rate of the first CDS(upstream) is calculated according to r 1 exp( βΔ G total ) MathType@MTEF@5@5@+= feaagKart1ev2aaaMfgvLHfij5gC1rhimfMBNvxyNvga7ThxY12EY1 xFFftFGWfCY9gC09giCvgEWbcxSvMz0Hci7bslGWLyLrxyGWfrLXgD HbYEh91E7XLCTThDVrxyS1xF91xFGWLCPDgA0LcatCvAUfeBSjuyZL 2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerb d9wDYLwzYbItLDharqqtubsr4rNCHbWexLMBbXgBd9gzLbvyNv2Cae Hbl7mZLdGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0x bba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0= vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaafaGcbaaeaaaaaaaa a8qacaqGYbWdamaaBaaaleaapeGaaGymaaWdaeqaaOWdbiabg2Hi1k GacwgacaGG4bGaaiiCamaabmaapaqaa8qacqGHsislcqaHYoGycqqH uoarcaWGhbWdamaaBaaaleaapeGaaeiDaiaab+gacaqG0bGaaeyyai aabYgaa8aabeaaaOWdbiaawIcacaGLPaaaaaa@7598@ .

The ΔG mRNA-rRNA refers to the free energy of folding for mRNA-rRNA complex, which is negative.

The ΔG spacing refers to an energetic penalty for ribosomal stretching or compression caused by a long or short spacer region between the 16S rRNA binding site and start codon, which is negative.

The ΔG start refers to the free energy for tRNA fMET -start codon complex, which is negative.

The ΔG standby refers to an energy penalty determined by the mRNA standby site’s interactions with the ribosome’s platform domain, which is positive.

The ΔG mRNA refers to the free energy of folding for 5’UTR,which is negative.


Fig D. The regions that consisted of tunner and their corresponding free energy


To sum up, each tunner sequence has its corresponding free energy, which represents the regulation level of each tunner. In order to accurately calculate the magnitude of each free energy, it is necessary to learn to distinguish the regions of the sequence.The reason why tunner is consisting of 35 nucleotides before and after the start codon,is that the model predictions do not improve when longer subsequences are considered. The tunner subsequence S1 consists of the max(1, nstart ? 35) to nstart nucleotides and the subsequence S2 consists of the max(1,nstart ? 35) to nstart + 35 nucleotides, where nstart is the position of a start codon.

To calculate the ΔGmRNA:rRNA, ΔGstart, ΔGmRNA, and ΔGstandby free energies, we use the NUPACK suite of algorithms.And five energy can be seen vividly in the Fig.C.These free energy calculations do not have any additional fitting or training parameters and explicitly depend on the mRNA sequence. In addition, the free energy terms are not orthogonal; changing a single nucleotide can potentially affect multiple energy terms.

ΔGmRNA.Using the NuPACK ‘mfe’ algorithm and Mfold parameters, the mfe configuration of

sequence S2 is calculated and its free energy is designated ΔGmRNA.

ΔG standby.The standby site is the 4 nt region upstream of the 16S rRNA binding site. The energy required to unfold the standby site is determining by calculating the mfe of sequence S2 with and without preventing the standby site from forming base pairs. The difference between these mfe's is

designated ΔGstandby. The mfe of sequence S2 without a standby site is considered as △GmRNA. To calculate the mfe of sequence S2 with a standby site that is constrained to be single-stranded, the sequence is first split into two subsequences, their mfes are each calculated, and then summed together. The two subsequences are the nucleotides nstart ? 35 to n3 ? 4 and n3 to nstart + 35, where n3 is the most 5′ base pair in the 16S rRNA binding site and 4 is the standby site length.

ΔGmRNA:rRNA.Using the NuPACK ‘subopt’ algorithm with Mfold 3.0 parameters at 37°C base pair configurations of the folded 16S rRNA and sequence S1 are enumerated, starting with the minimum free energy (mfe) configuration and continuing with suboptimal configurations, each with a corresponding ΔGmRNA:rRNA. In our work, we choose the mfe configuration’sΔGmRNA:rRNA .

ΔGspacing .The spacing(s) is a distance between SD sequence’s 3’ terminal and the start codon’s first base .When the 30S complex is stretched (s > 5 nt), the ΔGspacing is calculated according to the quadratic equation,

Δ G spacing = C 1 ( S S opt ) 2 + C 2 ( S S opt ) MathType@MTEF@5@5@+= feaagKart1ev2aaaMrjvLHfij5gC1rhimfMBNvxyNvgaCruzSrxyGS 3rFT3ECjxB7nhCHnwAUDwF91xF7XLCTThi9asF9T3qFftF7XfBLzgD OaYEtbslGS3uFT3ECjxB79gC01xF91xFGWLCPDgA0LIxY0hiRaYEd9 Lm9XfBLzgDOaYEtbslGS3uFT3ECjxB79gC01xF91xFGWLCPDgA0Lca tCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBae XatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbWexLMBbXgB d9gzLbvyNv2CaeHbl7mZLdGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8 qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9 pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaafa Gcbaaeaaaaaaaaa8qacqqHuoarcaWGhbWdamaaBaaaleaapeGaae4C aiaabchacaqGHbGaae4yaiaabMgacaqGUbGaae4zaaWdaeqaaOWdbi aab2dacaWGdbWdamaaBaaaleaapeGaaGymaaWdaeqaaOWdbmaabmaa paqaa8qacaWGtbGaeyOeI0Iaam4ua8aadaWgaaWcbaWdbiaab+gaca qGWbGaaeiDaaWdaeqaaaGcpeGaayjkaiaawMcaa8aadaahaaWcbeqa a8qacaaIYaaaaOGaey4kaSIaam4qa8aadaWgaaWcbaWdbiaaikdaa8 aabeaak8qadaqadaWdaeaapeGaam4uaiabgkHiTiaadofapaWaaSba aSqaa8qacaqGVbGaaeiCaiaabshaa8aabeaaaOWdbiaawIcacaGLPa aaaaa@95B4@

where sopt = 5 nt, c1 = 0.048 kcal/mol/nt2, and c2 = 0.24 kcal/mol/nt. When the 30S complex

is compressed (s < 5 nt), the ΔGspacing is calculated according to the sigmoidal function,

Δ G spacing = C 1 [ 1+exp( C 2 ( S S opt +2 ) ) ] 3 MathType@MTEF@5@5@+= feaagKart1ev2aaaMPkvLHfij5gC1rhimfMBNvxyNvgaCruzSrxyGS 3rFT3ECjxB7nhCHnwAUDwF91xF7XLCTThi9asF9XLzYf2y7T3Ed9vm 91xF7T3E7XfBLzgDBbYEXaYkGWvz4bhiCXwzMrhkGS3Ed9Lm9XfBLz gDOaYEtbslGS3uFT3ECjxB79gC01xF91hiRaIm9bcxYL2zOrxk9bcx YL2zOrxk9bcxYL2zOrxx951m91xFamXvP5wqSXMqHnxAJn0BKvguHD wzZbqefqvATv2CG4uz3bIuV1wyUbqedmvETj2BSbqefm0B1jxALjhi ov2DaebbnrfifHhDYfgatCvAUfeBSn0BKvguHDwzZbqegSSZmxoasa acH8YjY=vipgYlh9vqqj=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vq ai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=xfr=xb9adba qaaeGaciGaaiaabeqaamaabaabauaakeaaqaaaaaaaaaWdbiabfs5a ejaadEeapaWaaSbaaSqaa8qacaqGZbGaaeiCaiaabggacaqGJbGaae yAaiaab6gacaqGNbaapaqabaGcpeGaaeypamaalaaapaqaa8qacaWG dbWdamaaBaaaleaapeGaaGymaaWdaeqaaaGcbaWdbmaadmaapaqaa8 qacaaIXaGaey4kaSIaciyzaiaacIhacaGGWbWaaeWaa8aabaWdbiaa doeapaWaaSbaaSqaa8qacaaIYaaapaqabaGcpeWaaeWaa8aabaWdbi aadofacqGHsislcaWGtbWdamaaBaaaleaapeGaae4BaiaabchacaqG 0baapaqabaGcpeGaey4kaSIaaGOmaaGaayjkaiaawMcaaaGaayjkai aawMcaaaGaay5waiaaw2faa8aadaahaaWcbeqaa8qacaaIZaaaaaaa aaa@9F41@

where c1 = 12.2 kcal/mol and c2 = 2.5 nt -1.

ΔGstart.The ΔGstart is -1.19 and -0.075 kcal/mol for AUG and GUG start codons, respectively


The translation initiation rate of the second CDS is then calculated by summing together two sources of translational coupling: ribosome re-initiation and ribosome de novo initiation according to

r 2 α r reinitiation ( 2 ) +exp( βΔ G total (2) ) MathType@MTEF@5@5@+= feaagKart1ev2aaaMDjvLHfij5gC1rhimfMBNvxyNvga7ThxY12EY1 xFFjtFCfgBWHwyGShxY12EY1xFFT3ECjxB7jxzP5wA0Lwy0L2BU1xF 951ECXwzMrhkGidiCjxANHgDP0hiRacxLHhCGWfBLzgDOaYEG0ciCj wz0fgiCruzSrxyGC0x7ThxY12E09gDHXwF91Nx7HImP0xFGWLCPDgA 0LcatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTf MBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbWexLMB bXgBd9gzLbvyNv2CaeHbl7mZLdGeaGqiVCI8FfYJH8YrFfeuY=Hhbb f9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as 0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaq aafaGcbaaeaaaaaaaaa8qacaqGYbWdamaaBaaaleaapeGaaGOmaaWd aeqaaOWdbiabeg7aHjaabkhapaWaa0baaSqaa8qacaqGYbGaaeyzai aabMgacaqGUbGaaeyAaiaabshacaqGPbGaaeyyaiaabshacaqGPbGa ae4Baiaab6gaa8aabaWdbmaabmaapaqaa8qacaaIYaaacaGLOaGaay zkaaaaaOGaey4kaSIaciyzaiaacIhacaGGWbWaaeWaa8aabaWdbiab gkHiTiabek7aIjabfs5aejaadEeapaWaa0baaSqaa8qacaqG0bGaae 4BaiaabshacaqGHbGaaeiBaaWdaeaapeGaaiikaiaaikdacaGGPaaa aaGccaGLOaGaayzkaaaaaa@9C3E@

The first part in formula can be calculated by the following formula:


r reinitiation (2) = kp*k reinitiation ( d 1,2 )*exp(?β*Δ G total)

Where the kreinitiation(d1,2) refers to the intergenic distance dependence and the kp refers to the proportionality constant betwe rreinitiation( 2 )=kp*kreinitiation( d1,2 )*exp( β*ΔGtotal ) MathType@MTEF@5@5@+= feaagKart1ev2aaaMXkvLHfij5gC1rhimfMBNvxyNvga7XLCTTNCGi xzP5wA0Lwy0L2BU1xFCXwzMrhkGidiCjxANHgDPShxY12EG0diRbxF 9PYECjxB7TgiYvwAULgDPfgDP9MB91hxSvMz0Hci7ThxY12EK1xFXW Im9bcxYL2zOrxkQWvz4bhiCXwzMrhkGShiTacxIvgDHbIkCruzSrxy GC0ECjxB7r3B0fgB91xFGWLCPDgA0LcatCvAUfeBSjuyZL2yd9gzLb vyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwz YbItLDharqqtubsr4rNCHbWexLMBbXgBd9gzLbvyNv2CaeHbl7mZLd GeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8Wq FfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpW qaaeaabiGaciaacaqabeaadaqaaqaafaGcbaaeaaaaaaaaa8qacaqG YbGaaeOCaiaabwgacaqGPbGaaeOBaiaabMgacaqG0bGaaeyAaiaabg gacaqG0bGaaeyAaiaab+gacaqGUbWaaeWaa8aabaWdbiaaikdaaiaa wIcacaGLPaaacaqG9aGaae4AaiaabchacaGGQaGaae4Aaiaabkhaca qGLbGaaeyAaiaab6gacaqGPbGaaeiDaiaabMgacaqGHbGaaeiDaiaa bMgacaqGVbGaaeOBamaabmaapaqaa8qacaqGKbGaaGymaiaacYcaca aIYaaacaGLOaGaayzkaaGaaiOkaiGacwgacaGG4bGaaiiCamaabmaa paqaa8qacqGHsislcqaHYoGycaGGQaGaeuiLdqKaam4raiaabshaca qGVbGaaeiDaiaabggacaqGSbaacaGLOaGaayzkaaaaaa@B123@ en the ribosome assemble rate and the translation initiation rate.

For the kreinitiation(d1,2) is proved that can be calculate by the formula following:


d=-1 k=0.23

Where the d=xstart?xstop?3, xstart refers to the first nucleotides in jth CDS’s start codon while the xstop refers to first nucleotides in i th CDS ‘s stop codon.



Fig E. d>0 and d<0

(T--OUC-China--d0.png)

And it also points that the kp=10 .

The second part in formula can be calculated by the following formula:

ΔG total(2) =ΔG mRNA?rRNA+ΔG spacing+ΔG start+ΔG standby?ΔG noncoupling?ΔG coupling*Fcoupling

Where the ΔG coupling refers to the free energy of folding for all inhibitory RNA structure that block the standby site, overlap with SD sequence, spacing region or the downstream footprint region of ribosome.(If we assume that elongating ribosomes are uniformly distributed across a coding sequence with a footprint of 30 nucleotides (fp), then each one will occupy a fraction of the coding sequence’s length (fp/L).)

The ΔG noncoupling refers to the free energy of all the other RNA structure except the inhibitory RNA structure. And the Fcoupling can be calculated by the following formula:

Fcoupling = 1 /[1+C*exp(-β*ΔG total)]

Which the C is the ribosome-assisted unfolding coefficient. C=0.81 in our study.

We’ve understood about this thermodynamic model, but this model doesn’t apply for riboswitch. Because compared to normal 5’ UTR, the riboswitch’s structure will affect the upstream gene’s translation level in different ways. So we should build a new coupled translation model for bicistronic with riboswitch and use this new thermodynamic model to describe how the riboswitch affects the upstream and downstream gene’s translation level.


SECOND PART: THERMODYNAMIC MODEL WITH RIBOSWITCH


Fig F. The reaction coordinate diagram showing the states, energies, ligand-binding and translation initiation. Without ligand, the ribosome binds to a folded mRNA (state 1) with free energy ΔG total,OFF . When excess ligand is present, the co-transcriptional folding of mRNA and ligand through paths a ends with a mRNA–ligand complex (state 3) that binds the ribosome with free energy ΔG total,ON . The stability of the mRNA–ligand complex is controlled by the switching free energy (ΔG mRNA + ΔG ligand ).

(T--OUC-China--The reaction coordinate diagram showing the states, energies, ligand-binding and translation initiation .png)


As it shows in the Fig E, when there is no ligand, the conformational change leads to ΔG total,OFF, while ligands are excess, the conformational change leads to ΔG total,ON. In the later cases, it could be divided into three conditions. First, excessive concentration of intracellular ligands and stable binding of mRNA molecules to ligands. Second, the concentration of ligand is small, and the mRNA molecule can bind to ligand stably. Third, mRNA molecules cannot bind to ligands stably.

First situation: Excessive concentration of intracellular ligands and stable binding of mRNA molecules to ligands


Fig G. The change of free energies during the ligand binding and ribosome 30S subunit binding to the mRNA. (A) No ligand binds. (B) Ligand binds.

(T--OUC-China--The change of free energies during the ligand binding and ribosome 30S subunit binding to the mRNA..png)

As it shows in the Fig F, conformational changes lead to the changes of free energies. The initial state is the minimum energy state, when ribosome 30S subunit binds to mRNA, it will change to the final state. We focus on the energy difference of final and initial state, that is, ΔG total, because it is proportionable to our desired translation initiation rate(ron and roff).


When there is no lgand:


When ligand is binding:


The corresponding translation initiation rates r are:



In the riboswitch model, the calculating method for ΔGstart and ΔGspacing remains the same, however, there are quite a lot of changes for ΔGstandby, ΔGmRNA and ΔGmRNA:rRNA. The specific algorithm are listed as follows:

ΔGmRNA:rRNA and ΔGmRNA:ligand:rRNA: Using NuPACK ‘mfe’ algorithm to calculate the free energy when 16rRNA binds to mRNA.

ΔGmRNA : Using NuPACK ‘mfe’ algorithm to calculate the free energy of the initial state of riboswitch.

ΔGstandby: we use formula below to calculate this free energy:


Here, ΔG distortion describes ribosomal platform’s distortion energy penalty, and can be calculate by this:



As means the available single-stranded surface area; H means the hairpin’s height; D&P stand for distal and proximal binding sites, while C1 , C2 and C3 are the fitted parameter values and equal to 0.038kcal/mol/nt2 ,-1.629kcal/mol/nt and 17.359kcal/mol.

ΔG sliding quantize the presence of a sliding mechanism for the ribosomal platform. It’s the energy penalty the intervening mRNA structures are not unfolded, but are pushed aside with an energetic cost. To calculate this energy, we sum up the heights of all downstream RNA structures, and multiplied it by the coefficient (c = 0.2 kcal/mol/nt) to yield ΔG sliding.

ΔG unfolding describes the free energy released by unfolding base pairs. According to Amin Espah Borujeni[1], Unfolding 1 bp increases the standby site module’s surface area by 3 nt and increases the unfolding energy penalty to 0.90 kcal/mol.

④ΔGmRNA:ligand: In the case that we know the secondary structure of initial state of riboswitch in “ON” state, we use the RNAeval web server to calculates the energy of the ligand-binding riboswitch.


Second situation: the concentration of ligand is small, and the mRNA molecule can bind to ligand stably

During the ligand binding event, the mRNA folding conformation does switch from its minimum free energy at the OFF state to a more positive free energy at the ON state, resulting in a positive energy penalty ΔΔG mRNA


This energy penalty can be compensated by the free energy input from the binding of ligand to the mRNA, ΔG ligand (negative value)


Here, R is the universal gas constant, while KD is the ligand’s dissociation constant


When the switching free energy ( ΔΔG mRNA +ΔG ligand ) is negative, the mRNA-ligand binding is stable (first situation), and when the summation is positive the binding becomes metastable (second situation).

For the second situation, ON state translation initiation rate, rON,conc is:


Third situation , mRNA molecules cannot bind to ligands stably.

For the third situation, ON state translation initiation rate, rON,actual is



After figuring about all the three situations, we decide to combine the riboswitch part and bicistronic part together to yield the translation initiation rate (TIR) for our own GOI.

As we talk about in the first part, r2 can be calculated like this:




Here, we regard the TIR of riboswitch (rON) as r1, so as to build a connection between the riboswitch model and bicistronic thermodynamic model.

Depend on the riboswitch model and bicistronic thermodynamic model , we can design different tunners and predict different tunners’ functions under the riboswitch easily. you only give the sequences, the thermodynamic model will use this sequences to calculate the sequences' energy. You only give the sequences, the thermodynamic model will use this sequences to calculate the sequences' energy. This energy will predict the GOI’s initial rate of translation.

The following calculation example is given:


Fig H. Division of tunner3 region

(T--OUC-China--T--OUC-China--The change of free energies during the ligand binding and ribosome 30S subunit binding to the mRNA.png)

From Fig G,we can see that there are a lot of regions from the tunner3 which are mentioned before.

We also defined three new regions which called m1, m2 and m3,

When d<0:

The m1 is from the tunner’s first base to the start codon’s first base. The m2 is the distance between start codon and stop codon. M2 could be negative. For example, the distance of “TAATG” is -1. The m3 is the distance from the stop codon’s first base to tunner’s last base. Th s1 can be defined as m1, the s2 can be defined as m1.

When d >0:

The m1 is from the tunner’s first base to the stop codon’s first base. The m2 is the distance between stop codon and start codon. The m3 is the distance from the start codon’s first base to tunner’s last base.

The reason why we use these definitions to describe tunner is that we can use python to distinguish tunners’ different regions easily.

As we mentioned before, the free energy associated with the roboswitch were given,and we also defined the tunner’s regions, so we could use ‘NuPACK’ algorithm to calculate the corresponding energy to get the tunner’s translation initial rate.

The results calculated by the program are shown in the following Fig H:



Fig H. Individual free energy values associated with tunner3

We compared the predicted translation rate of all tunner with the experimental results, and the results are as follows:



Fig I.Comparison between experimental data and model results

(T--OUC-China--Model5.png)


Now, we know how to calculate the energy of the tunner, but how to get the suitable tunner’s sequence is a important problem for us. In our work, we use python to get the target sequence. First, we generated a random sequence which consists of the max(x, nstart ? 35) to nstart + 35 nucleotides. The nstart is the position of a start codon. The x is the first base of the sequence. The reason why we choose a sequence longer than nstart-35 is that we need to generate a stable secondary structure to hide the RBS. It so happened that this generated structure’s length is longer than nstart ? 35 to nstart + 35 nucleotides.Second, we set a target energy of the tunner. The software will defined the different regions of tunner which we mentioned before. And then the software will use ‘NuPACK’ algorithm to calculate the corresponding energy. This energy will compared with the goal energy. If they are too far apart, the software will change the common RBS, repressing region and coupling region’s sequences to keep approaching the target energy until the gap is small.

However, this thermodynamic model can not accurately predict the translation level. The translation rate is indeed related to the translation initiation rate, but the translation elongation rate also has great influence on the translation level. Therefore, to solve this problem, we introduce model relative codon bias(RCB).

There are a number of measures currently in use that quantify codon usage in genes. Based on the hypothesis that gene expressivity and codon composition is strongly correlated, RCB has been defined to provide an intuitively meaningful measure of an extent of the codon preference in a gene. This model was confirmed in Escherichia coli. Depend on this model we can assess the strength of RCB(RCBS) in genes as a guide to the gene’s expression level. RCBS is the overall score of a gene indicating the influence of RCB of each codon in a gene.

The RCBS in estimating gene expression level is related to codon usage difference of a gene with respect to biased nucleotide composition at the three codin sites. Let f(x,y,z) be the normalized codon frequency for the codon triplet(x,y,z) of a gene. Then the RCB of a codon triplet (x,y,z) in a gene is defined as:


where f1(x) is the normalized frequency of x at the first codon position, f2(y) is the normalized frequency of y at the second codon position, and f3(z) is the normalized frequency of z at the third codon position of the gene. The frequencies f1, f2, f3 have been derived from the set of codon samples of a gene and the normalization of frequency is done over the gene length in codons, in an attempt to compensate for the expected increase of RCB with the total number of codons. We quantify the degree of codon bias of a gene in such a way that comparisons can be made both within and between genomes. As defined earlier, dxyz contains somewhat more quantitative information than others, since it considers codon usage as well as the base compositional bias. Then the expression measure of a gene is


where is the codon usage difference of ith codon of a gene. L is the number of codons in the gene.

Our analysis is based on the hypothesis that RCB reflects the level of gene expression. The expression measure of a gene in this approach is denoted by RCBS. RCBS s thus useful for comparing different sets of genes. In most of the case, highly expressed genes have RCBS >0.5,the majority of gene (63%) have RCBS values lying between 0.2 and 0.4 and RCBS value close to 0 indicates a lack of bias for the codons.

Here is a example:

GOI sequence: AAA CGA CAC AGG AAA, L=5

dAAA=41/9 ; dCGA=13/12; dCAC=19/6;dAGG=19/6

RCBS=[(1+41/9)(1+13/12)(1+19/6)(1+19/6)]^(1/4) - 1=1.888


Finally, by combing the thermodynamic model with RCBS, a new formula describing the translation rate is obtained:

R(translation rate)=r2*RCBS

Because of this new formula, we can predict the translation rate of the target gene more accurately that the thermodynamic model.





PART 3: STABILIZER

After we overcome the problem of designing tunner, we also need to select the appropriate stabilizer clearly.

As we all konw, different riboswitches need different stabilizers to stable its second structure. If someone want to use a new riboswitch or they want to find out the shortest stabilizer’s length, we can use the docking matrix to get target stabilizer.

The two sides of the docking matrix consist of sequences of riboswitch. If the bases on both sides are paired, then we put a 1 in the matrix. If the bases on both sides are not paired, then we put a 0 in the matrix. By predicting the secondary structure of a riboswitch in a sequence of only riboswitch. Thus, we have a matrix of 0, 1 and riboswitch sequences. We call this matrix the Ribo. From this matrix we get a total target value i(that is, all zeros and ones added together).


Fig A. The types and composition of docking matrix.

(T--OUC-China--The_types_and_composition_of_docking_matrix2.png )

Next, we should find the appropriate GOI downstream of the riboswitch by blast or paper. We added this GOI to the riboswitch sequence. On this basis, we predict the secondary structure of the riboswitch. We can get a new matrix called Ribo-GOI. From this matrix we get a total target value j(that is, all zeros and ones added together).Finally,We added this GOI in multiples of three to the riboswitch sequence to predict the sequence pairing of riboswitch by adding the stabilizer. This new matrix called Ribo-STA.From this matrix we get a total target value k(that is, all zeros and ones added together). If the values of the same position of the matrix are the same(i.e., both are 0 or 1), we consider the position to be structurally correct.If the value are different at the same place in the matrix(i.e., not all of them are 0 or 1), we consider the riboswitch sequence to be mismatched. So, in the corresponding position in this matrix, we’re going to give it very high Penalty value to alert us.Therefore we also get a new total value k· from the Ribo-STA.

In conclusion, we get there matrices, Ribo, Ribo-GOI and Ribo-STA. By comparing three kinds of matrix, we select the most suitable stabilizer.We will discuss it in two ways:

If i-j≠0:

This represents a change in the secondary structure of the riboswitch by adding the GOI.However, STA of blast or paperbase has been proved to be successful in stabilizing secondary structure of riboswitch.Therefore, although its structure is not the best, it is relatively stable. So we calculate the different between i and k and i and j by the residual sum of squares . We want GOAL2/GOAL1 to be close to1 or 0. Close to 1 represents that the introduction of stabilizer’s stable effect is close to blast or paperbase. Close to 0 represents that the introduction of stabilizer’s stable effect is better than blast or paperbase. So on this comparison method, we can choose the relatively suitable stabilizer.

GOAL1=(i-j)**2; GOAL2=(i-k)**2


If i-j=0:

This indicates that the introduction of GOI stabilizes the secondary structure of the riboswitch. So

We divide i by k, and if it’s 1, that means the Ribo-STA corresponds to adding sequence is the most suitable stabilizer. If not, we’re going to pick the one that approaches 1.

Therefore, we translate the above procedure into code. with the help of python, we can get the suitable stabilizer sequence.we called them STA9,STA21STA81 and STA129. Among them, the program tells us that STA9 and STA21 stabilizer changed the secondary structure of riboswitch, they all have high levels of punishment , in contrast, STA81 and STA129 both close to 1 ,STA81 and STA129 stable the secondary structure of riboswitch. From the following experimental result, we can successfully verify the correctness of the theory and program.


(T--OUC-China--sta9.png)(T--OUC-China--sta21.png)


T--OUC-China--sta81.png T--OUC-China--sta129.png











PART 4: ASRNA

In order to regulate the riboswitch’s function more plentiful , we want to use asRNA to re-activate or re-repress the riboswitch by hiding or exposing the RBS. The mechanism is shown in the figure below. And we can also discover that asRNA contains a hfq binding site. The hfe binding site will recruit a hfq protein to stable the asRNA. Under the hfq protein’s protection, the asRNA prevents degradation.


Fig A. Principle of activation and repressing of asRNA

(T--OUC-China--Principle of activation and repressing of asRNA.png)


FIRST PART : REPRESSING ASRNA

A challenge in developing synthetic repressing asRNA design rules is a wide range of parameters that need to be considered: an Hfq binding site, thermodynamics, the asRNA length, the number of mismatches, and the binding location. Studying one parameter in isolation can be difficult because of the interdependence of each of these parameters. For example, changing the asRNA length or the number of mismatches can affect the thermodynamics of asRNA-mRNA interaction.


Fig B. Synthetic repressing asRNA design rules

(T--OUC-China--Model6.png)

From this picture, we can know the asRNA design rules. The first rule is choose the best Hfq binding site. There are four Hfq binding sites, and the Spot42 had the highest overall repression with an average of 76.9% repression for all five TBRs. The MicC,MicF, and MicF M7.4 binding sites also performed well, with average repressions of 67.2%, 71.6%, and 69.8%, respectively. Because orthogonality is critical to the function of complex gene circuits and the MicF M7.4 had the lowest off-target effects.Finally we choose the Micf M7.4 Hfq binding site.

The second rule is about target binding region design.The asRNAs are composed of two regions: the TBR and the Hfq binding site.The goal is to develop a list of design rules that will allow for the de novo design of asRNAs that can achieve high levels of target repression, high orthogonality, and generally predictable behavior. The following categories of design parameters were investigated: (1) target location, (2)mismatch, (3) length, (4) thermodynamics.

target location.The first design characteristic that was varied was the binding location of the asRNA. Several previous studies have shown that asRNAs are most effective when they target the translation initiation region (TIR).And the TIR was divided into four regions (USD, SD,AUG, and C2?8).The first region is the portion of the 5′ untranslated region (5′UTR) that is upstream of the Shine Dalgarno sequence, and is hereafter referred to as the Upstream Shine Dalgarno (USD).The second region contains the Shine Dalgarno (SD) sequence in addition to the nucleotides between the SD sequence and the start codon. This region is simply referred to as the Shine Dalgarno region (SD),to indicate that it includes the SD sequence.The third region is the three-nucleotide start codon (AUG), and the fourth region is codons 2?8 in the coding region (C2?8).and then each TBR was designed to target one of seven different combinations of these four regions.TBRs were designed to target one of seven mRNA locations.The binding

location has no impact on the repression efficiency of the asRNA, as each location

showed approximately equal percent repression.


Fig C. Repressing asRNA's target region

(T--OUC-China--Repressing asRNA's target region.png)


Mismatch. Though most engineered TBRs are perfectly complementary to their target gene, asRNAs in natural systems bind imperfectly to their target, resulting in several shorter (8?9 nt) regions of dsRNA. Because many naturally occurring asRNAs have some degree of mismatch,which raises the question as to what the maximum mismatch tolerance is for effective repression. The knowledge that TBRs with up to 15% mismatch can still effectively repress gene expression will be a helpful design guideline in preventing off-target effects.We introduced the mismatch by deleting nucleotides from the TBR sequence in order to introduce mismatches into the asRNA?mRNA complex.

Length. AsRNAs that had at least 15 nt of dsRNA had significantly higher target gene repression than asRNAs with less than 15 nt of dsRNA. A scatterplot

showing the correlation between percent repression and maximum dsRNA length in the asRNA?mRNA complex shows a sharp increase in repression, followed by a plateau in repression after approximately 15 nt of dsRNA.


Fig D. Repressing asRNA's length

(T--OUC-China--Repressing asRNA's length.png)

Thermodynamics. Thermodynamics were found to have a significant impact on repressing asRNA function. This result is expected, since several studies have already identified the importance of thermodynamics to the function of RNA regulators. In this study, two thermodynamic parameters were considered. First,the ΔG of the asRNA?mRNA complex was taken into consideration (ΔG Complex). This accounted for intermolecular forces between the asRNA and mRNA, and because this structure was designed to be stable, this value was very negative.This parameter was estimated using Mfold, as described previously.Second, the difference between the ΔG of the TBR (ΔG TBR), which was unstructured and generally very close to zero, and the ΔG of the asRNA?mRNA complex was calculated and called ΔG Complex Formation (ΔG CF). Lower ΔG values, which indicate more thermodynamically favorable structures, were correlated with higher repression, as expected.AsRNAs with ΔG CF values that were less than ?40 kcal/mol had significantly higher repression than asRNAs whose ΔG CF values were higher than ?40 kcal/mol.

In order to predict the asRNA’s repression efficiency, we use a multivariate model.


where F is the predicted repression efficiency, X1 is ΔGCF (in kcal/mol), X2 is percent mismatch (in %), and ε is the standard error (ε = 0.123). All coefficients had p

values less than 0.001.



Depend on these design principles, we can design a lot of applicable asRNAs by python.First, we get the sequence which matches TIR 100%. The TIR is located in the USD, SD,AUG, and C2?8 region of the adda riboswitch in our work. Second, We introduced the mismatch by deleting nucleotides from the TBR sequence in order to introduce mismatches into the asRNA?mRNA complex. This mismatches is less than 15%. Third, we calculate the energy ΔGCF and made sure it was less than -40 kcal/mol. Finally, we get a lot of feasible asRNAs. We use the a multivariate model to predict the asRNA’s repression efficiency. And we choose the best one.

Here is a design example:

Repressing asRNA sequence:

CTTCGGCAGGTCAAAGTAATTCATAGTCTCTCTTTATAA(TBR)CGTCCCGCAAGGATGCGGGTCTGTTTACCCCTATTTCAACCGGCCGCCTCGCGGCCGGTTTTTTTTT(hfq binding site)

Target sequence:

TTATAAAGAGAAGACTATGAATTACTTTGACCTGCCGAAG

Mismatch:1/39

ΔG Complex Formation=-54.381+1.1=-53.281

The results of repressing asRNA designed by the model are given as follows:






SECOND PART : ACTIVATING ASRNA

Compare to the repressing asRNA, there are no mature design principles in activating asRNA. On the present paper introduction, the design of an activating asRNA requires consideration of an Hfq binding site, thermodynamics, and the binding location.

As we mentioned before, we choose the Micf M7.4 Hfq binding site.

Thermodynamics were also found to have a significant impact on activating asRNA function. Here is a formula that describe the effect of thermodynamics on the activating asRNA.


The A and B represent asRNA and TIR, respectively. The ΔGc represents the energy of A?B complex which is calculated by NUPACK_mfe_multi.(ΔGc Complex). The ΔG is ΔGc Complex-ΔGA-△GB. The ΔGA and ΔGB represent the energy of A and B which are calculated by NUPACK_mfe,respectively. The ? is a length of the toehold of all targeted pairwise interactions AB. In order to design activated asRNA easily, α is a sequence length independent of the TBR stem ring which is considered to be a sequence length perfectly complementary to TIR(Fig E). And the α’s saturation levels is 6 (?sat=6 ). Gp=-1.28 Kcal/mol is the average contribution of a nucleotide in the toehold to the free energy of the initiation process.


Fig E.The definition of alpha

(T--OUC-China--The definition of alpha.png)

The binding location are also have some difference between repressing asRNA and activated asRNA.No studies have clearly shown the effect of different binding location on activated asRNA.In order to ensure that displace the cis-repressive sequences easily,We don’t introduce the mismatch but make the activated asRNA’s TBR matches the cis-repressive sequences 100%.To maximize the translation rate, we also enforce that the four nucleotides upstream the RBS(standby site) and all downstream nucleotides be unpaired in the final hybridized structure.



Fig F. Principle of of activating asRNA

(T--OUC-China--Principle of of activating asRNA.png)

In order to predict the activation, We used the fitting formula :

fold change=-0.1640x-1.4655

This formula is derived from experimental date.According to the formula, if we want to have a good activation effect of asRNA, the objective function △GKin needs to be less than -20 kal/mol.

Fig G. The activation times of △Gkin

With these guidelines, we were able to design activated asRNA.

Here is a design example:

Activating asRNA sequence:GGGAGGGUUGAUUGUGUGAGUCUGUCACAGUUCAGCGGAAACGUUGAUGCUGUGACAGAUUUAUGCGAGGC

△G= -34.500 kal/mol

Target sequence:

CCUCGCAUAAUUUCACUUCUUCAAUCCUCCCGUUAAAGAGGAGAAAUUAUGAAUG

△GA=-14.800 kal/mol

Complex sequence:CCUCGCAUAAUUUCACUUCUUCAAUCCUCCCGUUAAAGAGGAGAAAUUAUGAAUG+GGGAGGGUUGAUUGUGUGAGUCUGUCACAGUUCAGCGGAAACGUUGAUGCUGUGACAGAUUUAUGCGAGGC

△GB=-62.581 kal/mol

As a result, △Gkin=-62.581+34.5+14.8-6*1.28=-20.961 kal/mol

The results of repressing asRNA designed by the model are given as follows:


(T--OUC-China--btub+tunnere+asRNA.png)


PART5:MOLECULAR DYNAMICS

This year, we want to use molecular dynamics method to explore conformational change of riboswitch between ON state and OFF state. First, we find the secondary structure of addA in OLOF ALLNéR’s research, we know the sequence of addA riboswitch can be divided into several parts: stem P1, P2 and P3, loop L2 and L3, junction-connecting segments J1-2, J2-3, and J3-1, as it shows in Fig A Then we get tertiary structure file of addA riboswitch from RCSB PDB, which is a protein data bank. The PDB ID we search for is 1Y26, and we use PyMOL to visualize it. The result is shown in Fig B.


Fig A. 2-D secondary structure of addA riboswitch.


Fig B. Using PyMOL to visualize addA tertiary structure. Green section represents P1 stem, Blue section represents junction area, red section represents the ligand 2-AP, yellow section represents P2 stem, orange section represents P3 stem, purple section represents L2 stem , cyan section represents L3 stem and five pink dots represents Mg2+.

(T--OUC-China--Using PyMOL to visualize addA tertiary structure.png)

According to OLOF ALLNéR’s study, the distance between loop L2 and L3 is extremely important in the conformational change of addA riboswitch between ‘ON’ state and ‘OFF’ state. When such distance is larger than 27 ?, we can consider the ribowitch as an ‘ON’ state. As it shows in Fig C, with the distance between loop L2 and L3 increasing, the structure of addA becomes more and more loose, which provides a chance for RBS to expose, leading to slow change of riboswitch conformation from ‘OFF’ state to ‘ON’ state.


Fig C. Riboswitch aptamer in complex with adenine. Showing the relationship between the L2-L3 distnce and addA riboswitch ON/OFF state.

(T--OUC-China--Riboswitch aptamer in complex with adenine.png)

Also, in another study of adenine riboswitch structure (written by U. Deva Priyakumar and Alexander D. MacKerell Jr)]、, we find the ligand is essential in stabilizing the addA riboswitch’s structure. When ligand adenine binds to the addA riboswitch, the whole system becomes stable, so the classical structure such as L2, L3 and P1 come into being (Fig D). However, when adenine disassociates with the addA riboswitch, riboswitch suffers a heavy loss of stable binding of ligand covalent bonds, so the whole systems start to collapse. The bonds between the ribonucleotide bases begin to break, leading to disappearance of the stems and loops. In the end, the structure of addA riboswitch totally vanishes and becomes a straight chain, which couldn’t accept ligand-binding to initiate the translation of downstream gene.


Fig D. With or without ligand-binding, the conformational change of addA riboswitch.

(T--OUC-China--With or without ligand-binding, the conformational change of addA riboswitch.png)

Inspired by their previous work, we decide to simulate the whole process of how the addA ribswitch opens and closes. We choose GROMACS as our molecular dynamics simulations program because of its widely-use and great manipulational convenience compared to other programs. Before using it, in order to add the missing atoms to addA PDB file, we use PDBfixer to fix it, so as to get the proper input for the next move. We use the output of PDBfixer as input for GROMACS and start the simulation. This simulation refers to the GROMACS Tutorial: Lysozyme in Water to make the simulation more accurate.

We choose CHARMM27 all-atom force field as this simulation’s force field. We use a cubic water box, put the RNA in the center, and add Na+ and Cl- to neutralize the charge of the system. For the energy minimization process, we simulate 50000 steps, while for the NVT and NPT equilibration process, we simulate 100ps. For the last step of Production MD, simulation lasts 10ns. Except the situation of ligand-binding, we also probe the situation of no ligand-binding. We use PyMOL to delete the ligand of addA, and use the same method to run the simulation for 30ns.


Fig E. RMSD analysis of addA with or without ligand-binding. (a)RMSD of addA with ligand-binding. (b)RMSD of addA without ligand-binding.

(T--OUC-China--RMSD analysis of addA with or without ligand-binding1.png

T--OUC-China--RMSD analysis of addA with or without ligand-binding2.png)


First, we detect root-mean-square deviation(RMSD) of the addA during the whole simulation. As it shows in Fig E, with time passing by, RMSD of addA with ligand-binding doesn’t vary too much. It fluctuates around 1nm, indicating during the whole process of simulation, the structure of addA riboswitch is very stable. Due to ligand-binding, 2-AP help stabilize the structure of addA, so structure remains almost the same as the original state. However, as for addA without ligand-binding, RMSD becomes larger and larger as times goes by, meaning the structure changes a lot, which proves that without ligand-binding, the structure of addA tends to becomes loose and lose its function gradually.


(a)The distance for addA with ligand-binding. (b) The distance for addA without ligand-binding.


(c) Schematically shows the distance between A35 and U63 using PyMOL.

Fig F. Average distance of L2 and L3.

(T--OUC-China--The distance for addA with ligand-binding.png)

Then we explores how the distance between of loop L2 and L3 varies with time. We difines the distance of A35(in loop L2) and U63(in loop L3) as the distance between L2 and L3. We see the same regulation as we find in the RMSD(Fig F): the distance doesn’t change much for addA with ligand-binding, but changes a lot for addA without ligand-binding, which further proves our previous assumption.

Then, we further explore how the length of stabilizer affects addA riboswitch’s structure. We take 9bp and 150bp stabilizer as example to demonstrate such influence. First, we use SimRNA to predict the structure of addA riboswitch with 9bp stabilizer. As we can see in Fig G, the structure becomes linear, losing all the classic structure of P1, L2 and L3. And most of all, it loses ligand-binding site, which indicates it can’t correspond to ligand-bind, not to speak of the “ON” state or “OFF” state of riboswitch.


Fig G. The predict structure of addA riboswitch with 9bp stabilizer.

(T--OUC-China--The predict structure of addA riboswitch with 9bp stabilizer.png)

Next, we use RNAcompser to predict the structure of addA riboswitch with 150bp stabilizer. The certain sequence and secondary structure we use are given in the Fig H. The result of RNAcomposer are shown in Fig I.


Fig H. The sequence and secondary structure (in dot and bracket) of addA riboswitch with 150bp stabilizer.


Fig I. The predict tertiary structure of addA riboswitch with 150bp stabilizer. The area with color is addA riboswitch, while the white area represents 150bp stabilizer.

(The predict tertiary structure of addA riboswitch with 150bp stabilizer.png)

We then use autodock to dock 2-AP to the structure, to see if it can get the right conformation. We get 50 results of docking, and we choose the conformation where 2-AP exists between P1 stem and junction area, which is consistent with crystal structure. (Fig J)


Fig J.The docking result of addA riboswitch with 150bp stabilizer. The area with color is addA riboswitch, the red molecule represents ligand 2-AP, the white area represents 150bp stabilizer.

(The docking result of addA riboswitch with 150bp stabilizer .png)


For the last step, we put the docking result into GROMACS, to see the conformational change over time. The simulation lasts for 1ns. The result can be seen in the following video. From the video, we can see The structure is gradually elongated, but the ligand stays at the proper place. So the whole structure doesn’t collapse, indicating 150bp might stablize the structure of addA riboswitch.

(视频!)

For the future work, we will change the simulation conditions, so as to see when the distance between L2 and L3 increases, the structure changes from “ON” state to “OFF” state and visualize it with PyMOL.





Reference:

1. Tian Tian and Howard M. Salis.A predictive biophysical model of translational coupling to coordinate and control protein expression in bacterial operons.Nucleic Acids Research, 2015, Vol. 43, No. 14 7137–7151.

2. Howard M. Salis, Ethan A. Mirsky, and Christopher A. Voigt.Automated Design of Synthetic Ribosome Binding Sites to Precisely Control Protein Expression.Nat Biotechnol. 2009 October ; 27(10): 946–950.

3. Nicholas J. Porubsky, Brian R. Wolfe, Justin S. Bois, and Niles A. Pierce.NUPACK 3.2 User Guide

Analysis and Design of Nucleic Acid Structures, Devices, and Systems.California Institute of Technology March 7, 2017.

4. Amin Espah Borujeni, Dennis M. Mishler, Jingzhi Wang,etc.Automated physics-based design of synthetic riboswitches from diverse RNA aptamers.Nucleic Acids Research, 2015.

5. Amin Espah Borujeni, Anirudh S. Channarasappa and Howard M. Salis.Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream standby sites.Nucleic Acids Research, 2014, Vol. 42, No. 4.

6. Uttam ROYMONDAL, Shibsankar DAS, and Satyabrata SAHOO.Predicting Gene Expression Level from Relative Codon Usage Bias: An Application to Escherichia coli Genome.DNA RESEARCH 16, 13–30, (2009).

7. Serganov A, Yuan YR, Pikovskaya O, et al. Structural basis for discriminative regulation of gene expression by adenine- and guanine-sensing mRNAs. Chem Biol. 2004;11(12):1729–1741.

8. James E. Johnson, Jr., Francis E. Reyes,1 Jacob T. Polaski,etc.B12 cofactors directly stabilize an mRNA regulatory switch.Nature. 2012 Dec 6; 492(7427): 133–137.

9. Jacob T. Polaski,1 Samantha M. Webster,1 James E. Johnson, Jr,etc. Cobalamin riboswitches exhibit a broad range of ability to discriminate between methylcobalamin and adenosylcobalamin.J Biol Chem. 2017 Jul 14; 292(28): 11650–11658.

10. Antony Lussier, Laurène Bastet, Adrien Chauvier,etc.A Kissing Loop Is Important for btuB Riboswitch Ligand Sensing and Regulatory Control.J Biol Chem. 2015 Oct 30; 290(44): 26739–26751.

11. J?rg Vogel and Ben F. Luisi.Hfq and its constellation of RNA.Nature.2011.

12. Young Je Lee, Tae Seok Moon.Design rules of synthetic non-coding RNAs in bacteria.Methods 143 (2018) 58–69.

13. Allison Hoynes-O’Connor and Tae Seok Moon.Development of Design Rules for Reliable Antisense RNA Behavior in E. coli.ACS Synth. Biol. 2016, 5, 1441?1454.

14. Young Je Lee,? Soo-Jung Kim,? Matthew B. Amrofell,etc.Establishing a Multivariate Model for Predictable Antisense RNA Mediated Repression.ACS Synth. Biol. 2019, 8, 45?56.

15. Rodrigo G, Landrain TE, Majer E, Daro`s J-A, Jaramillo A (2013) Full Design Automation of Multi-State RNA Devices to Program Gene Expression Using Energy-Based Optimization. PLoS Comput Biol 9(8): e1003172.

16. Guillermo Rodrigo, Thomas E. Landrain, and Alfonso Jaramillo.De novo automated design of small RNA circuits for engineering synthetic riboregulation in living cells.PNAS ∣ September 18, 2012 ∣ vol. 109 ∣ no. 38 ∣ 15271–15276.

17. Yuta Sakai,Koichi Abe, Saki Nakashima,etc.Improving the Gene-Regulation Ability of Small RNAs by Scaffffold. Engineering in Escherichia coli .ACS Synth. Biol. 2014, 3, 152?162

18. Yin P, Choi HM, Calvert CR, Pierce NA (2008) Programming biomolecular self-assembly

pathways. Nature 451:318–322.

19. OLOF ALLNéR, LENNART NILSSON, and ALESSANDRA VILLA.Loop–loop interaction in an adenine-sensing riboswitch: A molecular dynamics study.RNA 19:916–926; ? 2013.

20. Priyakumar U D , Jr M K . Role of the Adenine Ligand on the Stabilization of the Secondary and Tertiary Interactions in the Adenine Riboswitch[J]. Journal of Molecular Biology, 2010, 396(5):0-1438.

21. Rolf Lutz and Hermann Bujard. Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements. Nucleic Acids Research, 1997, Vol. 25, No. 6 1203–1210.