Model
Modeling
instruction
We constructed a series of RNA sequence for predicting the stability of different hydrogen linkage of base pairs (C-G or A-U) and stem-loop structure sequence.
For testing the stability of bases, we only altered ACGU in a specific region of the same sequences. For example, in those two sequence
GGAUAAAUCCUUACAAGUCUGCUGAAGGAGAUAUACCC
GGAUCCAUCCUUACAAGUCUGCUGAAGGAGAUAUACCC
The same sequence with two pair of bases changed.
For testing the stability of the stem-loop structure, we mainly deal with it in two ways: changing size of the loop and length of the stem. Just like those below.
Purpose
For this modeling, we are looking for a prediction toward to the final product. Since our product want to accomplish the role of a biochemical based thermometer, we needs to restrict the expression strength of fluorescent protein in a certain temperature range. Plus, we have to make multiple kind of products in which the fluorescent protein will express in a variety of temperatures. In this case, it is necessary to predict the structure that is suitable to our investigation. This will highly likely reduce the bias and lower the cost of experiments.
How to construct the model
In order to have a prediction, we need to make a assumption of what percent of fraction will disconnect in the RNA sequences in the temperature from 1 Celsius degree to 100 Celsius degree.
We uses the website called “Nupack” to construct our modeling.
Here is the link: http://www.nupack.org/
This is the first page of the website. As you check the “compute melt”, the minimum temp, increment, and maximum temp would appear. We set a range for temperature from 0-100 Celsius degree and fill in a 10 into increment column. Therefore, it will give us the data of fraction bases unpaired every 10 Celsius degree. After set up, you need to click “analyze” to the next step.
Then, the graph is there. This graph give you fraction of bases unpaired and its corresponding temperature. It sets dots on every 10 Celsius degree as the markers. Once you click the “download data”, the data of all those 11 dots will be uploaded just like the picture shows below.
Observation and Analysis of our model
We have designed 49 RNA sequence in total. We categorized those RNA sequence into two group by their characteristic that we designed. One group is mainly focus on the loop size, and another is mainly focus on the length of the stem and the base on the complementary side of RBS (RNA binding site, it is the place where stimulate the protein express but restricted by the loop structure). The bases change on the complementary side of RBS will form a little bulge on sides of the stem because the base pairs only connect with their corresponding bases. The changing or adding non-corresponding bases on the complementary side of the RBS will form a bulge or a small loop structure. (notice: since our final goal are the RNA thermometer for vaccinate and since the characteristics of organic matters, we only mainly focus on the temperature span from 10-40 Celsius)
In this picture, where I circled with red pen is the bulge I mentioned. From the data under the diagram we can see that the one base change on the complementary side of the RBS apparently arouse the instability of the sequence. (The data under the graph means that X requires 5.15 Kcal/mol to break the sequence, and Y requires 8.95Kcal/mol to break the sequence.)
All the models
Here are the 49 models we constructed plus two reference models (X0 andY0 fromDesign of simple synthetic RNA thermometers for temperature-controlled gene expression in Escherichia coli)
We divide those models into 6 groups by their characteristics. Some group share the same kind of trait changing. This is because that we want to see what will happen in both moderated change and dramatic change. What I mean by this is that some shape of the structure might seen to be absurd, just like B5, it has 71 A’s in a row as the loop structure, and some have a gradually modify on the loop structure, just like group 1. We also made some groups holding a internal loop on the stem and compared the changes of the sequences whose order are the same except some subtle substitution on ACGU. For all of those actions, We just want to behold the variety of all sort of structures as much as possible; Therefore, we can have a much more confidence on the final product.
This is the overall graph of all the RNA sequences together. X-axis indicate the fraction of bases unpaired, and Y-axis represents temperature in Celsius degree though 0-100. Each strain represents the data of each sequence we gathered.
Observation and Analysis of all the groups
Group1
For this group, we gradually changed the size of the loop by adding one base each time though y1-y10 on the middle of the loop structure . We discovered the stability of them have been lightly decrease for each time.
This is the graph of those data. This graph shows that the starting point and the line of each sequence are different. The lines from downside to up are y1-10 in order. As a result, the loop size manipulate the stability of the sequence some how, and in the sequence stem structure, the bigger the loops are, the less stability the sequences are.
This is the group which has all the bulges on different district on the stem.
All the data fluctuate up and down disorderly. So that, so far, we assume the place of bulges does not effect the stability until further investigation.
For this group, we created the RNA sequences with dramatic differences in loop size. Some loops are huge and some don’t even have a loop. Of course, the result perfectly fit our analysis from our early explore: the bigger the loop is, the lesser stability the sequences are.
This group is a comparative group which derived from Group 2. Instant of change one base on the stem, we changed two bases on all the different location on the stem for this time. For a more clear explanation, the optic group is here.
You can see from this picture that group 2 is more concentrated compare with group 4. this shows that larger bulge on the stem does effect the stability. However, the thing we did not fingered out is that how the location can change the stability. You can see from the group that all the lines lie differently on the table, but they do not lie orderly. This could be the key for how the location of additional bases effect the stability of the whole sequence; However, since our testing groups are too small, we could not give a result for this. We will try to improve the modeling in future investigation.
This group is just for an simple explanation of the interrelationship and effect on the bases. The result pretty much coincide our expectation. The B23 is clearly more stable then others. This data show the matched bases make the sequence much more stable vice versa.
For this group of data, we set a smaller loop on the stem beneath the RBS and its complementary side. We only changed the component of the small loop structure each time and see what’s different. We want to see how the interrelation of bases on the small loop will change the stability of the whole sequence. From this data, we can clear see a distinguishable difference between the sequence B25,28,30 and B26, 27, 29. The thing that surprised us is why the B26 relatively stable? Form this result, we assume that the coexistence of G and U is more stable from the observation. The reason for that, we don’t know.
Summary
As a conclusion, we summarized all the characteristic of the RNA sequences. First, as the loop gets bigger, the stability gets lower, vise versa. Second, the number and size of bulges on the stem effect the stability. The more and bigger the bugles are, the lower the stability is.
Future investigation
We can not clearly define how the choice of bases and the location of bugles effect the sequence’s stability. We will make a bigger sample to determine the effect in future investigation. We will also try to finger out more factors that affect the stability of RNA sequence, like to finger out how the stem length could contribute to the stability...
Reference
1.Design of simple synthetic RNA thermometers for temperature-controlled gene expression in Escherichia coli Juliane Neupert, Daniel Karcher and Ralph Bock*Max-Planck-Institut fu¨r Molekulare Pflanzenphysiologie, Am Mu¨ hlenberg 1, D-14476 Potsdam-Golm, Germany
Received July 16, 2008; Revised August 7, 2008; Accepted August 8, 2008
2.Shaunak Sen, Divyansh Apurva, Rohit Satija, Dan Siegal, and Richard M. Murray
ACS Synth. Biol., Just Accepted Manuscript • Publication Date (Web): 24 Apr 2017
Downloaded from http://pubs.acs.org on April 27, 2017