Team:Stuttgart/Model

Awards

Model

Abstract

This year’s iGEM project consists of two distinct subprojects (tRNA optimization of Vibrio Natriegens and creation of an eco-friendly cultivation medium based on microalgae). Hereby, modelling in both subprojects was a key factor for our success.
For one, we underwent a systems biology approach in proving that a rare codon tag/tandem repeat of rare codons at the C-terminus slows down the protein translation initially and that this burden can be overcome by increasing the rare tRNA availability.. The gene expression model which uses ~500 ordinary differential- and 7 algebraic equations was adapted as far as possible to Vibrio natriegens.  By using the gene expression model, we gained insight into the expression of superfold-GFP (sfGFP) and derivatives of it in Vibrio natriegens.  We plan to use these sfGFP derivatives to validate the increased potential of our new Vibrio natriegens strain to express heterologous genes with an increased number of rare codons.
In the other subproject we were tasked with creation of enough biomass in order to follow through with our experiments. Hereby, we modelled a unique airlift photobioreactor with six integrated light tubes, increasing the effective illumination surface. The reactor was modelled using computational fluid dynamics (CFD), solving the navier-stokes equation for more than 737 043 tetrahedral cells. The confirmation that integration of light tubes into the airlift does not interfere with the fluid flow causing an effective mixing of the microalgae, helped us set up the reactor. With the usage of said bioreactor microalgae biomass production was able to be increased significantly enabling us to follow through with our experiments.

Vibrio natriegens

Model

Introduction

The bacterial gene expression model that we used, was developed in 2005 by Arnold et. al at the Institute of Biochemical Engineering at the University of Stuttgart1. The target of this model was to fully depict the bacterial transcription, translation initiation, elongation and termination. A schematic view of the biological processes depicted by the model is shown in figure 1.


This model is still used at the institute and was refined multiple times over the last 14 years. The current state, which served as a basis for our adaptation was published by Nieß et al. in 20172. During these iterations, the model’s runtime, simplicity and operability were improved. Table 1 shows a brief comparison of the Arnold- and Nieß-Model.


Table 1 – Summarized comparison of the most important features of the first and the last version of the protein translation model.
Arnold-Model (2005) Nieß-Model (2017)
Programs MapleV, ACSL, OptdesX MATLAB
Run time Several hours up to weeks Several minutes to one hour
Size ~1000 ODEs + 7 algebraic equations ~500 ODEs + 7 algebraic equations
System in vivo in vivo and in vitro

The program has a modular structure. It consists of a main script in which all subfunctions are called. Individual modules can be switched on or off, depending on the model statement to be made. Modules that are integrated are e.g.:

  • various energy regeneration systems (acetate kinase ‘MOD-AK’, nucleoside kinase ‘MOD-NK’ and creatine phosphate kinase ‘MOD-CP’)
  • Transcription (‘MOD-TC’)
  • Translation (Initiation (‘MOD-IN’), Elongation + tRNA recharging (‘MOD-EL’), Termination (‘MOD-TE’))
  • mRNA Degradation (‘MOD-mD’)

As this model consists of ~500 ODEs and 174 metabolites (amino acids (20), tRNA species (383), tRNA-amino acid species (38), aminoacylated tRNA + elongation factor thermo-unstable + GTP (39), Nucleotides (12 NxP), Initiation & elongation factors (14), Substrate metabolites (13 incl. free mRNA, mRNA with bound ribosomes & protein), its structure is quite complex. The individual ODEs, rate equations and their parameters are embedded into individual modules. Figure 2 depicts the structure of the gene expression model framework, the input and the output.


Some neat features of this model are its extendibility/modularity e.g. the addition of protein folding, the switch between an in vitro and in vivo model by switching on/off individual modules in the model and the ease to need just two input files (the coding sequence of the protein of interest (fasta format) and a metabolite file with initial concentrations (excel sheet)).


Using the gene translation model several results were published by the Institute of Biochemical Engineering1,2,4,5, we want to highlight one with high value for our project. Nieß et al. published in 2017 a study to debottleneck the in vitro protein synthesis, here they were able to analyze the sensitivity of translation factors with the described Nieß-model. The various translation factors (e.g. tRNA, ribosomes, elongation factors, …) have different impacts on the translation rate of a gene. Different concentrations of these translation factors were tested, while all other input parameters and reaction conditions remained constant. The impact of the translation factors is summed up in figure 3.


Figure 3 shows, that the translational factors with the highest impact on the translation rate in vitro are the tRNA and EFTu concentration. This supports our project design, as our project focusses on the improvement of heterologous gene expression in Vibrio natriegens by increasing the cellular tRNA concentration. By adapting the Nieß-model to Vibrio natriegens and analyzing the results, we gained further insides into our hypothesis.





Figure 1 -The schematic overview of the model mechanisms is shown, starting with RNA polymerase and subsequent sequence-oriented translation as well as regeneration of cofactors. Abbreviations: RNAP = RNA polymerase; IF = initiation factor; EFTu = elongation factor Tu; EFTs = elongation factor Ts; EFG = elongation factor G; RF = releasing factor; NTP = nucleotide triphosphate; GTP = guanosine triphosphate; GDP = guanosine diphosphate. Adapted from Nieß et al. 20172.


Figure 2 - Structure of the program. A Reading the starting concentrations and the protein sequence. Conversion of the Excel datasheet into a .dat file.  Creation of auxiliary matrices that enable a suitable codon/tRNA assignment. Pre-integration of the modules consisting of kinetic parameters (kin) and differential equations (DGL) to determine the equilibrium/start concentrations. Integration of the differential equations or solving of the algebraic equations in order to display concentration curves. B Detailed view of the model framework and its substructures.



Figure 3 - Sensitivity analysis of translation factors regarding translation rate. Values are derived from an in silico 50% increase in the concentration of each translation factor. Resulting translation rate is normalized to the non-altered rate to allow easy comparison. Factors considered are initiation factors 1−3 (IF); elongation factors G (EFG), Ts (EFTs), and Tu (EFTu); ribosome releasing factor (RF); T7 RNA polymerase and tRNA. A value of 1 indicates no influence of this factor on the translation rate. From Nieß et. al (2017)2.

Results/Discussion

By using the gene expression model, we wanted to gain insight into the expression of superfolder GFP (sfGFP) and derivatives of it in Vibrio natriegens. To adjust the model to more accurately depict Vibrio natriegens, the metabolite levels had to be adjusted to the growth rates experienced with Vibrio natriegens. Since there is no published data on the tRNA concentrations of Vibrio natriegens, we opted to adjust the tRNA levels in our model to the increased growth rate of Vibrio natriegens according to Dong et al.3, the initiation – and elongation factors according to Arnold et al.1.


To test our adjusted model, we designed GFP derivatives with a C-terminal tag consisting of 5 repeats of Vibrio natriegens’ rare codons (individual constructs for AGG, AGA, TGC, TCC and CGG). Figure 4 shows a graphical representation of the 5xAGG-GFP derivative.


Figure 4 - Schematic view of the sfGFP derivative with 5 AGG codons added to the C-terminus.

To simulate our improved Vibrio natriegens strain with increased rare tRNA levels, we increased the initial concentration of the respective rare codon by factor 10 (one at a time) and ran the gene expression model with the respective GFP derivative as input gene. We compared this run, with the simulation of the same GFP derivative in the model with normal tRNA levels, representing the wild type of Vibrio natriegens. Initially, we made sure that the increase in the initial level of rare tRNA concentration transferred to all metabolites in the pathway to the protein synthesis: tRNA à amino acid loaded tRNA à ternary complex (amino acid loaded tRNA, EFTu and GTP). Figure 5 shows these metabolite levels during the complete simulation time.


All tRNA containing species in figure 5 shows the same behavior but exactly shifted by factor 10 (only Arg5 is shown representatively for all five rare tRNAs).  Therefore, the tRNA-species concentrations were successfully increased without disturbing the overall response of the model. To comprehensively quantify the protein translation rate with codon level resolution the model quantifies the concentration of ribosomes bound to each codon on the mRNA. This comparison is shown for all GFP derivatives individually in figure 6.


The higher CjR is for one codon, the more ribosomes are bound at this codon and therefore the overall translation is stalled. As seen in Figure 6, CjR differs between the wild type - and the optimized Vibrio natriegens strain for all GFP derivatives only in the initial 5 codons, additionally CjR is always smaller in the optimized strain than in the wild type strain. Interestingly, all rare codons vary in their effect on CjR with AGG having the strongest retarding effect on ribosomes, but not being the rarest codon according to Vibrio natriegens codon usage (see https://2019.igem.org/Team:Stuttgart/Design), this is due to the fact, that the tRNA concentrations are based on measurements resulting from Escherichia coli3, whose codon usage varies slightly from that of Vibrio natriegens.


The comprehensiveness of our gene expression model allowed us not only to inspect the translation rate with codon resolution but also on the level of the whole gene. For this, we looked at CjR for all codons of the gene over the simulation time course and compared again the cases shown in figure 6. Representatively, the comparison for of 5xAGG-GFP expressed in wild type Vibrio natriegens and in the AGG optimized strain is shown in movie 1.


Movie 1 - Plot of the time courses of CjR for the whole gene. Green 5xAGG-GFP expressed in the wild type Vibrio natriegens model. Blue 5xAGG-GFP expressed in the Vibrio natriegens model with 10-fold increased rare tRNA.

In movie 1, the progression of the ribosomes on the mRNA is initially clearly stalled in the wild type strain model, while the ribosomes progress faster in the AGG optimized Vibrio natriegens strain model. Nevertheless, the ribosomes reach in both cases, the last codon at the same time and can complete the protein synthesis. After reaching the last codon, CjR increases for all codons until a steady state is reached for all codons. The effect of the initial ribosome stalling on the amount of synthesized protein can be seen in figure 7.

The protein time course in figures 7 A and B are identical at all time points, implying that the improved initial translation rate doesn’t affect the overall protein concentration. In respect to figure 3 and Nieß et al., the two metabolites with the strongest effect on the translation rate are the tRNA and the elongation factor thermo unstable (EFTu)2. Therefore, the model runs into the bottleneck of EFTu-GTP (the active species in this model) availability, when the tRNA availability is overcome. According to Chu et al. good protein expression requires a high translation initiation rate, but the key to optimal expression is high translation elongation6. Therefore the initial stalling of the translation might not have a big effect on the overall amount of protein that is synthesized, but Lipinszki et al. showed experimentally that the rare codon tags on GFP reduce the protein synthesis in Escherichia coli7. An alternative could be to scatter multiple tandem repeats of rare codons throughout the gene sequence, as can be found in vivo in various species8.

If the rare codon tags really have an impact on the protein yield can only be seen in an experimental setup at this point. Still, the use of the gene expression model gave us insight into the details of the molecular relations and the species involved in gene translation. We could successfully validate, that a rare codon tag slows down the protein translation initially and that this burden can be overcome by increasing the rare tRNA availability.


Acknowledgment

We want to thank the Institute of Biochemical Engineering at the University of Stuttgart and especially Jan Müller (PhD student, https://www.ibvt.uni-stuttgart.de/institute/team/Mueller-00126/) for helping us adapt their gene expression model to our needs, interpret the model results and provide the computational capacity.



References

  1. Arnold, S., Siemann-Herzberg, M., Schmid, J. & Reuss, M. in Biotechnology for the Future, edited by J. Nielsen (Springer Berlin Heidelberg, Berlin, Heidelberg, 2005), pp. 89–179.
  2. Nieß, A., Failmezger, J., Kuschel, M., Siemann-Herzberg, M. & Takors, R. Experimentally Validated Model Enables Debottlenecking of in Vitro Protein Synthesis and Identifies a Control Shift under in Vivo Conditions. ACS synthetic biology 6, 1913–1921; 10.1021/acssynbio.7b00117 (2017).
  3. Dong, H., Nilsson, L. & Kurland, C. G. Co-variation of tRNA abundance and codon usage in Escherichia coli at different growth rates. Journal of molecular biology 260, 649–663; 10.1006/jmbi.1996.0428 (1996).
  4. Nieß, A., Siemann-Herzberg, M. & Takors, R. Protein production in Escherichia coli is guided by the trade-off between intracellular substrate availability and energy cost. Microbial cell factories 18, 8; 10.1186/s12934-019-1057-5 (2019).
  5. Arnold, S. et al. Kinetic modeling and simulation of in vitro transcription by phage T7 RNA polymerase. Biotechnol. Bioeng. 72, 548–561; 10.1002/1097-0290(20010305)72:5<548::AID-BIT1019>3.0.CO;2-2 (2001).
  6. Chu, D. et al. Translation elongation can control translation initiation on eukaryotic mRNAs. The EMBO journal 33, 21–34; 10.1002/embj.201385651 (2014).
  7. Lipinszki, Z. et al. Enhancing the Translational Capacity of E. coli by Resolving the Codon Bias. ACS synthetic biology 7, 2656–2664; 10.1021/acssynbio.8b00332 (2018).
  8. Clarke, T. F. & Clark, P. L. Rare codons cluster. PloS one 3, e3412; 10.1371/journal.pone.0003412 (2008).


Figure 5 - Visualization of all metabolite levels of ‘Arg5’-tRNA (tRNA coding for AGG) containing species in the model plotted over time. Abbreviations: ‘t-Arg5’ = tRNA encoding for AGG, ‘aa-t-Arg5’ = amino acid loaded Arg5-tRNA, ’t3-Arg5’ = ternary complex (amino acid loaded Arg5-tRNA, EFTu and GTP).

Figure 6 - Plot of the steady state mRNA bound ribosome concentration at each codon position (‘CjR’) for all GFP derivatives (A 5xAGA-GFP, B 5xAGG-GFP, C 5xCGG-GFP, D 5xTCC-GFP, E 5xTGC-GFP) with C-terminal rare codon tags expressed in the wild type Vibrio natriegens model and in the Vibrio natriegens model with 10-fold increased rare tRNA.

Figure 7 - Plot of the time courses of the protein. A 5xAGG-GFP expressed in the wild type Vibrio natriegens model. B 5xAGG-GFP expressed in the Vibrio natriegens model with 10-fold increased rare tRNA.



Chlorella vulgaris

Introduction and Background

During our lab implementation and first experiments we quickly ran into issues with finishing our experiments on-time. The first major issue was the growth of the microalgae Chlorella Vulgaris. With no one in our group having notable experience with microalgae we started to cultivate C. vulgaris in a 200 mL flask, quickly noticing that the flask volume was not sufficient for quick microalgae growth. Therefore, we scaled up our microalgae cultivation. Firstly, into a 2-liter bottle and afterwards, with enough biomass, into a 5- and 10-liter bottle. The next optimizations were adjusting the light source for appropriate flux to allow photosynthesis, pumping air into the bottle and mixing the suspension with a magnetic stirrer.  All optimization steps included, the algae growth increased but was still too low to obtain a substantial amount of biomass for our experiments, with several problems arising during the cultivation.

Problem 1

Ineffective mixing caused some algae to grow attached to the glass bottle hindering the light source to pass through, therefore rendering our light source useless (Figure 1). Furthermore, the fumigation stone clogged up due to the attached growth of the algae. As a result of ineffective mixing the algae grew directly on the glass bottle which prevented the light from passing through. In addition to that, the mass of algae had caused clogging of the fumigation stone.

Building an airlift photobioreactor with proper mixing though the pumped air, preventing any attached growth from occurring.



Problem 2

However, the new solution had brought a new question How far and how deep can light travel through the bioreactor? How much energy do we lose through placing the light source outside of the reactor? How can we increase the effective illumination surface?


Solution to our second problem

An airlift photobioreactor with a number of tubes containing an LED-chain added into the bioreactor. Therefore, we consulted Prof. Dr. Takors from the Institute of Biotechnology from the university of Stuttgart. He provided us with expert knowledge on airlift design as well as helpful insight for our design. Based on his advice and airlift models found in literature we designed an in-silico airlift bioreactor with the proprietary specs we wanted to use.



Figure 1 - Cultivation of Chlorella vulgaris prior to setting up the reactor. The microalgae were cultivated in a 5- or 10-liters bottle, mixed using a magnet stirrer, gassed with air and illuminated using two extern light sources on the outside of the bottle.

Modelling

Aim

The main goal of having a computational fluid dynamics (CFD) simulation was to enable water recirculation in the proposed design. Keeping in mind the aforementioned algae growth issue and the problems with outwardly placed light sources, we decided to include the illumination within the downer chamber. This provided two-fold benefits:

  1. Having the lamps inside would increase the illuminated volume of the tank.,
  2. Based on the continuity assumption of fluid flow we expected the same kind of volumetric flow regardless of light source placement.

Furthermore, due to the flow rate equation (1&2), illumination inside of the airlift bioreactor increases the velocity of the liquid.

Q = liquid flow rate (m3/s); A = area of the pipe (m2); v = velocity of the liquid (m/s)

Simulation

Prior to setting up the reactor, we conducted a CFD simulation to ensure that that the light tubes wouldn’t disrupt the circulation of the flow. The CFD software uses Navier-Stokes equations to describe the fluid motion. With the fluid’s motion being based on the law of conservation, the Navier-Stokes equations use 3 conservation equations:

Assuming the temperature, the energy conservation equation can be neglected in our case. The mass conservation equation as well as the momentum conservation equation however need to be considered simultaneously in order to analyze the fluid flow. The simulation space is divided into smaller volumetric elements called “finite volumes” or cells. This division of simulation space is termed “meshing”. Each cell is a “building block” taking up a space in the empty volume of the reactor through which the liquid will flow. The size of the cells and their distribution plays a key role in determination of the flow behavior, placing large cells in crucial places of the geometry assembly might leave out essential characteristics which might have a significant impact on the output of the simulation. The Navier-Stokes equation is then solved for each cell building up to a specific geometry of the reactor.



Results

FLUENT 2019 R2 was used for the intended simulation. The geometry of the reactor was meshed in 737 043 tetrahedral cells (Figure 2). As shown, the cells were smaller and therefore of higher abundance in critical spaces of the reactor. Hereby, we can make sure that the flow in the reactor at places, which we considered critical for the flow, is calculated correctly.



The geometry of the reactor is being reconstructed by the 737 043 tetrahedral cells describe our preliminary reactor draft (Figure 3). Hereby the reactor is viewed in-silico from both sides, from the top as well as from an angle.


Because of the transient nature of the process the calculations were carried out in a pseudo-transient mode. For the sake of simplicity, the liquid phase was assumed to have the same properties as water. Air was chosen for the gas phase. For turbulence modeling “realizable k-ε” model was used and turbulence for multiphase was modeled using mixture model. For multiphase modeling Eulerian model was used with all the default settings for phase interactions and bubble size of 5 mm (this was assumed based on visual observations). It is worth mentioning that like every other type of modeling, CFD simulations need to be goal oriented and practical in a way that the intended results are obtained with a reasonable amount of resources (manpower, time, and computational power!). In our case CFD is being used to prove the concept and in a qualitative way to capture the overall behavior of the flow field. Usually, in most of the cases where quantification of the parameters are of interest, some experimental validation measurement are needed for the scope of our conceptual design. The discretization schemes for all variables were set to first order discretization and P-V coupling was carried out in a coupled scheme.


Next step was to introduce the Boundary Conditions (BCs) for the system. For our case, only two were needed: Gas flow inlet (set as a velocity inlet) and the boundary on fluid surface (set to be a pressure outlet with no backflow for the gas phase).
The velocity of the gas pumped into the system was set to 600 norm-liter per hour, a typical velocity for Chlorella vulgaris cultivation in an airlift often found in literature (Sadeghizadeh et al., 2017; Madhubalaji et al., 2019) .


First, the complete y-flow of the reactor was analyzed. It was of interest if the light tubes disrupt a typical liquid flow of an airlift reactor (Figure 4). At the sparger, the velocity and direction of the liquid flow is in a muddle. This is caused through the high input velocity as well as through the incoming water from the outer tubes. As can be seen, the water in the inner tube exclusively travels upwards with a velocity of around 20 centimeters per second. In the outer tubes the water exclusively travels downwards with a velocity of around 4 centimeters per second (Figure 4).


During analyzation of the downwards waterflow we detected discrepancies in the downward waterflow between the downward waterflow without tubes (Figure 5, red 1) in comparison with light tubes (Figure 5, red2).


With the differences in velocity in the downward flow, it was of interest how the influence of the light tubes is. Therefore, we conducted a specific CFD for a small part of the reactor and modelled the flow in the y-direction without (Figure 6, left side) and with light tubes (Figure 6, right side). The downward waterflow in spaces without light tubes exhibits a faster velocity (around 10 cm per second) whereas the downward waterflow with light tubes exhibit a velocity of around 4 cm per second.




Figure 2 - Meshing of the reactor is critical for correct prediction of the liquid flow.The reactor was meshed into 737.043 cells building up to the geometry of the reactor. Cells are more dense and smaller in size at critical flow coordinates in the reactor

Figure 3 - Correct in-silico reconstruction of the reactor is critical for successful simulation. The reactor was reconstructed in-silico based on the preliminary drawing.

Figure 4 - Flow calculation based on a CFD-simulation. The liquid flow of the complete airlift photobioreactor in the y-direction was simulated. Displayed is the direction of the water flow (arrows) as well as the velocity of the water flow (color-coded, left side).

Relevance for our project

The modelling indicated, that even with insertion of six light tubes into the airlift reactor the typical airlift flow is provided. Enabling us to set up the airlift reactor and start cultivation of Chlorella vulgaris, with the six inserted light tubes able to increase effective illumination surface. Furthermore, the predicted velocity of the liquid was high enough to prevent attached growth of the microalgae. The growth rate was able to be increased compared to our 10 liters bottle. With this increased growth rate, we were able to achieve quicker and higher biomass production biomass for our experiments and collaboration partners.


To conclude, without the airlift reactor we would not have been able to successfully conclude our algae project. With the validations of the flow while placing the light tubes as well as optimal distribution of light tubes, the reactor would not have been successfully set up without modelling. Therefore, the modelling part of the algae project was critical for our success and we could have not been able to conclude our experiments without it.


References




Figure 5 - Cross-section of the reactor from the top. The light tubes cause the downward waterflow to flow through two distinct topologies. The downward waterflow can flow like a typical airlift reactor (red 1), but also be limited in space due to the inserted light tubes (red 2).



Figure 5 - Analysis of liquid flow behavior with and without tubes. A part of the reactor was analyzed for liquid flow in the y-direction with downward waterflow without (left side) and with light tubes (right side). The direction of water flow is displayed as arrows with the velocity being color-coded.