S.P.L.A.S.H.
Suckerin Polymer Layer to Achieve Sustainable Health
Model
Mathematical modeling provides an excellent platform for both design and improvement of an experimental setup. Large scale production of suckerin proved challenging in our project. Thus, we created a model of E. coli metabolism, including the suckerin-19 reaction, to optimize production conditions. Suckerin yield after growth in LB medium in our bioreactors was low. Based on model predictions, we improved defined Studier Phosphate Glucose medium to replace LB. Growth in this new medium yielded increased levels of suckerin-19 as predicted. A second model, simulating suckerin-19 production, was then created to predict metabolic reactions that bottlenecked suckerin production. These reactions are viable targets for genetic intervention, further improving suckerin-19 production.
Our modeling page also provides a step-by-step guide through the making of our model, that can be used by future iGEM teams as a reference. This way, we hope to inspire other teams to implement metabolic modeling in their projects, as it was vital to the success of our own project. To this end, we also provided all our modeling code on Github to aid future modeling ventures.
Click on the boxes below to read on!
Background & Goals
Chapter 1: Background & Goals
Numbers of burn wound victims around the world amount to a staggering eleven million. The proposed treatment of such high numbers of patients with hydrogels will require vast quantities of suckerin protein. Therefore, scaling up of the suckerin production is key to our project. In order to predict increased protein production, we developed a model of the metabolism of Escherichia coli (E. coli). At the initiation of our production process, it was unsure whether we would be able to make our own suckerin-12 production constructs. Therefore, it was decided to develop the model on the suckerin-19 producing organism expressing the plasmid pQE80-L SRT-19, provided by the Miserez group. Suckerin-19 is not directly secreted after production, but confined intracellularly. This made continuous production impossible, so a batch process was required. Suckerin-19 production was under the control of the pLac promoter and induced by Isopropyl β-D-1-thiogalactopyranoside (IPTG). Because of this, we assumed that the most absolute amounts of suckerin-19 can be produced with as high as possible amounts of biomass at the time of induction. This is because we do not have any control over the reactions contributing to suckerin-19 production. Moreover, we assume that the induction of suckerin-19 production is like a switch, changing between growth and suckerin production.
Multiple organisms are suitable for large scale protein production. In our project, we proposed using Saccharomyces cerevisiae, Streptomyces lividans, and Escherichia coli. This would require the construction of three separate models. However, due to time constraints, we constructed a model of the relatively simpler metabolism of E. coli. To model the E. coli metabolism, we made use of the COnstraint-Based Reconstruction and Analysis (COBRA) toolbox in the MATLAB modeling environment. The COBRA toolbox provides in-depth analysis methods for biomass production under specified growth conditions (see chapter 2). Our model is based on the iJO1336 model, the latest representation of E. coli metabolism in the COBRA environment [4].
During our project, large scale production of suckerin-19 proved challenging. E. coli growth in LB medium did not yield high levels of suckerin-19 after purification. The viscosity of the LB medium likely hampered the suckerin purification, which explains our low yield. By implementing predictions made by the model of E. coli metabolism, we set out to improve suckerin-19 yield. Two approaches were implemented, the first being increased biomass production, for which we developed and optimized a minimal medium according to the model predictions to replace LB in bioreactors. The second approach predicted metabolic reactions that would bottleneck suckerin-19 production. By genetically modifying the genes behind these reactions, higher levels of suckerin production could be achieved. We will explain both approaches in detail in the following chapters. The flowchart below describes the different stages in the development of our metabolic model and will be present at the start of each chapter to describe our progress.
Our modeling page will not only be used to describe our modeling efforts, but will also serve as a guide for other iGEM teams that want to use a metabolic model in their project. The COBRA toolbox provides an ideal platform for metabolic modeling, but it is very complex. We attempt to concisely describe the basics of working in COBRA environment. To achieve this, we will also provide our model and all results generated by it on Github. Any team can access our model and use it to further their research or improve the model. By doing this, we hope to stimulate other teams to implement metabolic modeling in their projects, as it proved invaluable to the success of our own project.
COBRA Toolbox Explanation & Biomass Formation
Chapter 2: COBRA Toolbox Explanation & Biomass Formation Model
We decided that our goal was to find experimental conditions for which the production of suckerin-19 is maximal. Furthermore, we settled on using the COBRA toolbox environment for our modeling efforts. Our next step was the construction of a metabolic model that represented biomass production in E. coli. This model would allow us to simulate the maximum biomass production of E. coli cells, as we assumed that this would allow us to obtain a larger amount of suckerin-19.
COBRA Toolbox Modeling
COBRA toolbox modeling visualizes flux through metabolic reactions during biomass formation. This enables the model to predict modifications to the media composition that would increase biomass formation and suckerin-19 production. This, however, requires data generated in a controlled environment. Because bioreactors are capable of continuously monitoring multiple parameters, like pH, oxygen uptake and carbon dioxide production, they provide the ideal source of data for metabolic modeling using the COBRA toolbox. A basic understanding of how modeling in the toolbox works is required to understand the need for this controlled setup. The basics behind the toolbox will be explained in this chapter.
The COBRA toolbox is used to both model and visualize the reactions that take place in a cellular environment. The toolbox bases itself on a network of reactions that use and produce metabolites and the genes that influence these reactions. The following simplified network explains how the software simulates the metabolism of living cells:
Figure 1. Simplified representation of metabolic reactions in the COBRA toolbox. Reactions are displayed in green and metabolites in red.
The network in Figure 1 explains a basic reaction pathway in a cell and the metabolites that are consumed and produced. The toolbox uses a stoichiometric coefficient for each metabolite used in the reactions. In reaction 2, for instance, metabolite A has a coefficient of -2 and B of 2. This means two A metabolites are used in the reaction and two B metabolites are produced. By doing this for every reaction in the network a ($m\times n$)-matrix, the stoichiometric matrix ($S$), is generated from the number of metabolites ($m$) and the number of reactions ($n$):
$$\qquad \quad \textcolor{grey}{ \begin{matrix} R1\! & R2\! & R3\! & R4\! & R5\! & R6\! & R7\! \end{matrix}}\\[5pt] S =\textcolor{grey}{\begin{matrix} A\\ B\\ C\\ D\\ E \end{matrix}} \; \begin{bmatrix} 1 & -1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & -2 & -1 & 0 & 0 & 0 \\ 0 & 0 & 2 & 0 & -1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & -1 & 0 \\ 0 & 0 & 0 & 0 & 1 & 1 & -1 \end{bmatrix} $$
Stoichiometric matrix ($S$) based on the metabolic reactions in Figure 1. Every metabolite gets assigned a stoichiometric coefficient based on the reaction coefficients of the respective reaction.
This basic concept is used to display every reaction that happens in the cell, including exchanges with the extracellular environment [7, 15].
The second required variable, the first variable being the stoichiometric matrix ($S$), is the flux through each of these reactions. This is the reaction rate in mmol/gDW/h (mmol per gram Dry Weight per hour) for each reaction. The flux of every reaction is stored in the flux vector ($v$), of size $n$ (number of reactions), and is used in the following equation:
$$ Sv = 0 $$This equation represents the mass balance over the system in a steady-state, as shown in Figure 2. The model assumes that the reaction times are much shorter than the entire time frame the model looks at, thus it assumes pseudo steady-state, as every small change will return to steady-state after a short time. The flux vector ($v$) represents the flux through all reactions in the network and the stoichiometric matrix ($S$) represents the concentrations of all metabolites in the reactions. Together, they represent the mass balance equations that make up the metabolism in the model.
Figure 2. Depiction of how $Sv = 0$ represents the mass balance over a system in a steady state in the model. Includes the stoichiometric matrix ($S$) and the flux vector ($v$).
In a large metabolic model, there are more reactions than compounds present, which means $n>m$. Thus, there exist more unknown variables than equations in the steady-state, making a unique solution in this underdetermined system impossible. Because of this, COBRA metabolic modeling requires mass balance constraints. The model starts with every flux in the flux vector ($v$) maximized, meaning that every reaction can run as fast as physically possible. This is not the case in the actual organism, so reaction fluxes can be lowered to introduce constraints to the model, This means, for instance, that the model has to run using a limited glucose flux and adapt maximum biomass production to these conditions [7, 15].
Biomass Formation Model
Creating a complex model for E. coli from scratch is time-consuming and laborious, therefore we decided to adapt and build on an existing reconstruction of E. coli metabolism. This model, known as iJO1366, incorporates 1805 metabolites, 2583 reactions, and 1367 genes in E. coli metabolism [4]. This model is capable of simulating growth under specific conditions, simulating what happens in living cells. The limitation of using this model is that it represents a general strain, E. coli K-12, and not the E. coli strain Rosetta used in our experiments. The differences between the two strains could impose slight variations between the model and reality.
The model was modified by adding the reaction for the production of suckerin-19 as shown in Figure 3. This more accurately represents the conditions in our strain than a model that does not contain the suckerin-19 production reaction. The amino acid and ATP components of suckerin-19 were obtained from literature [16]. Important in this reaction is the ATP to ADP reaction providing energy. If energy in the form of ATP is used in a reaction, new ATP must be created in another reaction in the cell to compensate, which changes the calculated fluxes
Figure 3. The suckerin-19 production reaction as added to the model. Each metabolite has its respective stoichiometric coefficient, which is added to the stoichiometric matrix ($S$).
The problem with this reaction is what the model does after suckerin-19 is produced. After production, the protein will have no place to go in the model, since it is not used or broken down. Because of this, the $Sv = 0$ condition becomes impossible. We created a suckerin-19 sink, simulating the accumulation of suckerin-19 in the cell. This basically tells the model that the protein gets removed from the system after production, which restores the possibility for the $Sv = 0$ condition and solves the problem of suckerin-19 accumulation.
The question remains if this sink reaction is representative of what actually happens in the cells. Fortunately, the sink reaction comes quite close to what E. coli actually does with suckerin, since the protein is stored in inclusion bodies by the organism, as shown in Figure 4. This removes the protein from the system, which is similar to what is modeled with a sink [10, 11].
Figure 4. After production, suckerin-19 is stored in inclusion bodies. This process is simulated by the suckerin-19 sink reaction in the model, that removes the protein from the system in a similar fashion.
Flux Balance Analysis (FBA)
The first step in improving suckerin production is to determine what the maximum growth and production rates are for E. coli. To this end, Flux Balance Analysis (FBA) was used, which makes use of the predetermined reaction fluxes in the metabolic network to determine the maximum growth rate and production under specific conditions.
The challenge with FBA is that there are multiple ways to satisfy the condition $Sv = 0$, since the model has no defined objective to optimize. This creates an enormous solution space, in which an optimal $v$ for all reaction fluxes cannot be found. Because of this, the introduction of an objective function is required. This is a goal that the model must achieve, for instance maximizing biomass production. The objective can be any reaction and serves to decrease the solution space in the metabolic network, allowing for the determination of an optimal flux vector ($v$). Determination of the objective function is displayed by the following equation:
$$ {\max_{v} \atop \min_{v}} \; \text{for} \, v_i\\[8pt] \text{subject to}\quad Sv = 0\\ c^Tv \ge \gamma Z_0\\ v_l \leq v \leq v_u $$In this formula, $Z$ represents the objective function and $v$ represents the flux vector ($v$). Vector $c^T$ represents the weight that each reaction contributes to the objective function. Normally, only one reaction is optimized using FBA, meaning that all elements of vector $c$ equal to zero except the objective reaction, which equals to 1. The determination of the objective function is visualized in Figure 5 [8, 9, 15].
Figure 5. Determination of the objective function of the model. Since the objective function is one reaction in the model, $m = 1$. This means only one reaction in the model is set as $Z$, the objective. Because of this, $Z$ is satisfied by $c^T$.
FBA uses linear algebra to solve $Sv = 0$ for the combination of fluxes that form the objective function. The solution space is defined by the upper (vu) and lower (vl) bound flux of the reactions in the network, which are the constraints that were imposed upon the model before. FBA is able to find one optimal flux distribution for vector $v$ in the solution space that satisfies the objective, as displayed in Figure 6.
When using a different objective function than biomass production in FBA, the analysis assumes that biomass production is 10% of the maximum value. This is done to prevent solutions in the analysis without the production of biomass. However, this influences how accurate the model represents reality, since growth rates can be higher or lower. In order to refine the model, measurements of biomass formation at maximum suckerin-19 production were needed, which was not possible due to time constraints [8, 9, 15].
Figure 6. FBA uses the solution space defined by $v_u$ and $v_l$ to maximize objective $Z$ for $Sv = 0$. This means FBA will find an optimal flux set for all reactions in vector $v$ to maximize the flux through the objective reaction $Z$.
Flux Balance Analysis & Suckerin-19 Production
Chapter 3: Flux Balance Analysis & Suckerin-19 Production Model
So far, we have determined that our goal is to scale up suckerin-19 production in E. coli. To achieve this, we have constructed a metabolic model of E. coli in the COBRA toolbox. Our first objective was to change the medium used in our bioreactor setup, since LB was not suitable for high levels of suckerin-19 production. To do this, we needed to impose constraints on the model to simulate the behavior of real cells in a different medium.
Approach
As previously mentioned, the first goal of our model was to find a medium to replace LB in a bioreactor setup. Because of this, uptake reactions in the model were constrained based on concentrations of the metabolites in Studier Phosphate Glucose (SPG) minimal medium (Table 1). Under these conditions, maximum biomass production was calculated for E. coli. To determine maximum biomass production in our model, we made use of FBA. FBA also allows for the prediction of metabolites in SPG that were limiting for E. coli growth. This enabled us to substitute the medium with limiting metabolites and provide more optimal conditions for biomass production.
Next, a second metabolic model was created. The first model represents growth before induction of suckerin-19 production, but the second model represents suckerin-19 production after induction. The production of suckerin-19 was optimized by using FBA with suckerin-19 production as its objective function. This predicted the maximum suckerin-19 yield in SPG medium.
FBA for Biomass Formation in Studier Phosphate Glucose (SPG) Medium
FBA, or flux balance analysis, uses predetermined reaction flux constraints to determine the maximum possible rate for a selected objective reaction. FBA solves $Sv = 0$ for this objective function, which optimizes the flux vector ($v$) for this objective function by changing flux in other metabolic reactions. This yields the highest possible rate for the selected reaction under the introduced constraints. See chapter 2 for an in-depth explanation of FBA.
We based our metabolic model on growth in SPG minimal medium ([12, 13] Table 1). To produce realistic FBA results, constraints were imposed upon the model according to this medium. Our original plan was to use GC-MS to determine uptake rates of metabolites from the SPG medium by E. coli cells. This proved impossible due to time constraints, so we used the mmol/L concentrations of metabolites in SPG instead (Table 2) [12, 13]. Using metabolite concentrations as constraints is a common method in FBA when no experimental data is available [3].
Table 1. SPG medium ingredients and concentrations. These concentrations were used to define the constraints shown in Table 2 [12, 13].
|
Ingredients |
Concentration (g/L) |
|
(NH4)2SO4 |
3.3 |
|
KH2PO4 |
6.8 |
|
Na2HPO4 |
7.1 |
|
MgSO4 |
0.24 |
|
Glucose |
5 |
Table 2. Constraints imposed on exchange reactions in the model based on SPG minimal medium. Each constraint is based on the concentrations of medium ingredients as displayed in Table 1.
|
Exchange reactions in the model |
Imposed uptake constraints (mmol/gDW/hr) |
|
Ammonium Exchange |
49.9472 |
|
Hydrogen Exchange |
149.9517 |
|
Phosphate Exchange |
99.9831 |
|
Glucose Exchange |
27.7537 |
|
Magnesium Exchange |
1.9938 |
|
Potassium Exchange |
49.9685 |
|
Sodium Exchange |
50.0145 |
|
Sulphate Exchange |
26.9675 |
FBA for biomass formation under the constraints in Table 2 yielded a maximum growth rate of 0.9824 mole biomass generated per mole biomass present per hour. Because the model was not constrained besides the exchange reactions in Table 2, this growth rate is likely higher than in real cells. Lack of constraints make the model assume almost ideal growth conditions. However, in reality, there are far more limitations in maximum growth rate due to tougher growth conditions than simulated by the model.
Though the growth rates predicted by FBA were likely too high, they could still be used to predict which medium components were limiting for growth. This was achieved by subsequently increasing and decreasing the flux for each exchange reactions in Table 2. Every time the flux was changed, FBA was executed to optimize biomass formation. The resulting growth rates only increased when changing the glucose exchange rate, as depicted in Figure 7. The model predicted that glucose was the only limiting ingredient in SPG minimal medium, and adding more glucose would yield faster biomass formation.
Figure 7. Graph displaying FBA results for biomass formation under increasing glucose uptake rates. The more mmol/gDW/h glucose cells are allowed to take up, the faster they form biomass.
Predictions in Figure 7 can be set higher due to limited glucose uptake. Besides this, high concentrations of glucose results in hypertonic solutions, which can be osmotically stressful for the cells. Nonetheless, this prediction
gives the indication that increasing glucose levels above the original 5 g/L of the medium would increase the growth rate. As explained in chapter 1, suckerin-19 production must happen in batches. Faster growth means more batches can be
produced and more suckerin-19 can be obtained. That is why we implemented these predictions in a growth assay of flask cultures. SPG was supplemented with 5 g/L, 10 g/L and 20 g/L of glucose, respectively. The final optical density at
600 nm (OD600) was used to assess the biomass formation after 24 hours of growth (Fig. 8). These measurements showed increased final OD600 at increased glucose levels, suggesting that higher glucose concentrations
increase the growth. At 20 g/L, OD600 did not increase further, due to the aforementioned osmotic stress. A more detailed account of the experiments can be found on the results
page.
The results in Figure 8 are no definite proof that higher glucose levels in SPG result in a higher growth rate, but they do indicate that additional glucose has a positive effect on the growth rate. Due to time constraints, however, we were only able to test this prediction in the described flask culture growth assay. We attempted growth assays in 96-well plates, but these were unsuccessful due to negative stress effects on cell growth. Due to time constraints, a growth assay in a controlled bioreactor was also not possible. Thus, we cannot provide further validation of these results.
Figure 8. Comparison of OD600 of cultures grown in LB, as well as different formulations of SPG with varying glucose. The OD600 was measured 24 h after initial inoculation. The data was obtained in biological duplicates. The comparisons were made using a parametric t-test, p-values are shown. A significance level of 0.05 was chosen.
Suckerin-19 Production Model
As stated before, we assumed that suckerin-19 production did not occur in E. coli before induction by IPTG. This assumption allowed us to create a second metabolic model of E. coli. The first model, discussed until now, represented the growth phase of E. coli without suckerin-19 production. The model included the suckerin-19 reaction, but there was no flux through the reaction. The second model assumed that biomass production was at 10% of the maximum, but did in fact have a flux through the suckerin-19 reaction. Therefore, it represents the situation in the cells when suckerin-19 production had been induced by IPTG. We used the second model to determine maximum suckerin-19 production under the given constraints through FBA (Table 1). FBA with the suckerin-19 production as its objective yielded a maximum production rate of 0.0255 mmol/gDW/hr under the given constraints. This indicated that suckerin-19 production could occur in SPG minimal medium. This was tested by harvesting and purifying suckerin-19 in different SPG media formulations. An SDS-PAGE gel showed clear bands in each SPG medium formulation, indicating the presence of suckerin-19 (Fig. 9). A more detailed account of the experiments can be found on the results page.
Figure 9. SDS-PAGE of proteins from cultures grown in LB and SPG media formulations and purified using a His-Link purification kit.
Lane 1: Ladder, 3+4: LB, 5+6: SPG + 5 g/L glucose, 7+8 SPG + 10 g/L
glc, 9+10: SPG + 20 g/L glc. Suckerin-19 has a mass of around 39 kDa.
To summarize, we implemented the biomass formation model to determine that glucose was the limiting metabolite in the SPG medium for growth (Fig. 7). This prediction was used to improve SPG medium. By supplementing SPG with increasing concentrations of glucose, higher levels of biomass were obtained (Fig. 8). This was vital in our project, as we relied on high biomass formation to produce enough suckerin-19. On top of this, the suckerin-19 production model predicted that suckerin-19 production was possible in SPG medium. We validated these predictions by harvesting and purifying suckerin-19 from different SPG media formulations (Fig. 9). This showed that the model had correctly predicted that SPG was a suitable medium for suckerin-19 production. According to these results, we recommend using these SPG formulations for suckerin-19 production in future experiments.
Flux Variability Analysis & OptForce Algorithm
Chapter 4: Flux Variability Analysis & OptForce Algorithm
Up to this point, we have used our metabolic model of E. coli to optimize biomass formation by our strain in minimal SPG medium. This was done to replace LB medium, because this medium yielded low levels of suckerin-19 as shown on the results page. We determined that glucose was the limiting component in SPG and therefore supplemented the medium accordingly, resulting in higher biomass formation. On top of this, we constructed a second model, displaying suckerin-19 production. We used this model to determine if suckerin-19 would be produced in sufficient quantities in the SPG medium. Predictions of both models were experimentally confirmed, highlighting the importance of modeling in our project. In the following chapter, we will discuss our second approach to increase production: finding bottlenecks in suckerin-19 production.
Approach
To increase the production of suckerin-19 in E. coli, FVA (Flux Variability Analysis) and the OptForce Algorithm were used. These methods found reactions in the metabolic network that posed a bottleneck to the production of suckerin-19. Limiting reactions could be altered by genetic mutations, allowing for removal of these bottlenecks. This identifies suitable targets for future upscaling of suckerin-19 production.
Flux Variability Analysis (FVA)
FVA determines the minimum and maximum flux through reactions in the metabolic network that can satisfy a certain objective function. This means one objective function, for instance biomass production, is set to a maximum flux, which the model cannot change during the analysis. Locking biomass production at a maximum, allows the model to change the flux of all other reaction to satisfy the objective reaction, in this case biomass production. To do this, the following equation is solved:
$$ {\max_{v} \atop \min_{v}} \; \text{for} \, v_i\\[8pt] \text{subject to}\quad Sv = 0\\ c^Tv \ge \gamma Z_0\\ v_l \leq v \leq v_u $$This equation directly builds on the results of the FBA analysis. Using the maximum reaction rates found in the FBA, the FVA solves the minimum and maximum flux ($\min_v$ and $\max_v$) through each reaction in the network ($v_i$) to satisfy the objective reaction. The objective is defined as $c^T$ and $v$ defines the flux of each reaction in vector $v$. $\gamma Z_0$ is the objective used in the FBA, which the FVA uses to optimize the rest of the flux in vector $v$ [6].
OptForce Algorithm
By using FVA analysis on the biomass production model and on the suckerin-19 production model, two sets of minimum and maximum reaction flux were obtained. This allowed us to compare metabolism between the two models and pick out reactions that had a high difference in flux range. Reactions that highly differ in flux are likely important in the production of suckerin-19 and are therefore a target for genetic modification.
Data generated by FVA of both models was compared by using the OptForce algorithm. OptForce compares minimum and maximum flux for every reaction between the two models, as determined by FVA. Since the two models have two different objectives, growth and suckerin-19 production respectively, flux through reactions will differ between them. OptForce compares the flux range of each reaction between the two models, using the minima and maxima generated by FVA. When flux ranges for a reaction do not overlap, it means flux through the reaction must change to produce more suckerin-19. OptForce places these reactions in a MUST set, a set of reactions that must change in flux to increase flux through the target reaction. Overlap of flux is displayed in Figure 10A, whereas 10B and 10C display no overlap. Reactions 10B and 10C therefore require a change in flux to produce more suckerin-19 [2, 14].
Figure 10. Flux range overlap of reaction in the biomass formation model and the suckerin-19 production model. Reaction A shows overlap in flux range, meaning that no change in flux is required to increase suckerin-19 production. Reaction B shows a higher range in the suckerin-19 production model, so its flux must increase. The algorithm places this reaction in the MUST U set. The opposite is true for reaction C, which is placed in the MUST L set.
OptForce compares the flux range of every reaction between the two models using this method. The analysis creates five different MUST sets that describe the differences between the two models:
- MUST U: all single reactions that must increase in flux to optimize the target reaction
- MUST L: all single reactions that must decrease in flux to optimize the target reaction
- MUST UU: all sums of two reactions that must both increase in flux to optimize the target reaction
- MUST LL: all sums of two reactions that must both decrease in flux to optimize the target reaction
- MUST UL: all sums of two reactions that must both increase and decrease in flux respectively to optimize the target reaction
OptForce then compares all MUST sets and finds the minimum amount of genetic knockouts or upregulations to satisfy the MUST sets. OptForce determines a FORCE set, a set of reactions that can increase flux through the objective reaction after genetic intervention. Using these predictions, a strain can be genetically modified to increase the yield of a target compound, in our case suckerin-19 [14].
OptForce Results
The OptForce algorithm targeted the reactions displayed in Table 3 for genetic intervention. All targeted reactions are part of amino acid metabolism, which makes sense when producing proteins. According to these results, the largest bottleneck in suckerin-19 production in E. coli is histidine production. Reactions in histidine metabolism are the main target for intervention according to OptForce.
Table 3. Reactions targeted by the OptForce Algorithm for genetic intervention
|
Reaction ID |
Metabolic System |
Regulation Type |
Target Genes |
|
PRMICI |
Histidine |
Upregulation |
hisA |
|
IGPDH |
Histidine |
Upregulation |
hisB |
|
PRATPP |
Histidine |
Upregulation |
hisI |
|
PRAMPC |
Histidine |
Upregulation |
hisI |
|
ATPPRT |
Histidine |
Upregulation |
hisG |
|
HSTPT |
Histidine |
Upregulation |
hisC |
|
HISTP |
Histidine |
Upregulation |
hisB |
|
IG3PS |
Histidine |
Upregulation |
hisF & hisH |
|
PPND |
Tyrosine, Tryptophan and Phenylalanine |
Upregulation |
tyrA |
|
IPPS |
Valine, Leucine and Isoleucine |
Upregulation |
leuA |
|
IPPMIb |
Valine, Leucine and Isoleucine |
Downregulation |
leuC & leuD |
|
IPMD |
Valine, Leucine and Isoleucine |
Upregulation |
leuB |
Since histidine metabolism is such a large bottleneck in suckerin-19 production, we have visualized all reactions in this metabolism targeted by OptForce in Figure 11 [1]. When examining this metabolism, it becomes apparent that OptForce has targeted almost every reaction in it. This cements the importance of histidine metabolism in the production of suckerin-19, making it a major target for genetic intervention. Due to time constraints, predicted genetic interventions were not tested, but they pose interesting targets for future suckerin-19 research. Since histidine metabolism is tightly regulated, upregulating gene expression of the target genes might not yield higher suckerin-19 production. The indicated target genes in histidine metabolism are all part of the histidine operon. By targeting regulation of the expression of this operon, we could possibly overexpress multiple target genes at once. By upregulating the entire histidine biosynthesis pathway, higher levels of histidine can be produced. The histidine operon is, however, also subject to attenuation at high concentrations of histidine in the cell, stopping further histidine biosynthesis. This regulation must be targeted as well if higher levels of histidine synthesis want to be achieved [5].
Figure 11. Overview of histidine metabolism in E. coli and the reactions targeted for genetic intervention by OptForce. This overview contains all metabolites and reactions used in histidine metabolism. Reaction ID’s can be found in Table 3, together with the respective genetic targets. The model predicts that by upregulating the indicated reactions, more suckerin-19 will be produced.
To summarize, we used a comparison of reaction flux between the biomass formation model and the suckerin-19 production model to determine which reactions were a bottleneck in suckerin-19 production. The OptForce algorithm identified reactions in amino acid biosynthesis as bottlenecks. Especially histidine biosynthesis was targeted by the algorithm. Upregulating histidine production in the cells is, therefore, a great method for increasing suckerin-19 production. Either target genes on the histidine operon or the regulation of the histidine operon need to be genetically altered to achieve this. Due to time constraints, we were not able to implement these predictions in our own research. This information can, however, be used in future research to improve suckerin-19 production further.
Conclusions & Future Prospects
Chapter 5: Conclusions & Future Prospects
At the end of our modeling efforts, we had successfully established two models of E. coli metabolism and used them to improve suckerin-19 production. The first model simulated biomass formation in E. coli. Since growth in LB did not yield sufficient levels of suckerin-19, we used this model to find a replacement for LB medium in our bioreactor setup. According to model predictions, we optimized a defined minimal medium, SPG, for biomass formation. This way, the model greatly influenced the scaling up of suckerin-19 production.
A second metabolic model of E. coli was then created, simulating suckerin-19 production after IPTG induction. We used this model to predict whether E. coli was able to produce suckerin-19 in SPG medium. As this proved to be the case, we implemented SPG in our bioreactor setup. Due to time constraints, we were not able to optimize SPG medium any further. The second model was also used to determine reactions that were a bottleneck in suckerin-19 production, which was the second goal of our modeling efforts. By comparing reaction flux ranges from the biomass formation model and the suckerin-19 production model using the OptForce algorithm, bottleneck reactions were could be identified. Especially histidine biosynthesis was a large bottleneck. Genetic intervention in the histidine operon or its regulation system could increase histidine production, alleviating the bottleneck. However, the regulation of the histidine operon is complex, so we were not able to implement these predictions [5].
In the end, we were able to explore multiple approaches to increase suckerin-19 production in E. coli. Through this feat, both models greatly contributed to our ability to upscale the production of suckerin-19.
The resulting medium improvements and determination of useful genetic intervention are a great basis for future research. Our model page also serves as a guide for other iGEM teams that want to use metabolic modeling. We have
provided our entire model and all results generated by it on Github, so other teams can use it to their advantage. We hope to inspire other
teams to use metabolic modeling in their projects, since it was so vital to the success of our own project.
Appendix
References
- Z. King, A. Dräger, A. Ebrahim, N. Sonnenschein, N. Lewis, and B. Palsson, “Escher: A web application for building, sharing, and embedding data-rich visualizations of biological pathways”, PLoS Computational Biology. 2015
- S. Ranganathan, P. Suthers, and C. Maranas, “OptForce: An Optimization Procedure for Identifying All Genetic Manipulations Leading to Targeted Overproductions”, PLoS Computational Biology. 2010
- H. Kim, S, Kim and S. Yoon, “Metabolic network reconstruction and phenome analysis of the industrial microbe, Escherichia coli BL21(DE3)”, PLoS ONE. 2018
- Z. King, J. Lu, A. Dräger, P. Miller, S. Federowicz, J. Lerman, A. Ebrahim, B. Palsson, and N. Lewis, “BiGG Models: A platform for integrating, standardizing, and sharing genome-scale models” Nucleic Acids Research. 2016
- M. Winkler, and S. Ramos-Montañez, “Biosynthesis of Histidine”, EcoSal Plus, 2009
- V. Vlasov, “Flux Variability analysis (FVA)”, COBRA Toolbox tutorials. 2017
- S. Hinton, “E. coli Core Model for Beginners (PART 1)”, COBRA Toolbox tutorials. 2017
- S. Hinton, “E. coli Core Model for Beginners (PART 2)”, COBRA Toolbox tutorials. 2017
- S. Hinton, “E. coli Core Model for Beginners (PART 3)”, COBRA Toolbox tutorials. 2017
- T. Pfau, “Creating a Model”, COBRA Toolbox tutorials.2017
- V. Vlasov, and T. Pfau, “Model manipulation”, COBRA Toolbox tutorials. 2017
- C. Thakur, M. Brown, J. Sama, M. Jackson, and T. Dayie, “Growth of wildtype and mutant E. coli strains in minimal media for optimal production of nucleic acids for preparing labeled nucleotides”, Appl. Micriol. Biotechnol. 2010
- F. Studier, “Protein production by auto-induction in high-density shaking cultures”, Protein Expression & Purification. 2005
- S. Mendoza, “OptForce”, COBRA Toolbox tutorials. 2017
- J. Orth, I. Thiele, and B. Palsson, “What is flux balance analysis?”, Nat Biotechnol. 2010
- D. Bier, “The Role of Protein and Amino Acids in Sustaining and Enhancing Performance”, National Academies Press (US). 1999