S.P.L.A.S.H.
Suckerin Polymer Layer to Achieve Sustainable Health
Model
Mathematical modeling provides an excellent platform for both the design and improvement of an experimental setup. Large scale production of suckerin proved challenging in our project. Thus, we created a model of E. coli metabolism, including the suckerin-19 reaction, to optimize production conditions. Suckerin yield after growth in Luria-Bertani (LB) medium in our bioreactors was low. Based on model predictions, we improved Studier Phosphate Glucose (SPG) medium to replace LB. Growth in this new medium yielded increased levels of suckerin-19 as predicted. A second model, simulating suckerin-19 production, was then created to predict metabolic reactions that serve as bottlenecks for suckerin production. These reactions were viable targets for genetic intervention, further improving suckerin-19 production.
Our modeling page also provides a step-by-step guide through the making of our model, that can be used by future iGEM teams as a reference. This way, we hope to inspire other teams to implement metabolic modeling in their projects, as it was vital to the success of our own project. To this end, we also provided all our modeling code on Github to aid future modeling ventures.
Click on the boxes below to read on!
Background & Goals
Chapter 1: Background & Goals
Numbers of burn wound victims around the world amount to a staggering 11 million. The proposed treatment of such high numbers of patients with hydrogels will require vast quantities of suckerin protein. Therefore, scaling up of the suckerin production is key to our project. In order to predict increased protein production, we developed a model of the metabolism of Escherichia coli (E. coli). At the initiation of our production process, it was unsure whether we would be able to make our own suckerin production constructs. Therefore, we decided to develop the model based on the suckerin-19 producing organism expressing the plasmid pQE80-L SRT-19, provided by the Miserez group. Suckerin-19 is not directly secreted after production, but confined intracellularly. This made continuous production impossible, so a batch process was required. Suckerin-19 production was under the control of the PLac promoter and induced by Isopropyl β-D-1-thiogalactopyranoside (IPTG). Because of this, we assumed that the most absolute amounts of suckerin-19 could be produced with as high as possible amounts of biomass at the time of induction. This is because we do not have any control over the reactions contributing to suckerin-19 production. Moreover, we assume that the induction of suckerin-19 production diverts all substrate use from biomass formation to suckerin-19 production.
Multiple organisms are suitable for large scale protein production. In our project, we proposed using Saccharomyces cerevisiae, Streptomyces lividans, and Escherichia coli. This would require the construction of three separate models. However, due to time constraints, we constructed a model of the relatively simpler metabolism of E. coli. To model E. coli metabolism, we made use of the COnstraint-Based Reconstruction and Analysis (COBRA) toolbox in the MATLAB modeling environment. The COBRA toolbox provides in-depth analysis methods for biomass formation under specified growth conditions (see chapter 2). Our model is based on the iJO1336 model, the latest representation of E. coli metabolism in the COBRA environment [4].
During our project, large scale production of suckerin-19 proved challenging. E. coli growth in LB medium did not yield high levels of suckerin-19 after purification. The viscosity of the LB medium likely hampered the suckerin purification, which explains our low yield. By implementing predictions made by the model of E. coli metabolism, we set out to improve suckerin-19 yield. Two approaches were implemented, the first being increased biomass formation, for which we developed and optimized a minimal medium according to the model predictions to replace LB in bioreactors. The second approach predicted metabolic reactions that would bottleneck suckerin-19 production. By genetically modifying the genes controlling these reactions, higher levels of suckerin production could be achieved. We will explain both approaches in detail in the following chapters. The flowchart below describes the different stages in the development of our metabolic model and will be present at the start of each chapter to describe our progress.
Our modeling page will not only be used to describe our modeling efforts, but will also serve as a guide for other iGEM teams that want to use a metabolic model in their project. The COBRA toolbox provides an ideal platform for metabolic modeling, but it is very complex. We attempt to concisely describe the basics of working in COBRA environment and make the platform more accessible for new users. To achieve this, we will also provide our model and all results generated by it on Github. Any team can access our model and use it to further their research or improve their model. By doing this, we hope to stimulate other teams to implement metabolic modeling in their projects, as it proved invaluable to the success of our own project.
COBRA Toolbox Explanation & Biomass Formation
Chapter 2: COBRA Toolbox Explanation & Biomass Formation Model
In the previous chapter, we decided that our goal was to find experimental conditions for which the production of suckerin-19 is maximal. Furthermore, we settled on using the COBRA toolbox environment for our modeling efforts. Our next step was the construction of a metabolic model that represented biomass formation of E. coli. This model would allow us to simulate the maximum lean biomass formation (biomass formation excluding the suckerin-19 protein) of E. coli cells, as we assumed that this would allow us to obtain a larger amount of suckerin-19.
COBRA Toolbox Modeling
COBRA toolbox modeling visualizes flux through metabolic reactions during biomass formation. This enables the model to predict modifications to media compositions that would increase biomass formation and suckerin-19 production. This, however, requires data generated in a controlled environment to calibrate the model. Because bioreactors are capable of continuously monitoring multiple parameters, like pH, oxygen uptake and carbon dioxide production, they provide the ideal source of data for metabolic modeling using the COBRA toolbox. The basics behind the toolbox will be explained in this chapter.
The COBRA toolbox is used to both model and visualize the reactions that take place in a cellular environment. The toolbox bases itself on a network of reactions that use and produce metabolites and the genes that influence these reactions. The following simplified network explains how the software simulates the metabolism of living cells:
Figure 1. Simplified representation of metabolic reactions in the COBRA toolbox. Reactions are displayed in green and metabolites in red.
The network in figure 1 explains a basic reaction pathway in a cell and the metabolites that are consumed and produced. The toolbox uses a stoichiometric coefficient for each metabolite used in the reactions. In reaction 2, for instance, metabolite A has a coefficient of -2 and B of 2. This means two A metabolites are used in the reaction and two B metabolites are produced. By doing this for every reaction in the network a ($m\times n$)-matrix, the stoichiometric matrix ($S$), is generated from the number of metabolites ($m$) and the number of reactions ($n$):
$$\qquad \quad \textcolor{grey}{ \begin{matrix} R1\! & R2\! & R3\! & R4\! & R5\! & R6\! & R7\! \end{matrix}}\\[5pt] S =\textcolor{grey}{\begin{matrix} A\\ B\\ C\\ D\\ E \end{matrix}} \; \begin{bmatrix} 1 & -1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & -2 & -1 & 0 & 0 & 0 \\ 0 & 0 & 2 & 0 & -1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & -1 & 0 \\ 0 & 0 & 0 & 0 & 1 & 1 & -1 \end{bmatrix} $$
Stoichiometric matrix ($S$) based on the metabolic reactions in figure 1. Every metabolite gets assigned a stoichiometric coefficient based on the reaction coefficients of the respective reaction.
This basic concept is used to display every reaction that happens in the cell, including exchanges with the extracellular environment [7, 15].
The second required variable, the first variable being the stoichiometric matrix ($S$), is the flux through each of these reactions. This is the reaction rate in mmol/gDW/h (mmol per gram Dry Weight per hour) for each reaction. The flux of every reaction is stored in the flux vector ($v$), of size $n$ (number of reactions), and is used in the following equation:
$$ Sv = 0 $$This equation represents the mass balance over the system in a steady-state, as shown in figure 2. The model assumes that the reaction times are much shorter than the entire time frame the model simulates, thus it assumes pseudo steady-state, as every small change will return to steady-state after a short time. The flux vector ($v$) represents the flux through all reactions in the network and the stoichiometric matrix ($S$) represents the concentrations of all metabolites in the reactions. Together, they represent the mass balance equations that make up the metabolism in the model.
Figure 2. Depiction of how $Sv = 0$ represents the mass balance over a system in a steady state in the model. Includes the stoichiometric matrix ($S$) and the flux vector ($v$).
In a large metabolic model, there are more reactions than compounds present, which means $n>m$. Thus, there exist more unknown variables than equations in the steady-state, making a unique solution in this underdetermined system impossible. Because of this, COBRA metabolic modeling requires mass balance constraints. The model starts with every flux in the flux vector ($v$) maximized, meaning that every reaction can run as fast as physically possible. This is not the case in the actual organism, so reaction fluxes can be lowered to introduce constraints to the model, This means, for instance, that the model has to run using a limited glucose uptake flux and adapt maximum biomass formation to these conditions [7, 15].
Biomass Formation Model
Creating a complex model for E. coli from scratch is time-consuming and laborious. Therefore, we decided to adapt and build on an existing reconstruction of E. coli metabolism. This model, known as iJO1366, incorporates 1805 metabolites, 2583 reactions, and 1367 genes in E. coli metabolism [4]. It is capable of simulating growth under specific conditions, simulating what happens in living cells. The limitation of using this model is that it represents a general strain, E. coli K-12, and not the E. coli strain Rosetta used in our experiments. The differences between the two strains could impose slight variations between the model and reality, but no major differences are expected.
The model was modified by adding the reaction for the production of suckerin-19 as shown in figure 3. This more accurately represents the conditions in our strain than a model that does not contain the suckerin-19 production reaction. The amino acid and ATP components of suckerin-19 were obtained from literature [16]. Important in this reaction is the ATP to ADP reaction providing energy. If energy in the form of ATP is used in a reaction, new ATP must be created in another reaction in the cell to compensate, which changes the calculated fluxes.
Figure 3. The suckerin-19 production reaction as added to the model. Each metabolite has its respective stoichiometric coefficient, which is added to the stoichiometric matrix ($S$).
The problem with this reaction is what the model does after suckerin-19 is produced. After production, the protein will have no place to go in the model, since it is not used or broken down. Because of this, the $Sv = 0$ condition becomes impossible. We created a suckerin-19 sink, simulating the accumulation of suckerin-19 in the cell. This basically instructs the model that the protein must be removed from the system after production, which restores the possibility for the $Sv = 0$ condition and solves the problem of suckerin-19 accumulation.
The question remains if this sink reaction is representative of what actually happens in the cells. Fortunately, the sink reaction comes quite close to what E. coli actually does with suckerin, since the protein is stored in inclusion bodies by the organism, as shown in figure 4. This removes the protein from the system, which is similar to what is modeled with a sink [10, 11].
Figure 4. After production, suckerin-19 is stored in inclusion bodies. This process is simulated by the suckerin-19 sink reaction in the model, that removes the protein from the system in a similar fashion.
Flux Balance Analysis (FBA)
The first step in improving suckerin production is to determine what the maximum growth and production rates are for E. coli. To this end, Flux Balance Analysis (FBA) was used, which makes use of the predetermined reaction fluxes in the metabolic network to determine the maximum growth rate and production under specific conditions.
The challenge with FBA is that there are multiple ways to satisfy the condition $Sv = 0$, since the model has no defined objective to optimize. This creates an infinite solution space, in which an optimal $v$ for all reaction fluxes cannot be found. Because of this, the introduction of an objective function is required. This is a goal that the model must achieve, for instance maximizing biomass formation. The objective can be any reaction and serves to decrease the solution space in the metabolic network, allowing for the determination of an optimal flux vector ($v$). Determination of the objective function is displayed by the following equation:
$$ {\max_{v} \atop \min_{v}} \; \text{for} \, v_i\\[8pt] \text{subject to}\quad Sv = 0\\ c^Tv \ge \gamma Z_0\\ v_l \leq v \leq v_u $$In this formula, $Z$ represents the objective function and $v$ represents the flux vector ($v$). Vector $c^T$ represents the weight that each reaction contributes to the objective function. Normally, only one reaction is optimized using FBA, meaning that all elements of vector $c$ equal to zero except the objective reaction, which equals to 1. The determination of the objective function is visualized in figure 5 [8, 9, 15].
Figure 5. Determination of the objective function of the model. Since the objective function is one reaction in the model, $m = 1$. This means only one reaction in the model is set as $Z$, the objective. Because of this, $Z$ is satisfied by $c^T$.
FBA uses linear algebra to solve $Sv = 0$ for the combination of fluxes that form the objective function. The solution space is defined by the upper (vu) and lower (vl) bound flux of the reactions in the network, which are the constraints that were imposed upon the model in the previous section. FBA is able to find one optimal flux distribution for vector $v$ in the solution space that satisfies the objective, as displayed in figure 6.
When using a different objective function than biomass production in FBA, the analysis by default assumes that biomass production is 10% of the maximum value. This is done to prevent solutions in the analysis without the production of biomass. However, this influences how accurate the model represents reality, since growth rates can be higher or lower. In order to refine the model, measurements of biomass formation at maximum suckerin-19 production were required, which was not possible due to time constraints [8, 9, 15].
Figure 6. FBA uses the solution space defined by $v_u$ and $v_l$ to maximize objective $Z$ for $Sv = 0$. This means FBA will find an optimal flux set for all reactions in vector $v$ to maximize the flux through the objective reaction $Z$.
Flux Balance Analysis & Suckerin-19 Production
Chapter 3: Flux Balance Analysis & Suckerin-19 Production Model
So far, we have determined that our goal is to scale up suckerin-19 production in E. coli. To achieve this, we have constructed a metabolic model of E. coli in the COBRA toolbox environment. Our first objective was to change the medium used in our bioreactor setup, since LB was not suitable for high levels of suckerin-19 production. To do this, we needed to impose constraints on the model to simulate the behavior of real cells in a different medium.
Approach
As previously mentioned, the first goal of our model was to find a medium to replace LB in a bioreactor setup. Because of this, uptake reactions in the model were constrained based on concentrations of the metabolites in Studier Phosphate Glucose (SPG) minimal medium (Table 1). Under these conditions, maximum biomass formation was calculated for E. coli. To determine maximum biomass formation in our model, we made use of FBA, or flux balance analysis. FBA also allowed for the prediction of metabolites in SPG that were limiting for E. coli growth. This enabled us to substitute the medium with metabolites that were otherwise limiting and provide more optimal conditions for biomass formation. Next, a second metabolic model was created. The first model represents growth before induction of suckerin-19 production, but the second model represents suckerin-19 production after induction. The production of suckerin-19 was optimized by using FBA with suckerin-19 production as its objective function. This FBA predicted the maximum suckerin-19 yield in SPG medium.
FBA for Biomass Formation in Studier Phosphate Glucose (SPG) Medium
FBA uses predetermined reaction flux constraints to determine the maximum possible rate for a selected objective reaction. FBA solves $Sv = 0$ for this objective function, which optimizes the flux vector ($v$) for this objective function by changing flux in other metabolic reactions. This yields the highest possible rate for the selected reaction under the introduced constraints. See chapter 2 for an in-depth explanation of FBA.
We based our metabolic model on growth in SPG minimal medium (Table 1) [12, 13]. To produce realistic FBA results, constraints were imposed upon the model according to this medium. We used the mmol/L concentrations of metabolites in SPG (Table 3) [12, 13] to simulate metabolite uptake rates from the SPG medium by E. coli cells. Using metabolite concentrations as constraints is a common method in FBA when no experimental data is available, which makes this a reasonable assumption [3]. Further refinement of uptake rates can be done using GC-MS.
Table 1. SPG medium ingredients and concentrations. These concentrations were used to define the constraints shown in Table 2 [12, 13].
Ingredients |
Concentration (g/L) |
(NH4)2SO4 |
3.3 |
KH2PO4 |
6.8 |
Na2HPO4 |
7.1 |
MgSO4 |
0.24 |
Glucose |
5 |
Table 2. Constraints imposed on exchange reactions in the model based on SPG minimal medium. Each constraint is based on the concentrations of medium ingredients as displayed in Table 1.
Exchange reactions in the model |
Imposed uptake constraints (mmol/gDW/hr) |
Ammonium Exchange |
49.9472 |
Hydrogen Exchange |
149.9517 |
Phosphate Exchange |
99.9831 |
Glucose Exchange |
27.7537 |
Magnesium Exchange |
1.9938 |
Potassium Exchange |
49.9685 |
Sodium Exchange |
50.0145 |
Sulphate Exchange |
26.9675 |
FBA for biomass formation under the constraints in Table 2 yielded a maximum growth rate of 0.9824 mole biomass generated per mole biomass present per hour. Because the model was not constrained besides the exchange reactions in Table 2, this growth rate is likely higher than in real cells. Lack of constraints make the model assume almost ideal growth conditions. However, in reality, maximum growth rate is far more limited due to tougher growth conditions than simulated by the model.
Though the growth rates predicted by FBA were likely too high, they could still be used to predict which medium components were limiting for growth. This was achieved by subsequently increasing and decreasing the flux for each exchange reactions in Table 2. Every time the flux was changed, FBA was executed to optimize biomass formation. The resulting growth rates only increased when changing the glucose exchange rate, as depicted in figure 7. The model predicted that glucose was the only limiting ingredient in SPG minimal medium, and adding more glucose would yield faster biomass formation.
Figure 7. Graph displaying FBA results for biomass formation under increasing glucose uptake rates. The more mmol/gDW/h glucose cells are allowed to take up, the faster they form biomass.
Predicted uptake rates in figure 7 are higher than possible in real cells, due to limitations in glucose uptake. Besides this, high concentrations of glucose result in hypertonic solutions, which can be osmotically stressful for the cells. Nonetheless, these predictions
gave an indication that increasing glucose levels above the original 5 g/L of the medium would increase the growth rate. As explained in chapter 1, suckerin-19 production must happen in batches. Faster growth means more batches can be
produced and more suckerin-19 can be obtained. That is why we implemented these predictions in a growth assay using flask cultures. SPG was supplemented with 5 g/L, 10 g/L and 20 g/L of glucose, respectively. The final optical density at
600 nm (OD600) was used to assess the biomass formation after 24 hours of growth (Fig. 8). These measurements showed increased final OD600 at increased glucose levels, suggesting that higher glucose concentrations increase the growth. At 20 g/L, OD600 did not increase further, due to the aforementioned osmotic stress. A more detailed account of the experiments can be found on the results
page.
The results in figure 8 are no definite proof that higher glucose levels in SPG result in a higher growth rate, but they do indicate that additional glucose has a positive effect on the growth rate. Due to time constraints, however, we were only able to test this prediction in the described flask culture growth assay. We attempted growth assays in 96-well plates, but these were unsuccessful due to negative stress effects on cell growth. A growth assay in a controlled bioreactor was also not possible. Thus, we cannot provide further validation of these results.
Figure 8. Comparison of OD600 of cultures grown in LB, as well as different formulations of SPG with varying glucose. The OD600 was measured 24 h after initial inoculation. The data was obtained in biological duplicates. The comparisons were made using a parametric t-test, p-values are shown. A significance level of 0.05 was chosen.
Suckerin-19 Production Model
As stated before, we assumed that suckerin-19 production did not occur in E. coli before induction by IPTG. This assumption allowed us to create a second metabolic model of E. coli. The first model, discussed until now, represented the growth phase of E. coli without suckerin-19 production. The model included the suckerin-19 reaction, but there was no flux through the reaction. The second model assumed that biomass production was at 10% of the maximum, but did in fact have a flux through the suckerin-19 reaction. Therefore, it represented the situation in the cells when suckerin-19 production had been induced by IPTG. We used the second model to determine maximum suckerin-19 production under the given constraints through FBA (Table 1). FBA with the suckerin-19 production as its objective yielded a maximum production rate of 0.0255 mmol/gDW/hr under the given constraints. This indicated that suckerin-19 production could occur in SPG minimal medium. This was tested by harvesting and purifying suckerin-19 from flask cultures in different SPG media formulations. An SDS-PAGE gel showed clear bands in each SPG medium formulation, indicating the presence of suckerin-19 (Fig. 9). A more detailed account of the experiments can be found on the results page.
Figure 9. SDS-PAGE of proteins from cultures grown in LB and SPG media formulations and purified using a His-Link purification kit.
Lane 1: Ladder, lane 3+4: LB, lane 5+6: SPG + 5 g/L glucose, lane 7+8: SPG + 10 g/L
glc, lane 9+10: SPG + 20 g/L glc. Suckerin-19 has a mass of around 39 kDa.
To summarize, we implemented the biomass formation model to determine that glucose was the limiting metabolite in the SPG medium for growth (Fig. 7). This prediction was used to improve SPG medium. By supplementing SPG with increasing concentrations of glucose, higher levels of biomass were obtained (Fig. 8). This was vital in our project, as we relied on high biomass formation to produce enough suckerin-19. On top of this, the suckerin-19 production model predicted that suckerin-19 production was actually possible in SPG medium, making the medium a viable alternative for LB. We validated these predictions by harvesting and purifying suckerin-19 from different SPG media formulations (Fig. 9). This showed that the model had correctly predicted that SPG was a suitable medium for suckerin-19 production. According to these results, we recommend using these SPG formulations for suckerin-19 production in further experiments.
Flux Variability Analysis & OptForce Algorithm
Chapter 4: Flux Variability Analysis & OptForce Algorithm
Up to this point, we have used our metabolic model of E. coli to optimize biomass formation by our strain in minimal SPG medium. This was done to replace LB medium, because this medium yielded low levels of suckerin-19 as shown on the results page. We determined that glucose was the limiting component in SPG and therefore supplemented the medium accordingly, resulting in higher biomass formation. On top of this, we constructed a second model, displaying suckerin-19 production. We used this model to determine if suckerin-19 would be produced in sufficient quantities in the SPG medium. Predictions of both models were experimentally confirmed, highlighting the importance of modeling in our project. In the following chapter, we will discuss our second approach to increase production: finding bottlenecks in suckerin-19 production.
Approach
To increase the production of suckerin-19 in E. coli, FVA (Flux Variability Analysis) and the OptForce Algorithm were used. These methods found reactions in the metabolic network that posed a bottleneck to the production of suckerin-19. Limiting reactions could be altered by genetic mutations, allowing for removal of these bottlenecks. This method identified suitable targets for future large scale production of suckerin-19.
Flux Variability Analysis (FVA)
FVA determines the minimum and maximum flux through reactions in the metabolic network that can satisfy a certain objective function. This means one objective function, for instance biomass formation, is set to a maximum flux, which the model cannot change during the analysis. Locking biomass production at a maximum allows the model to change the flux of all other reaction to satisfy the objective reaction, in this case biomass production. To do this, the following equation is solved:
$$ {\max_{v} \atop \min_{v}} \; \text{for} \, v_i\\[8pt] \text{subject to}\quad Sv = 0\\ c^Tv \ge \gamma Z_0\\ v_l \leq v \leq v_u $$This equation directly builds on the results of the FBA analysis. Using the maximum reaction rates found in the FBA, the FVA solves the minimum and maximum flux ($\min_v$ and $\max_v$) through each reaction in the network ($v_i$) to satisfy the objective reaction. This means FVA fixes the final flux through the objective reaction and determines all possible metabolic fluxes than can satisfy this fixed objective. The objective is defined as $c^T$ and $v$ defines the flux of each reaction in vector $v$. $\gamma Z_0$ is the objective used in the FBA, which the FVA uses to optimize the rest of the flux in vector $v$ [6].
OptForce Algorithm
By using FVA analysis on the biomass production model and on the suckerin-19 production model, two sets of minimum and maximum reaction flux were obtained. This allowed us to compare metabolism between the two models and pick out reactions that had a high difference in flux range between biomass formation and suckerin-19 production. Reactions that highly differ in flux are likely important in the production of suckerin-19 and are therefore a target for genetic modification.
Data generated by FVA of both models was compared by using the OptForce algorithm. OptForce compares minimum and maximum flux for every reaction between the two models, as determined by FVA. Since the two models have two different objectives, growth and suckerin-19 production respectively, flux through reactions will differ between them. OptForce compares the flux range of each reaction between the two models, using the minima and maxima generated by FVA. When flux ranges for a reaction do not overlap, it means flux through the reaction must change to produce more suckerin-19. OptForce places these reactions in a MUST set, a set of reactions that must change in flux in order to increase flux through the target reaction. Overlap of flux is displayed in figure 10A, whereas 10B and 10C display no overlap. Reactions 10B and 10C therefore require a change in flux to produce more suckerin-19 [2, 14].
Figure 10. Flux range overlap of reactions in the biomass formation model and the suckerin-19 production model. Reaction A shows overlap in flux range, meaning that no change in flux is required to increase suckerin-19 production. Reaction B shows no overlap and a higher range in the suckerin-19 production model, so its flux must increase to produce more suckerin-19. The algorithm places this reaction in the MUST U set. The opposite is true for reaction C, which is placed in the MUST L set.
OptForce compares the flux range of every reaction between the two models using this method. The analysis creates five different MUST sets that describe the differences between the two models:
- MUST U: all single reactions that must increase in flux to optimize the target reaction
- MUST L: all single reactions that must decrease in flux to optimize the target reaction
- MUST UU: all sums of two reactions that must both increase in flux to optimize the target reaction
- MUST LL: all sums of two reactions that must both decrease in flux to optimize the target reaction
- MUST UL: all sums of two reactions that must both increase and decrease in flux respectively to optimize the target reaction
OptForce then compares all MUST sets and finds the minimum amount of genetic knockouts or upregulations to satisfy the MUST sets. OptForce determines a FORCE set, a set of reactions that can increase flux through the objective reaction after genetic intervention. Using these predictions, a strain can be genetically modified to increase the yield of a target compound, in our case suckerin-19 [14].
OptForce Results
The OptForce algorithm targeted the reactions displayed in Table 3 for genetic intervention. All targeted reactions are part of amino acid metabolism, which makes sense when producing proteins. According to these results, the largest bottleneck in suckerin-19 production in E. coli is histidine production. Reactions in histidine metabolism are the main target for intervention according to OptForce.
Table 3. Reactions targeted by the OptForce Algorithm for genetic intervention
Reaction ID |
Metabolic System |
Regulation Type |
Target Genes |
PRMICI |
Histidine |
Upregulation |
hisA |
IGPDH |
Histidine |
Upregulation |
hisB |
PRATPP |
Histidine |
Upregulation |
hisI |
PRAMPC |
Histidine |
Upregulation |
hisI |
ATPPRT |
Histidine |
Upregulation |
hisG |
HSTPT |
Histidine |
Upregulation |
hisC |
HISTP |
Histidine |
Upregulation |
hisB |
IG3PS |
Histidine |
Upregulation |
hisF & hisH |
PPND |
Tyrosine, Tryptophan and Phenylalanine |
Upregulation |
tyrA |
IPPS |
Valine, Leucine and Isoleucine |
Upregulation |
leuA |
IPPMIb |
Valine, Leucine and Isoleucine |
Downregulation |
leuC & leuD |
IPMD |
Valine, Leucine and Isoleucine |
Upregulation |
leuB |
Since histidine metabolism is such a large bottleneck in suckerin-19 production, we have visualized all reactions in this metabolism targeted by OptForce in figure 11 [1]. When examining this metabolism, it becomes apparent that OptForce has targeted almost every reaction in it. This cements the importance of histidine metabolism in the production of suckerin-19, making it a major target for genetic intervention. Due to time constraints, predicted genetic interventions were not tested, but they pose interesting targets for future suckerin-19 research. Since histidine metabolism is tightly regulated, upregulating gene expression of the target genes might not yield higher suckerin-19 production. The indicated target genes in histidine metabolism are all part of the histidine operon. By targeting regulation of the expression of this operon, we could possibly overexpress multiple target genes at once. By upregulating the entire histidine biosynthesis pathway, higher levels of histidine can be produced. The histidine operon is, however, also subject to attenuation at high concentrations of histidine in the cell, stopping further histidine biosynthesis. This regulation must be targeted as well if higher levels of histidine synthesis are to be achieved [5].
Figure 11. Overview of histidine metabolism in E. coli and the reactions targeted for genetic intervention by OptForce. This overview contains all metabolites and reactions used in histidine metabolism. Reaction ID’s can be found in Table 3, together with the respective genetic targets. The model predicts that by upregulating the indicated reactions, more suckerin-19 will be produced.
To summarize, we used a comparison of every reaction flux between the biomass formation model and the suckerin-19 production model to determine which reactions were a bottleneck in suckerin-19 production. The OptForce algorithm identified reactions in amino acid biosynthesis as bottlenecks. Especially histidine biosynthesis was targeted by the algorithm. Upregulating histidine production in the cells is, therefore, a promising approach for increasing suckerin-19 production. Either target genes on the histidine operon or the regulation of the histidine operon need to be genetically altered to achieve this. Due to time constraints, we were not able to implement these predictions in our own research. This information can, however, be used in future research to improve suckerin-19 production further.
Conclusions & Future Prospects
Chapter 5: Conclusions & Future Prospects
At the end of our modeling efforts, we had successfully established two models of E. coli metabolism and used them to improve suckerin-19 production. The first model simulated biomass formation in E. coli. Since growth in LB did not yield sufficient levels of suckerin-19, we used this model to find a replacement for LB medium in our bioreactor setup. According to model predictions, we optimized a minimal medium, SPG, for biomass formation. This way, the model greatly influenced the scaling up of suckerin-19 production.
A second metabolic model of E. coli was then created, simulating suckerin-19 production after IPTG induction. We used this model to predict whether E. coli was able to produce suckerin-19 in SPG medium. As this proved to be the case, we implemented SPG in our bioreactor setup. Due to time constraints, we were not able to optimize SPG medium any further. The second model was also used to determine reactions that were a bottleneck in suckerin-19 production, which was the second goal of our modeling efforts. By comparing reaction flux ranges from the biomass formation model and the suckerin-19 production model using the OptForce algorithm, bottleneck reactions were identified. Especially histidine biosynthesis was a large bottleneck. Genetic intervention in the histidine operon or its regulation system could increase histidine production, alleviating the bottleneck. However, the regulation of the histidine operon is complex, so we were not able to implement these predictions [5].
In the end, we were able to explore multiple approaches to increase suckerin-19 production in E. coli. Through this feat, both models greatly contributed to our ability to upscale the production of suckerin-19.
The resulting medium improvements and determination of useful genetic intervention are a great basis for future research. Our model page also serves as a guide for other iGEM teams that want to use metabolic modeling. We have
provided our entire model and all results generated by it on Github, so other teams can use it to their advantage. We hope to inspire other
teams to use metabolic modeling in their projects, since it was so vital to the success of our own project.
References
References
- King Z., Dräger A., Ebrahim A., Sonnenschein N., Lewis N., Palsson B., & Gardner P. (2015). Escher: A Web Application for Building, Sharing, and Embedding Data-Rich Visualizations of Biological Pathways. PLoS Computational Biology, 11(8), E1004321.
- Ranganathan S., Suthers P., Maranas C., & Price N. (2010). OptForce: An Optimization Procedure for Identifying All Genetic Manipulations Leading to Targeted Overproductions. PLoS Computational Biology, 6(4), E1000744.
- Kim H., Kim S., & Yoon S. H. (2018). Metabolic network reconstruction and phenome analysis of the industrial microbe, Escherichia coli BL21 (DE3). PLoS ONE, 13(9), E0204375.
- King Z., Lu J., Drager A., Miller P., Federowicz S., Lerman J., . . . Nathan E. (2015). BiGG Models: A platform for integrating, standardizing and sharing genome-scale models. Nucleic Acids Research, 44(D1), D515-22.
- Winkler M., & Ramos-Montañez S. (2009). Biosynthesis of Histidine, EcoSal Plus, 3(2).
- Vlasov V. (2017). Flux Variability analysis (FVA). COBRA Toolbox tutorials.
- Hinton S. (2017). E. coli Core Model for Beginners (PART 1). COBRA Toolbox tutorials.
- Hinton S. (2017). E. coli Core Model for Beginners (PART 2). COBRA Toolbox tutorials.
- Hinton S. (2017). E. coli Core Model for Beginners (PART 3). COBRA Toolbox tutorials.
- Pfau T. (2017). Creating a Model. COBRA Toolbox tutorials.
- Vlasov V., & Pfau T. (2017). Model manipulation. COBRA Toolbox tutorials.
- Thakur C., Brown S., Sama M., Jackson E., & Dayie J. (2010). Growth of wildtype and mutant E. coli strains in minimal media for optimal production of nucleic acids for preparing labeled nucleotides. Applied Microbiology and Biotechnology, 88(3), 771-779.
- Studier F. (2005). Protein production by auto-induction in high-density shaking cultures. Protein Expression and Purification, 41(1), 207-234.
- Mendoza S. (2017). OptForce. COBRA Toolbox tutorials.
- Orth J. D., Thiele I., & Palsson B. (2010). What is flux balance analysis? Nature Biotechnology, 28(3), 245-248.
- Institute of Medicine, & Committee on Military Nutrition Research. (2000). The Role of Protein and Amino Acids in Sustaining and Enhancing Performance. National Academies Press.