Model Overview

During the project, some important decisions were made based on the results of mathematical modeling. Below is a description of the kind of models we used, why they were used, the insight gained from the simulation results as well as a more detailed description of how the model was implemented.

PCB Degradation Modeling

While designing the project, one major concern was how many genes that would need to be integrated into the genome of Saccharomyces cerevisiae to achieve successful degradation of PCB. The pathway that we aimed to construct is based on genes from the biphenyl degradation pathway with an additional enzyme PcbA5 included. The addition of this enzyme would in theory increase the amount of different congeners that could be degraded by the system by first converting them from highly chlorinated to more lowly chlorinated congeners [1,2]. The pathway is displayed in Figure 1.

**Figure 1.** Illustration of the PCB degradation pathway. A. The enzyme PcbA5 dechlorinates highly chlorinated PCBs, converting them to lower chlorinated congeners. Adapted from Bedard 2014 [1]. B. The bph-pathway breaks down the lowly chlorinated PCBs, converting them to benzoate, pyruvate and acetyl-CoA. Adapted from Agulló et al. [2].

This intended pathway would consist of 12 genes, and integrating that many genes into a single cell is very ambitious. There was an idea to omit the final three genes, which are able to convert 2-hydroxypenta-2,4-dienoate to pyruvate and acetyl-CoA, from the pathway and thereby reduce the amount of genes that would need to be integrated into the yeast by 25%, from 12 to 9. However, since the validity of this approach was questionable we decided to use mathematical modeling to investigate if this would be a viable option.

The approach that was suggested was to use a genome scale metabolic model (GEM) of S. cerevisiae. Two models would be generated, one where the complete 12-gene pathway was implemented and one where the tree final genes were missing, and a comparison between the performance of the two models would indicate how encompassing the project would need to be. While at first glance the PCB degradation pathway seems simple in terms of reactions involved, in practice it is highly complex. Since PCB can come in many different forms depending on where the chlorines are placed, that means that each enzyme in the pathway can act on a number of different substrates, which will in turn yield different products. As an example of how many reactions that a single enzyme can catalyze, Figure 2 shows how the dehalogenase enzyme PcbA5 can convert one substrate (highly chlorinated congener) to four possible end products (lowly chlorinated congeners) by removing one chlorine at a time, and thereby going through intermediate products.

**Figure 2.** Illustration of the functionality of the dehalogenase enzyme PcbA5. The enzyme dechlorinates the numbered congeners that appear higher up in each network, gradually converting them into the congeners that appear at the bottom of each network.

By expanding the network shown in Figure 2 and analyzing all of the possible reactions, the number of reactions that needed to be added to the metabolic network of the GEM was 166 for PcbA5 alone. After compiling the reactions contained in the entire pathway, the two different models described above were generated. From here on the model containing the incomplete pathway will be referred to as model A and the model containing the complete pathway will be referred to as model B The finished models were then used as the basis for performing flux balance analysis (FBA).

FBA is a method of analyzing the flows of metabolites through metabolic networks. Given a system of equations that represent the stoichiometry of all reactions in the metabolic network, which is what the models described above essentially are, this approach aims to find biologically relevant information through the optimization of a so-called ‘objective function’ [3]. This approach assumes that under any set of given conditions, the fluxes, or reaction rates, through the metabolic network will eventually reach a steady-state, and the result yielded by the analysis is the metabolic flux distribution that under steady state will yield the optimal value of the chosen objective function [4].

The first objective that was set for FBA of the PCB degradation models was cellular growth, i.e. biomass production. Choosing this objective would simulate the cells simply growing naturally in the presence of PCB, and the results would indicate if the cells benefited from the degradation process. The simulated concentrations of different PCB congeners that the model was allowed to take up was based on the composition of Aroclor 1260, the PCB mix that was intended to be used in the lab [5]. The resulting flux distributions for the metabolites involved in the pathway are displayed in Figure 3.

**Figure 3.** Flux distribution for uptake/secretion of PCB congeners and metabolites involved in PCB degradation. For simplicity, all PCB congeners which have an equal amount of chlorines attached are shown as one bar. In addition, chlorinated pathway end products are displayed together with their non-chlorinated variants as one bar. A. Optimal flux distribution of model A while the objective function is biomass production. B. Optimal flux distribution of model B while the objective function is biomass production.

Figure 3 displays the optimal exchange fluxes of different PCB congeners and the metabolites that are produced as end products when the last three enzymes of the pathway are missing for the two models. By comparing Figure 3 A, which shows the flux distribution for model A, with Figure 3 B, which shows the flux distribution for model B, it becomes clear that while model B is able to degrade all types of PCB present in Aroclor 1260 (which is indicated by the negative flux values) and produce benzoate (which is indicated by the positive flux value), the same is not true for model A. Instead, model A is able to degrade the more highly chlorinated PCB congeners while producing predominantly tetrachlorinated biphenyls. It should be noted that for model A no pathway end products are produced, indicating that no degradation is taking place in this scenario. These results were not very surprising considering that since model B was able to convert PCB to pyruvate and acetyl-CoA, effectively growing on PCB, it would outperform model A when the objective function was defined as biomass production. To illustrate this we decided to compare the optimal growth rates of model A and model B to each other and to the original GEM. The results are displayed in Table 1.

**Table 1.** *Comparison of the optimal growth rates of the original GEM, here labeled 'WT', model A and model B.*
Model	Growth rate
WT	0.09741
A	0.098492
B	0.14286

By observing Table 1, it becomes clear that model B performs the best out of the three models. This is expected, since due to its capacity to completely metabolize PCB it has access to more nutrients and can therefore grow better. What is interesting however, is that model A shows a slightly improved optimal growth rate compared to the 'wild-type'-model despite its inability to fully break down PCB. This implies that even the incomplete pathway provides an advantage from a growth perspective in PCB contaminated areas. Note that the numbers displayed in Table 1 should not be interpreted as quantitative data. The reason for this is that the total amount of PCB that the model was allowed to take up was equal to the amount of glucose, a scenario that is not reasonable in real life. Nevertheless, the trends seen in Table 1 were taken as support for implementing the reduced pathway.

Even if model A was able to outgrow the wild type, it did not show the capacity to actually degrade PCBs. As an alternative simulation, the rationale that all of the enzymes in the pathway would be placed under strong, constitutive promoters was therefore used to justify changing the objective function to the production of benzoates instead and see if theoretical degradation was possible for model A. The resulting flux distributions are displayed in Figure 4.

**Figure 4.** Flux distribution for uptake/secretion of PCB congeners and metabolites involved in PCB degradation. For simplicity, all PCB congeners which have an equal amount of chlorines attached are shown as one bar. In addition, chlorinated pathway end products are displayed together with their non-chlorinated variants as one bar. A. Optimal flux distribution of model A while the objective function is benzoate production. B. Optimal flux distribution of model B while the objective function is benzoate production.

By observing Figure 4 it can be seen that while model B performs identically well as during the previous simulations, the results for model A are very promising. The efficiency with which lowly chlorinated PCBs are degraded is similar to those of model B, and benzoate is also produced at an equivalent rate, indicating that both models are able to completely metabolize all of the PCB they are provided. With these results behind us it was determined that constructing our yeast strain in accordance with model A would be enough for a proof of concept study, and if successful the final genes could be implemented at a later date. Indeed, while these modeling results determined the scope of our project, the contrast between the results for model A and model B regarding their ability to degrade PCB under the objective of growth make the completion of the pathway a very exciting future prospect.

PCB Dechlorination Modeling

As the project progressed, the focus shifted towards the dechlorinating gene pcbA5. The reason for this was insight gained from human practices, where it was indicated that the dechlorination of highly chlorinated PCB congeners would make them easier to handle. Indeed, it has been shown that lowly chlorinated biphenyls are biodegradable to some extent, while congeners with more that five chlorines attached are completely resistant towards degradation [6]. With the importance of dechlorination in mind, a new model, here called model C, was generated to simulate a yeast into which only pcbA5 had been integrated. Using the same justification as previously when choosing an objective function, chlorine exchange was chosen to simulate dechlorination. The results from the simulation can be seen in Figure 5.

**Figure 5.** *Flux distribution for uptake/secretion of PCB congeners. For simplicity, all PCB congeners which have an equal amount of chlorines attached are shown as one bar.*

As can be observed from Figure 5, model C successfully dechlorinates the highly chlorinated PCB congeners. When comparing the results with those displayed in Figure 3 and Figure 4, the model is able to remove highly chlorinated congeners at the same rate as that achieved by model B from the previous simulations. It can be noted that the majority of the PCB congeners that are being produced by this model are tetrachlorinated, with some tri- and dichlorinated biphenyls being produced as well. The production of many tetrachlorinated congeners is expected however. It is known that due to the positioning of the chlorines there are many such congeners that pcbA5 is not able to act upon due to its possible targets being meta- and para-positioned chlorines [7].

A simulation with the objective function set to growth was also performed with the aim to compare the optimal growth rates of model C to the previously generated model A. The results can be observed in Table 2.

**Table 2.** *Comparison of the optimal growth rates of the original GEM, here labeled 'WT', model A and model C.*
Model	Growth rate
WT	0.09741
A	0.098492
C	0.098492

While observing Table 2, it becomes very interesting to note that model A and model C have an identical optimal growth rate under identical conditions. Indeed, by observing Figure 1 it becomes apparent that model A only shows dechlorination activity under the objective of growth, meaning that likely pcbA5 is the only gene showing activity in this scenario. Once again, the numbers presented in Table 2 can not be taken as quantitative results, however this indicates that for the survival of the cell, pcbA5 is more important than any of the remaining genes added to model A.

These results highlight that this gene is of importance in isolation from the remaining pathway, since the congeners that are completely resistant towards biodegradation are indeed able to be removed, and implementing a single gene into a yeast strain is a less daunting task than implementing the entire pathway. However, it still does not undermine the fact that the intended pathway produces a more desirable outcome.

Model Design

In order to construct the desired models, the first thing that needed to be done was acquiring the consensus genome scale metabolic model (GEM) for S. cerevisiae [8], which is currently hosted on the SysBio GitHub. This was downloaded and loaded into MATLAB. The structure of the GEM was then analyzed, and based on how the reactions present in the model were written, excel sheets were constructed containing the necessary information about the additional reactions that were to be implemented. These sheets were designed such that each sheet represented a specific enzyme in the pathway, since this allowed for separate integration and analysis of each gene, akin to what would be done in the lab, with an additional sheet being created to house necessary transport and exchange reactions.

Underlying assumptions about the model

While analyzing the network of reactions and constructing the excel sheets, a number of assumptions were made about the pathway. Firstly, the reactions were defined with a 1:1 stoichiometry according to the pathway displayed in Figure 1. Additionally, due to lack of knowledge about the kinetics of the involved enzymes, most reactions were assumed to be reversible, with the exceptions being the dechlorinating reactions catalyzed by PcbA5 and BphK as well as the reactions that break apart the substrate molecule into two product molecules. These reactions are specifically those catalyzed by BphD and BphI, as can be observed in Figure 1. Another important assumption that was made was that all of the reactions take place in the cytosol of the cell, due to lack of information about specific localization. This assumption in turn led to another important assumption having to be made: PCB congeners and pathway metabolites that can not be further degraded were allowed to freely leave the cell, and PCB that existed outside of the cell was allowed to freely enter it. These assumptions were again made based on lack of data that contradicted this hypothesis.

Implementation in MATLAB

After compiling all of the reactions that make up the intended pathway, the next step was to incorporate them into the MATLAB model. The RAVEN toolbox [9] was downloaded, also from the SysBio GitHub,and used for this purpose. Functions from RAVEN were also used to perform simulations with the finished models. The specific version of RAVEN that was used to construct the models can also be found, along with all of the MATLAB files, excel sheets and the base yeast GEM used for this project, on our GitHub.

**Figure 6.** Overview of the file structure used for the genome-scale metabolic modeling. The dashed lines display how files are loaded into and/or called on by other files. The colored arrows represent the order in which the sections of the main file should be run to generate the results described under 'Model overview'. To generate 'model A', run the file according to the blue arrows. To generate model B, run the file according to the orange arrows. In order to change the objective function from growth to benzoate production, run the additional section indicated by the purple arrows.

Figure 6 shows a visual representation of the file structure, and how to run the code to construct the two models described above. The different sections of the main file ‘PCBDegradationGEM.m’ are used to load the model, integrate additional reactions using the function file ‘add_to_model.m’, and perform simulations with the generated models using the function file ‘simulate_PCB_degradation.m’. Please note that in order to run these files it is a requirement to have the RAVEN toolbox installed, as well as having all of the relevant files and excel sheets shown in Figure 5 in the working directory.

While constructing model C, described under 'PCB dechlorination modeling', new scripts were written to accommodate for this model containing fewer reactions. These are called 'PCBdechlorinationGEM.m' and 'simulate_PCB_dechlorination.m' and can also be found on our GitHub. These scripts are operated very similarly to the previously described scripts, and require that the previously mentioned files 'transport_and_exchange.xlsx', 'pcbA5.xlsx' and 'add_to_model.m' are present within the current directory.

References

Bedard DL. PCB dechlorinases revealed at last. Proceedings of the National Academy of Sciences. 2014 Aug 19;111(33):11919-20.
Agulló L, Pieper DH, Seeger M. Genetics and biochemistry of biphenyl and PCB biodegradation. Aerobic Utilization of Hydrocarbons, Oils, and Lipids. 2019:595-622.
Orth JD, Thiele I, Palsson BØ. What is flux balance analysis?. Nature biotechnology. 2010 Mar;28(3):245.
Kauffman KJ, Prakash P, Edwards JS. Advances in flux balance analysis. Current opinion in biotechnology. 2003 Oct 1;14(5):491-6.
Albro PW, Corbett JT, Schroeder JL. Quantitative characterization of polychlorinated biphenyl mixtures (aroclors® 1248, 1254 and 1260) by gas chromatograpy using capillary columns. Journal of Chromatography A. 1981 Jan 23;205(1):103-11.
Food and Agriculture Organization of the United Nations. Assessing soil contamination A reference manual [Internet]. Fao.org. 2019 [cited 13 October 2019]. Available from: http://www.fao.org/3/X2570E08.htm?fbclid=IwAR2kMzvRuwBMAF_ndSCfdRNQ-Bue_b6lx0gcO6fNJtj8b7zd_ExHZ2Qk2BE
Wang S, Chng KR, Wilm A, Zhao S, Yang KL, Nagarajan N, He J. Genomic characterization of three unique Dehalococcoides that respire on persistent polychlorinated biphenyls. Proceedings of the National Academy of Sciences. 2014 Aug 19;111(33):12103-8.
Aung HW, Henry SA, Walker LP. Revising the representation of fatty acid, glycerolipid, and glycerophospholipid metabolism in the consensus model of yeast metabolism. Industrial biotechnology. 2013 Aug 1;9(4):215-28.
Wang H, Marcišauskas S, Sánchez BJ, Domenzain I, Hermansson D, Agren R, Nielsen J, Kerkhoven EJ. (2018) RAVEN 2.0: A versatile toolbox for metabolic network reconstruction and a case study on Streptomyces coelicolor. PLoS Comput Biol 14(10): e1006541. doi:10.1371/journal.pcbi.1006541.

Team:Chalmers-Gothenburg/Model