Team:JiangnanU China/Model

JiangNan

Introduction

1. The stakeholders informed us that benefits matter most in application, we realized that it's essential to make our anti-phage strains grow as robust as original strain under the current fermentation conditions. However, it's difficult to tell which strain grows most suitable or which device has the least impact on cell growth from the growth curve graph of all those strains (Figure 1). Therefore, we established a mathematical model to evaluate the growth properties of strains containing different parts for the most suitable device in future application.
2. The phage-induced promoters are vital in our genetic circuit, which would response the phage stimulation and start transcription of antP and P-1 (antimicrobial peptide) against the phage infection (Figure 2). And the phage-induced promoters we used were selected from E. coli, so there might be some potential problems such as leakage and inclusion body due to the inappropriate promoter strength. Thus we developed a quantitative design method for phage-induced promoters based on strength prediction using artificial neural network, which allows us to choose or design promoters with desired strength without extra experiments.
Assumption

(1) The promoter strengths in the training set are the same using various vectors.
(2) The original strain growth properties were most suitable for fermentation industries.
(3) The impact of parts in cell growth are relatively the same under different situation.
GRA_EWM Model
Symbol Description
Weights Calculation of The Growth Curve
Analysis of The Weight Distribution

As showed in Figure 3, the OD600 measured after 4 hour to 10 hour were more useful. However, the weight given by the experts might consider plateau phase as an important period for industry fermentation.
Grey Relational Analysis for Picking The Most Suitable Strain
Conclusion

Figure 4 demonstrates that the impact of the distinguishing coefficient on the result of GRA is very significant. In particular, for all tested distinguishing coefficients, part gntR always ranks first, which means it is the most suitable part for future application.
BP-ANN Model

Accurate and controllable regulatory elements like promoters are indispensable tools to quantitatively regulate gene expression for rational pathway engineering, which means a promoter with proper strength might be an easy solution to leakage and inclusion body. In order to select or design promoters with desired strength without experiments, we developed a quantitative strength prediction method using artificial neural network (ANN) with Neural Network Toolbox under Matlab environment.
Construction of Early Phage-induced Promoter Strength Library

The endogenous promoters in E.coli BL21(DE3) we selected to respond to phage infection might start transcription without phage stimulation, so we selected 19 promoters with distributed strength from T4 phage mentioned in previous study as the phage-induced promoter library. (attachment: T4PE.docx)
The relative promoter strength is determined as the quotient of the β-lactamase divided by the 6-phospho-β-galactosidase activity. This ratio then is normalized to that obtained from a clone harboring P46.7. Reference in all promoter strength measurements), resulting in the pKWIII unit.
Computational platform construction

Matlab2019b (Mathworks Inc., http://www.mathworks.com/) ran on a personal laptop with Windows 10 64-bit(Microsoft Inc., http://www.microsoft.com/) operation system. Neural Network Toolbox within Matlab served as the basic tool for artificial neural network (ANN) model construction, data fitting and prediction. All programs used in this work were designed and run upon Neural Network Toolbox and Matlab environment.
Construction And Training of ANN Predicting Models

The initial BP-ANN model was built by Neural Network Toolbox. The model contains four layers, including an input layer, an output layer and two hidden layer. Neuron numbers of two hidden layer and a output layer were 26, 5, 1 respectively. The initial weight for all neuron connections were randomly assigned by Matlab functions. And other parameters were set as followed:
net = newff(minmax(p),[26,5,1],{'tansig', 'tansig', 'purelin'},'trainlm'); net.trainParam.epochs = 15000; net.trainParam.mc = 0.98; net.trainParam.goal = 1e-6; net.trainParam.lr = 0.01; And the original sequence data were translated to digital data and served as the input matrix according to the following rules via python program (attachment: seqDigtal.py):
Among all generated models, NET_26_5 (attachment: NET_26_5.mat) shows the highest correlation coefficient values of 0.93336 for test set prediction, and its correlation coefficient values for fitting the training data set reaches up to 0.99104. (Figure 3)
And all the promoter prediction strengths were showed in Figure 4. (attachment: pStrengthVsStrength.xlsx)
Conclusion

We could predict the designed promoter strength based on this BP-ANN model, which could save a lot time for us to test millions of designed promoters.
[1]. LI X, WANG K, LIU L, et al. Application of the Entropy Weight and TOPSIS Method in Safety Evaluation of Coal Mines [J]. Procedia Engineering, 2011, 26(4): 2085-2091.
[2]. KUO Y, YANG T, HUANG G W. The use of grey relational analysis in solving multiple attribute decision-making problems [J]. Computers & Industrial Engineering, 2008, 55(1): 80-93.
[3]. KAI W, R GER W. Characterization of Bacteriophage T4 Early Promoters in Vivo with a New Promoter Probe Vector [J]. Plasmid, 1996, 35(2): 108-120.
back