Team:SBS NY/Model

Modeling

Basic Idea

Here we provide a simple but efficient mathematical model to interpret our data. In this model, we tried to exclude the influence of cross interactions and get high accurate result.

Assumption

Any abstract model requires proper assumptions to approximate real system. Here are our basic assumptions.

  1. Fluorence intensity is proportional to the binding efficiency of ion with vector
  2. Any responsing curve fits Hill equation.
  3. When different kinds of ion exist in the detection system, they will not influence others’ responsing behaviour.

The first assumption is natural. Consider the whole responsing process as a computation module which consists of input, processing layer, and output. Therefore, the input is the number of ions. Processing layer is to produce fluorence protein when the ions successfully bound with vector. Output is the number of fluorence protein. The only thing affect the output is the probability of ion binding, which is determined by input. In average, we have the following:

I∝H(x)
Here I refers to intensity and H(x) refers to the probability of binding. x stands for the concentration of ion.

The second assumption helps us to calculate the probability in assumption one, which is based on a simplified but well-established biochemical model. In short, Hill equation reflects the binding of ligands to macromolecules, as a function of the ligand concentration. It can be written as:

Pactive=Ha(x)=xnkan+xnPinhibitive=Hi(x)=kinkin+xn

To explain the physical meaning of those parameters, here we briefly introduce the model. Suppose an β€œon” state of processing layer requires n binding ions, and corresponding chemical reaction can be written as:

P+nIβ‡ŒnPI
And the dissociation constant can be expressed as:
Kd=[P][I]n[nPI]
What we focus on is the probability of binding, which can be calculated by the proportion of [nPI] over all states. To simplify, we only consider two states: all binding([P]) and no binding([nPI]).Therefore the probability is:
P=[nPI][P]+[nPI]=[P][I]nKd[P]+[P][I]nKd=[I]nKd+[I]n=[I]nkan+[I]n
Since Kd is a positive constant, it can be rewritten in power form:Kd=kan. So the probability is the function of ion concentration. This is exactly the second assumption.

By integrating assumption 1 and 2, it is easy to find out the function of intensity:

I=Imax[I]nkan+[I]n
Imax is the maximum intensity the system can get.

The third assumption ensures a linear model system. Since each kind of ion has its unique responsing curve to each detector, the final intensity will be the addition of intensity from all kinds of ion. This assumption ensure that the intensity for each ion will only depend on the concentration of their own and will not be influenced by other ions. Therefore, the intensity for ith detector should be:

Ii=βˆ‘j(Imax)ijHij(xj)
Here Hij is the responsing curve of jth ion to ith vector.

Model

Based on the analysis in assumption, the model system is clear. For ith the detector should be:

Ii=βˆ‘j(Imax)ijHij(xj)
Here Hij is the responsing curve of jth ion to ith vector.

To analyze our data, we should first determine the all coeffcients in the expression and then use those standard functions to calculate the actual concentrations in an unknown sample.

Determine Coefficients

During the experiment, we have tested how the detectors response to ions by a concentration gradient. To couple the experiment with model, it is necessary to do some transformation on fitting equation.

Generally, Ij=(Imax)H(x) can be rewritten as:

log⁑(IImaxβˆ’I)=nlog⁑(ka)βˆ’nlog⁑(I)
In this form, we can easily get a linear relation between our input concerntration and output. The question is how to find out Imax in this equation because this value determine the reprocessed data of output. Another question is, due to the large scale of our data, to ease the workload of proceesing such data. To meet the needs of these two question, define the ratio between output data and the maximum of all output data as the standard output. As following shows:

output=I1,I2,Β·Β·Β·,In

Ioutputβ€²={I1β€²,I2β€²,Β·Β·Β·,Inβ€²}whichIiβ€²=IimaxIoutput

The elements in Ioutputβ€² fit following equation:

log⁑Iiβ€²maxIoutputImaxβˆ’Iiβ€²maxIoutput=nlog⁑xiβˆ’nlog⁑k
We define the value of ImaxmaxIoutput as a parameter PImax. So the equation we actually simulate is following one:
log⁑yiβ€²PImaxβˆ’yiβ€²=nlog⁑xiβˆ’nlog⁑k
Use Mathematica, the following code is shown:

outputdata = {Output1, Output2, Output3, Output4, Output5, Output6, Output7};
Processeddata = outputdata/Max[outputdata] // N;
data' = {{Log10[10^(-10)], Processeddata[[1]]}, {Log10[10^(-9)], 
    Processeddata[[2]]}, {Log10[10^(-8)], 
    Processeddata[[3]]}, {Log10[10^(-7)], 
    Processeddata[[4]]}, {Log10[10^(-6)], 
    Processeddata[[5]]}, {Log10[10^(-5)], 
    Processeddata[[6]]}, {Log10[10^(-4)], Processeddata[[7]]}};
data = {{data'[[1, 1]], data'[[1, 2]]}, {data'[[2, 1]], 
    data'[[2, 2]]}, {data'[[3, 1]], data'[[3, 2]]}, {data'[[4, 1]], 
    data'[[4, 2]]}, {data'[[5, 1]], data'[[5, 2]]}, {data'[[6, 1]], 
    data'[[6, 2]]}, {data'[[7, 1]], data'[[7, 2]]}};
solu = Flatten[
   Solve[Log10[(y*PImax)/(1 - (y*PImax))] == n*x - n*logk, y]];
fitparameter = (FindFit[data, y /. solu, {PImax, logk, n}, x])
fit = y /. solu /. fitparameter;
Show[ListPlot[data, PlotStyle -> Red], Plot[fit, {x, -11, 0}], 
 PlotRange -> {0, 1}]

Data Analysis

In last section, we successfully got statistics of each detector. Now they will be used to analyze an unknown sample.

Denote the concentration of each ion in the sample is Xn.

For each detector:

Ii=βˆ‘j(Imax)ijHij(Xj)for i=1,2,3,Β·Β·Β·,n
Which {(Imax)ijHij(x)} has been determined for all i,j. {Ii} is the output of unknown sample in ith detector.

Now we have n equation for n variables, it should determined the value of all variables. But unfortunately, this is not a linear system and more importantly, the technique we used to get linear form in last section cannot be transplanted here. A general way to solve such an nonlinear system is so-called β€œNetwon Iteration Method”.

New Form

First rewrite the model as:

βˆ‘j(Imax)ijHij(Xj)βˆ’Ii=0
Define:
F(X)=(F1(X)F2(X)Β·Β·Β·Fn(X));X=(x1x2Β·Β·Β·xn)Fi(X)=βˆ‘j(Imax)ijHij(xj)βˆ’Ii,for i=1,2,3,Β·Β·Β·,n∴F(X)=0
Now calculate the Jacobian matrix:
J(X)=(βˆ‚F1(X)βˆ‚x1βˆ‚F1(X)βˆ‚x1Β·Β·Β·βˆ‚F1(X)βˆ‚xnβˆ‚F2(X)βˆ‚x1βˆ‚F2(X)βˆ‚x1Β·Β·Β·βˆ‚F2(X)βˆ‚xnΒ·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·βˆ‚Fn(X)βˆ‚x1βˆ‚Fn(X)βˆ‚x1Β·Β·Β·βˆ‚Fn(X)βˆ‚xn)βˆ‚Fi(X)βˆ‚xj=βˆ‚βˆ‚xj(βˆ‘j(Imax)ijHij(xj)βˆ’Ii)=(Imax)ijβˆ‚βˆ‚xjHij(xj)
Once Jacobian matrix is determined, take iteration:
xn+1=xnβˆ’Jβˆ’1(xn)F(xn)
Theoretically we have
limnβ†’+∞xn=(X1X2Β·Β·Β·Xn)

This is exactly the newton iteration method.

Notice

The iteration method does not always works well, which is depending on the property of iteration functions. For some nonlinear system, the approximation could be useful in a very ting neighborhood. Such neighborhood determines the flexibility of intial value choosing. If the initial value is choosen far from the solution and the solving system is bad, then the iteration will be deficient. Therefore, to choose proper position and proper inital value to start iteration algorithm is critial to final result.

How to choose initial value

To get a good initial value for iteration, a proper range of solution should be guessed. Based on the biological property of detectors, the cross talk between target ion and untarget ion should be low. In extreme, the response of a wonderful detector to untarget ion should be a constant. First get rid of all posible background, and then solve the equation only with target ion.

That means to solve xi0 in following:

(Imax)iiHii(xi)=Iiβˆ’βˆ‘iβ‰ j(Imax)ij
For all i=1,2,Β·Β·Β·,n

Then we got initial value for iteration:

x0=(x10x20Β·Β·Β·xn0)

How to choose data?

In our test, the sample can be diluted to 10X or even 100X according to its quality. Therefore, we can choose the data that benefits to the data processing. From analysis, we want to make a robust approximation, which means a small perturbation will not lead to a big change of result. So the region with large derivative should be avoided. One thing should be noticed that since the inputs in algorithm are the detector results and outputs are deduced from response curves, the derivative of the inverse functions of the curve, not response curve, should be considered. Therefore, we choose the detector results with biggest change as the β€œproper” data. To define the β€œChange”, here considers the difference between adjacent data points.

Example

Here we provide an example for 2 variables:

H[n_, k_, x_, V_] := V*x^n/(k^n + x^n)(*Hill Function*); 
DH[n_, k_, x_, V_] := (
 V*k^n n x^(-1 + n))/(k^n + x^
   n)^2(*Derivative of Hill Function*);
M = {{0.003,0.0007}}
(*Initial Value,Determine by "Solve[5*x^2/((10^-3)^2+x^2)\\[Equal]4.5,x];Solve[7*x^2/((10^-3.5)^2+x^2)\[Equal]6,x]"*);

For[i = 1, i < 10, i++, x1 = M[[i, 1]]; x2 = M[[i, 2]];
 g1 = H[2, 10^(-3), x1, 5] + H[2, 10^(-1), x2, 4] - 4.5;
 g2 = H[2, 10^(-2), x1, 3] + H[2, 10^(-3.5), x2, 7] - 6;
 {x1d, y1d} = {x1, x2} - 
   Inverse[{{DH[2, 10^(-3), x1, 5], 
       DH[2, 10^(-1), x2, 4]}, {DH[2, 10^(-2), x1, 3], 
       DH[2, 10^(-3.5), x2, 7]}}].{g1, g2}; 
 AppendTo[M, {x1d, y1d}]](*Newton Method*); Print[M[[10]]]

Result:{0.00299939,0.000679022}

Compared to original:{0.003,0.0007}

Good approximation.

Experiment

Experimental Design

Transformation of the Recombinant pSB1C3-GFP Plasmids

In order to test the ability of the merR-like receptors to detect heavy metal ions, we isolated four merR-like receptor genes – merR, cueR, cadR, and zntR – and inserted them into pSB1C3-GFP plasmids (see Notes section for details). Then, we transformed plasmids with four merR-like receptor genes into E.coli (DH5Ξ± strain) bacteria.

Next, we exposed the transformed bacteria to chloramphenicol, to which the recombinant plasmids are resistant in order to test if transformation is successful.

We then grew the bacteria that were successfully transformed at 37Β°C for 8-12 hours and moved them to LB broth for amplification.

When fully grown, 500uL of the bacteria liquid was taken out and was sequenced.

Preparation for Flurescent Measuring

After sequencing, the transformed E.coli were further amplified in the liquid LB broth with chloramphenicol. We got 1L of E.coli bacteria liquid for each type of plasmid (corresponding to four genes). We ceased to grow the bacteria as soon as OD reached 0.5.

Next, at a laminar flow cabinet, we added E.coli into ELISA plates (96 wells), and then added different types of heavy metal solutions, two groups (48 wells each) per plate, and each well containing different concentrations of bacteria and heavy metal solutions. The arrangements are briefly demonstrated below (check the link for details):

Plate 1 Plate 2 Plate 3
Hg, merR / Cu, merR Hg, cueR / Cu, cueR Hg + Cu, merR / Hg+Cu, cueR

Above were the groups that we tested first. We did some combinations of heavy metals, not solely one at a time, due to the fact that we aimed to eliminate the possibility of cross-talk as much as we could through modeling.

Next, we measured the rest of the genes’ expression on the exposure of different heavy metals:

Plate 4 Plate 5 Plate 6 Plate 7
Cd, merR / Zn, merR Cd, cueR / Zn, cueR Hg, cadR / Cu, cadR Cd, cadR / Zn, cadR
Plate 8 Plate 9 Plate 10
Hg, zntR / Cu, zntR Cd, zntR / Zn, zntR Cd+Zn, cadR / Cd+Zn, zntR

We then put all the plates into plate reader for fluorescent measuring. After the data was obtained, we analyzed the results and established our models for further studies.

Fluorescent Measuring

Result

The Expression of capB and Cadmium (Cd2+) Absorption

In addition to measuring heavy metal to find out the sensitivity of the merR-like receptors to respond to heavy metal exposure, we also found the capB gene during our research, a gene responsible for absorbing cadmium ions (Cd2+), to further tackle heavy metal. To test the amount of cadmium ions a certain amount of plamid with capB can absorb, we came up with another experiment in addition to the fluorescent measuring of

Notebook

Gene Sequences and Recombinant Plasmids Maps


Lab Notebook

timelines and procedures

Reference

  1. Bondarczuk K, Piotrowska-Seget Z. Molecular basis of active copper resistance mechanisms in Gram-negative bacteria. Cell Biol Toxicol. 2013;29(6):397–405. doi:10.1007/s10565-013-9262-1

  2. Qin, W., Liu, X., Yu, X., Chu, X., Tian, J., and Wu, N. (2017). Identification of cadmium resistance and adsorption gene from Escherichia coli BL21 (DE3). RSC Adv. 7, 51460–51465. doi: 10.1039/c7ra10656d

  3. Sekhar, K., Priyanka, B., Reddy, V. D., and Rao, K. V. (2011). Metallothionein 1 (CcMT1) of pigeonpea (Cajanus cajan, L.) confers enhanced tolerance to copper and cadmium in Escherichia coli and Arabidopsis thaliana. Environ. Exp. Bot. 72, 131–139. doi: 10.1016/j.envexpbot.2011.02.017

  4. Vidhyaparkavi, A., Osborne, J., and Babu, S. (2017). Analysis of zntA gene in environmental Escherichia coli and additional implications on its role in zinc translocation. 3 Biotech 7:9. doi: 10.1007/s13205-017-0613-0