| Line 14: | Line 14: | ||
</head> | </head> | ||
| − | <h1 id="modeling">Modeling</h1> | + | <body> |
| + | <h1 id="modeling">Modeling</h1> | ||
| − | <h2 id="basicidea">Basic Idea</h2> | + | <h2 id="basicidea">Basic Idea</h2> |
| − | + | ||
| − | <p>Here we provide a simple but efficient mathematical model to interpret our data. In this model, we tried to exclude the influence of cross interactions and get high accurate result.</p> | + | <p>Here we provide a simple but efficient mathematical model to interpret our data. In this model, we tried to exclude the influence of cross interactions and get high accurate result.</p> |
| − | + | ||
| − | <h2 id="assumption">Assumption</h2> | + | <h2 id="assumption">Assumption</h2> |
| − | + | ||
| − | <p>Any abstract model requires proper assumptions to approximate real system. Here are our basic assumptions.</p> | + | <p>Any abstract model requires proper assumptions to approximate real system. Here are our basic assumptions.</p> |
| − | + | ||
| − | <ol> | + | <ol> |
| − | <li>Fluorence intensity is proportional to the binding efficiency of ion with vector</li> | + | <li>Fluorence intensity is proportional to the binding efficiency of ion with vector</li> |
| − | + | ||
| − | <li>Any responsing curve fits Hill equation.</li> | + | <li>Any responsing curve fits Hill equation.</li> |
| − | + | ||
| − | <li>When different kinds of ion exist in the detection system, they will not influence others' responsing behaviour. </li> | + | <li>When different kinds of ion exist in the detection system, they will not influence others' responsing behaviour. </li> |
| − | </ol> | + | </ol> |
| − | + | ||
| − | <p>The first assumption is natural. Consider the whole responsing process as a computation module which consists of input, processing layer, and output. Therefore, the input is the number of ions. Processing layer is to produce fluorence protein when the ions successfully bound with vector. Output is the number of fluorence protein. The only thing affect the output is the probability of ion binding, which is determined by input. In average, we have the following: | + | <p>The first assumption is natural. Consider the whole responsing process as a computation module which consists of input, processing layer, and output. Therefore, the input is the number of ions. Processing layer is to produce fluorence protein when the ions successfully bound with vector. Output is the number of fluorence protein. The only thing affect the output is the probability of ion binding, which is determined by input. In average, we have the following: |
| − | $$ | + | $$ |
| − | I \propto H(x) | + | I \propto H(x) |
| − | $$ | + | $$ |
| − | Here $I$ refers to intensity and $H(x)$ refers to the probability of binding. $x$ stands for the concentration of ion. </p> | + | Here $I$ refers to intensity and $H(x)$ refers to the probability of binding. $x$ stands for the concentration of ion. </p> |
| − | + | ||
| − | <p>The second assumption helps us to calculate the probability in assumption one, which is based on a simplified but well-established biochemical model. In short, Hill equation reflects the binding of ligands to macromolecules, as a function of the ligand concentration. It can be written as: | + | <p>The second assumption helps us to calculate the probability in assumption one, which is based on a simplified but well-established biochemical model. In short, Hill equation reflects the binding of ligands to macromolecules, as a function of the ligand concentration. It can be written as: |
| − | $$ | + | $$ |
| − | P<em>{active}=H</em>a(x)=\frac{x^n}{k<em>a^n+x^n}\ | + | P<em>{active}=H</em>a(x)=\frac{x^n}{k<em>a^n+x^n}\ |
| − | P</em>{inhibitive}=H<em>i(x)=\frac{k</em>i^n}{k_i^n+x^n} | + | P</em>{inhibitive}=H<em>i(x)=\frac{k</em>i^n}{k_i^n+x^n} |
| − | $$</p> | + | $$</p> |
| − | + | ||
| − | <p>To explain the physical meaning of those parameters, here we briefly introduce the model. Suppose an "on" state of processing layer requires $n$ binding ions, and corresponding chemical reaction can be written as: | + | <p>To explain the physical meaning of those parameters, here we briefly introduce the model. Suppose an "on" state of processing layer requires $n$ binding ions, and corresponding chemical reaction can be written as: |
| − | $$ | + | $$ |
| − | P+nI \rightleftharpoons nPI | + | P+nI \rightleftharpoons nPI |
| − | $$ | + | $$ |
| − | And the dissociation constant can be expressed as: $$K<em>d=\frac{[P][I]^n}{[nPI]}$$ | + | And the dissociation constant can be expressed as: |
| − | What we focus on is the probability of binding, which can be calculated by the proportion of $[nPI]$ over all states. To simplify, we only consider two states: all binding($[P]$) and no binding($[nPI]$).Therefore the probability is: | + | $$ |
| − | $$ | + | K<em>d=\frac{[P][I]^n}{[nPI]} |
| − | P=\frac{[nPI]}{[P]+[nPI]}\ | + | $$ |
| − | =\frac{\frac{[P][I]^n}{K</em>d}}{[P]+\frac{[P][I]^n}{K<em>d}} | + | |
| − | \=\frac{[I]^n}{K</em>d+[I]^n} | + | What we focus on is the probability of binding, which can be calculated by the proportion of $[nPI]$ over all states. To simplify, we only consider two states: all binding($[P]$) and no binding($[nPI]$).Therefore the probability is: |
| − | \=\frac{[I]^n}{k<em>a^n+[I]^n} | + | $$ |
| − | $$ | + | P=\frac{[nPI]}{[P]+[nPI]}\ |
| − | Since $K</em>d$ is a positive constant, it can be rewritten in power form:$K<em>d=k</em>a^n$. So the probability is the function of ion concentration. This is exactly the second assumption.</p> | + | =\frac{\frac{[P][I]^n}{K</em>d}}{[P]+\frac{[P][I]^n}{K<em>d}} |
| − | + | \=\frac{[I]^n}{K</em>d+[I]^n} | |
| − | <p>By integrating assumption 1 and 2, it is easy to find out the function of intensity: | + | \=\frac{[I]^n}{k<em>a^n+[I]^n} |
| − | $$ | + | $$ |
| − | I=I<em>{max}\frac{[I]^n}{k</em>a^n+[I]^n} | + | Since $K</em>d$ is a positive constant, it can be rewritten in power form:$K<em>d=k</em>a^n$. So the probability is the function of ion concentration. This is exactly the second assumption.</p> |
| − | $$ | + | |
| − | $I_{max}$ is the maximum intensity the system can get. </p> | + | <p>By integrating assumption 1 and 2, it is easy to find out the function of intensity: |
| − | + | $$ | |
| − | <p>The third assumption ensures a linear model system. Since each kind of ion has its unique responsing curve to each detector, the final intensity will be the addition of intensity from all kinds of ion. This assumption ensure that the intensity for each ion will only depend on the concentration of their own and will not be influenced by other ions. Therefore, the intensity for $i^{th}$ detector should be: | + | I=I<em>{max}\frac{[I]^n}{k</em>a^n+[I]^n} |
| − | $$ | + | $$ |
| − | I<em>i=\sum</em>j(I<em>{max})</em>{ij}H<em>{ij}(x</em>j) <br /> | + | $I_{max}$ is the maximum intensity the system can get. </p> |
| − | $$ | + | |
| − | Here $H_{ij}$ is the responsing curve of $j^{th}$ ion to $i^{th}$ vector. </p> | + | <p>The third assumption ensures a linear model system. Since each kind of ion has its unique responsing curve to each detector, the final intensity will be the addition of intensity from all kinds of ion. This assumption ensure that the intensity for each ion will only depend on the concentration of their own and will not be influenced by other ions. Therefore, the intensity for $i^{th}$ detector should be: |
| − | + | $$ | |
| − | <h2 id="model">Model</h2> | + | I<em>i=\sum</em>j(I<em>{max})</em>{ij}H<em>{ij}(x</em>j) <br /> |
| − | + | $$ | |
| − | <p>Based on the analysis in assumption, the model system is clear. For $i^{th}$ the detector should be: | + | Here $H_{ij}$ is the responsing curve of $j^{th}$ ion to $i^{th}$ vector. </p> |
| − | $$ | + | |
| − | I<em>i=\sum</em>j(I<em>{max})</em>{ij}H<em>{ij}(x</em>j) | + | <h2 id="model">Model</h2> |
| − | $$ | + | |
| − | Here $H_{ij}$ is the responsing curve of $j^{th}$ ion to $i^{th}$ vector. </p> | + | <p>Based on the analysis in assumption, the model system is clear. For $i^{th}$ the detector should be: |
| − | + | $$ | |
| − | <p>To analyze our data, we should first determine the all coeffcients in the expression and then use those standard functions to calculate the actual concentrations in an unknown sample. </p> | + | I<em>i=\sum</em>j(I<em>{max})</em>{ij}H<em>{ij}(x</em>j) |
| − | + | $$ | |
| − | <h3 id="determinecoefficients">Determine Coefficients</h3> | + | Here $H_{ij}$ is the responsing curve of $j^{th}$ ion to $i^{th}$ vector. </p> |
| − | + | ||
| − | <p>During the experiment, we have tested how the detectors response to ions by a concentration gradient. To couple the experiment with model, it is necessary to do some transformation on fitting equation. </p> | + | <p>To analyze our data, we should first determine the all coeffcients in the expression and then use those standard functions to calculate the actual concentrations in an unknown sample. </p> |
| − | + | ||
| − | <p>Generally, $I<em>j=(I</em>{max})H(x)$ can be rewritten as: | + | <h3 id="determinecoefficients">Determine Coefficients</h3> |
| − | $$ | + | |
| − | \log(\frac{I}{I<em>{max}-I})=n\log(k</em>a)-n\log(I) | + | <p>During the experiment, we have tested how the detectors response to ions by a concentration gradient. To couple the experiment with model, it is necessary to do some transformation on fitting equation. </p> |
| − | $$ | + | |
| − | In this form, we can easily get a linear relation between our input concerntration and output. The question is how to find out $I_{max}$ in this equation because this value determine the reprocessed data of output. Another question is, due to the large scale of our data, to ease the workload of proceesing such data. To meet the needs of these two question, define the ratio between output data and the maximum of all output data as the standard output. As following shows:</p> | + | <p>Generally, $I<em>j=(I</em>{max})H(x)$ can be rewritten as: |
| − | + | $$ | |
| − | <p>$$ | + | \log(\frac{I}{I<em>{max}-I})=n\log(k</em>a)-n\log(I) |
| − | {output}={I<em>1,I</em>2,···,I_n} | + | $$ |
| − | $$</p> | + | In this form, we can easily get a linear relation between our input concerntration and output. The question is how to find out $I_{max}$ in this equation because this value determine the reprocessed data of output. Another question is, due to the large scale of our data, to ease the workload of proceesing such data. To meet the needs of these two question, define the ratio between output data and the maximum of all output data as the standard output. As following shows:</p> |
| − | + | ||
| − | <p>$$ | + | <p>$$ |
| − | I'<em>{output}={I</em>1',I<em>2',···,I</em>n'}\quad which\quad I<em>i'=\frac{I</em>i}{\max{I_{output}}} | + | {output}={I<em>1,I</em>2,···,I_n} |
| − | $$</p> | + | $$</p> |
| − | + | ||
| − | <p>The elements in $I'<em>{output}$ fit following equation: | + | <p>$$ |
| − | $$ | + | I'<em>{output}={I</em>1',I<em>2',···,I</em>n'}\quad which\quad I<em>i'=\frac{I</em>i}{\max{I_{output}}} |
| − | \log{\frac{{I</em>i'}\max{I<em>{output}}}{I</em>{max}-{I<em>i'}\max{I</em>{output}}}}=n\log{x<em>i}-n\log{k} | + | $$</p> |
| − | $$ | + | |
| − | We define the value of $\frac{I</em>{max}}{\max{I<em>{output}}}$ as a parameter $PI</em>{max}$. So the equation we actually simulate is following one: | + | <p>The elements in $I'<em>{output}$ fit following equation: |
| − | $$ | + | $$ |
| − | \log{\frac{y<em>i'}{PI</em>{max}-y<em>i'}}=n\log{x</em>i}-n\log{k} | + | \log{\frac{{I</em>i'}\max{I<em>{output}}}{I</em>{max}-{I<em>i'}\max{I</em>{output}}}}=n\log{x<em>i}-n\log{k} |
| − | $$ | + | $$ |
| − | Use Mathematica, the following code is shown:</p> | + | We define the value of $\frac{I</em>{max}}{\max{I<em>{output}}}$ as a parameter $PI</em>{max}$. So the equation we actually simulate is following one: |
| − | + | $$ | |
| − | <pre><code class="Mathematica language-Mathematica">outputdata = {Output1, Output2, Output3, Output4, Output5, Output6, Output7}; | + | \log{\frac{y<em>i'}{PI</em>{max}-y<em>i'}}=n\log{x</em>i}-n\log{k} |
| − | Processeddata = outputdata/Max[outputdata] // N; | + | $$ |
| − | data' = {{Log10[10^(-10)], Processeddata[[1]]}, {Log10[10^(-9)], | + | Use Mathematica, the following code is shown:</p> |
| − | + | ||
| − | + | <pre><code class="Mathematica language-Mathematica">outputdata = {Output1, Output2, Output3, Output4, Output5, Output6, Output7}; | |
| − | + | Processeddata = outputdata/Max[outputdata] // N; | |
| − | + | data' = {{Log10[10^(-10)], Processeddata[[1]]}, {Log10[10^(-9)], | |
| − | + | Processeddata[[2]]}, {Log10[10^(-8)], | |
| − | data = {{data'[[1, 1]], data'[[1, 2]]}, {data'[[2, 1]], | + | Processeddata[[3]]}, {Log10[10^(-7)], |
| − | + | Processeddata[[4]]}, {Log10[10^(-6)], | |
| − | + | Processeddata[[5]]}, {Log10[10^(-5)], | |
| − | + | Processeddata[[6]]}, {Log10[10^(-4)], Processeddata[[7]]}}; | |
| − | solu = Flatten[ | + | data = {{data'[[1, 1]], data'[[1, 2]]}, {data'[[2, 1]], |
| − | + | data'[[2, 2]]}, {data'[[3, 1]], data'[[3, 2]]}, {data'[[4, 1]], | |
| − | fitparameter = (FindFit[data, y /. solu, {PImax, logk, n}, x]) | + | data'[[4, 2]]}, {data'[[5, 1]], data'[[5, 2]]}, {data'[[6, 1]], |
| − | fit = y /. solu /. fitparameter; | + | data'[[6, 2]]}, {data'[[7, 1]], data'[[7, 2]]}}; |
| − | Show[ListPlot[data, PlotStyle -> Red], Plot[fit, {x, -11, 0}], | + | solu = Flatten[ |
| − | + | Solve[Log10[(y*PImax)/(1 - (y*PImax))] == n*x - n*logk, y]]; | |
| − | </code></pre> | + | fitparameter = (FindFit[data, y /. solu, {PImax, logk, n}, x]) |
| − | + | fit = y /. solu /. fitparameter; | |
| − | <h3 id="dataanalysis">Data Analysis</h3> | + | Show[ListPlot[data, PlotStyle -> Red], Plot[fit, {x, -11, 0}], |
| − | + | PlotRange -> {0, 1}] | |
| − | <p>In last section, we successfully got statistics of each detector. Now they will be used to analyze an unknown sample. </p> | + | </code></pre> |
| − | + | ||
| − | <p>Denote the concentration of each ion in the sample is $X_n$. </p> | + | <h3 id="dataanalysis">Data Analysis</h3> |
| − | + | ||
| − | <p>For each detector: | + | <p>In last section, we successfully got statistics of each detector. Now they will be used to analyze an unknown sample. </p> |
| − | $$ | + | |
| − | I<em>i=\sum</em>j(I<em>{max})</em>{ij}H<em>{ij}(X</em>j) | + | <p>Denote the concentration of each ion in the sample is $X_n$. </p> |
| − | \for\ i=1,2,3,···,n | + | |
| − | $$ | + | <p>For each detector: |
| − | Which ${(I<em>{max})</em>{ij}H<em>{ij}(x)}$ has been determined for all $i,j$. ${I</em>i}$ is the output of unknown sample in $i^{th}$ detector. </p> | + | $$ |
| − | + | I<em>i=\sum</em>j(I<em>{max})</em>{ij}H<em>{ij}(X</em>j) | |
| − | <p>Now we have $n$ equation for $n$ variables, it should determined the value of all variables. But unfortunately, this is not a linear system and more importantly, the technique we used to get linear form in last section cannot be transplanted here. A general way to solve such an nonlinear system is so-called "Netwon Iteration Method". </p> | + | \for\ i=1,2,3,···,n |
| − | + | $$ | |
| − | <p><h4> New Form | + | Which ${(I<em>{max})</em>{ij}H<em>{ij}(x)}$ has been determined for all $i,j$. ${I</em>i}$ is the output of unknown sample in $i^{th}$ detector. </p> |
| − | First rewrite the model as: | + | |
| − | $$ | + | <p>Now we have $n$ equation for $n$ variables, it should determined the value of all variables. But unfortunately, this is not a linear system and more importantly, the technique we used to get linear form in last section cannot be transplanted here. A general way to solve such an nonlinear system is so-called "Netwon Iteration Method". </p> |
| − | \sum<em>j(I</em>{max})<em>{ij}H</em>{ij}(X<em>j)-I</em>i=0 | + | |
| − | $$ | + | <p><h4> New Form |
| − | Define: | + | First rewrite the model as: |
| − | $$ | + | $$ |
| − | F(X)=\left(\begin{array}{}F<em>1(X)\F</em>2(X)\···\F<em>n(X)\end{array}\right);X=\left(\begin{array}{}x</em>1\x<em>2\···\x</em>n\end{array}\right) | + | \sum<em>j(I</em>{max})<em>{ij}H</em>{ij}(X<em>j)-I</em>i=0 |
| − | \ F<em>i(X)=\sum</em>j(I<em>{max})</em>{ij}H<em>{ij}(x</em>j)-I<em>i,for\ i=1,2,3,···,n | + | $$ |
| − | \\therefore F(X)=0 | + | Define: |
| − | $$ | + | $$ |
| − | Now calculate the Jacobian matrix: | + | F(X)=\left(\begin{array}{}F<em>1(X)\F</em>2(X)\···\F<em>n(X)\end{array}\right);X=\left(\begin{array}{}x</em>1\x<em>2\···\x</em>n\end{array}\right) |
| − | $$ | + | \ F<em>i(X)=\sum</em>j(I<em>{max})</em>{ij}H<em>{ij}(x</em>j)-I<em>i,for\ i=1,2,3,···,n |
| − | J(X)=\left(\begin{array}{}\frac{\partial F</em>1(X)}{\partial x<em>1}&\frac{\partial F</em>1(X)}{\partial x<em>1}&···&\frac{\partial F</em>1(X)}{\partial x<em>n}\\frac{\partial F</em>2(X)}{\partial x<em>1}&\frac{\partial F</em>2(X)}{\partial x<em>1}&···&\frac{\partial F</em>2(X)}{\partial x<em>n}\···&···&···&···\\frac{\partial F</em>n(X)}{\partial x<em>1}&\frac{\partial F</em>n(X)}{\partial x<em>1}&···&\frac{\partial F</em>n(X)}{\partial x<em>n}\end{array}\right) | + | \\therefore F(X)=0 |
| − | \ \frac{\partial F</em>i(X)}{\partial x<em>j}=\frac{\partial }{\partial x</em>j}(\sum<em>j(I</em>{max})<em>{ij}H</em>{ij}(x<em>j)-I</em>i)=(I<em>{max})</em>{ij}\frac{\partial }{\partial x<em>j}H</em>{ij}(x<em>j) | + | $$ |
| − | $$ | + | Now calculate the Jacobian matrix: |
| − | Once Jacobian matrix is determined, take iteration: | + | $$ |
| − | $$ | + | J(X)=\left(\begin{array}{}\frac{\partial F</em>1(X)}{\partial x<em>1}&\frac{\partial F</em>1(X)}{\partial x<em>1}&···&\frac{\partial F</em>1(X)}{\partial x<em>n}\\frac{\partial F</em>2(X)}{\partial x<em>1}&\frac{\partial F</em>2(X)}{\partial x<em>1}&···&\frac{\partial F</em>2(X)}{\partial x<em>n}\···&···&···&···\\frac{\partial F</em>n(X)}{\partial x<em>1}&\frac{\partial F</em>n(X)}{\partial x<em>1}&···&\frac{\partial F</em>n(X)}{\partial x<em>n}\end{array}\right) |
| − | x</em>{n+1}=x<em>n-J^{-1}(x</em>n)F(x<em>n) | + | \ \frac{\partial F</em>i(X)}{\partial x<em>j}=\frac{\partial }{\partial x</em>j}(\sum<em>j(I</em>{max})<em>{ij}H</em>{ij}(x<em>j)-I</em>i)=(I<em>{max})</em>{ij}\frac{\partial }{\partial x<em>j}H</em>{ij}(x<em>j) |
| − | $$ | + | $$ |
| − | Theoretically we have | + | Once Jacobian matrix is determined, take iteration: |
| − | $$ | + | $$ |
| − | \\lim</em>{n\to+\infty}x<em>n=\left(\begin{array}{}X</em>1\X<em>2\···\X</em>n\end{array}\right) | + | x</em>{n+1}=x<em>n-J^{-1}(x</em>n)F(x<em>n) |
| − | $$</p> | + | $$ |
| − | + | Theoretically we have | |
| − | <p>This is exactly the newton iteration method.</p> | + | $$ |
| − | + | \\lim</em>{n\to+\infty}x<em>n=\left(\begin{array}{}X</em>1\X<em>2\···\X</em>n\end{array}\right) | |
| − | <p><h4> Notice</p> | + | $$</p> |
| − | + | ||
| − | <p>The iteration method does not always works well, which is depending on the property of iteration functions. For some nonlinear system, the approximation could be useful in a very ting neighborhood. Such neighborhood determines the flexibility of intial value choosing. If the initial value is choosen far from the solution and the solving system is bad, then the iteration will be deficient. Therefore, to choose proper position and proper inital value to start iteration algorithm is critial to final result. </p> | + | <p>This is exactly the newton iteration method.</p> |
| − | + | ||
| − | <p><h5> How to choose initial value <\h5></p> | + | <p><h4> Notice</p> |
| − | + | ||
| − | <p>To get a good initial value for iteration, a proper range of solution should be guessed. Based on the biological property of detectors, the cross talk between target ion and untarget ion should be low. In extreme, the response of a wonderful detector to untarget ion should be a constant. First get rid of all posible background, and then solve the equation only with target ion. </p> | + | <p>The iteration method does not always works well, which is depending on the property of iteration functions. For some nonlinear system, the approximation could be useful in a very ting neighborhood. Such neighborhood determines the flexibility of intial value choosing. If the initial value is choosen far from the solution and the solving system is bad, then the iteration will be deficient. Therefore, to choose proper position and proper inital value to start iteration algorithm is critial to final result. </p> |
| − | + | ||
| − | <p>That means to solve $x^0<em>i$ in following: | + | <p><h5> How to choose initial value <\h5></p> |
| − | $$ | + | |
| − | (I</em>{max})<em>{ii}H</em>{ii}(x<em>i)=I</em>i-\sum<em>{i \neq j}(I</em>{max})_{ij} | + | <p>To get a good initial value for iteration, a proper range of solution should be guessed. Based on the biological property of detectors, the cross talk between target ion and untarget ion should be low. In extreme, the response of a wonderful detector to untarget ion should be a constant. First get rid of all posible background, and then solve the equation only with target ion. </p> |
| − | $$ | + | |
| − | For all $i=1,2,···,n$</p> | + | <p>That means to solve $x^0<em>i$ in following: |
| − | + | $$ | |
| − | <p>Then we got initial value for iteration: | + | (I</em>{max})<em>{ii}H</em>{ii}(x<em>i)=I</em>i-\sum<em>{i \neq j}(I</em>{max})_{ij} |
| − | $$ | + | $$ |
| − | x^0=\left(\begin{array}{}x^0<em>1\x^0</em>2\···\x^0_n\end{array}\right) | + | For all $i=1,2,···,n$</p> |
| − | $$</p> | + | |
| − | + | <p>Then we got initial value for iteration: | |
| − | <p><h5> How to choose data? <\h5></p> | + | $$ |
| − | + | x^0=\left(\begin{array}{}x^0<em>1\x^0</em>2\···\x^0_n\end{array}\right) | |
| − | <p>In our test, the sample can be diluted to 10X or even 100X according to its quality. Therefore, we can choose the data that benefits to the data processing. From analysis, we want to make a robust approximation, which means a small perturbation will not lead to a big change of result. So the region with large derivative should be avoided. One thing should be noticed that since the inputs in algorithm are the detector results and outputs are deduced from response curves, the derivative of the inverse functions of the curve, not response curve, should be considered. Therefore, we choose the detector results with biggest change as the "proper" data. To define the "Change", here considers the difference between adjacent data points. </p> | + | $$</p> |
| − | + | ||
| − | <p><h4> Example | + | <p><h5> How to choose data? <\h5></p> |
| − | Here we provide an example for 2 variables:</p> | + | |
| − | + | <p>In our test, the sample can be diluted to 10X or even 100X according to its quality. Therefore, we can choose the data that benefits to the data processing. From analysis, we want to make a robust approximation, which means a small perturbation will not lead to a big change of result. So the region with large derivative should be avoided. One thing should be noticed that since the inputs in algorithm are the detector results and outputs are deduced from response curves, the derivative of the inverse functions of the curve, not response curve, should be considered. Therefore, we choose the detector results with biggest change as the "proper" data. To define the "Change", here considers the difference between adjacent data points. </p> | |
| − | <pre><code class="mathematica language-mathematica">H[n_, k_, x_, V_] := V*x^n/(k^n + x^n)(*Hill Function*); | + | |
| − | DH[n_, k_, x_, V_] := ( | + | <p><h4> Example |
| − | + | Here we provide an example for 2 variables:</p> | |
| − | + | ||
| − | M = {{0.003,0.0007}} | + | <pre><code class="mathematica language-mathematica">H[n_, k_, x_, V_] := V*x^n/(k^n + x^n)(*Hill Function*); |
| − | (*Initial Value,Determine by "Solve[5*x^2/((10^-3)^2+x^2)\\[Equal]4.5,x];Solve[7*x^2/((10^-3.5)^2+x^2)\[Equal]6,x]"*); | + | DH[n_, k_, x_, V_] := ( |
| − | + | V*k^n n x^(-1 + n))/(k^n + x^ | |
| − | For[i = 1, i < 10, i++, x1 = M[[i, 1]]; x2 = M[[i, 2]]; | + | n)^2(*Derivative of Hill Function*); |
| − | + | M = {{0.003,0.0007}} | |
| − | + | (*Initial Value,Determine by "Solve[5*x^2/((10^-3)^2+x^2)\\[Equal]4.5,x];Solve[7*x^2/((10^-3.5)^2+x^2)\[Equal]6,x]"*); | |
| − | + | ||
| − | + | For[i = 1, i < 10, i++, x1 = M[[i, 1]]; x2 = M[[i, 2]]; | |
| − | + | g1 = H[2, 10^(-3), x1, 5] + H[2, 10^(-1), x2, 4] - 4.5; | |
| − | + | g2 = H[2, 10^(-2), x1, 3] + H[2, 10^(-3.5), x2, 7] - 6; | |
| − | + | {x1d, y1d} = {x1, x2} - | |
| − | </code></pre> | + | Inverse[{{DH[2, 10^(-3), x1, 5], |
| − | + | DH[2, 10^(-1), x2, 4]}, {DH[2, 10^(-2), x1, 3], | |
| − | <p>Result:{0.00299939,0.000679022}</p> | + | DH[2, 10^(-3.5), x2, 7]}}].{g1, g2}; |
| − | + | AppendTo[M, {x1d, y1d}]](*Newton Method*); Print[M[[10]]] | |
| − | <p>Compared to original:{0.003,0.0007}</p> | + | </code></pre> |
| − | + | ||
| − | <p>Good approximation. </p> | + | <p>Result:{0.00299939,0.000679022}</p> |
| + | |||
| + | <p>Compared to original:{0.003,0.0007}</p> | ||
| + | |||
| + | <p>Good approximation. </p> | ||
| + | </body> | ||
</html> | </html> | ||
Revision as of 18:13, 20 October 2019
Modeling
Basic Idea
Here we provide a simple but efficient mathematical model to interpret our data. In this model, we tried to exclude the influence of cross interactions and get high accurate result.
Assumption
Any abstract model requires proper assumptions to approximate real system. Here are our basic assumptions.
- Fluorence intensity is proportional to the binding efficiency of ion with vector
- Any responsing curve fits Hill equation.
- When different kinds of ion exist in the detection system, they will not influence others' responsing behaviour.
The first assumption is natural. Consider the whole responsing process as a computation module which consists of input, processing layer, and output. Therefore, the input is the number of ions. Processing layer is to produce fluorence protein when the ions successfully bound with vector. Output is the number of fluorence protein. The only thing affect the output is the probability of ion binding, which is determined by input. In average, we have the following: $$ I \propto H(x) $$ Here $I$ refers to intensity and $H(x)$ refers to the probability of binding. $x$ stands for the concentration of ion.
The second assumption helps us to calculate the probability in assumption one, which is based on a simplified but well-established biochemical model. In short, Hill equation reflects the binding of ligands to macromolecules, as a function of the ligand concentration. It can be written as: $$ P{active}=Ha(x)=\frac{x^n}{ka^n+x^n}\ P{inhibitive}=Hi(x)=\frac{ki^n}{k_i^n+x^n} $$
To explain the physical meaning of those parameters, here we briefly introduce the model. Suppose an "on" state of processing layer requires $n$ binding ions, and corresponding chemical reaction can be written as: $$ P+nI \rightleftharpoons nPI $$ And the dissociation constant can be expressed as: $$ Kd=\frac{[P][I]^n}{[nPI]} $$ What we focus on is the probability of binding, which can be calculated by the proportion of $[nPI]$ over all states. To simplify, we only consider two states: all binding($[P]$) and no binding($[nPI]$).Therefore the probability is: $$ P=\frac{[nPI]}{[P]+[nPI]}\ =\frac{\frac{[P][I]^n}{Kd}}{[P]+\frac{[P][I]^n}{Kd}} \=\frac{[I]^n}{Kd+[I]^n} \=\frac{[I]^n}{ka^n+[I]^n} $$ Since $Kd$ is a positive constant, it can be rewritten in power form:$Kd=ka^n$. So the probability is the function of ion concentration. This is exactly the second assumption.
By integrating assumption 1 and 2, it is easy to find out the function of intensity: $$ I=I{max}\frac{[I]^n}{ka^n+[I]^n} $$ $I_{max}$ is the maximum intensity the system can get.
The third assumption ensures a linear model system. Since each kind of ion has its unique responsing curve to each detector, the final intensity will be the addition of intensity from all kinds of ion. This assumption ensure that the intensity for each ion will only depend on the concentration of their own and will not be influenced by other ions. Therefore, the intensity for $i^{th}$ detector should be:
$$
Ii=\sumj(I{max}){ij}H{ij}(xj)
$$
Here $H_{ij}$ is the responsing curve of $j^{th}$ ion to $i^{th}$ vector.
Model
Based on the analysis in assumption, the model system is clear. For $i^{th}$ the detector should be: $$ Ii=\sumj(I{max}){ij}H{ij}(xj) $$ Here $H_{ij}$ is the responsing curve of $j^{th}$ ion to $i^{th}$ vector.
To analyze our data, we should first determine the all coeffcients in the expression and then use those standard functions to calculate the actual concentrations in an unknown sample.
Determine Coefficients
During the experiment, we have tested how the detectors response to ions by a concentration gradient. To couple the experiment with model, it is necessary to do some transformation on fitting equation.
Generally, $Ij=(I{max})H(x)$ can be rewritten as: $$ \log(\frac{I}{I{max}-I})=n\log(ka)-n\log(I) $$ In this form, we can easily get a linear relation between our input concerntration and output. The question is how to find out $I_{max}$ in this equation because this value determine the reprocessed data of output. Another question is, due to the large scale of our data, to ease the workload of proceesing such data. To meet the needs of these two question, define the ratio between output data and the maximum of all output data as the standard output. As following shows:
$$ {output}={I1,I2,···,I_n} $$
$$ I'{output}={I1',I2',···,In'}\quad which\quad Ii'=\frac{Ii}{\max{I_{output}}} $$
The elements in $I'{output}$ fit following equation: $$ \log{\frac{{Ii'}\max{I{output}}}{I{max}-{Ii'}\max{I{output}}}}=n\log{xi}-n\log{k} $$ We define the value of $\frac{I{max}}{\max{I{output}}}$ as a parameter $PI{max}$. So the equation we actually simulate is following one: $$ \log{\frac{yi'}{PI{max}-yi'}}=n\log{xi}-n\log{k} $$ Use Mathematica, the following code is shown:
outputdata = {Output1, Output2, Output3, Output4, Output5, Output6, Output7};
Processeddata = outputdata/Max[outputdata] // N;
data' = {{Log10[10^(-10)], Processeddata[[1]]}, {Log10[10^(-9)],
Processeddata[[2]]}, {Log10[10^(-8)],
Processeddata[[3]]}, {Log10[10^(-7)],
Processeddata[[4]]}, {Log10[10^(-6)],
Processeddata[[5]]}, {Log10[10^(-5)],
Processeddata[[6]]}, {Log10[10^(-4)], Processeddata[[7]]}};
data = {{data'[[1, 1]], data'[[1, 2]]}, {data'[[2, 1]],
data'[[2, 2]]}, {data'[[3, 1]], data'[[3, 2]]}, {data'[[4, 1]],
data'[[4, 2]]}, {data'[[5, 1]], data'[[5, 2]]}, {data'[[6, 1]],
data'[[6, 2]]}, {data'[[7, 1]], data'[[7, 2]]}};
solu = Flatten[
Solve[Log10[(y*PImax)/(1 - (y*PImax))] == n*x - n*logk, y]];
fitparameter = (FindFit[data, y /. solu, {PImax, logk, n}, x])
fit = y /. solu /. fitparameter;
Show[ListPlot[data, PlotStyle -> Red], Plot[fit, {x, -11, 0}],
PlotRange -> {0, 1}]
Data Analysis
In last section, we successfully got statistics of each detector. Now they will be used to analyze an unknown sample.
Denote the concentration of each ion in the sample is $X_n$.
For each detector: $$ Ii=\sumj(I{max}){ij}H{ij}(Xj) \for\ i=1,2,3,···,n $$ Which ${(I{max}){ij}H{ij}(x)}$ has been determined for all $i,j$. ${Ii}$ is the output of unknown sample in $i^{th}$ detector.
Now we have $n$ equation for $n$ variables, it should determined the value of all variables. But unfortunately, this is not a linear system and more importantly, the technique we used to get linear form in last section cannot be transplanted here. A general way to solve such an nonlinear system is so-called "Netwon Iteration Method".
New Form First rewrite the model as: $$ \sumj(I{max}){ij}H{ij}(Xj)-Ii=0 $$ Define: $$ F(X)=\left(\begin{array}{}F1(X)\F2(X)\···\Fn(X)\end{array}\right);X=\left(\begin{array}{}x1\x2\···\xn\end{array}\right) \ Fi(X)=\sumj(I{max}){ij}H{ij}(xj)-Ii,for\ i=1,2,3,···,n \\therefore F(X)=0 $$ Now calculate the Jacobian matrix: $$ J(X)=\left(\begin{array}{}\frac{\partial F1(X)}{\partial x1}&\frac{\partial F1(X)}{\partial x1}&···&\frac{\partial F1(X)}{\partial xn}\\frac{\partial F2(X)}{\partial x1}&\frac{\partial F2(X)}{\partial x1}&···&\frac{\partial F2(X)}{\partial xn}\···&···&···&···\\frac{\partial Fn(X)}{\partial x1}&\frac{\partial Fn(X)}{\partial x1}&···&\frac{\partial Fn(X)}{\partial xn}\end{array}\right) \ \frac{\partial Fi(X)}{\partial xj}=\frac{\partial }{\partial xj}(\sumj(I{max}){ij}H{ij}(xj)-Ii)=(I{max}){ij}\frac{\partial }{\partial xj}H{ij}(xj) $$ Once Jacobian matrix is determined, take iteration: $$ x{n+1}=xn-J^{-1}(xn)F(xn) $$ Theoretically we have $$ \\lim{n\to+\infty}xn=\left(\begin{array}{}X1\X2\···\Xn\end{array}\right) $$
This is exactly the newton iteration method.
Notice
The iteration method does not always works well, which is depending on the property of iteration functions. For some nonlinear system, the approximation could be useful in a very ting neighborhood. Such neighborhood determines the flexibility of intial value choosing. If the initial value is choosen far from the solution and the solving system is bad, then the iteration will be deficient. Therefore, to choose proper position and proper inital value to start iteration algorithm is critial to final result.
How to choose initial value <\h5>
To get a good initial value for iteration, a proper range of solution should be guessed. Based on the biological property of detectors, the cross talk between target ion and untarget ion should be low. In extreme, the response of a wonderful detector to untarget ion should be a constant. First get rid of all posible background, and then solve the equation only with target ion.
That means to solve $x^0i$ in following: $$ (I{max}){ii}H{ii}(xi)=Ii-\sum{i \neq j}(I{max})_{ij} $$ For all $i=1,2,···,n$
Then we got initial value for iteration: $$ x^0=\left(\begin{array}{}x^01\x^02\···\x^0_n\end{array}\right) $$
How to choose data? <\h5>
In our test, the sample can be diluted to 10X or even 100X according to its quality. Therefore, we can choose the data that benefits to the data processing. From analysis, we want to make a robust approximation, which means a small perturbation will not lead to a big change of result. So the region with large derivative should be avoided. One thing should be noticed that since the inputs in algorithm are the detector results and outputs are deduced from response curves, the derivative of the inverse functions of the curve, not response curve, should be considered. Therefore, we choose the detector results with biggest change as the "proper" data. To define the "Change", here considers the difference between adjacent data points.
Example
Here we provide an example for 2 variables:
H[n_, k_, x_, V_] := V*x^n/(k^n + x^n)(*Hill Function*);
DH[n_, k_, x_, V_] := (
V*k^n n x^(-1 + n))/(k^n + x^
n)^2(*Derivative of Hill Function*);
M = {{0.003,0.0007}}
(*Initial Value,Determine by "Solve[5*x^2/((10^-3)^2+x^2)\\[Equal]4.5,x];Solve[7*x^2/((10^-3.5)^2+x^2)\[Equal]6,x]"*);
For[i = 1, i < 10, i++, x1 = M[[i, 1]]; x2 = M[[i, 2]];
g1 = H[2, 10^(-3), x1, 5] + H[2, 10^(-1), x2, 4] - 4.5;
g2 = H[2, 10^(-2), x1, 3] + H[2, 10^(-3.5), x2, 7] - 6;
{x1d, y1d} = {x1, x2} -
Inverse[{{DH[2, 10^(-3), x1, 5],
DH[2, 10^(-1), x2, 4]}, {DH[2, 10^(-2), x1, 3],
DH[2, 10^(-3.5), x2, 7]}}].{g1, g2};
AppendTo[M, {x1d, y1d}]](*Newton Method*); Print[M[[10]]]
H[n_, k_, x_, V_] := V*x^n/(k^n + x^n)(*Hill Function*);
DH[n_, k_, x_, V_] := (
V*k^n n x^(-1 + n))/(k^n + x^
n)^2(*Derivative of Hill Function*);
M = {{0.003,0.0007}}
(*Initial Value,Determine by "Solve[5*x^2/((10^-3)^2+x^2)\\[Equal]4.5,x];Solve[7*x^2/((10^-3.5)^2+x^2)\[Equal]6,x]"*);
For[i = 1, i < 10, i++, x1 = M[[i, 1]]; x2 = M[[i, 2]];
g1 = H[2, 10^(-3), x1, 5] + H[2, 10^(-1), x2, 4] - 4.5;
g2 = H[2, 10^(-2), x1, 3] + H[2, 10^(-3.5), x2, 7] - 6;
{x1d, y1d} = {x1, x2} -
Inverse[{{DH[2, 10^(-3), x1, 5],
DH[2, 10^(-1), x2, 4]}, {DH[2, 10^(-2), x1, 3],
DH[2, 10^(-3.5), x2, 7]}}].{g1, g2};
AppendTo[M, {x1d, y1d}]](*Newton Method*); Print[M[[10]]]
Result:{0.00299939,0.000679022}
Compared to original:{0.003,0.0007}
Good approximation.