Latest revision as of 02:11, 18 October 2019

Model

Theoretical Analysis in Dry Experiment

Computation Model

In this part, we propose molecular computation models for artificial neural networks and prove the validity via kinetic analysis.

In our systems, numerical values are represented by concentrations of certain species. Chemical reactions are simplified as formal reactions such as \(A+B \xrightarrow{k} C\). A formal reaction consists of reactants, products (A, B and C) and rate constant (\(k\)). For each computation operation, given initial concentrations of certain species as inputs, the outputs are representd by concentrations of other species in the system at the end of reactions.

According to [1], formal reactions with less than or equal to two reactants can be mapped to DNA strand displacement (DSD) reactions [2] without losing the kinetic features of the reaction. Therefore in our project, after designing formal reactions which satisfy such constraints we use such a model to map our calculation operations to implementable DSD reactions.

As the calculation steps (weighted summation, activation and backpropagation) can be separated to addition,subtraction and multiplication operations, we only provide the implementation of such computation. For each calculation operation, we firstly provide formal reactions, then provide kinetic analysis and finally propose DSD reaction implementation. The analysis in our model is totally based on classic mass action kinetics.

Addition:

To implement addition, we utilize:
\(A_1 \xrightarrow{k_1} O, A_2 \xrightarrow{k_2} O,... A_n \xrightarrow{k_n} O\).
Initial concentrations of reactants \(A_i\) of such reactions are considered addends and the final concentration of product \(O\) represent our sum. Such reactions calculate \([O](\infty)=\sum_{i=1}^n[A_i](0)\).

Proof:

\(\dfrac{d [A_i](t)}{d t}=-k_i[A_i](t) (i=1,2...n)\) \(\Rightarrow [A_i](t)=[A_i](0)e^{-k_it} (i=1,2...n)\), \(\dfrac{d [O](t)}{d t}=\sum_{i=1}^n k_i[A_i](t)\) \(\Rightarrow [O](t)=-(\sum_{i=1}^n[A_i](0)e^{-k_it})+\sum_{i=1}^n[A_i](0)\) \(\Rightarrow [O](\infty)=\sum_{i=1}^n[A_i](0)\). Thus addition is successfully implemented.

The proposed DSD implementation:

Subtraction:

To build a subtractor, we use:
\(A+B \xrightarrow{k_1} \phi\).
Here, \(A, B\) are inputs and \(\phi\) is a species that will not be used elsewhere (wasted). We can arbitrarily define subtrahend and minuend. The result is the final concentration of \(A\) or \(B\), depending on the initial concentration of \(A\) and \(B\).

Proof:

Apparently, \([A](t)=[B](t)+\Delta \).
If \(\Delta \neq 0\), \(\dfrac{d [A](t)}{d t}=-[A](t)([A](t)-\Delta)\) \(\Rightarrow [A](t)=\dfrac{[A](0)\Delta}{-[A](0)+[A](0)e^{\Delta t}+\Delta e^{\Delta t}} (\Delta \neq 0).\) If \(\Delta > 0\), \([A](\infty)=\Delta\). Otherwise \([A](\infty)=0\).
If \(\Delta =0\), \([A](t)=\dfrac{[A](0)}{1+[A](0)t}\). \([A](\infty)=0\). Hence substraction is implemented.

The proposed DSD implementation:

Multiplication:

To calculate concentration multiplication, we utilize
\(\alpha \xrightarrow{k_1} \phi, A+B+\alpha \xrightarrow{k_2} A+B+\alpha+C\).
Concentrations of \(A, B\) represent factors and the final concentration of \(C\) is the product. It calculates \([C](\infty)=[A](0)\times[B](0)\).

Proof:

\(\dfrac{d [\alpha](t)}{d t}=-k_1[\alpha](t)\) \(\Rightarrow [\alpha](t)=[\alpha](0)e^{-k_1t},\) \(\dfrac{d [A](t)}{d t}=\dfrac{d [B](t)}{d t}=0, \dfrac{d [C](t)}{d t}=k_2[A](t)[B](t)[\alpha](t)\) \(\Rightarrow [C](\infty)=\int_0^\infty [A](0)[B](0)[\alpha](t)=k_2/k_1[\alpha](0)[A](0)[B](0)\). Hence multiplication is implemented.

The proposed DSD implementation:

To modify our proposed formal reactions and make them implementable, compared to the orignal version, there are some minor changes in our DSD reactions:
1. \(\alpha\) is canceled to reduce the number of reactants.
2. One of the reactants will be consumed.

Neuron Implementation

The model shown here is used to construct our Software Tool,which is capable of automatically generating a neuron according to user specifications. As concentrations are always positive, to represent a value that is possibly negative we have to use two species, which is called dual-rail representation method. For example, to represent \(x\) we should use both the concentrations of \(x^+\) and \(x^-\) to represent this value and assign a subtractor \(x^++x^-\xrightarrow{} \phi\).

1. Input layer

Input layer calculates weighted summation of all input data, i.e. it calculates \(\sum_{i=1}^n w_ix_i\) where \(n\) is the number of inputs, \(w_i, x_i\) (\(i=1,2,...n\)) are weights and inputs, respectively. In our implementation, the previously proposed multiplication method is employed. We have to implement calculation reactions that generate positive species and negative species respectively using multiplication.

2. Activation

Activation functions are usually used in neural networks to provide non-linearity. They are non-linear functions and after passing the weighted sum through an activation layer, the result is no longer linear combination of inputs. Widely-used activation functions include sigmoid function \(f(x)=\dfrac{1}{1+e^{-x}}\), rectified linear unit (RELU) \(f(x)=0\) (if \(x<=0\)), \(x\) (if \(x>0\)), etc.

Assume the outputs of input layer are \(s^+\) and \(s^-\). From the expression of ReLU, we can deduce that only subtraction is needed in this step. To generate \(0\) when the weighted sum is negative, we simply ignore the concentration of \(s^-\). As for the positive part, subtraction (\(s^++s^- \xrightarrow{} \phi\)) is employed as usual to obtain the correct result.

3. Backpropagation

Backpropagation algorithm [3] is widely applied in modern neural networks. Simply speaking, it calculates how a parameter in a system affects the final output and adjusts the parameter accordingly. Our calculation is based on loss function \(L=\frac{1}{2}(y-d)^2\), where \(y\) is the output and \(d\) is the desired output used in supervised learning. To determine how the weights should be adjusted, we calculate \(\frac{\partial y}{\partial w_i}\), and renew the weights using \(w_{i}'=w_i-\epsilon \frac{\partial y}{\partial w_i}\) where \(i\) is the index of the weight and \(\epsilon\) is a parameter called 'learning rate'. The calculation of partial deritives is based on traditional chain derivative calculation methods. In our implementation, multiplication, addition and subtraction are all employed.

Assue that the inputs are \(x_i, (i=1,2,...n)\), the output of activation part is \(y\), the desired answer in training is \(d\) and the losss is \(L\). To calculate
\(\dfrac{\partial L}{\partial w_i}= \left\{ \begin{aligned} &0, y<0\\ &Undefined, y=0\\ &(y-d)input_i, y>0\\ \end{aligned} \right.\)
We use subtraction to calculate \(y-d\) and apply multiplication to calculate \(\frac{\partial L}{\partial w_i}\). Then, both subtraction and addition are applied to renew weights (\(w_{i}'=w_i-\epsilon \frac{\partial y}{\partial w_i}\)).

Algorithm in Wet Experiment

The generation of sequences should consider several problems. Firstly, to guarantee reusability, the generation process should be as random as possible. Secondly, there are some constraints that should be take into account to produce applicable sequences. The requirements are listed as follows:

Sequence generation requirement (conforming to [4])：

1. All DNA strands includes long domains (recognition) and short domains (toehold).
2. The length of the recognition domain can be adjusted, tentatively set at 15nt.
3. The toehold domain has a chain length of 5nt, which includes toecore with chain length of 3nt and clamp with chain length of 2nt. More specifically, toecore contains a C in a random position, and clamp's sequence is fixed on AC.

The generation steps are listed below:

Sequence generation step

1. When generating a DNA sequence, in order to ensure that the melting temperature is reached in the experiment, the initial strand is first generated at a GC ratio of 30%-70%.
2. After the initial chain is generated, the generated domain is checked for continuous AAAA, TTTT or CCC. If so, the last 1nt of the continuous domain is randomly modified, and produce DNA strands without continuous AAAA, TTTT and CCC.
3. Finally, complementary strands of DNA chains are generated using the base complementary pairing principle.

Our original MATLAB code is shown below.

Click hereto download MATLAB .m source file.

This algorithm is integrated to our Software Tool.

References

[1] D. Soloveichik, G. Seelig, E. Winfree, "DNA as a universal substrate for chemical kinetics," Proceedings of the National Academy of Sciences, vol. 107, no. 12, pp. 5393–5398, 2010.

[2] DNA Strand Displacement.

[3] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, no. 6088, pp. 533-536, 1986.

[4] L. Qian, E. Winfree, "Scaling Up Digital Circuit Computation with DNA Strand Displacement Cascades," Science, vol. 332, no. 6034, pp. 1196-1201, 2011.

@@ Line 14: / Line 14: @@
      margin: 4px 2px !important;
      cursor: pointer !important;
+}
+.about-contentbox1 {
+  margin-top: 80px;
+  text-align:left;
+  line-height:1.8;
+}
+.about-contentbox1 span {
+  font-size: 24px;
+  font-weight: 300;
+}
+.about-contentbox1 h1 {
+  font-size: 40px;
+  font-weight: 700;
+}
+.about-contentbox1 h2 {
+  font-size: 36px;
+  font-weight: 700;
+}
+.about-contentbox1 h3 {
+  font-size: 32px;
+  font-weight: 700;
+}
+.about-contentbox1 h4 {
+  font-size: 24px;
+  font-weight: 700;
+}
+.about-contentbox1 h5 {
+  font-size: 20px;
+  font-weight: 700;
+}
+.about-contentbox1 p {
+  max-width: 820px;
+  text-align: left !important;
+  line-height: 2;
+  font-weight:250;
+  font-size: 20px !important;
+}
+.img-float-right{
+  float:right;
+}
+.img-float-left{
+  float: left;
 }
 </style>
@@ Line 25: / Line 68: @@
                            <div class="row justify-content-center">
                                <div>
-                                   <div class="about-contentbox">
+                                   <div class="about-contentbox1">
                                        <div>
-                                           <h2>Model</h2>
+                                           <h1>Model</h1>
-                                           <p>In this part, we provide our computation model in chemical reactions and prove the validity via kinetic analysis.</p>
+                                          <h2>Theoretical Analysis in Dry Experiment</h2>
-                                           <p>According to [1], formal reactions such as \(A+B \xrightarrow{k} C\) can be mapped to DNA strand displacement (DSD) reactions [2] without losing the kinetic features of the reaction. We borrow such a model in our project to implement calculation operations.</p>
+                                          <h3>Computation Model</h3>
-                                           <p> The whole proof is based on mass action kinetics. The corresponding DSD reaction implementation is given as well. As only addition, subtraction and multiplication are required in our project, we only provide the implementation of such computation. For each calculation operation, we firstly provide formal reactions, then provide kinetic analysis and finally propose DSD reaction implementation.</p>
+                                           <p>In this part, we propose molecular computation models for artificial neural networks and prove the validity via kinetic analysis.</p>
+                                           <p>In our systems, numerical values are represented by concentrations of certain species. Chemical reactions are simplified as formal reactions such as \(A+B \xrightarrow{k} C\). A formal reaction consists of reactants, products (A, B and C) and rate constant (\(k\)). For each computation operation, given initial concentrations of certain species as inputs, the outputs are representd by concentrations of other species in the system at the end of reactions. </p>
+                                          <p>According to [1], formal reactions with less than or equal to two reactants can be mapped to DNA strand displacement (DSD) reactions [2] without losing the kinetic features of the reaction. Therefore in our project, after designing formal reactions which satisfy such constraints we use such a model to map our calculation operations to implementable DSD reactions.</p>
+                                           <p>As the calculation steps (weighted summation, activation and backpropagation) can be separated to addition,subtraction and multiplication operations, we only provide the implementation of such computation. For each calculation operation, we firstly provide formal reactions, then provide kinetic analysis and finally propose DSD reaction implementation. The analysis in our model is totally based on classic mass action kinetics.</p>
                                            <!-- <h3>Computation method</h3> -->
                                            <h4>Addition:</h4>
-                                           <p>\(A_1 \xrightarrow{k_1} O,\quad A_2 \xrightarrow{k_2} O\)</p>
+                                           <p>To implement addition, we utilize:<br>\(A_1 \xrightarrow{k_1} O, A_2 \xrightarrow{k_2} O,... A_n \xrightarrow{k_n} O\).<br>Initial concentrations of reactants \(A_i\) of such reactions are considered addends and the final concentration of product \(O\) represent our sum. Such reactions calculate \([O](\infty)=\sum_{i=1}^n[A_i](0)\).</p>
-                                           <h5> Proof:</h5> <p>\(\dfrac{d [A_i](t)}{d t}=-k_i[A_i](t)\) \(\Rightarrow [A_i](t)=[A_i](0)e^{-k_it}, \) \(\dfrac{d [O](t)}{d t}=\sum_{i=1}^2 k_i[A_i](t)\) \(\Rightarrow [O](\infty)=\int_0^\infty \sum_{i=1}^2 k_i[A_i](t)dt = [A_1](0)/k_1+[A_2](0)/k_2.\) If \(k_1\approx k_2\), then addition is successfully implemented.</p>
+                                           <h5> Proof:</h5> <p>\(\dfrac{d [A_i](t)}{d t}=-k_i[A_i](t) (i=1,2...n)\) \(\Rightarrow [A_i](t)=[A_i](0)e^{-k_it} (i=1,2...n)\), \(\dfrac{d [O](t)}{d t}=\sum_{i=1}^n k_i[A_i](t)\) \(\Rightarrow [O](t)=-(\sum_{i=1}^n[A_i](0)e^{-k_it})+\sum_{i=1}^n[A_i](0)\) \(\Rightarrow [O](\infty)=\sum_{i=1}^n[A_i](0)\). Thus addition is successfully implemented.</p>
-                                           <p style="font-size=18px">The DSD implementation:</p>
+                                           <p style="font-size=18px">The proposed DSD implementation:</p>
                                            <img src="https://static.igem.org/mediawiki/2019/8/8f/T--SEU--additionDNA.png"  width="620" height="175" >
                                            <h4>Subtraction:</h4>
-                                           <p style="font-size=1px">\(A+B \xrightarrow{k_1} \phi\)</p>
+                                           <p style="font-size=1px">To build a subtractor, we use: <br> \(A+B \xrightarrow{k_1} \phi\).<br>Here, \(A, B\) are inputs and \(\phi\) is a species that will not be used elsewhere (wasted). We can arbitrarily define subtrahend and minuend. The result is the final concentration of \(A\) or \(B\), depending on the initial concentration of \(A\) and \(B\).</p>
                                            <h5> Proof:</h5> <p>Apparently, \([A](t)=[B](t)+\Delta \). <br>If \(\Delta \neq 0\), \(\dfrac{d [A](t)}{d t}=-[A](t)([A](t)-\Delta)\) \(\Rightarrow [A](t)=\dfrac{[A](0)\Delta}{-[A](0)+[A](0)e^{\Delta t}+\Delta e^{\Delta t}} (\Delta \neq 0).\) If \(\Delta > 0\), \([A](\infty)=\Delta\). Otherwise \([A](\infty)=0\).
                                            <br>If \(\Delta =0\), \([A](t)=\dfrac{[A](0)}{1+[A](0)t}\). \([A](\infty)=0\). Hence substraction is implemented.</p>
-                                           <p style="font-size=18px">The DSD implementation:</p>
+                                           <p style="font-size=18px">The proposed DSD implementation:</p>
                                            <img src="https://static.igem.org/mediawiki/2019/b/bf/T--SEU--subtractionDNA.png"  width="620" height="235" >
                                            <h4>Multiplication: </h4>
-                                           <p style="font-size=18px">\(\alpha \xrightarrow{k_1} \phi, A+B+\alpha \xrightarrow{k_2} A+B+\alpha+C\)</p>
+                                           <p style="font-size=18px">To calculate concentration multiplication, we utilize <br>\(\alpha \xrightarrow{k_1} \phi, A+B+\alpha \xrightarrow{k_2} A+B+\alpha+C\).<br> Concentrations of \(A, B\) represent factors and the final concentration of \(C\) is the product. It calculates \([C](\infty)=[A](0)\times[B](0)\).</p>
                                            <h5> Proof:</h5> <p>\(\dfrac{d [\alpha](t)}{d t}=-k_1[\alpha](t)\) \(\Rightarrow [\alpha](t)=[\alpha](0)e^{-k_1t},\) \(\dfrac{d [A](t)}{d t}=\dfrac{d [B](t)}{d t}=0, \dfrac{d [C](t)}{d t}=k_2[A](t)[B](t)[\alpha](t)\) \(\Rightarrow [C](\infty)=\int_0^\infty [A](0)[B](0)[\alpha](t)=k_2/k_1[\alpha](0)[A](0)[B](0)\). Hence multiplication is implemented.</p>
-                                           <p style="font-size=18px">The DSD implementation:</p>
+                                           <p style="font-size=18px">The proposed DSD implementation:</p>
                                            <img src="https://static.igem.org/mediawiki/2019/d/d6/T--SEU--multiplicationDNA.png"  width="620" height="339" >
-                                           <h3>References</h3>
+                                          <p>To modify our proposed formal reactions and make them implementable, compared to the orignal version, there are some minor changes in our DSD reactions:
+                                          <br>1. \(\alpha\) is canceled to reduce the number of reactants.
+                                          <br>2. One of the reactants will be consumed.</p>
+                                           <h3>Neuron Implementation</h3>
+                                          <p>The model shown here is used to construct our <b><a href="https://2019.igem.org/Team:SEU/Software">Software Tool,</a></b>which is capable of automatically generating a neuron according to user specifications.  As concentrations are always positive, to represent a value that is possibly negative we have to use two species, which is called dual-rail representation method. For example, to represent \(x\) we should use both the concentrations of \(x^+\) and \(x^-\) to represent this value and assign a subtractor \(x^++x^-\xrightarrow{} \phi\). </p>
+                                          <h4>1. Input layer</h4>
+                                          <img src="https://static.igem.org/mediawiki/2019/7/79/T--SEU--inputLayer.png"  class="img-float-left" width="190" vspace="10" hspace="10">
+                                          <p>Input layer calculates weighted summation of all input data, i.e. it calculates \(\sum_{i=1}^n w_ix_i\) where \(n\) is the number of inputs, \(w_i, x_i\) (\(i=1,2,...n\)) are weights and inputs, respectively. In our implementation, the previously proposed multiplication method is employed. We have to implement calculation reactions that generate positive species and negative species respectively using multiplication.</p>
+                                          <h4>2. Activation</h4>
+                                           <img src="https://static.igem.org/mediawiki/2019/c/c7/T--SEU--activation.png"  class="img-float-right" width="190" vspace="10" hspace="10">
+                                          <p>Activation functions are usually used in neural networks to provide non-linearity. They are non-linear functions and after passing the weighted sum through an activation layer, the result is no longer linear combination of inputs. Widely-used activation functions include sigmoid function \(f(x)=\dfrac{1}{1+e^{-x}}\), rectified linear unit (RELU) \(f(x)=0\) (if \(x<=0\)), \(x\) (if \(x>0\)), etc.</p>
+                                          <p>Assume the outputs of input layer are \(s^+\) and \(s^-\). From the expression of ReLU, we can deduce that only subtraction is needed in this step. To generate \(0\) when the weighted sum is negative, we simply ignore the concentration of \(s^-\). As for the positive part, subtraction (\(s^++s^- \xrightarrow{} \phi\)) is employed as usual to obtain the correct result.</p>
+                                          <h4>3. Backpropagation</h4>
+                                          <img src="https://static.igem.org/mediawiki/2019/d/d1/T--SEU--backpropagation.png"  class="img-float-left" width="190" vspace="10" hspace="10">
+                                          <p>Backpropagation algorithm [3] is widely applied in modern neural networks. Simply speaking, it calculates how a parameter in a system affects the final output and adjusts the parameter accordingly. Our calculation is based on loss function \(L=\frac{1}{2}(y-d)^2\), where \(y\) is the output and \(d\) is the desired output used in supervised learning. To determine how the weights should be adjusted, we calculate \(\frac{\partial y}{\partial w_i}\), and renew the weights using \(w_{i}'=w_i-\epsilon \frac{\partial y}{\partial w_i}\) where \(i\) is the index of the weight and \(\epsilon\) is a parameter called 'learning rate'. The calculation of partial deritives is based on traditional chain derivative calculation methods. In our implementation, multiplication, addition and subtraction are all employed.</p>
+                                          <p>Assue that the inputs are \(x_i, (i=1,2,...n)\), the output of activation part is \(y\), the desired answer in training is \(d\) and the losss is \(L\). To calculate <br>
+                                            \(\dfrac{\partial L}{\partial w_i}=
+                                            \left\{
+                                            \begin{aligned}
+                                            &0, y<0\\
+                                            &Undefined, y=0\\
+                                            &(y-d)input_i, y>0\\
+                                            \end{aligned}
+                                            \right.\)
+                                          <br>We use subtraction to calculate \(y-d\) and apply multiplication to calculate \(\frac{\partial L}{\partial w_i}\). Then, both subtraction and addition are applied to renew weights (\(w_{i}'=w_i-\epsilon \frac{\partial y}{\partial w_i}\)).</p>
+                                          <h2>Algorithm in Wet Experiment</h2>
+                                          <p>The generation of sequences should consider several problems. Firstly, to guarantee reusability, the generation process should be as random as possible. Secondly, there are some constraints that should be take into account to produce applicable sequences. The requirements are listed as follows:</p>
+                                          <h4>Sequence generation requirement (conforming to [4])：</h4>
+                                          <p>1. All DNA strands includes long domains (recognition) and short domains (toehold).<br>
+. The length of the recognition domain can be adjusted, tentatively set at 15nt.<br>
+. The toehold domain has a chain length of 5nt, which includes toecore with chain length of 3nt and clamp with chain length of 2nt. More specifically, toecore contains a C in a random position, and clamp's sequence is fixed on AC.
+                                          </p>
+                                          <p>The generation steps are listed below:</p>
+                                          <h4>Sequence generation step</h4>
+                                          <p>1. When generating a DNA sequence, in order to ensure that the melting temperature is reached in the experiment, the initial strand is first generated at a GC ratio of 30%-70%.<br>
+. After the initial chain is generated, the generated domain is checked for continuous AAAA, TTTT or CCC. If so, the last 1nt of the continuous domain is randomly modified, and produce DNA strands without continuous AAAA, TTTT and CCC.<br>
+. Finally, complementary strands of DNA chains are generated using the base complementary pairing principle.
+                                          </p>
+                                          <p>Our original MATLAB code is shown below.</p>
+                                          <center><embed src="https://static.igem.org/mediawiki/2019/2/25/T--SEU--codeMatlab.pdf" height="500" width="820">
+                      </center>
+                                          <p>Click <b><a href="https://static.igem.org/mediawiki/2019/7/72/T--SEU--codeMatlab.m">here</a></b>to download MATLAB .m source file.</p>
+                                          <p>This algorithm is integrated to our <b><a href="https://2019.igem.org/Team:SEU/Software">Software Tool.</a></b></p>
+                                          <h1>References</h1>
                                            <p>[1] D. Soloveichik, G. Seelig, E. Winfree, "DNA as a universal substrate for chemical kinetics," Proceedings of the National Academy of Sciences, vol. 107, no. 12, pp. 5393–5398, 2010.</p>
                                            <p>[2] <a href="https://www.youtube.com/watch?v=zjC_0PW9A4c&feature=youtu.be&tdsourcetag=s_pctim_aiomsg">DNA Strand Displacement.</a></p>
+                                           <p>[3] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, no. 6088, pp. 533-536, 1986.</p>
+                                          <p>[4] L. Qian, E. Winfree, "Scaling Up Digital Circuit Computation with DNA Strand Displacement Cascades," Science, vol. 332, no. 6034, pp. 1196-1201, 2011.</p>
                                        </div>
                                  </div>

Difference between revisions of "Team:SEU/Model"

Latest revision as of 02:11, 18 October 2019

Model

Theoretical Analysis in Dry Experiment

Computation Model

Addition:

Proof:

Subtraction:

Proof:

Multiplication:

Proof:

Neuron Implementation

1. Input layer

2. Activation

3. Backpropagation

Algorithm in Wet Experiment

Sequence generation requirement (conforming to [4])：

Sequence generation step

References