Model
Theoretical Analysis in Dry Experiment
Computation Model
In this part, we propose molecular computation models for artificial neural networks and prove the validity via kinetic analysis.
In our systems, numerical values are represented by concentrations of certain species. Chemical reactions are simplified as formal reactions such as \(A+B \xrightarrow{k} C\). A formal reaction consists of reactants, products (A, B and C) and rate constant (\(k\)). For each computation operation, given initial concentrations of certain species as inputs, the outputs are representd by concentrations of other species in the system at the end of reactions.
According to [1], formal reactions with less than or equal to two reactants can be mapped to DNA strand displacement (DSD) reactions [2] without losing the kinetic features of the reaction. Therefore in our project, after designing formal reactions which satisfy such constraints we use such a model to map our calculation operations to implementable DSD reactions.
As the calculation steps (weighted summation, activation and backpropagation) can be separated to addition,subtraction and multiplication operations, we only provide the implementation of such computation. For each calculation operation, we firstly provide formal reactions, then provide kinetic analysis and finally propose DSD reaction implementation. The analysis in our model is totally based on classic mass action kinetics.
Addition:
To implement addition, we utilize:
\(A_1 \xrightarrow{k_1} O, A_2 \xrightarrow{k_2} O,... A_n \xrightarrow{k_n} O\).
Initial concentrations of reactants \(A_i\) of such reactions are considered addends and the final concentration of product \(O\) represent our sum. Such reactions calculate \([O](\infty)=\sum_{i=1}^n[A_i](0)\).
Proof:
\(\dfrac{d [A_i](t)}{d t}=-k_i[A_i](t) (i=1,2...n)\) \(\Rightarrow [A_i](t)=[A_i](0)e^{-k_it} (i=1,2...n)\), \(\dfrac{d [O](t)}{d t}=\sum_{i=1}^n k_i[A_i](t)\) \(\Rightarrow [O](t)=-(\sum_{i=1}^n[A_i](0)e^{-k_it})+\sum_{i=1}^n[A_i](0)\) \(\Rightarrow [O](\infty)=\sum_{i=1}^n[A_i](0)\). Thus addition is successfully implemented.
The proposed DSD implementation:
Subtraction:
To build a subtractor, we use:
\(A+B \xrightarrow{k_1} \phi\).
Here, \(A, B\) are inputs and \(\phi\) is a species that will not be used elsewhere (wasted). We can arbitrarily define subtrahend and minuend. The result is the final concentration of \(A\) or \(B\), depending on the initial concentration of \(A\) and \(B\).
Proof:
Apparently, \([A](t)=[B](t)+\Delta \).
If \(\Delta \neq 0\), \(\dfrac{d [A](t)}{d t}=-[A](t)([A](t)-\Delta)\) \(\Rightarrow [A](t)=\dfrac{[A](0)\Delta}{-[A](0)+[A](0)e^{\Delta t}+\Delta e^{\Delta t}} (\Delta \neq 0).\) If \(\Delta > 0\), \([A](\infty)=\Delta\). Otherwise \([A](\infty)=0\).
If \(\Delta =0\), \([A](t)=\dfrac{[A](0)}{1+[A](0)t}\). \([A](\infty)=0\). Hence substraction is implemented.
The proposed DSD implementation:
Multiplication:
To calculate concentration multiplication, we utilize
\(\alpha \xrightarrow{k_1} \phi, A+B+\alpha \xrightarrow{k_2} A+B+\alpha+C\).
Concentrations of \(A, B\) represent factors and the final concentration of \(C\) is the product. It calculates \([C](\infty)=[A](0)\times[B](0)\).
Proof:
\(\dfrac{d [\alpha](t)}{d t}=-k_1[\alpha](t)\) \(\Rightarrow [\alpha](t)=[\alpha](0)e^{-k_1t},\) \(\dfrac{d [A](t)}{d t}=\dfrac{d [B](t)}{d t}=0, \dfrac{d [C](t)}{d t}=k_2[A](t)[B](t)[\alpha](t)\) \(\Rightarrow [C](\infty)=\int_0^\infty [A](0)[B](0)[\alpha](t)=k_2/k_1[\alpha](0)[A](0)[B](0)\). Hence multiplication is implemented.
The proposed DSD implementation:
To modify our proposed formal reactions and make them implementable, compared to the orignal version, there are some minor changes in our DSD reactions:
1. \(\alpha\) is canceled to reduce the number of reactants.
2. One of the reactants will be consumed.
Neuron Implementation
The model shown here is used to construct our Software Tool,which is capable of automatically generating a neuron according to user specifications. As concentrations are always positive, to represent a value that is possibly negative we have to use two species, which is called dual-rail representation method. For example, to represent \(x\) we should use both the concentrations of \(x^+\) and \(x^-\) to represent this value and assign a subtractor \(x^++x^-\xrightarrow{} \phi\).
1. Input layer
Input layer calculates weighted summation of all input data, i.e. it calculates \(\sum_{i=1}^n w_ix_i\) where \(n\) is the number of inputs, \(w_i, x_i\) (\(i=1,2,...n\)) are weights and inputs, respectively. In our implementation, the previously proposed multiplication method is employed. We have to implement calculation reactions that generate positive species and negative species respectively using multiplication.
2. Activation
Activation functions are usually used in neural networks to provide non-linearity. They are non-linear functions and after passing the weighted sum through an activation layer, the result is no longer linear combination of inputs. Widely-used activation functions include sigmoid function \(f(x)=\dfrac{1}{1+e^{-x}}\), rectified linear unit (RELU) \(f(x)=0\) (if \(x<=0\)), \(x\) (if \(x>0\)), etc.
Assume the outputs of input layer are \(s^+\) and \(s^-\). From the expression of ReLU, we can deduce that only subtraction is needed in this step. To generate \(0\) when the weighted sum is negative, we simply ignore the concentration of \(s^-\). As for the positive part, subtraction (\(s^++s^- \xrightarrow{} \phi\)) is employed as usual to obtain the correct result.
3. Backpropagation
Backpropagation algorithm [3] is widely applied in modern neural networks. Simply speaking, it calculates how a parameter in a system affects the final output and adjusts the parameter accordingly. Our calculation is based on loss function \(L=\frac{1}{2}(y-d)^2\), where \(y\) is the output and \(d\) is the desired output used in supervised learning. To determine how the weights should be adjusted, we calculate \(\frac{\partial y}{\partial w_i}\), and renew the weights using \(w_{i}'=w_i-\epsilon \frac{\partial y}{\partial w_i}\) where \(i\) is the index of the weight and \(\epsilon\) is a parameter called 'learning rate'. The calculation of partial deritives is based on traditional chain derivative calculation methods. In our implementation, multiplication, addition and subtraction are all employed.
Assue that the inputs are \(x_i, (i=1,2,...n)\), the output of activation part is \(y\), the desired answer in training is \(d\) and the losss is \(L\). To calculate
\(\dfrac{\partial L}{\partial w_i}=
\left\{
\begin{aligned}
&0, y<0\\
&Undefined, y=0\\
&(y-d)input_i, y>0\\
\end{aligned}
\right.\)
We use subtraction to calculate \(y-d\) and apply multiplication to calculate \(\frac{\partial L}{\partial w_i}\). Then, both subtraction and addition are applied to renew weights (\(w_{i}'=w_i-\epsilon \frac{\partial y}{\partial w_i}\)).
Algorithm in Wet Experiment
The generation of sequences should consider several problems. Firstly, to guarantee reusability, the generation process should be as random as possible. Secondly, there are some constraints that should be take into account to produce applicable sequences. The requirements are listed as follows:
Sequence generation requirement (conforming to [4]):
1. All DNA strands includes long domains (recognition) and short domains (toehold).
2. The length of the recognition domain can be adjusted, tentatively set at 15nt.
3. The toehold domain has a chain length of 5nt, which includes toecore with chain length of 3nt and clamp with chain length of 2nt. More specifically, toecore contains a C in a random position, and clamp's sequence is fixed on AC.
The generation steps are listed below:
Sequence generation step
1. When generating a DNA sequence, in order to ensure that the melting temperature is reached in the experiment, the initial strand is first generated at a GC ratio of 30%-70%.
2. After the initial chain is generated, the generated domain is checked for continuous AAAA, TTTT or CCC. If so, the last 1nt of the continuous domain is randomly modified, and produce DNA strands without continuous AAAA, TTTT and CCC.
3. Finally, complementary strands of DNA chains are generated using the base complementary pairing principle.
Our original MATLAB code is shown below.
Click hereto download MATLAB .m source file.
This algorithm is integrated to our Software Tool.
References
[1] D. Soloveichik, G. Seelig, E. Winfree, "DNA as a universal substrate for chemical kinetics," Proceedings of the National Academy of Sciences, vol. 107, no. 12, pp. 5393–5398, 2010.
[3] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, no. 6088, pp. 533-536, 1986.
[4] L. Qian, E. Winfree, "Scaling Up Digital Circuit Computation with DNA Strand Displacement Cascades," Science, vol. 332, no. 6034, pp. 1196-1201, 2011.