Team:SEU/Model





Model

Theoretical Analysis in Dry Experiment

Computation Model

In this part, we provide our computation model in chemical reactions and prove the validity via kinetic analysis.

In our system, numerical values are represented by concentrations of certain species. Chemical reactions are simplified as formal reactions such as \(A+B \xrightarrow{k} C\). A formal reaction consists of reactants, products (A, B and C) and rate constant (\(k\)). For each computation operation, given initial concentrations of certain species as inputs, the outputs are representd by the concentration of another species in the system at the end of the reaction.

According to [1], formal reactions with less than two reactants can be mapped to DNA strand displacement (DSD) reactions [2] without losing the kinetic features of the reaction. Therefore in our project, after designing formal reactions we use such a model to map our calculation operations to implementable DSD reactions.

As the calculation steps (weighted summation, activation and backpropagation) can be separated to addition,subtraction and multiplication operations, we only provide the implementation of such computation. For each calculation operation, we firstly provide formal reactions, then provide kinetic analysis and finally propose DSD revaction implementation. The analysis in our model is totally based on classic mass action kinetics.

Addition:

To implement addition, we utilize:
\(A_1 \xrightarrow{k_1} O,\quad A_2 \xrightarrow{k_2} O,... \quad A_n \xrightarrow{k_n} O\).
Initial concentrations of reactants \(A_i\) of such reactions are considered inputs and the final concentration of product \(O\) represent our result. Such reactions calculate \([O](\infty)=\sum_{i=1}^n[A_i](0)\).

Proof:

\(\dfrac{d [A_i](t)}{d t}=-k_i[A_i](t) (i=1,2...n)\) \(\Rightarrow [A_i](t)=[A_i](0)e^{-k_it} (i=1,2...n)\), \(\dfrac{d [O](t)}{d t}=\sum_{i=1}^n k_i[A_i](t)\) \(\Rightarrow [O](t)=-(\sum_{i=1}^n[A_i](0)e^{-k_it})+\sum_{i=1}^n[A_i](0)\) \(\Rightarrow [O](\infty)=\sum_{i=1}^n[A_i](0)\). Thus addition is successfully implemented.

The DSD implementation:

Subtraction:

To build a subtractor, we use \(A+B \xrightarrow{k_1} \phi\).
The result is the final concentration of A or B, depending on the initial concentration of A and B.

Proof:

Apparently, \([A](t)=[B](t)+\Delta \).
If \(\Delta \neq 0\), \(\dfrac{d [A](t)}{d t}=-[A](t)([A](t)-\Delta)\) \(\Rightarrow [A](t)=\dfrac{[A](0)\Delta}{-[A](0)+[A](0)e^{\Delta t}+\Delta e^{\Delta t}} (\Delta \neq 0).\) If \(\Delta > 0\), \([A](\infty)=\Delta\). Otherwise \([A](\infty)=0\).
If \(\Delta =0\), \([A](t)=\dfrac{[A](0)}{1+[A](0)t}\). \([A](\infty)=0\). Hence substraction is implemented.

The DSD implementation:

Multiplication:

To calculate concentration multiplication, we utilize \(\alpha \xrightarrow{k_1} \phi, A+B+\alpha \xrightarrow{k_2} A+B+\alpha+C\).
It calculates \([C](\infty)=[A](0)\times[B](0)\).

Proof:

\(\dfrac{d [\alpha](t)}{d t}=-k_1[\alpha](t)\) \(\Rightarrow [\alpha](t)=[\alpha](0)e^{-k_1t},\) \(\dfrac{d [A](t)}{d t}=\dfrac{d [B](t)}{d t}=0, \dfrac{d [C](t)}{d t}=k_2[A](t)[B](t)[\alpha](t)\) \(\Rightarrow [C](\infty)=\int_0^\infty [A](0)[B](0)[\alpha](t)=k_2/k_1[\alpha](0)[A](0)[B](0)\). Hence multiplication is implemented.

The DSD implementation:

Compared to the formal reactions, there are some minor changes:
1. \(\alpha\) is canceled to reduce the number of reactants.
2. One of the reactants will be consumed.

Neuron Implementation

The model shown here is used to construct our Software Tool,which is capable of automatically generating a neuron according to user specifications.

1. Input layer

Input layer calculates weighted summation of all input data, i.e. it calculates \(\sum_{i=1}^n w_ix_i\) where \(n\) is the number of inputs, \(w_i, x_i\) (\(i=1,2,...n\)) are weights and inputs, respectively. In our implementation, the previously proposed multiplication method is employed. We calculate the positive and negative parts of the result respectively using multiplication.

2. Activation

Activation functions are usually used in neural networks to provide non-linearity. They are non-linear functions and after passing the weighted sum through an activation layer, the result is no longer linear combination of inputs. Widely-used activation functions include sigmoid function \(f(x)=\dfrac{1}{1+e^{-x}}\), rectified linear unit (RELU) \(f(x)=0\) (if \(x<=0\)), \(x\) (if \(x>0\)), etc.

From the expression of ReLU, we can easily deduce that only subtraction is needed in this step to generate \(0\) when the weighted sum is negative.

3. Backpropagation

Backpropagation algorithm [3] is widely applied in modern neural networks. Simply speaking, it calculates how a parameter in a system affects the final output and adjusts the parameter accordingly. Our calculation is based on loss function \(L=\frac{1}{2}(y-d)^2\), where \(y\) is the output and \(d\) is the desired output used in supervised learning. To determine how the weights should be adjusted, we calculate \(\frac{\partial y}{\partial w_i}\), and renew the weights using \(w_{i}'=w_i-\epsilon \frac{\partial y}{\partial w_i}\) where \(i\) is the index of the weight and \(\epsilon\) is a parameter called 'learning rate'. The calculation of partial deritives is based on traditional chain derivative calculation methods. In our implementation, multiplication, addition and subtraction are all employed.

Algorithm in Wet Experiment

The generation of sequences should consider several problems. Firstly, to guarantee reusability, the generation process should be as random as possible. Secondly, there are some constraints that should be take into account to produce applicable sequences. The requirements are listed as follows:

Sequence generation requirement (conforming to [4]):

1. All DNA strands includes long domains (recognition) and short domains (toehold).
2. The length of the recognition domain can be adjusted, tentatively set at 15nt.
3. The toehold domain has a chain length of 5nt, which includes toecore with chain length of 3nt and clamp with chain length of 2nt. More specifically, toecore contains a C in a random position, and clamp's sequence is fixed on AC.

The generation steps are listed below:

Sequence generation step

1. When generating a DNA sequence, in order to ensure that the melting temperature is reached in the experiment, the initial strand is first generated at a GC ratio of 30%-70%.
2. After the initial chain is generated, the generated domain is checked for continuous AAAA, TTTT or CCC. If so, the last 1nt of the continuous domain is randomly modified, and produce DNA strands without continuous AAAA, TTTT and CCC.
3. Finally, complementary strands of DNA chains are generated using the base complementary pairing principle.

Our original MATLAB code is shown below.

Click hereto download MATLAB .m source file.

This algorithm is integrated to our Software Tool.

References

[1] D. Soloveichik, G. Seelig, E. Winfree, "DNA as a universal substrate for chemical kinetics," Proceedings of the National Academy of Sciences, vol. 107, no. 12, pp. 5393–5398, 2010.

[2] DNA Strand Displacement.

[3] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors, Nature, vol. 323, no. 6088, pp. 533-536, 1986.

[4] L. Qian, E. Winfree, "Scaling Up Digital Circuit Computation with DNA Strand Displacement Cascades," Science, vol. 332, no. 6034, pp. 1196-1201, 2011.