Revision as of 08:18, 20 October 2019

Model

Biological Models

Flux Balance Analysis

Flux Balance Analysis (FBA) is a mathematical method for simulating metabolism in metabolic networks. It is based on linear programming to calculate fluxes when the model is stable.

Assuming S is a matrix, which contains all reactions in a model. In this matrix, metabolites consumed take negative coefficients, and metabolites produced then take positive ones. Furthermore, v is a vector that represents the flux of all reactions. When the system is steady, it satisfies: $$ \textbf{S}\textbf{v}=0 $$ And then, FBA tries to maximize or minimize the objective function $\textbf{c}^\textrm{T} \textbf{v}$, which c contains the weight of all reactions contributing to the function. Now it is clear that FBA is a linear programming problem, and it just works.

Flux Variability Analysis

Flux Variability Analysis (FVA) is an extension of FBA. It can show the minimum and maximum range of each reaction flux while they satisfy constraints and have the same optimal objective by solving a double linear programming problem.

However, when the metabolic network contains internal loops, the result can be too high in absolute to be realistic. Users can select the "loopless" option to avoid such internal loops, and get a more consistent result.

Computer Models

Object-Relational Mapping

Object-Relational Mapping (ORM) enables us to work with databases more comfortable and safer. It fills the gap between object-oriented programming languages and relational databases and avoids the vulnerability of SQL Injection.

We use ORM to organize our metabolic networks, biobricks data, and user's computational models. It brings us the convenience of creating, querying, listing, and connecting models, with acceptable performance.

In practice, we use Django ORM with MySQL backend to provide fast, flexible, and reliable service.

Message Queues

Our website requires computing large models, and it is quite embarrassing to see users waiting for browsers for so long. So we use message queues to maintain computing tasks.

We split our programs computing models from our website, and use message queues to send and receive information about our computation. A queue is a data structure that stores things waiting to be handled, and it obeys the "First Come, First Serve" principle. So, we can store our computing tasks in the queue, and response to users instantly about the progress of his tasks.

In Foresyn, we use RabbitMQ and Celery to build our message queues. Here, Celery uses its protocol to communicate with web programs and computation programs. And RabbitMQ acts as a broker who accepts and forwards messages of Celery.

Different Searching Engines

When we are developing Foresyn, we discovered that there is a contradiction between speed and accuracy. So our strategy is to use slow but accurate algorithm when searching small datasets, and use fast but less accurate algorithm when searching large ones.

Here, there are only some models like E. coli, but a lot of genes, reactions and metabolites in our database. We choose to calculate Levenshtein distances for model searching. Levenshtein distance defines as the following recursive function: $$ \qquad\operatorname{lev}_{a,b}(i,j) = \begin{cases} \max(i,j) & \text{if } \min(i,j)=0, \\ \min (\operatorname{lev}_{a,b}(i-1,j) + 1,\operatorname{lev}_{a,b}(i,j-1) + 1, \operatorname{lev}_{a,b}(i-1,j-1) + 1) & a_i \neq b_i, \\ \min (\operatorname{lev}_{a,b}(i-1,j) + 1,\operatorname{lev}_{a,b}(i,j-1) + 1, \operatorname{lev}_{a,b}(i-1,j-1)) & \text{otherwise.} \\ \end{cases} $$ which has a time complexity of $O(nm)$ where n and m are the length of the two strings.

When we are handling genes, reactions and metabolites, we use another algorithm.

Biobricks Recommendation

TBD

References

1. Orth JD, Thiele I, Palsson BØ. What is flux balance analysis?. *Nat Biotechnol*. 2010;28(3):245–248. doi:10.1038/nbt.1614
2. Gudmundsson S, Thiele I. Computationally efficient flux variability analysis. *BMC Bioinformatics*. 2010;11:489. Published 2010 Sep 29. doi:10.1186/1471-2105-11-489
3. Arne C. Müller, Alexander Bockmayr, Fast thermodynamically constrained flux variability analysis, *Bioinformatics*, Volume 29, Issue 7, 1 April 2013, Pages 903–909, https://doi.org/10.1093/bioinformatics/btt059
4. Abdelmoneim Amer Desouki, Florian Jarre, Gabriel Gelius-Dietrich, Martin J. Lercher, CycleFreeFlux: efficient removal of thermodynamically infeasible loops from flux distributions, *Bioinformatics*, Volume 31, Issue 13, 1 July 2015, Pages 2159–2165, https://doi.org/10.1093/bioinformatics/btv096
5. Ambler, Scott W. "Mapping objects to relational databases." On the World Wide Web: http://www. AmbySoft. com (2000).
6. Vinoski, Steve. "Advanced message queuing protocol." IEEE Internet Computing 6 (2006): 87-89.

@@ Line 28: / Line 28: @@
            <div class="passage">Flux Balance Analysis (FBA) is a mathematical method for simulating metabolism in
              metabolic networks. It is based on linear programming to calculate fluxes when the model is stable.
-             <br>
+             <br><br>
              Assuming S is a matrix, which contains all reactions in a model. In this matrix, metabolites consumed take
              negative coefficients, and metabolites produced then take positive ones. Furthermore, v is a vector that
@@ Line 44: / Line 44: @@
              reaction flux while they satisfy constraints and have the same optimal objective by solving a double linear
              programming problem.
-             <br>
+             <br><br>
              However, when the metabolic network contains internal loops, the result can be too high in absolute to be
              realistic. Users can select the "loopless" option to avoid such internal loops, and get a more consistent
@@ Line 55: / Line 55: @@
              gap between object-oriented programming languages and relational databases and avoids the vulnerability of
              SQL Injection.
-             <br>
+             <br><br>
              We use ORM to organize our metabolic networks, biobricks data, and user's computational models. It brings us
              the convenience of creating, querying, listing, and connecting models, with acceptable performance.
-             <br>
+             <br><br>
              In practice, we use Django ORM with MySQL backend to provide fast, flexible, and reliable service.
            </div>
@@ Line 65: / Line 65: @@
              Our website requires computing large models, and it is quite embarrassing to see users waiting for browsers
              for so long. So we use message queues to maintain computing tasks.
-             <br>
+             <br><br>
              We split our programs computing models from our website, and use message queues to send and receive
              information about our computation. A queue is a data structure that stores things waiting to be handled, and
              it obeys the "First Come, First Serve" principle. So, we can store our computing tasks in the queue, and
              response to users instantly about the progress of his tasks.
-             <br>
+             <br><br>
              In Foresyn, we use RabbitMQ and Celery to build our message queues. Here, Celery uses its protocol to
              communicate with web programs and computation programs. And RabbitMQ acts as a broker who accepts and
@@ Line 80: / Line 80: @@
              our strategy is to use slow but accurate algorithm when searching small datasets, and use fast but less
              accurate algorithm when searching large ones.
-             <br>
+             <br><br>
              Here, there are only some models like E. coli, but a lot of genes, reactions and metabolites in our
              database. We choose to calculate Levenshtein distances for model searching. Levenshtein distance defines as
@@ Line 94: / Line 94: @@
              $$
              which has a time complexity of \(O(nm)\) where n and m are the length of the two strings.
-             <br>
+             <br><br>
              When we are handling genes, reactions and metabolites, we use another algorithm.
            </div>

Difference between revisions of "Team:USTC-Software/Model"