Team:USTC-Software/Model

Model

Biological Models

Flux Balance Analysis

Flux Balance Analysis (FBA) is a mathematical method for simulating metabolism in metabolic networks. It is based on linear programming to calculate fluxes when the model is stable.

Assuming S is a matrix, which contains all reactions in a model. In this matrix, metabolites consumed take negative coefficients, and metabolites produced then take positive ones. Furthermore, v is a vector that represents the flux of all reactions. When the system is steady, it satisfies: $$ \textbf{S}\textbf{v}=0 $$ And then, FBA tries to maximize or minimize the objective function \(\textbf{c}^\textrm{T} \textbf{v}\), which c contains the weight of all reactions contributing to the function. Now it is clear that FBA is a linear programming problem, and it just works.

The conceptual basis of constraint- based modeling. With no constraints, the fIux distribution of a biological network may lie at any point in a solution space. When mass balance constraints imposed by the stoichiometric matrix S (labeled 1) and capacity constraints imposed by the lower and upper bounds (a;and b;) (labeled 2) are applied to a network, it defines an allowable solution space. The network may acquire any flux distribution within this space, but points outside this space are denied by the constraints. Through optimization of an objective function, FBA can identify a single optimal fIux distribution that lies on the edge of the allowable solution space.

Flux Variability Analysis

Flux Variability Analysis (FVA) is an extension of FBA. It can show the minimum and maximum range of each reaction flux while they satisfy constraints and have the same optimal objective by solving a double linear programming problem.

However, when the metabolic network contains internal loops, the result can be too high in absolute to be realistic. Users can select the "loopless" option to avoid such internal loops, and get a more consistent result.

Whole process of analysis

Regulatory Flux Balance Analysis

Regulatory Flux Balance Analysis is a brilliant combination of FBA and transcriptional regulation. It integrates regulatory network and adds constraints based on the FBA. Therefore the solution space can be compressed, and our results can be more credible and authentic.

The combined metabolic/regulatory model can predict the ability of mutant E.coli strains to grow on defined media as well as time courses of cell growth, substrate uptake, metabolic by-product secretion, and qualitative gene expression in various conditions.

By applying this method in our software, we can make it more precisely.

rFBA

Computer Models

Object-Relational Mapping

Object-Relational Mapping (ORM) enables us to work with databases more comfortable and safer. It fills the gap between object-oriented programming languages and relational databases and avoids the vulnerability of SQL Injection.

Object-Relational Mapping

We use ORM to organize our metabolic networks, biobricks data, and user's computational models. It brings us the convenience of creating, querying, listing, and connecting models, with acceptable performance.

Relationship between two types of database

In practice, we use Django ORM with MySQL backend to provide fast, flexible, and reliable service.

Django ORM

Message Queues

Our website requires computing large models, and it is quite embarrassing to see users waiting for browsers for so long. So we use message queues to maintain computing tasks.

We split our programs computing models from our website, and use message queues to send and receive information about our computation. A queue is a data structure that stores things waiting to be handled, and it obeys the "First Come, First Serve" principle. So, we can store our computing tasks in the queue, and response to users instantly about the progress of his tasks.

Message Queues

In Foresyn, we use RabbitMQ and Celery to build our message queues. Here, Celery uses its protocol to communicate with web programs and computation programs. And RabbitMQ acts as a broker who accepts and forwards messages of Celery.

RabbitMQ

Different Searching Engines

When we are developing Foresyn, we discovered that there is a contradiction between speed and accuracy. So our strategy is to use slow but accurate algorithm when searching small datasets, and use fast but less accurate algorithm when searching large ones.

Here, there are only some models like E. coli, but a lot of genes, reactions and metabolites in our database. We choose to calculate Levenshtein distances for model searching. Levenshtein distance defines as the following recursive function: $$ \qquad\operatorname{lev}_{a,b}(i,j) = \begin{cases} \max(i,j) & \text{if } \min(i,j)=0, \\ \min (\operatorname{lev}_{a,b}(i-1,j) + 1,\operatorname{lev}_{a,b}(i,j-1) + 1, \operatorname{lev}_{a,b}(i-1,j-1) + 1) & a_i \neq b_i, \\ \min (\operatorname{lev}_{a,b}(i-1,j) + 1,\operatorname{lev}_{a,b}(i,j-1) + 1, \operatorname{lev}_{a,b}(i-1,j-1)) & \text{otherwise.} \\ \end{cases} $$ which has a time complexity of \(O(nm)\) where n and m are the length of the two strings.

When we are handling genes, reactions and metabolites, we use another algorithm.

Biobricks Recommendation

It is troublesome for Synthesis biologists to decide which kind of biobricks should be used in their computing models. To deal with the issue, we designed a biobricks recommending system based on key-words extraction and behavior analysis. With our special designed algorithm, our system can give a precise prediction of biobricks that users demand.

Recommendation

References

  • 1. Orth JD, Thiele I, Palsson BØ. What is flux balance analysis?. *Nat Biotechnol*. 2010;28(3):245–248. doi:10.1038/nbt.1614
  • 2. Gudmundsson S, Thiele I. Computationally efficient flux variability analysis. *BMC Bioinformatics*. 2010;11:489. Published 2010 Sep 29. doi:10.1186/1471-2105-11-489
  • 3. Arne C. Müller, Alexander Bockmayr, Fast thermodynamically constrained flux variability analysis, *Bioinformatics*, Volume 29, Issue 7, 1 April 2013, Pages 903–909, https://doi.org/10.1093/bioinformatics/btt059
  • 4. Abdelmoneim Amer Desouki, Florian Jarre, Gabriel Gelius-Dietrich, Martin J. Lercher, CycleFreeFlux: efficient removal of thermodynamically infeasible loops from flux distributions, Bioinformatics, Volume 31, Issue 13, 1 July 2015, Pages 2159–2165, https://doi.org/10.1093/bioinformatics/btv096
  • 5. Ambler, Scott W. "Mapping objects to relational databases." On the World Wide Web: http://www. AmbySoft. com (2000).
  • 6. Vinoski, Steve. "Advanced message queuing protocol." IEEE Internet Computing 6 (2006): 87-89.
  • 7. https://medium.com/@jairvercosa/manger-vs-query-sets-in-django-e9af7ed744e0
  • 8. https://www.nature.com/articles/nbt.1614
  • 9. https://www.cloudamqp.com/blog/2014-12-03-what-is-message-queuing.html
  • 10. https://www.rabbitmq.com/