Difference between revisions of "Team:USTC-Software/Model"

Line 1: Line 1:
 
{{USTC-Software/html/header}}
 
{{USTC-Software/html/header}}
 
<html>
 
<html>
<!--对应链接<li><a href="https://2019.igem.org/Team:USTC-Software/Collaborations">Collaborations</a></li>-->
 
 
<link href="https://2019.igem.org/Template:USTC-Software/css/default?action=raw&ctype=text/css" rel="stylesheet"
 
<link href="https://2019.igem.org/Template:USTC-Software/css/default?action=raw&ctype=text/css" rel="stylesheet"
 
       type="text/css"/>
 
       type="text/css"/>
Line 38: Line 37:
 
             c contains the weight of all reactions contributing to the function. Now it is clear that FBA is a linear
 
             c contains the weight of all reactions contributing to the function. Now it is clear that FBA is a linear
 
             programming problem, and it just works.
 
             programming problem, and it just works.
 +
            <div class="psgImg">
 +
              <img src="https://static.igem.org/mediawiki/2019/e/e4/T--USTC-Software--model_fba_banlanced.png" alt="">
 +
              <p>The conceptual basis of constraint- based modeling. With no constraints, the fIux
 +
                distribution of a biological network may lie at any point in a solution space. When mass balance
 +
                constraints imposed by the stoichiometric matrix S (labeled 1) and capacity constraints imposed
 +
                by the lower and upper bounds (a;and b;) (labeled 2) are applied to a network, it defines an
 +
                allowable solution space. The network may acquire any flux distribution within this space, but
 +
                points outside this space are denied by the constraints. Through optimization of an objective
 +
                function, FBA can identify a single optimal fIux distribution that lies on the edge of the
 +
                allowable solution space.</p>
 +
            </div>
 
           </div>
 
           </div>
 
           <h4>Flux Variability Analysis</h4>
 
           <h4>Flux Variability Analysis</h4>
Line 48: Line 58:
 
             realistic. Users can select the "loopless" option to avoid such internal loops, and get a more consistent
 
             realistic. Users can select the "loopless" option to avoid such internal loops, and get a more consistent
 
             result.
 
             result.
 +
            <div class="psgImg">
 +
              <img src="https://static.igem.org/mediawiki/2019/f/f3/T--USTC-Software--model_analysis.png" alt="">
 +
              <p>Flux Variability Analysis</p>
 +
            </div>
 
           </div>
 
           </div>
 
           <h3>Computer Models</h3>
 
           <h3>Computer Models</h3>
Line 55: Line 69:
 
             gap between object-oriented programming languages and relational databases and avoids the vulnerability of
 
             gap between object-oriented programming languages and relational databases and avoids the vulnerability of
 
             SQL Injection.
 
             SQL Injection.
             <br><br>
+
             <br>
 +
            <div class="psgImg">
 +
              <img src="https://static.igem.org/mediawiki/2019/c/cb/T--USTC-Software--model_mapping.png" alt="">
 +
              <p>Object-Relational Mapping</p>
 +
            </div>
 
             We use ORM to organize our metabolic networks, biobricks data, and user's computational models. It brings us
 
             We use ORM to organize our metabolic networks, biobricks data, and user's computational models. It brings us
 
             the convenience of creating, querying, listing, and connecting models, with acceptable performance.
 
             the convenience of creating, querying, listing, and connecting models, with acceptable performance.
             <br><br>
+
             <br>
 +
            <div class="psgImg">
 +
              <img src="https://static.igem.org/mediawiki/2019/f/f0/T--USTC-Software--model_DB.png" alt="">
 +
              <p>Relationship between two types of database</p>
 +
            </div>
 
             In practice, we use Django ORM with MySQL backend to provide fast, flexible, and reliable service.
 
             In practice, we use Django ORM with MySQL backend to provide fast, flexible, and reliable service.
 +
            <br>
 +
            <div class="psgImg">
 +
              <img src="https://static.igem.org/mediawiki/2019/1/19/T--USTC-Software--model_django.png" alt="">
 +
              <p>Django ORM</p>
 +
            </div>
 
           </div>
 
           </div>
 
           <h4>Message Queues</h4>
 
           <h4>Message Queues</h4>
Line 70: Line 97:
 
             it obeys the "First Come, First Serve" principle. So, we can store our computing tasks in the queue, and
 
             it obeys the "First Come, First Serve" principle. So, we can store our computing tasks in the queue, and
 
             response to users instantly about the progress of his tasks.
 
             response to users instantly about the progress of his tasks.
             <br><br>
+
             <br>
 +
            <div class="psgImg">
 +
              <img src="https://static.igem.org/mediawiki/2019/3/3f/T--USTC-Software--model_msg.png" alt="">
 +
              <p>Message Queues</p>
 +
            </div>
 
             In Foresyn, we use RabbitMQ and Celery to build our message queues. Here, Celery uses its protocol to
 
             In Foresyn, we use RabbitMQ and Celery to build our message queues. Here, Celery uses its protocol to
 
             communicate with web programs and computation programs. And RabbitMQ acts as a broker who accepts and
 
             communicate with web programs and computation programs. And RabbitMQ acts as a broker who accepts and
 
             forwards messages of Celery.
 
             forwards messages of Celery.
 +
            <br>
 +
            <div class="psgImg">
 +
              <img src="https://static.igem.org/mediawiki/2019/f/f3/T--USTC-Software--model_rabit.png" alt="">
 +
              <p>RabbitMQ</p>
 +
            </div>
 
           </div>
 
           </div>
 
           <h4>Different Searching Engines</h4>
 
           <h4>Different Searching Engines</h4>
Line 148: Line 184:
  
 
<link href="https://2019.igem.org/Template:USTC-Software/css/pageContent?action=raw&ctype=text/css" rel="stylesheet">
 
<link href="https://2019.igem.org/Template:USTC-Software/css/pageContent?action=raw&ctype=text/css" rel="stylesheet">
 +
 +
<!--add figure <index>-->
 +
<script src="https://2019.igem.org/Template:USTC-Software/js/insertNumForPics?action=raw&ctype=text/javascript"></script>
  
 
</html>
 
</html>
  
 
{{USTC-Software/html/footer}}
 
{{USTC-Software/html/footer}}

Revision as of 11:55, 20 October 2019

Model

Biological Models

Flux Balance Analysis

Flux Balance Analysis (FBA) is a mathematical method for simulating metabolism in metabolic networks. It is based on linear programming to calculate fluxes when the model is stable.

Assuming S is a matrix, which contains all reactions in a model. In this matrix, metabolites consumed take negative coefficients, and metabolites produced then take positive ones. Furthermore, v is a vector that represents the flux of all reactions. When the system is steady, it satisfies: $$ \textbf{S}\textbf{v}=0 $$ And then, FBA tries to maximize or minimize the objective function \(\textbf{c}^\textrm{T} \textbf{v}\), which c contains the weight of all reactions contributing to the function. Now it is clear that FBA is a linear programming problem, and it just works.

The conceptual basis of constraint- based modeling. With no constraints, the fIux distribution of a biological network may lie at any point in a solution space. When mass balance constraints imposed by the stoichiometric matrix S (labeled 1) and capacity constraints imposed by the lower and upper bounds (a;and b;) (labeled 2) are applied to a network, it defines an allowable solution space. The network may acquire any flux distribution within this space, but points outside this space are denied by the constraints. Through optimization of an objective function, FBA can identify a single optimal fIux distribution that lies on the edge of the allowable solution space.

Flux Variability Analysis

Flux Variability Analysis (FVA) is an extension of FBA. It can show the minimum and maximum range of each reaction flux while they satisfy constraints and have the same optimal objective by solving a double linear programming problem.

However, when the metabolic network contains internal loops, the result can be too high in absolute to be realistic. Users can select the "loopless" option to avoid such internal loops, and get a more consistent result.

Flux Variability Analysis

Computer Models

Object-Relational Mapping

Object-Relational Mapping (ORM) enables us to work with databases more comfortable and safer. It fills the gap between object-oriented programming languages and relational databases and avoids the vulnerability of SQL Injection.

Object-Relational Mapping

We use ORM to organize our metabolic networks, biobricks data, and user's computational models. It brings us the convenience of creating, querying, listing, and connecting models, with acceptable performance.

Relationship between two types of database

In practice, we use Django ORM with MySQL backend to provide fast, flexible, and reliable service.

Django ORM

Message Queues

Our website requires computing large models, and it is quite embarrassing to see users waiting for browsers for so long. So we use message queues to maintain computing tasks.

We split our programs computing models from our website, and use message queues to send and receive information about our computation. A queue is a data structure that stores things waiting to be handled, and it obeys the "First Come, First Serve" principle. So, we can store our computing tasks in the queue, and response to users instantly about the progress of his tasks.

Message Queues

In Foresyn, we use RabbitMQ and Celery to build our message queues. Here, Celery uses its protocol to communicate with web programs and computation programs. And RabbitMQ acts as a broker who accepts and forwards messages of Celery.

RabbitMQ

Different Searching Engines

When we are developing Foresyn, we discovered that there is a contradiction between speed and accuracy. So our strategy is to use slow but accurate algorithm when searching small datasets, and use fast but less accurate algorithm when searching large ones.

Here, there are only some models like E. coli, but a lot of genes, reactions and metabolites in our database. We choose to calculate Levenshtein distances for model searching. Levenshtein distance defines as the following recursive function: $$ \qquad\operatorname{lev}_{a,b}(i,j) = \begin{cases} \max(i,j) & \text{if } \min(i,j)=0, \\ \min (\operatorname{lev}_{a,b}(i-1,j) + 1,\operatorname{lev}_{a,b}(i,j-1) + 1, \operatorname{lev}_{a,b}(i-1,j-1) + 1) & a_i \neq b_i, \\ \min (\operatorname{lev}_{a,b}(i-1,j) + 1,\operatorname{lev}_{a,b}(i,j-1) + 1, \operatorname{lev}_{a,b}(i-1,j-1)) & \text{otherwise.} \\ \end{cases} $$ which has a time complexity of \(O(nm)\) where n and m are the length of the two strings.

When we are handling genes, reactions and metabolites, we use another algorithm.

Biobricks Recommendation

TBD

References

  • 1. Orth JD, Thiele I, Palsson BØ. What is flux balance analysis?. *Nat Biotechnol*. 2010;28(3):245–248. doi:10.1038/nbt.1614
  • 2. Gudmundsson S, Thiele I. Computationally efficient flux variability analysis. *BMC Bioinformatics*. 2010;11:489. Published 2010 Sep 29. doi:10.1186/1471-2105-11-489
  • 3. Arne C. Müller, Alexander Bockmayr, Fast thermodynamically constrained flux variability analysis, *Bioinformatics*, Volume 29, Issue 7, 1 April 2013, Pages 903–909, https://doi.org/10.1093/bioinformatics/btt059
  • 4. Abdelmoneim Amer Desouki, Florian Jarre, Gabriel Gelius-Dietrich, Martin J. Lercher, CycleFreeFlux: efficient removal of thermodynamically infeasible loops from flux distributions, *Bioinformatics*, Volume 31, Issue 13, 1 July 2015, Pages 2159–2165, https://doi.org/10.1093/bioinformatics/btv096
  • 5. Ambler, Scott W. "Mapping objects to relational databases." On the World Wide Web: http://www. AmbySoft. com (2000).
  • 6. Vinoski, Steve. "Advanced message queuing protocol." IEEE Internet Computing 6 (2006): 87-89.