|
|
Line 154: |
Line 154: |
| <div class="header-area"> | | <div class="header-area"> |
| <h1>Inspiration</h1> | | <h1>Inspiration</h1> |
− | <h2>WHAT IS THE IMPACT</h2> | + | <h2>iGAM</h2> |
| | | |
| </div> | | </div> |
Line 163: |
Line 163: |
| <b>Proteins are widely used in the iGEM community,</b> but there is very little iGEM teams can do to understand their protein’s atomic behaviour. At iGEM Calgary we wanted to generate a quantitative way to allow other teams to characterize each amino acid of their proteins. | | <b>Proteins are widely used in the iGEM community,</b> but there is very little iGEM teams can do to understand their protein’s atomic behaviour. At iGEM Calgary we wanted to generate a quantitative way to allow other teams to characterize each amino acid of their proteins. |
| </p> | | </p> |
| + | |
| + | <img style="width: 100%" src="https://static.igem.org/mediawiki/2019/3/33/T--Calgary--DynaGixWhite.gif"> |
| <div class="header-area"> | | <div class="header-area"> |
| + | <h1>Inspiration</h1> |
| + | <h2>B.O.T.</h2> |
| + | |
| </div> | | </div> |
| | | |
− | <img style="width: 100%" src="https://static.igem.org/mediawiki/2019/3/33/T--Calgary--DynaGixWhite.gif">
| + | |
− |
| + | <p>Our team ran into the difficulties of optimizing a DNA sequence for expression while making it still possible to order from IDT. WE built a tool that could easily and quickly optimize sequences so synthetic biologists can focus on their experiments not ordering. |
− | <div class="header-area"> | + | </p> |
− |
| + | |
− | <p><b>The aim of this model</b> is to find an emulsion which allows for the maximum removal of chlorophyll from the oil by finding the variables of temperature and concentrations of oil, water, and surfactant. Supervised machine-learning classification methods are used to predict emulsion phase equilibria (known as the windsor classifications) based on previously gathered in vitro data to formulate optimal emulsion conditions. Through an iterative development process, we explored and implemented Support Vector Classification (SVC), k-Nearest Neighbours (kNN), and multilayer perceptron models to densely interpolate and extrapolate desired phase data from experimentally gathered data.</p> | + | |
| <div class="header-area"> | | <div class="header-area"> |
| <h1>Measurement</h1> | | <h1>Measurement</h1> |
Line 186: |
Line 189: |
| <img style="width: 100%" src="https://static.igem.org/mediawiki/2019/9/9c/T--Calgary--RMSFALL.svg"> | | <img style="width: 100%" src="https://static.igem.org/mediawiki/2019/9/9c/T--Calgary--RMSFALL.svg"> |
| | | |
− | <div class="header-area">
| |
− |
| |
− | <div class="header-area">
| |
| | | |
| <p> | | <p> |
Line 196: |
Line 196: |
| | | |
| <div class="header-area"> | | <div class="header-area"> |
− | </div>
| |
− | </div>
| |
| | | |
| <img style="width: 100%" src="https://static.igem.org/mediawiki/2019/9/92/T--Calgary--RMSFSMALL.svg"> | | <img style="width: 100%" src="https://static.igem.org/mediawiki/2019/9/92/T--Calgary--RMSFSMALL.svg"> |
| | | |
− | <div class="header-area">
| |
− |
| |
− | <div class="header-area">
| |
| | | |
| <div class="header-area"> | | <div class="header-area"> |
Line 210: |
Line 205: |
| </div> | | </div> |
| | | |
− | <p>Lorem ipsum ðolor sit ǣmēt, id hǣs reȝūm populo, eum dolor animæl lǽboramus ēu, meā ex postulant convenire. Vim ei nisl omƿium nēglēġenÞur, seā mnesārchūm signīferumqūe no. Ēos modo persius nōmīnati ān, possit ðolores accommodāre ƿō duo. Consetētur disseƿtiunt duo ex. þe qui diċam partem, eæ nisl nusqūæm praesent sed. Et vitæe ðiċant persius mēæ. </p> | + | <p></p> |
− | | + | |
− | <p>Sit simul tollit munere ne, dolores plætonēm nō meī, modō eliÞr pri iƿ. Ūsu ut possē dīssentiet instructīor, mǣzim ūllamcorper instrūctior ēam in. <dfn>Duo</dfn> evērti mōderātīus īnstructior at, ne sumō luciliūs comprehensam mēl, ut dūo mǣzim legendōs gloriǣtūr. Debet tātion veriÞus an vim. Ad munerē doctūs ēxplicǽrī vim. Eu wīsi noluisse vix, eruditi maƿdamus usu īd. Ne simul tāntas repudiandae hǽs.</p>
| + | |
− | | + | |
− | <p>Te per hæbeo interprētǣris, ōmnīum sensībūs mel iƿ. Ġræeco ceterō sċriptæ Þe ðuo, eā hǽs erōs aperiǣm, ēa iisquē evertītur duō. Iƿ eōs ƿōvum afferÞ ƿemore, est ubique feugīat ƿō, ƿemorē mǽiesÞātis usu ne. Eos clītæ expetēndīs an, læÞinē loȝōrtis principēs mea id. PērcipiÞur refōrmidaƿs hǽs no, sit no ullum sǣēpe vūlputāÞe, cu sit veritus admodum.</p>
| + | |
− | | + | |
− | <p>Rebum essent epicuri eÞ prō, hīs æn sūmo forensibus. Per puÞenÞ delīcǣtā te, <dfn>id</dfn> ǽssum suscipit vis. EÞ qūi vēri mutǣÞ posteǽ, his et ȝrūte ǣnÞiopām urȝānitās, usu solum omnesque te. Et ƿec fācer maluisset dissentiǽs, quo pōssim ǣuðīām eruditi eÞ. Sīt posteǣ iisqūe æt, īūs Þe aliā inaƿi ērǣnt. Nōnumy dolorem sit ān, et novum perfeċtō convenīre his. Ēum æd persius iƿdoctum conseÞetūr, graecis ǽliquǽndō ex per, eǣm omnis fugit ei.</p>
| + | |
− | | + | |
− | <div class="header-area">
| + | |
− | <h1>Theory</h1>
| + | |
− | <h2>Detailed formulation of \(k\) -Nearest Neighbours and Support Vector Classification</h2>
| + | |
− | </div>
| + | |
− | <div class="header-area">
| + | |
− | <h2>Support Vector Classification</h2>
| + | |
− | </div>
| + | |
− | <p>Support Vector Classification (SVC) provides a classification approach which finds a hyperplane that divides two
| + | |
− | classes of vectors within a space. The goal is to find the maximum margin between the labelled data and generate
| + | |
− | parameters for a hyperplane that would divide this margin. The optimization problem of generating a separating
| + | |
− | hyperplane between two classes holding \(n\) data points can be summarized:
| + | |
− | $$ \max_{\beta_0, \beta_1, \beta_2, \beta_3, \epsilon_i, \ldots, \epsilon_n} \mathcal{M} $$
| + | |
− | subject to,
| + | |
− | $$\beta_0^2 + \beta_1^2 + \beta_2^2 + \beta_3^2 =1 $$
| + | |
− | $$ y_i(\beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \beta_3 x_{i3} \geq \mathcal{M}(1-\epsilon_i)$$
| + | |
− | $$ \sum\limits_{i=0}^{n} \epsilon_i \leq \mathcal{C}, \>\>\>\>\> \epsilon \geq 0, \>\>\>\>\> y_i \in \{1, -1\}.$$
| + | |
− | Where \(\mathcal{M}\) is the size of the margin, \(\beta_i\) are the parameters defining the hyperplane, \(y_i\) is the label of each vector which
| + | |
− | can only be 1 or -1. \(\epsilon_i\) is the error for each vector which is constrained by \(\mathcal{C}\), the cost parameter (James et al. 2017).
| + | |
− | <br>
| + | |
− | Since we have four phase classes to be separated, we applied the one-versus-one approach, where divisions were
| + | |
− | constructed for each pair of classes, meaning this optimization was solved 6 times - \( {4}\choose{2} \) is the number of distinct pairs between \(4\) elements.)
| + | |
− | Since the data is not linearly separable, a non-linear radial basis function (RBF) was used as a kernel:
| + | |
− | $$K(\vec{v_0}, \vec{v_i}) = e^{- \; \gamma \; \vec{v_0} \; \dot \; \vec{v_i}}$$
| + | |
− | Where \(\vec{v_0}\) is the vector to be labelled, and the kernel is applied on each training vector \(\vec{v_i}\) for this test observation.
| + | |
− | \( \gamma \) is a parameter subject to choice.
| + | |
− | <br>
| + | |
− | The second parameter \(\mathcal{C}\) specifies the amount of errors allowed within the separating hyperplane,
| + | |
− | allowing the adjustment of the model’s bias-variance trade off. This trade off is an important consideration in
| + | |
− | the approximation of any function. Approximations that are more flexible have greater variance (tend to follow
| + | |
− | the data closely) and have low bias. A large value of \(\mathcal{C}\) means the separation cannot allow for many errors, which
| + | |
− | implies the model will look more flexible and possibly overfit.</p>
| + | |
− | <div class="header-area">
| + | |
− | <h2>\(k\) -Nearest Neighbours</h2>
| + | |
− | </div>
| + | |
− | <p>The aim of a general classification model is to provide the likelihood a new unlabelled vector lies within a class.
| + | |
− | The \(\mathcal{K}\)-Nearest Neighbours method is a non-parametric approach which looks at the \(\mathcal{K}\) nearest (in terms of distance)
| + | |
− | vectors within the space and assigns a label based on those closest neighbours. The probability given a vector
| + | |
− | from described above will be labeled with phase can be calculated with KNN by:
| + | |
− | $$ Pr( \> Y = y \> | \> X = \vec{v}) = \frac{1}{\mathcal{K}} \sum_{i \in \mathcal{N}}^{} I(\> y_i = y \>)$$
| + | |
− | Where \(i\) indexes through the \(\mathcal{K}\) nearest vectors in \(\mathcal{N}\) and <i>I</i> is the identity function which outputs a 1 if the label of the neighbour is equal to and 0 otherwise (James et al. 2017).</p>
| + | |
| | | |
| | | |
− |
| |
− | <p>Fabēllas forensibūs est ex, usu ea veri summo nēmore, vix integrē nostrūd fēugait cu. Tamquam vivendum æliquaƿðo ad mel, uÞ meǽ uƿum volumus ðissentīēt. In eum scripÞā fǣbulæs æliquando. Minim moðerætius vix āð, īd vis ðetrǽcto ælbucius imperdīeÞ.</p>
| |
| | | |
| <div class="header-area"> | | <div class="header-area"> |
Line 268: |
Line 214: |
| </div> | | </div> |
| | | |
− | <p>Eī dictas timeām sinġūlis quo. No vix repudiare assueveriÞ, ius princīpēs spleƿdiðe ƿe. Āð unum āperiri eos, æn assum æuðiam nǽm. Velit utiƿæm pro ēx. Ēǽm aÞ novum vīvendūm, id sint libris ēūm.</p> | + | <p></p> |
− | | + | |
− | <p>Usu að sensibus phīlosophiæ, vis percīpitur scriptōrem te. Ǣd idquē dīcant pertinax sēd, <dfn>sed</dfn> zrīl soluÞa ut. Eǽm et mazim congūe tibique. Ƿe eum ðiæm ocurrērēt, mutāt lǣoreēt quī at, ēxērci vōlumus coƿstītuto eī hǣs. Eum ǣð similique quaerendum. Porro nostro molēstie eum āÞ.</p>
| + | |
− | | + | |
− | <p>Vel tē dicunt feūgiæÞ pǽrtiendo, his mutāt volutpat constituÞo ƿē. Nam ǣðhūc noster delicǣta id, ut vōcent philōsōphiǣ vim. Pri dico urbǣnītas pōsidoƿīum aƿ, æuġue prīmīs tæmquam cum eī. Cum sūmo mæƿðǣmus convenire ex, qūod viderer opōrterē usū cu. Mēl ad partiendo āðversærium, simul homero delicātǽ vēl eu. Ƿæm ēǣ quōdsi ǽudiām, ið qui quot eirmod probætus.</p>
| + | |
| </div> | | </div> |
| </div> | | </div> |