|
|
| Line 1,543: |
Line 1,543: |
| | <p>The team's initial software idea was to build a metagenomic analysis desktop app for finding a protein with desired characteristics that would use local system resources and would work on any operating system. Before even starting to write the code, our team had a set of concerns about the software being a desktop app, data cleaning, and hardware problems like storage and processing power.</p> | | <p>The team's initial software idea was to build a metagenomic analysis desktop app for finding a protein with desired characteristics that would use local system resources and would work on any operating system. Before even starting to write the code, our team had a set of concerns about the software being a desktop app, data cleaning, and hardware problems like storage and processing power.</p> |
| | | | |
| − | <p>After having our first consultation with Michail Chrunov, desktop software developer, a decision was made to create the software as a web service application. According to Mr. Chrunov, this would increase the reachability and would be more convenient for those who just want to try it without having to install or download additional files. Additionally, this way, the app would use our hardware so we could be sure the analysis to be completed without any hardware-related errors.</p>
| |
| | | | |
| − | <p><b style="color: #f6cd61!important">Implementation:</b></p>
| |
| − | <ul>
| |
| − | <li>We decided to build software as a web-service application instead of a desktop app.</li><br>
| |
| − | </ul>
| |
| | | | |
| − | <p>The next major challenge was solving one of the biggest problems in metagenomics, that is, removing the duplicates from the multi-million sequence FASTA format files and to standardize the sequences for further analysis. Although at first glance, it seemed like a simple enough problem, we later found out that it is a well-known ongoing issue within the bioinformatics field. We had lengthy consultations and discussions with Dr. Justas Dapkūnas, Dr. Kliment Olechnovič, and Kotryna Kvederavičiūtė, and it took us numerous hours until we came up with an algorithmic solution for the problem. After implementing it and improving the code, we ended up having a tool that not only removes duplicates but also standardizes and cleans up the data, which is a must-have for a flawless analysis.</p>
| |
| | | | |
| − | <p><b style="color: #f6cd61!important">Implementation:</b></p>
| |
| − | <ul>
| |
| − | <li>Solved the duplicate problem and implemented standardization & cleanup algorithms.</li><br>
| |
| − | </ul>
| |
| − |
| |
| − | <p>We then moved on to tackling the problem with space for data storage. Since metagenomic data collections take up terabytes of data, we figured that it wouldn’t be cost-effective to let every user store their data on our servers. Dr. Dapkūnas suggested that we could use the best-in-class clustering tool for this job, which takes out only a certain percentage of representative sequences and clusters them into a new, up to 70% smaller file. We were also told that if we are working with big data, the Apache Spark cluster-computing framework is a must.</p>
| |
| − |
| |
| − | <p>The general discussions about our software included topics such as application, performance, legal rights, value, and its impact on the scientific community and the need for such software. We took in many opinions and suggestions from real professionals of their fields and came out of this event with a result we are proud of and hope it can make a breakthrough in the scientific community.</p>
| |
| | <p class="page-heading" style="font-size:1rem!important;text-align:center;">Photos from BioHackathon</p> | | <p class="page-heading" style="font-size:1rem!important;text-align:center;">Photos from BioHackathon</p> |
| | <img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/9/9e/T--Vilnius-Lithuania--Int-Bio-1.jpg"> | | <img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/9/9e/T--Vilnius-Lithuania--Int-Bio-1.jpg"> |
| Line 1,569: |
Line 1,555: |
| | <div class="modal-content"> | | <div class="modal-content"> |
| | <span class="closeModal" id="closeModal20">×</span> | | <span class="closeModal" id="closeModal20">×</span> |
| | + | <p class="page-heading">Michail Chrunov</p> |
| | + | <p>After having our first consultation with Michail Chrunov, desktop software developer, a decision was made to create the software as a web service application. According to Mr. Chrunov, this would increase the reachability and would be more convenient for those who just want to try it without having to install or download additional files. Additionally, this way, the app would use our hardware so we could be sure the analysis to be completed without any hardware-related errors.</p> |
| | + | |
| | + | <p><b style="color: #f6cd61!important">Implementation:</b></p> |
| | + | <ul> |
| | + | <li>We decided to build software as a web-service application instead of a desktop app.</li><br> |
| | + | </ul><br> |
| | + | <p class="page-heading" style="font-size:1rem!important;text-align:center;">Photos from BioHackathon</p> |
| | <img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/5/59/T--Vilnius-Lithuania--int-bio-1.jpg"> | | <img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/5/59/T--Vilnius-Lithuania--int-bio-1.jpg"> |
| | </div> | | </div> |
| Line 1,577: |
Line 1,571: |
| | <div class="modal-content"> | | <div class="modal-content"> |
| | <span class="closeModal" id="closeModal21">×</span> | | <span class="closeModal" id="closeModal21">×</span> |
| | + | <p class="page-heading">Dr. Justas Dapkūnas, Dr. Kliment Olechnovič, Kotryna Kvederavičiūtė</p> |
| | + | <p>The next major challenge was solving one of the biggest problems in metagenomics, that is, removing the duplicates from the multi-million sequence FASTA format files and to standardize the sequences for further analysis. Although at first glance, it seemed like a simple enough problem, we later found out that it is a well-known ongoing issue within the bioinformatics field. We had lengthy consultations and discussions with Dr. Justas Dapkūnas, Dr. Kliment Olechnovič, and Kotryna Kvederavičiūtė, and it took us numerous hours until we came up with an algorithmic solution for the problem. After implementing it and improving the code, we ended up having a tool that not only removes duplicates but also standardizes and cleans up the data, which is a must-have for a flawless analysis.</p> |
| | + | |
| | + | <p><b style="color: #f6cd61!important">Implementation:</b></p> |
| | + | <ul> |
| | + | <li>Solved the duplicate problem and implemented standardization & cleanup algorithms.</li><br> |
| | + | </ul> |
| | + | |
| | + | <p>We then moved on to tackling the problem with space for data storage. Since metagenomic data collections take up terabytes of data, we figured that it wouldn’t be cost-effective to let every user store their data on our servers. Dr. Dapkūnas suggested that we could use the best-in-class clustering tool for this job, which takes out only a certain percentage of representative sequences and clusters them into a new, up to 70% smaller file. We were also told that if we are working with big data, the Apache Spark cluster-computing framework is a must.</p> |
| | + | |
| | + | <p>The general discussions about our software included topics such as application, performance, legal rights, value, and its impact on the scientific community and the need for such software. We took in many opinions and suggestions from real professionals of their fields and came out of this event with a result we are proud of and hope it can make a breakthrough in the scientific community.</p> |
| | <img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/4/43/T--Vilnius-Lithuania--int-bio-2.jpg"> | | <img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/4/43/T--Vilnius-Lithuania--int-bio-2.jpg"> |
| | </div> | | </div> |
×
Stage 1: Creating the team
iGEM provides a wide range of possibilities for aspiring young scientists aiming to develop and implement their ideas. iGEM involves different stakeholders, such as students, academia/scientists, public authorities, and private businesses, who exchange ideas in developing solutions for the same issues.
In our team, we have clearly understood the importance of the whole team dynamics, work planning, and responsibilities allocation. The organizational architecture of our team's resources is based on strategic planning activities. To appropriately plan our project development timeline, we have consulted different stakeholders that contributed to our team functioning in one or another way. We have worked in close collaboration with the private sector – our general sponsor Thermo Fisher Scientific Baltics, and the public sector – our Alma mater Vilnius university.
From the very beginning, we have understood that the coordination of the whole team that, in our case encompassed of laboratory, human practices, IT, design, mathematical modeling, marketing, and public relations areas, requires a defined, coherent management system. Therefore, our team management and formation activities had two directions: formal and informal. The former was aimed at outlining precise team objectives, allocating responsibilities and workload; the latter was intended to promote various team building activities.
×
International Conference The COINS 2019
Both 2018 and 2019, Vilnius-Lithuania iGEM teams participated in an international conference of life sciences The COINS. Here we attended company fair and made both oral and poster presentations of the 2018 Vilnius-Lithuania iGEM team project.
More than 300 students attended the presentation to hear about iGEM competition and synthetic biology. This presentation was the very first from Vilnius-Lithuania iGEM 2018 to our local community after they had come back from Boston.
The attention of participants was captured not only by the last year's but also by the current team. New iGEM participants were reaching out to potential sponsors, mentors, communicating with interested members of the academia.
Most importantly, our team had an exclusive opportunity to talk with the Nobel prize laureate in Physiology or Medicine in 2014, Prof. John O'Keefe, and hear the story of his life, research, as well as the path of becoming such an esteemed member of the scientific community. He talked passionately about the importance of interdisciplinary thinking, especially philosophy, and not giving up when it seems rough.
Because Prof. O'Keefe is a neuroscientist with a background in engineering, he was able to point out the benefits of implementing engineering principles while working with biological systems. This was important to us, as we were already thinking of working with optogenetics, the start of which is also closely related to neuroscience.
Also, we took the professor's advice on interdisciplinarity, therefore later on we tried to discuss our project with mathematicians, physicists. Moreover, every time organizing an event, we tried to invite as intellectually diverse people as possible (for example, we asked the philosopher Dr. Jonas Čiurlionis, to participate in our discussion on the future of humanity and technology).
Photos from International Conference The COINS 2019
×
Prof. E. Sužiedėlienė and Dr. Justas Dapkūnas
Having the idea that we could extract light-modular transcription regulators from metagenomes, we searched for the way how to find sequences which our newly discovered protein bind to.
Our team met Dr. Justas Dapkūnas, who works in the field of bioinformatics as a researcher in the analysis of protein structure, surface properties, and interactions. We reasoned that by using bioinformatic tools, we could predict a DNA sequence that a protein of interest could bind to.
However, team members heard from Dr. Dapkūnas that this sort of protein structure modeling is a possible but extremely challenging task. The first difficulty stems from the fact that sometimes completely unrelated protein bind to the same DNA sequences, and other times homologous protein bind to entirely different DNA sequences. Secondly, it could be possible to predict the DNA sequence by docking a very closely related protein with a known structure to genomic DNA from the organism the protein of interest comes from. However, this method would not be possible, as our protein sequences would be gathered from metagenomic databases.
After this conversation with Dr. Dapkūnas, we understood that we would not be able to find the needed DNA sequence in silico and would have to find them by performing in vitro experimentation. Therefore, we asked Prof. Edita Sužiedėlienė to give us a piece of advice. She suggested performing double-stranded SELEX (Systematic evolution of ligands by exponential enrichment). This approach started by expressing our protein of interest fused with an affinity tag in E. coli. The next step would be immobilization of the protein on beads and bringing double-stranded chemically synthesized random nucleotide containing oligonucleotides. Professor Sužiedėlienė warned that the assay might be noisy; therefore, we should be prepared to optimize the conditions as much as possible.
Implementations:
- We understood that finding DNA binding sites in silico would be too challenging. Therefore, we decided to use in vitro method.
- We decided to use the double-stranded SELEX method to find the protein binding sites of the undescribed protein of our research.
Photos from meeting with Prof. E. Sužiedėlienė and Dr. Justas Dapkūnas
×
Dr. Justas Dapkūnas, Dr. Kliment Olechnovič, Kotryna Kvederavičiūtė
The next major challenge was solving one of the biggest problems in metagenomics, that is, removing the duplicates from the multi-million sequence FASTA format files and to standardize the sequences for further analysis. Although at first glance, it seemed like a simple enough problem, we later found out that it is a well-known ongoing issue within the bioinformatics field. We had lengthy consultations and discussions with Dr. Justas Dapkūnas, Dr. Kliment Olechnovič, and Kotryna Kvederavičiūtė, and it took us numerous hours until we came up with an algorithmic solution for the problem. After implementing it and improving the code, we ended up having a tool that not only removes duplicates but also standardizes and cleans up the data, which is a must-have for a flawless analysis.
Implementation:
- Solved the duplicate problem and implemented standardization & cleanup algorithms.
We then moved on to tackling the problem with space for data storage. Since metagenomic data collections take up terabytes of data, we figured that it wouldn’t be cost-effective to let every user store their data on our servers. Dr. Dapkūnas suggested that we could use the best-in-class clustering tool for this job, which takes out only a certain percentage of representative sequences and clusters them into a new, up to 70% smaller file. We were also told that if we are working with big data, the Apache Spark cluster-computing framework is a must.
The general discussions about our software included topics such as application, performance, legal rights, value, and its impact on the scientific community and the need for such software. We took in many opinions and suggestions from real professionals of their fields and came out of this event with a result we are proud of and hope it can make a breakthrough in the scientific community.
×
Dr. Nicholas Ting Xun Ong
Our laboratory experiments showed that Gal4-QPas1 and LexA-QPas1 chimeras repress promoters up to 3 times, which did not quite meet our expectations. To improve our system, we decided to seek advice from a scientist with expertise in working with BphP1. Therefore we found Dr. Nicholas Ting Xun Ong, whose Ph.D. study was mainly focused BphP1/PpsR2 optogenetic system in E. coli.
The response that we got was that, in general, trying to build chimeric proteins to function as transcriptional regulators, e.g., an engineered PpsR2 repressor, is not a very easy task. Dr. Ong also suggested taking a look at his former lab's recent publication about such efforts.
Dr. Ong's research shows that the problem with the low dynamic range between the induced and the uninduced states is not linked with PpsR2 binding to DNA. Instead, the key factor is how BphP1 interacts with PpsR2, that is, how BphP1 disrupts the dimeric structure of PpsR2 in the light-excited state. In one of his papers, he was able to show 76 fold repression by using PpsR2; however, BphP1, even after a thorough optimization, derepressed only 2.5 fold.
Dr. Ong recommended trying a high expression level pBR_crtE promoter and was very optimistic about our idea to try Q-Pas1, a minimal PpsR2 domain, fused with a DNA binding domain. He said he has considered this idea, but due to a lack of time did not try it.
Implementations after the consultation:
- We decided to try Q-Pas1 fused with the HTH domain from the PpsR2 protein.
- To work with the pBR_crtE promoter, offered by Dr. N. Ong.
- Knowing that all the promoters Dr. Ong used had two operator sequences, we decided to test new promoters, designed based on the ColE promoter, with only one operator sequence.