Difference between revisions of "Team:Vilnius-Lithuania/Human Practices"

Line 1,543: Line 1,543:
 
<p>The team's initial software idea was to build a metagenomic analysis desktop app for finding a protein with desired characteristics that would use local system resources and would work on any operating system. Before even starting to write the code, our team had a set of concerns about the software being a desktop app, data cleaning, and hardware problems like storage and processing power.</p>
 
<p>The team's initial software idea was to build a metagenomic analysis desktop app for finding a protein with desired characteristics that would use local system resources and would work on any operating system. Before even starting to write the code, our team had a set of concerns about the software being a desktop app, data cleaning, and hardware problems like storage and processing power.</p>
  
<p>After having our first consultation with Michail Chrunov, desktop software developer, a decision was made to create the software as a web service application. According to Mr. Chrunov, this would increase the reachability and would be more convenient for those who just want to try it without having to install or download additional files. Additionally, this way, the app would use our hardware so we could be sure the analysis to be completed without any hardware-related errors.</p>
 
  
<p><b style="color: #f6cd61!important">Implementation:</b></p>
 
<ul>
 
<li>We decided to build software as a web-service application instead of a desktop app.</li><br>
 
</ul>
 
  
<p>The next major challenge was solving one of the biggest problems in metagenomics, that is, removing the duplicates from the multi-million sequence FASTA format files and to standardize the sequences for further analysis. Although at first glance, it seemed like a simple enough problem, we later found out that it is a well-known ongoing issue within the bioinformatics field. We had lengthy consultations and discussions with Dr. Justas Dapkūnas, Dr. Kliment Olechnovič, and Kotryna Kvederavičiūtė, and it took us numerous hours until we came up with an algorithmic solution for the problem. After implementing it and improving the code, we ended up having a tool that not only removes duplicates but also standardizes and cleans up the data, which is a must-have for a flawless analysis.</p>
 
  
<p><b style="color: #f6cd61!important">Implementation:</b></p>
 
<ul>
 
<li>Solved the duplicate problem and implemented standardization & cleanup algorithms.</li><br>
 
</ul>
 
 
<p>We then moved on to tackling the problem with space for data storage. Since metagenomic data collections take up terabytes of data, we figured that it wouldn’t be cost-effective to let every user store their data on our servers. Dr. Dapkūnas suggested that we could use the best-in-class clustering tool for this job, which takes out only a certain percentage of representative sequences and clusters them into a new, up to 70% smaller file. We were also told that if we are working with big data, the Apache Spark cluster-computing framework is a must.</p>
 
 
<p>The general discussions about our software included topics such as application, performance, legal rights, value, and its impact on the scientific community and the need for such software. We took in many opinions and suggestions from real professionals of their fields and came out of this event with a result we are proud of and hope it can make a breakthrough in the scientific community.</p>
 
 
<p class="page-heading" style="font-size:1rem!important;text-align:center;">Photos from BioHackathon</p>
 
<p class="page-heading" style="font-size:1rem!important;text-align:center;">Photos from BioHackathon</p>
 
<img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/9/9e/T--Vilnius-Lithuania--Int-Bio-1.jpg">
 
<img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/9/9e/T--Vilnius-Lithuania--Int-Bio-1.jpg">
Line 1,569: Line 1,555:
 
   <div class="modal-content">
 
   <div class="modal-content">
 
     <span class="closeModal" id="closeModal20">&times;</span>
 
     <span class="closeModal" id="closeModal20">&times;</span>
 +
<p class="page-heading">Michail Chrunov</p>
 +
<p>After having our first consultation with Michail Chrunov, desktop software developer, a decision was made to create the software as a web service application. According to Mr. Chrunov, this would increase the reachability and would be more convenient for those who just want to try it without having to install or download additional files. Additionally, this way, the app would use our hardware so we could be sure the analysis to be completed without any hardware-related errors.</p>
 +
 +
<p><b style="color: #f6cd61!important">Implementation:</b></p>
 +
<ul>
 +
<li>We decided to build software as a web-service application instead of a desktop app.</li><br>
 +
</ul><br>
 +
<p class="page-heading" style="font-size:1rem!important;text-align:center;">Photos from BioHackathon</p>
 
<img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/5/59/T--Vilnius-Lithuania--int-bio-1.jpg">
 
<img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/5/59/T--Vilnius-Lithuania--int-bio-1.jpg">
 
   </div>
 
   </div>
Line 1,577: Line 1,571:
 
   <div class="modal-content">
 
   <div class="modal-content">
 
     <span class="closeModal" id="closeModal21">&times;</span>
 
     <span class="closeModal" id="closeModal21">&times;</span>
 +
<p class="page-heading">Dr. Justas Dapkūnas, Dr. Kliment Olechnovič, Kotryna Kvederavičiūtė</p>
 +
<p>The next major challenge was solving one of the biggest problems in metagenomics, that is, removing the duplicates from the multi-million sequence FASTA format files and to standardize the sequences for further analysis. Although at first glance, it seemed like a simple enough problem, we later found out that it is a well-known ongoing issue within the bioinformatics field. We had lengthy consultations and discussions with Dr. Justas Dapkūnas, Dr. Kliment Olechnovič, and Kotryna Kvederavičiūtė, and it took us numerous hours until we came up with an algorithmic solution for the problem. After implementing it and improving the code, we ended up having a tool that not only removes duplicates but also standardizes and cleans up the data, which is a must-have for a flawless analysis.</p>
 +
 +
<p><b style="color: #f6cd61!important">Implementation:</b></p>
 +
<ul>
 +
<li>Solved the duplicate problem and implemented standardization & cleanup algorithms.</li><br>
 +
</ul>
 +
 +
<p>We then moved on to tackling the problem with space for data storage. Since metagenomic data collections take up terabytes of data, we figured that it wouldn’t be cost-effective to let every user store their data on our servers. Dr. Dapkūnas suggested that we could use the best-in-class clustering tool for this job, which takes out only a certain percentage of representative sequences and clusters them into a new, up to 70% smaller file. We were also told that if we are working with big data, the Apache Spark cluster-computing framework is a must.</p>
 +
 +
<p>The general discussions about our software included topics such as application, performance, legal rights, value, and its impact on the scientific community and the need for such software. We took in many opinions and suggestions from real professionals of their fields and came out of this event with a result we are proud of and hope it can make a breakthrough in the scientific community.</p>
 
<img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/4/43/T--Vilnius-Lithuania--int-bio-2.jpg">
 
<img style="width:50%;margin:auto" src="https://2019.igem.org/wiki/images/4/43/T--Vilnius-Lithuania--int-bio-2.jpg">
 
   </div>
 
   </div>

Revision as of 16:28, 19 October 2019

Human practices
Creating A Team Team Building Strategic Plan Forming Previous Teams Shaping Our Idea Rolandas Meskys Darius Kazlauskas Thermo Fisher LSSU Student for one day COINS IBO Building Colight Kotryna Kvederaviciute Justas Dapkunas Irmantas Rokaitis Modeling Biohackathon Michail Chrunov Kliment Nicholas Final Project General Public