Modeling nowadays has become crucial to all scientific fields and synthetic biology is no exception.
Computational methods have the ability to provide answers to questions that cannot be obtained in the
lab. With Xylencer, we wanted to leverage this power and answer questions that are key in developing our
phage therapy. We incorporated a broad array of different computational techniques to make full use of
computational methods available in this day and age. This includes the use of temporal and spatial
models, a custom physical protein modeling workflow, comparative genomics, and machine learning. This
page serves to give an overview of the different questions we answered with computational techniques and
links are provided for more in-depth information.
Overview
Spatial temporal modelling
How will our therapy function in a real-world setting?
Testing the application of phage therapy on a single field or larger space is far beyond the scope of an iGEM project. Still, this is one of the crucial stages in the Xylencer project. To gain a better understanding of how our therapy would spread and how efficient it would be at curing an X. fastidiosa infection, we employed spatial-temporal modeling. Since X. fastidiosa poses a big threat to the European continent, the EU already created models that can be used to assess the efficiency of different biocontainment approaches [1]. We took these models as a starting point and incorporated the Xylencer phages to assess the efficiency of our solution. This first required the construction of a model for the interaction between the Xylencer phage and an X. fastidiosa colony inside of the plant. Combining this model with the latest EU model yielded our final spatial spread model. From this model we could conclude that the Xylencer phages can be effective at combatting X. fastidiosa, finally providing the much-needed cure.
Genome-scale machine learning
How do we ensure our phage delivery bacterium is safe?
We revolutionize phage therapy with the phage delivery bacterium (PDB). However, selecting a compatible and safe strain that can serve this role, is a complicated matter. Phage replication requires a high similarity in cell metabolism between X. fastidiosa and the PDB. This is problematic as most species closely related to Xylella are known to be phytopathogenic, making them unsafe to use. A few species have reported non-pathogenic strains, but some of these strains were later shown to be pathogenic under different testing conditions [2]. This makes the classification of the other non-pathogens doubtful. By combining comparative genomics with machine learning, we found a genetic basis for non-pathogenicity and selected a set of non-pathogens that conform to this genetic basis. With this information, we selected Xanthomonas arboricola strain CITA 44 as the prime candidate for our PDB. By analyzing this model, we also identified both the lack of a Type III secretion system and a lack of transposons linked to specific pathogenicity islands, as important factors for non-pathogenicity in the Xanthomonas genus.
Physical Modeling
Would our protein fusions be theoretically possible?
An important part of our therapy is the enhanced Xylencer phages. The creation of these phages hinges on a fusion of the phage capsid protein and an adhesion protein. These capsid molecules form complex multimers to construct the phage capsid and a fusion protein must not restrict the capsid’s ability to assemble. To asses if the capsid is not sterically hindered by the fusion, we make use of protein modeling.
Modeling of regular proteins can nowadays be easily performed by using a single web server. However, fusion proteins cannot be aligned to a single template since they are composed of multiple proteins. This makes it hard to easily simulate them in high quality. We developed a workflow (described below) based on freely accessible web services, that allows anyone with basic knowledge on protein modeling, to model and visualize fusion proteins. Using this workflow, we were able to show that either the chitin-binding domain of chitinase A1 (Bacillus circulans) or a GFP can be fused to the decorator protein of phage lambda, our model phage, without obstructing the trimer-interface. We also used the same workflow to visualize the integration of quorum sensing fusion protein RpfCch into the outer membrane of the phage delivery bacterium phage delivery bacterium.
-
Fusion protein modeling workflow arrow_downward
A prevalent hypothesis on how proteins fold is that protein domain, the smallest functional units of proteins, fold first and do so individually. Only after the domains have folded, do the remaining parts of the protein fold in such a way as to minimize the free energy in the entire protein. By closely following these biological phenomena, our workflow aims to achieve the best possible model. The workflow starts with the detection of the different protein domains and individually modeling all the detected domains. Finally, the full sequence is threaded through the domain models and the connecting linkers are modeled ab initio (Figure 3).
Methods
Protein domains are detected by running the entire sequence through the ThreadomEX [3] web server. Detected domains are modeled using the award-winning I-Tasser [4] web server. Generated models are manually trimmed to remove linker sequences using UCF ChimeraX [5], by removing approximately 5 to 20 amino acids at the start and end of the domains, allowing for more flexible modeling. The trimmed models are renumbered and assembled by the AIDA [6] web server. Additionally, for RpfCch the positioning in the outer membrane was calculated using the OPM [7] web server. The final models were visualized using ChimeraX.
Model Examples
-
References arrow_downward
- C. Bragard et al., “Update of the Scientific Opinion on the risks to plant health posed by Xylella fastidiosa in the EU territory,” EFSA J., vol. 17, no. 5, May 2019.
- Ferrante, P., & Scortichini, M. (2018). Xanthomonas arboricola pv. fragariae: a confirmation of the pathogenicity of the pathotype strain. European journal of plant pathology, 150(3), 825-829.
- Wang, Y., Wang, J., Li, R., Shi, Q., Xue, Z., & Zhang, Y. (2017). ThreaDomEx: a unified platform for predicting continuous and discontinuous protein domains by multiple-threading and segment assembly. Nucleic acids research, 45(W1), W400-W407.
- Roy, A., Kucukural, A., & Zhang, Y. (2010). I-TASSER: a unified platform for automated protein structure and function prediction. Nature protocols, 5(4), 725.
- Goddard, T. D., Huang, C. C., Meng, E. C., Pettersen, E. F., Couch, G. S., Morris, J. H., & Ferrin, T. E. (2018). UCSF ChimeraX: Meeting modern challenges in visualization and analysis. Protein Science, 27(1), 14-25.
- Xu, D., Jaroszewski, L., Li, Z., & Godzik, A. (2015). AIDA: ab initio domain assembly for automated multi-domain protein structure prediction and domain–domain interaction prediction. Bioinformatics, 31(13), 2098-2105.
- Lomize, M. A., Pogozheva, I. D., Joo, H., Mosberg, H. I., & Lomize, A. L. (2011). OPM database and PPM web server: resources for positioning of proteins in membranes. Nucleic acids research, 40(D1), D370-D376.