Modeling nowadays has become crucial to all scientific fields and synthetic biology is no exception.
Computational methods have the ability to provide answers to questions that cannot be obtained in the
lab. With Xylencer, we wanted to leverage this power and answer questions that are key in developing our
phage therapy. We incorporated a broad array of different computational techniques to make full use of
computational methods available in this day and age. This includes the use of temporal and spatial
models, a custom physical protein modeling workflow, comparative genomics, and machine learning. This
page serves to give an overview of the different questions we answered with computational techniques and
links are provided for more in-depth information.
Spatial temporal modelling
How will our therapy function in a real-world setting?
Testing the application of phage therapy on a single field or larger space is far beyond the scope of
an iGEM project. Still, this is one of the crucial stages in the Xylencer project. To gain a better
understanding of how our therapy would spread and how efficient it would be at curing an X.
fastidiosa infection, we employed spatial-temporal modeling. Since X. fastidiosa
poses a big threat to the European continent, the EU already created models that can be used to
assess the efficiency of different biocontainment approaches . We took these models as a starting
point and incorporated the Xylencer phages to assess the efficiency of our solution. This first
required the construction of a model for the interaction between the Xylencer phage and an X.
fastidiosa colony inside of the plant. Combining this model with the latest EU model
yielded our final spatial spread model. From this model we could conclude that the Xylencer phages
can be effective at combatting X. fastidiosa, finally providing the much-needed cure.
How do we ensure our phage delivery bacterium is safe?
We revolutionize phage therapy with the phage delivery bacterium (PDB). However, selecting a
compatible and safe strain that can serve this role, is a complicated matter. Phage replication
requires a high similarity in cell metabolism between X. fastidiosa and the PDB. This
is problematic as most species closely related to Xylella are known to be
phytopathogenic, making them unsafe to use. A few species have reported non-pathogenic strains,
but some of these strains were later shown to be pathogenic under different testing
conditions . This makes the classification of the other non-pathogens doubtful. By combining
comparative genomics with machine learning, we found a genetic basis for non-pathogenicity and
selected a set of non-pathogens that conform to this genetic basis. With this information, we
selected Xanthomonas arboricola strain CITA 44 as the prime candidate for our PDB. By
analyzing this model, we also identified both the lack of a Type III secretion system and a lack
of transposons linked to specific pathogenicity islands, as important factors for
non-pathogenicity in the Xanthomonas genus.
Would our protein fusions be theoretically possible?
An important part of our therapy is the enhanced Xylencer phages. The creation of these phages
hinges on a fusion of the phage capsid protein and an adhesion protein. These capsid molecules
form complex multimers to construct the phage capsid and a fusion protein must not restrict the
capsid’s ability to assemble. To asses if the capsid is not sterically hindered by the fusion,
we make use of protein modeling.
Modeling of regular proteins can nowadays be easily performed by using a single web server.
However, fusion proteins cannot be aligned to a single template since they are composed of
multiple proteins. This makes it hard to easily simulate them in high quality. We developed a
workflow (described below) based on freely accessible web services, that allows anyone with
basic knowledge on protein modeling, to model and visualize fusion proteins. Using this
workflow, we were able to show that either the chitin-binding domain of chitinase A1
(Bacillus circulans) or a GFP can be fused to the decorator protein of phage lambda,
our model phage, without obstructing the trimer-interface.
We also used the same workflow to visualize the integration of quorum sensing fusion protein
RpfCch into the outer membrane of the phage delivery bacterium phage delivery
Fusion protein modeling workflow arrow_downward
A prevalent hypothesis on how proteins fold is that protein domain, the smallest functional
units of proteins, fold first and do so individually. Only after the domains have folded, do
the remaining parts of the protein fold in such a way as to minimize the free energy in the
entire protein. By closely following these biological phenomena, our workflow aims to
achieve the best possible model. The workflow starts with the detection of the different
protein domains and individually modeling all the detected domains. Finally, the full
sequence is threaded through the domain models and the connecting linkers are modeled ab
initio (Figure 3).
Protein domains are detected by running the entire sequence through the ThreadomEX  web server. Detected domains
are modeled using the award-winning I-Tasser  web server. Generated models
are manually trimmed to remove linker sequences using UCF ChimeraX , by removing approximately 5 to 20 amino
acids at the start and end of the domains, allowing for more flexible modeling. The trimmed
models are renumbered and assembled by the AIDA  web
server. Additionally, for RpfCch the positioning in the outer membrane was
calculated using the OPM  web server.
The final models were visualized using ChimeraX.
C. Bragard et al., “Update of the Scientific Opinion on the risks to
plant health posed
by Xylella fastidiosa in the EU territory,” EFSA J., vol. 17, no. 5, May
Ferrante, P., & Scortichini, M. (2018). Xanthomonas arboricola pv.
confirmation of the pathogenicity of the pathotype strain. European
journal of plant
pathology, 150(3), 825-829.
Wang, Y., Wang, J., Li, R., Shi, Q., Xue, Z., & Zhang, Y. (2017).
ThreaDomEx: a unified
platform for predicting continuous and discontinuous protein domains by
multiple-threading and segment assembly. Nucleic acids research, 45(W1),
Roy, A., Kucukural, A., & Zhang, Y. (2010). I-TASSER: a unified platform
protein structure and function prediction. Nature protocols, 5(4), 725.
Goddard, T. D., Huang, C. C., Meng, E. C., Pettersen, E. F., Couch, G.
S., Morris, J.
H., & Ferrin, T. E. (2018). UCSF ChimeraX: Meeting modern challenges in
and analysis. Protein Science, 27(1), 14-25.
Xu, D., Jaroszewski, L., Li, Z., & Godzik, A. (2015). AIDA: ab initio
for automated multi-domain protein structure prediction and
prediction. Bioinformatics, 31(13), 2098-2105.
Lomize, M. A., Pogozheva, I. D., Joo, H., Mosberg, H. I., & Lomize, A.
L. (2011). OPM
database and PPM web server: resources for positioning of proteins in
acids research, 40(D1), D370-D376.